A discrimlnant function approach to ecological site classification in northern New England
James M. Fincher; Marie-Louise Smith
1994-01-01
Describes one approach to ecologically based classification of upland forest community types of the White and Green Mountain physiographic regions. The classification approach is based on an intensive statistical analysis of the relationship between the communities and soil-site factors. Discriminant functions useful in distinguishing between types based on soil-site...
MODEL-BASED CLUSTERING FOR CLASSIFICATION OF AQUATIC SYSTEMS AND DIAGNOSIS OF ECOLOGICAL STRESS
Clustering approaches were developed using the classification likelihood, the mixture likelihood, and also using a randomization approach with a model index. Using a clustering approach based on the mixture and classification likelihoods, we have developed an algorithm that...
NASA Astrophysics Data System (ADS)
Chen, Y.; Luo, M.; Xu, L.; Zhou, X.; Ren, J.; Zhou, J.
2018-04-01
The RF method based on grid-search parameter optimization could achieve a classification accuracy of 88.16 % in the classification of images with multiple feature variables. This classification accuracy was higher than that of SVM and ANN under the same feature variables. In terms of efficiency, the RF classification method performs better than SVM and ANN, it is more capable of handling multidimensional feature variables. The RF method combined with object-based analysis approach could highlight the classification accuracy further. The multiresolution segmentation approach on the basis of ESP scale parameter optimization was used for obtaining six scales to execute image segmentation, when the segmentation scale was 49, the classification accuracy reached the highest value of 89.58 %. The classification accuracy of object-based RF classification was 1.42 % higher than that of pixel-based classification (88.16 %), and the classification accuracy was further improved. Therefore, the RF classification method combined with object-based analysis approach could achieve relatively high accuracy in the classification and extraction of land use information for industrial and mining reclamation areas. Moreover, the interpretation of remotely sensed imagery using the proposed method could provide technical support and theoretical reference for remotely sensed monitoring land reclamation.
The development of a classification schema for arts-based approaches to knowledge translation.
Archibald, Mandy M; Caine, Vera; Scott, Shannon D
2014-10-01
Arts-based approaches to knowledge translation are emerging as powerful interprofessional strategies with potential to facilitate evidence uptake, communication, knowledge, attitude, and behavior change across healthcare provider and consumer groups. These strategies are in the early stages of development. To date, no classification system for arts-based knowledge translation exists, which limits development and understandings of effectiveness in evidence syntheses. We developed a classification schema of arts-based knowledge translation strategies based on two mechanisms by which these approaches function: (a) the degree of precision in key message delivery, and (b) the degree of end-user participation. We demonstrate how this classification is necessary to explore how context, time, and location shape arts-based knowledge translation strategies. Classifying arts-based knowledge translation strategies according to their core attributes extends understandings of the appropriateness of these approaches for various healthcare settings and provider groups. The classification schema developed may enhance understanding of how, where, and for whom arts-based knowledge translation approaches are effective, and enable theorizing of essential knowledge translation constructs, such as the influence of context, time, and location on utilization strategies. The classification schema developed may encourage systematic inquiry into the effectiveness of these approaches in diverse interprofessional contexts. © 2014 Sigma Theta Tau International.
Anavi, Yaron; Kogan, Ilya; Gelbart, Elad; Geva, Ofer; Greenspan, Hayit
2015-08-01
In this work various approaches are investigated for X-ray image retrieval and specifically chest pathology retrieval. Given a query image taken from a data set of 443 images, the objective is to rank images according to similarity. Different features, including binary features, texture features, and deep learning (CNN) features are examined. In addition, two approaches are investigated for the retrieval task. One approach is based on the distance of image descriptors using the above features (hereon termed the "descriptor"-based approach); the second approach ("classification"-based approach) is based on a probability descriptor, generated by a pair-wise classification of each two classes (pathologies) and their decision values using an SVM classifier. Best results are achieved using deep learning features in a classification scheme.
A Systematic Approach to Subgroup Classification in Intellectual Disability
ERIC Educational Resources Information Center
Schalock, Robert L.; Luckasson, Ruth
2015-01-01
This article describes a systematic approach to subgroup classification based on a classification framework and sequential steps involved in the subgrouping process. The sequential steps are stating the purpose of the classification, identifying the classification elements, using relevant information, and using clearly stated and purposeful…
EXTENDING AQUATIC CLASSIFICATION TO THE LANDSCAPE SCALE HYDROLOGY-BASED STRATEGIES
Aquatic classification of single water bodies (lakes, wetlands, estuaries) is often based on geologic origin, while stream classification has relied on multiple factors related to landform, geomorphology, and soils. We have developed an approach to aquatic classification based o...
Malay sentiment analysis based on combined classification approaches and Senti-lexicon algorithm.
Al-Saffar, Ahmed; Awang, Suryanti; Tao, Hai; Omar, Nazlia; Al-Saiagh, Wafaa; Al-Bared, Mohammed
2018-01-01
Sentiment analysis techniques are increasingly exploited to categorize the opinion text to one or more predefined sentiment classes for the creation and automated maintenance of review-aggregation websites. In this paper, a Malay sentiment analysis classification model is proposed to improve classification performances based on the semantic orientation and machine learning approaches. First, a total of 2,478 Malay sentiment-lexicon phrases and words are assigned with a synonym and stored with the help of more than one Malay native speaker, and the polarity is manually allotted with a score. In addition, the supervised machine learning approaches and lexicon knowledge method are combined for Malay sentiment classification with evaluating thirteen features. Finally, three individual classifiers and a combined classifier are used to evaluate the classification accuracy. In experimental results, a wide-range of comparative experiments is conducted on a Malay Reviews Corpus (MRC), and it demonstrates that the feature extraction improves the performance of Malay sentiment analysis based on the combined classification. However, the results depend on three factors, the features, the number of features and the classification approach.
Malay sentiment analysis based on combined classification approaches and Senti-lexicon algorithm
Awang, Suryanti; Tao, Hai; Omar, Nazlia; Al-Saiagh, Wafaa; Al-bared, Mohammed
2018-01-01
Sentiment analysis techniques are increasingly exploited to categorize the opinion text to one or more predefined sentiment classes for the creation and automated maintenance of review-aggregation websites. In this paper, a Malay sentiment analysis classification model is proposed to improve classification performances based on the semantic orientation and machine learning approaches. First, a total of 2,478 Malay sentiment-lexicon phrases and words are assigned with a synonym and stored with the help of more than one Malay native speaker, and the polarity is manually allotted with a score. In addition, the supervised machine learning approaches and lexicon knowledge method are combined for Malay sentiment classification with evaluating thirteen features. Finally, three individual classifiers and a combined classifier are used to evaluate the classification accuracy. In experimental results, a wide-range of comparative experiments is conducted on a Malay Reviews Corpus (MRC), and it demonstrates that the feature extraction improves the performance of Malay sentiment analysis based on the combined classification. However, the results depend on three factors, the features, the number of features and the classification approach. PMID:29684036
NASA Astrophysics Data System (ADS)
Bialas, James; Oommen, Thomas; Rebbapragada, Umaa; Levin, Eugene
2016-07-01
Object-based approaches in the segmentation and classification of remotely sensed images yield more promising results compared to pixel-based approaches. However, the development of an object-based approach presents challenges in terms of algorithm selection and parameter tuning. Subjective methods are often used, but yield less than optimal results. Objective methods are warranted, especially for rapid deployment in time-sensitive applications, such as earthquake damage assessment. Herein, we used a systematic approach in evaluating object-based image segmentation and machine learning algorithms for the classification of earthquake damage in remotely sensed imagery. We tested a variety of algorithms and parameters on post-event aerial imagery for the 2011 earthquake in Christchurch, New Zealand. Results were compared against manually selected test cases representing different classes. In doing so, we can evaluate the effectiveness of the segmentation and classification of different classes and compare different levels of multistep image segmentations. Our classifier is compared against recent pixel-based and object-based classification studies for postevent imagery of earthquake damage. Our results show an improvement against both pixel-based and object-based methods for classifying earthquake damage in high resolution, post-event imagery.
Yang, Wen; Zhu, Jin-Yong; Lu, Kai-Hong; Wan, Li; Mao, Xiao-Hua
2014-06-01
Appropriate schemes for classification of freshwater phytoplankton are prerequisites and important tools for revealing phytoplanktonic succession and studying freshwater ecosystems. An alternative approach, functional group of freshwater phytoplankton, has been proposed and developed due to the deficiencies of Linnaean and molecular identification in ecological applications. The functional group of phytoplankton is a classification scheme based on autoecology. In this study, the theoretical basis and classification criterion of functional group (FG), morpho-functional group (MFG) and morphology-based functional group (MBFG) were summarized, as well as their merits and demerits. FG was considered as the optimal classification approach for the aquatic ecology research and aquatic environment evaluation. The application status of FG was introduced, with the evaluation standards and problems of two approaches to assess water quality on the basis of FG, index methods of Q and QR, being briefly discussed.
Classification-Based Spatial Error Concealment for Visual Communications
NASA Astrophysics Data System (ADS)
Chen, Meng; Zheng, Yefeng; Wu, Min
2006-12-01
In an error-prone transmission environment, error concealment is an effective technique to reconstruct the damaged visual content. Due to large variations of image characteristics, different concealment approaches are necessary to accommodate the different nature of the lost image content. In this paper, we address this issue and propose using classification to integrate the state-of-the-art error concealment techniques. The proposed approach takes advantage of multiple concealment algorithms and adaptively selects the suitable algorithm for each damaged image area. With growing awareness that the design of sender and receiver systems should be jointly considered for efficient and reliable multimedia communications, we proposed a set of classification-based block concealment schemes, including receiver-side classification, sender-side attachment, and sender-side embedding. Our experimental results provide extensive performance comparisons and demonstrate that the proposed classification-based error concealment approaches outperform the conventional approaches.
Knowledge-based approach to video content classification
NASA Astrophysics Data System (ADS)
Chen, Yu; Wong, Edward K.
2001-01-01
A framework for video content classification using a knowledge-based approach is herein proposed. This approach is motivated by the fact that videos are rich in semantic contents, which can best be interpreted and analyzed by human experts. We demonstrate the concept by implementing a prototype video classification system using the rule-based programming language CLIPS 6.05. Knowledge for video classification is encoded as a set of rules in the rule base. The left-hand-sides of rules contain high level and low level features, while the right-hand-sides of rules contain intermediate results or conclusions. Our current implementation includes features computed from motion, color, and text extracted from video frames. Our current rule set allows us to classify input video into one of five classes: news, weather, reporting, commercial, basketball and football. We use MYCIN's inexact reasoning method for combining evidences, and to handle the uncertainties in the features and in the classification results. We obtained good results in a preliminary experiment, and it demonstrated the validity of the proposed approach.
Knowledge-based approach to video content classification
NASA Astrophysics Data System (ADS)
Chen, Yu; Wong, Edward K.
2000-12-01
A framework for video content classification using a knowledge-based approach is herein proposed. This approach is motivated by the fact that videos are rich in semantic contents, which can best be interpreted and analyzed by human experts. We demonstrate the concept by implementing a prototype video classification system using the rule-based programming language CLIPS 6.05. Knowledge for video classification is encoded as a set of rules in the rule base. The left-hand-sides of rules contain high level and low level features, while the right-hand-sides of rules contain intermediate results or conclusions. Our current implementation includes features computed from motion, color, and text extracted from video frames. Our current rule set allows us to classify input video into one of five classes: news, weather, reporting, commercial, basketball and football. We use MYCIN's inexact reasoning method for combining evidences, and to handle the uncertainties in the features and in the classification results. We obtained good results in a preliminary experiment, and it demonstrated the validity of the proposed approach.
Slabbinck, Bram; Waegeman, Willem; Dawyndt, Peter; De Vos, Paul; De Baets, Bernard
2010-01-30
Machine learning techniques have shown to improve bacterial species classification based on fatty acid methyl ester (FAME) data. Nonetheless, FAME analysis has a limited resolution for discrimination of bacteria at the species level. In this paper, we approach the species classification problem from a taxonomic point of view. Such a taxonomy or tree is typically obtained by applying clustering algorithms on FAME data or on 16S rRNA gene data. The knowledge gained from the tree can then be used to evaluate FAME-based classifiers, resulting in a novel framework for bacterial species classification. In view of learning in a taxonomic framework, we consider two types of trees. First, a FAME tree is constructed with a supervised divisive clustering algorithm. Subsequently, based on 16S rRNA gene sequence analysis, phylogenetic trees are inferred by the NJ and UPGMA methods. In this second approach, the species classification problem is based on the combination of two different types of data. Herein, 16S rRNA gene sequence data is used for phylogenetic tree inference and the corresponding binary tree splits are learned based on FAME data. We call this learning approach 'phylogenetic learning'. Supervised Random Forest models are developed to train the classification tasks in a stratified cross-validation setting. In this way, better classification results are obtained for species that are typically hard to distinguish by a single or flat multi-class classification model. FAME-based bacterial species classification is successfully evaluated in a taxonomic framework. Although the proposed approach does not improve the overall accuracy compared to flat multi-class classification, it has some distinct advantages. First, it has better capabilities for distinguishing species on which flat multi-class classification fails. Secondly, the hierarchical classification structure allows to easily evaluate and visualize the resolution of FAME data for the discrimination of bacterial species. Summarized, by phylogenetic learning we are able to situate and evaluate FAME-based bacterial species classification in a more informative context.
2010-01-01
Background Machine learning techniques have shown to improve bacterial species classification based on fatty acid methyl ester (FAME) data. Nonetheless, FAME analysis has a limited resolution for discrimination of bacteria at the species level. In this paper, we approach the species classification problem from a taxonomic point of view. Such a taxonomy or tree is typically obtained by applying clustering algorithms on FAME data or on 16S rRNA gene data. The knowledge gained from the tree can then be used to evaluate FAME-based classifiers, resulting in a novel framework for bacterial species classification. Results In view of learning in a taxonomic framework, we consider two types of trees. First, a FAME tree is constructed with a supervised divisive clustering algorithm. Subsequently, based on 16S rRNA gene sequence analysis, phylogenetic trees are inferred by the NJ and UPGMA methods. In this second approach, the species classification problem is based on the combination of two different types of data. Herein, 16S rRNA gene sequence data is used for phylogenetic tree inference and the corresponding binary tree splits are learned based on FAME data. We call this learning approach 'phylogenetic learning'. Supervised Random Forest models are developed to train the classification tasks in a stratified cross-validation setting. In this way, better classification results are obtained for species that are typically hard to distinguish by a single or flat multi-class classification model. Conclusions FAME-based bacterial species classification is successfully evaluated in a taxonomic framework. Although the proposed approach does not improve the overall accuracy compared to flat multi-class classification, it has some distinct advantages. First, it has better capabilities for distinguishing species on which flat multi-class classification fails. Secondly, the hierarchical classification structure allows to easily evaluate and visualize the resolution of FAME data for the discrimination of bacterial species. Summarized, by phylogenetic learning we are able to situate and evaluate FAME-based bacterial species classification in a more informative context. PMID:20113515
Na, X D; Zang, S Y; Wu, C S; Li, W L
2015-11-01
Knowledge of the spatial extent of forested wetlands is essential to many studies including wetland functioning assessment, greenhouse gas flux estimation, and wildlife suitable habitat identification. For discriminating forested wetlands from their adjacent land cover types, researchers have resorted to image analysis techniques applied to numerous remotely sensed data. While with some success, there is still no consensus on the optimal approaches for mapping forested wetlands. To address this problem, we examined two machine learning approaches, random forest (RF) and K-nearest neighbor (KNN) algorithms, and applied these two approaches to the framework of pixel-based and object-based classifications. The RF and KNN algorithms were constructed using predictors derived from Landsat 8 imagery, Radarsat-2 advanced synthetic aperture radar (SAR), and topographical indices. The results show that the objected-based classifications performed better than per-pixel classifications using the same algorithm (RF) in terms of overall accuracy and the difference of their kappa coefficients are statistically significant (p<0.01). There were noticeably omissions for forested and herbaceous wetlands based on the per-pixel classifications using the RF algorithm. As for the object-based image analysis, there were also statistically significant differences (p<0.01) of Kappa coefficient between results performed based on RF and KNN algorithms. The object-based classification using RF provided a more visually adequate distribution of interested land cover types, while the object classifications based on the KNN algorithm showed noticeably commissions for forested wetlands and omissions for agriculture land. This research proves that the object-based classification with RF using optical, radar, and topographical data improved the mapping accuracy of land covers and provided a feasible approach to discriminate the forested wetlands from the other land cover types in forestry area.
Younghak Shin; Balasingham, Ilangko
2017-07-01
Colonoscopy is a standard method for screening polyps by highly trained physicians. Miss-detected polyps in colonoscopy are potential risk factor for colorectal cancer. In this study, we investigate an automatic polyp classification framework. We aim to compare two different approaches named hand-craft feature method and convolutional neural network (CNN) based deep learning method. Combined shape and color features are used for hand craft feature extraction and support vector machine (SVM) method is adopted for classification. For CNN approach, three convolution and pooling based deep learning framework is used for classification purpose. The proposed framework is evaluated using three public polyp databases. From the experimental results, we have shown that the CNN based deep learning framework shows better classification performance than the hand-craft feature based methods. It achieves over 90% of classification accuracy, sensitivity, specificity and precision.
ERIC Educational Resources Information Center
Hsu, Ting-Chia
2016-01-01
In this study, a peer assessment system using the grid-based knowledge classification approach was developed to improve students' performance during computer skills training. To evaluate the effectiveness of the proposed approach, an experiment was conducted in a computer skills certification course. The participants were divided into three…
Gold-standard for computer-assisted morphological sperm analysis.
Chang, Violeta; Garcia, Alejandra; Hitschfeld, Nancy; Härtel, Steffen
2017-04-01
Published algorithms for classification of human sperm heads are based on relatively small image databases that are not open to the public, and thus no direct comparison is available for competing methods. We describe a gold-standard for morphological sperm analysis (SCIAN-MorphoSpermGS), a dataset of sperm head images with expert-classification labels in one of the following classes: normal, tapered, pyriform, small or amorphous. This gold-standard is for evaluating and comparing known techniques and future improvements to present approaches for classification of human sperm heads for semen analysis. Although this paper does not provide a computational tool for morphological sperm analysis, we present a set of experiments for comparing sperm head description and classification common techniques. This classification base-line is aimed to be used as a reference for future improvements to present approaches for human sperm head classification. The gold-standard provides a label for each sperm head, which is achieved by majority voting among experts. The classification base-line compares four supervised learning methods (1- Nearest Neighbor, naive Bayes, decision trees and Support Vector Machine (SVM)) and three shape-based descriptors (Hu moments, Zernike moments and Fourier descriptors), reporting the accuracy and the true positive rate for each experiment. We used Fleiss' Kappa Coefficient to evaluate the inter-expert agreement and Fisher's exact test for inter-expert variability and statistical significant differences between descriptors and learning techniques. Our results confirm the high degree of inter-expert variability in the morphological sperm analysis. Regarding the classification base line, we show that none of the standard descriptors or classification approaches is best suitable for tackling the problem of sperm head classification. We discovered that the correct classification rate was highly variable when trying to discriminate among non-normal sperm heads. By using the Fourier descriptor and SVM, we achieved the best mean correct classification: only 49%. We conclude that the SCIAN-MorphoSpermGS will provide a standard tool for evaluation of characterization and classification approaches for human sperm heads. Indeed, there is a clear need for a specific shape-based descriptor for human sperm heads and a specific classification approach to tackle the problem of high variability within subcategories of abnormal sperm cells. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Liu, Haijian; Wu, Changshan
2018-06-01
Crown-level tree species classification is a challenging task due to the spectral similarity among different tree species. Shadow, underlying objects, and other materials within a crown may decrease the purity of extracted crown spectra and further reduce classification accuracy. To address this problem, an innovative pixel-weighting approach was developed for tree species classification at the crown level. The method utilized high density discrete LiDAR data for individual tree delineation and Airborne Imaging Spectrometer for Applications (AISA) hyperspectral imagery for pure crown-scale spectra extraction. Specifically, three steps were included: 1) individual tree identification using LiDAR data, 2) pixel-weighted representative crown spectra calculation using hyperspectral imagery, with which pixel-based illuminated-leaf fractions estimated using a linear spectral mixture analysis (LSMA) were employed as weighted factors, and 3) representative spectra based tree species classification was performed through applying a support vector machine (SVM) approach. Analysis of results suggests that the developed pixel-weighting approach (OA = 82.12%, Kc = 0.74) performed better than treetop-based (OA = 70.86%, Kc = 0.58) and pixel-majority methods (OA = 72.26, Kc = 0.62) in terms of classification accuracy. McNemar tests indicated the differences in accuracy between pixel-weighting and treetop-based approaches as well as that between pixel-weighting and pixel-majority approaches were statistically significant.
Cost-effectiveness of a classification-based system for sub-acute and chronic low back pain.
Apeldoorn, Adri T; Bosmans, Judith E; Ostelo, Raymond W; de Vet, Henrica C W; van Tulder, Maurits W
2012-07-01
Identifying relevant subgroups in patients with low back pain (LBP) is considered important to guide physical therapy practice and to improve outcomes. The aim of the present study was to assess the cost-effectiveness of a modified version of Delitto's classification-based treatment approach compared with usual physical therapy care in patients with sub-acute and chronic LBP with 1 year follow-up. All patients were classified using the modified version of Delitto's classification-based system and then randomly assigned to receive either classification-based treatment or usual physical therapy care. The main clinical outcomes measured were; global perceived effect, intensity of pain, functional disability and quality of life. Costs were measured from a societal perspective. Multiple imputations were used for missing data. Uncertainty surrounding cost differences and incremental cost-effectiveness ratios was estimated using bootstrapping. Cost-effectiveness planes and cost-effectiveness acceptability curves were estimated. In total, 156 patients were included. The outcome analyses showed a significantly better outcome on global perceived effect favoring the classification-based approach, and no differences between the groups on pain, disability and quality-adjusted life-years. Mean total societal costs for the classification-based group were
Choice-Based Conjoint Analysis: Classification vs. Discrete Choice Models
NASA Astrophysics Data System (ADS)
Giesen, Joachim; Mueller, Klaus; Taneva, Bilyana; Zolliker, Peter
Conjoint analysis is a family of techniques that originated in psychology and later became popular in market research. The main objective of conjoint analysis is to measure an individual's or a population's preferences on a class of options that can be described by parameters and their levels. We consider preference data obtained in choice-based conjoint analysis studies, where one observes test persons' choices on small subsets of the options. There are many ways to analyze choice-based conjoint analysis data. Here we discuss the intuition behind a classification based approach, and compare this approach to one based on statistical assumptions (discrete choice models) and to a regression approach. Our comparison on real and synthetic data indicates that the classification approach outperforms the discrete choice models.
a Two-Step Classification Approach to Distinguishing Similar Objects in Mobile LIDAR Point Clouds
NASA Astrophysics Data System (ADS)
He, H.; Khoshelham, K.; Fraser, C.
2017-09-01
Nowadays, lidar is widely used in cultural heritage documentation, urban modeling, and driverless car technology for its fast and accurate 3D scanning ability. However, full exploitation of the potential of point cloud data for efficient and automatic object recognition remains elusive. Recently, feature-based methods have become very popular in object recognition on account of their good performance in capturing object details. Compared with global features describing the whole shape of the object, local features recording the fractional details are more discriminative and are applicable for object classes with considerable similarity. In this paper, we propose a two-step classification approach based on point feature histograms and the bag-of-features method for automatic recognition of similar objects in mobile lidar point clouds. Lamp post, street light and traffic sign are grouped as one category in the first-step classification for their inter similarity compared with tree and vehicle. A finer classification of the lamp post, street light and traffic sign based on the result of the first-step classification is implemented in the second step. The proposed two-step classification approach is shown to yield a considerable improvement over the conventional one-step classification approach.
Mallants, Dirk; Batelaan, Okke; Gedeon, Matej; Huysmans, Marijke; Dassargues, Alain
2017-01-01
Cone penetration testing (CPT) is one of the most efficient and versatile methods currently available for geotechnical, lithostratigraphic and hydrogeological site characterization. Currently available methods for soil behaviour type classification (SBT) of CPT data however have severe limitations, often restricting their application to a local scale. For parameterization of regional groundwater flow or geotechnical models, and delineation of regional hydro- or lithostratigraphy, regional SBT classification would be very useful. This paper investigates the use of model-based clustering for SBT classification, and the influence of different clustering approaches on the properties and spatial distribution of the obtained soil classes. We additionally propose a methodology for automated lithostratigraphic mapping of regionally occurring sedimentary units using SBT classification. The methodology is applied to a large CPT dataset, covering a groundwater basin of ~60 km2 with predominantly unconsolidated sandy sediments in northern Belgium. Results show that the model-based approach is superior in detecting the true lithological classes when compared to more frequently applied unsupervised classification approaches or literature classification diagrams. We demonstrate that automated mapping of lithostratigraphic units using advanced SBT classification techniques can provide a large gain in efficiency, compared to more time-consuming manual approaches and yields at least equally accurate results. PMID:28467468
Rogiers, Bart; Mallants, Dirk; Batelaan, Okke; Gedeon, Matej; Huysmans, Marijke; Dassargues, Alain
2017-01-01
Cone penetration testing (CPT) is one of the most efficient and versatile methods currently available for geotechnical, lithostratigraphic and hydrogeological site characterization. Currently available methods for soil behaviour type classification (SBT) of CPT data however have severe limitations, often restricting their application to a local scale. For parameterization of regional groundwater flow or geotechnical models, and delineation of regional hydro- or lithostratigraphy, regional SBT classification would be very useful. This paper investigates the use of model-based clustering for SBT classification, and the influence of different clustering approaches on the properties and spatial distribution of the obtained soil classes. We additionally propose a methodology for automated lithostratigraphic mapping of regionally occurring sedimentary units using SBT classification. The methodology is applied to a large CPT dataset, covering a groundwater basin of ~60 km2 with predominantly unconsolidated sandy sediments in northern Belgium. Results show that the model-based approach is superior in detecting the true lithological classes when compared to more frequently applied unsupervised classification approaches or literature classification diagrams. We demonstrate that automated mapping of lithostratigraphic units using advanced SBT classification techniques can provide a large gain in efficiency, compared to more time-consuming manual approaches and yields at least equally accurate results.
Quantum Ensemble Classification: A Sampling-Based Learning Control Approach.
Chen, Chunlin; Dong, Daoyi; Qi, Bo; Petersen, Ian R; Rabitz, Herschel
2017-06-01
Quantum ensemble classification (QEC) has significant applications in discrimination of atoms (or molecules), separation of isotopes, and quantum information extraction. However, quantum mechanics forbids deterministic discrimination among nonorthogonal states. The classification of inhomogeneous quantum ensembles is very challenging, since there exist variations in the parameters characterizing the members within different classes. In this paper, we recast QEC as a supervised quantum learning problem. A systematic classification methodology is presented by using a sampling-based learning control (SLC) approach for quantum discrimination. The classification task is accomplished via simultaneously steering members belonging to different classes to their corresponding target states (e.g., mutually orthogonal states). First, a new discrimination method is proposed for two similar quantum systems. Then, an SLC method is presented for QEC. Numerical results demonstrate the effectiveness of the proposed approach for the binary classification of two-level quantum ensembles and the multiclass classification of multilevel quantum ensembles.
An objective and parsimonious approach for classifying natural flow regimes at a continental scale
NASA Astrophysics Data System (ADS)
Archfield, S. A.; Kennen, J.; Carlisle, D.; Wolock, D.
2013-12-01
Hydroecological stream classification--the process of grouping streams by similar hydrologic responses and, thereby, similar aquatic habitat--has been widely accepted and is often one of the first steps towards developing ecological flow targets. Despite its importance, the last national classification of streamgauges was completed about 20 years ago. A new classification of 1,534 streamgauges in the contiguous United States is presented using a novel and parsimonious approach to understand similarity in ecological streamflow response. This new classification approach uses seven fundamental daily streamflow statistics (FDSS) rather than winnowing down an uncorrelated subset from 200 or more ecologically relevant streamflow statistics (ERSS) commonly used in hydroecological classification studies. The results of this investigation demonstrate that the distributions of 33 tested ERSS are consistently different among the classes derived from the seven FDSS. It is further shown that classification based solely on the 33 ERSS generally does a poorer job in grouping similar streamgauges than the classification based on the seven FDSS. This new classification approach has the additional advantages of overcoming some of the subjectivity associated with the selection of the classification variables and provides a set of robust continental-scale classes of US streamgauges.
Two Approaches to Estimation of Classification Accuracy Rate under Item Response Theory
ERIC Educational Resources Information Center
Lathrop, Quinn N.; Cheng, Ying
2013-01-01
Within the framework of item response theory (IRT), there are two recent lines of work on the estimation of classification accuracy (CA) rate. One approach estimates CA when decisions are made based on total sum scores, the other based on latent trait estimates. The former is referred to as the Lee approach, and the latter, the Rudner approach,…
Classification of cancerous cells based on the one-class problem approach
NASA Astrophysics Data System (ADS)
Murshed, Nabeel A.; Bortolozzi, Flavio; Sabourin, Robert
1996-03-01
One of the most important factors in reducing the effect of cancerous diseases is the early diagnosis, which requires a good and a robust method. With the advancement of computer technologies and digital image processing, the development of a computer-based system has become feasible. In this paper, we introduce a new approach for the detection of cancerous cells. This approach is based on the one-class problem approach, through which the classification system need only be trained with patterns of cancerous cells. This reduces the burden of the training task by about 50%. Based on this approach, a computer-based classification system is developed, based on the Fuzzy ARTMAP neural networks. Experimental results were performed using a set of 542 patterns taken from a sample of breast cancer. Results of the experiment show 98% correct identification of cancerous cells and 95% correct identification of non-cancerous cells.
Griffiths, Jason I.; Fronhofer, Emanuel A.; Garnier, Aurélie; Seymour, Mathew; Altermatt, Florian; Petchey, Owen L.
2017-01-01
The development of video-based monitoring methods allows for rapid, dynamic and accurate monitoring of individuals or communities, compared to slower traditional methods, with far reaching ecological and evolutionary applications. Large amounts of data are generated using video-based methods, which can be effectively processed using machine learning (ML) algorithms into meaningful ecological information. ML uses user defined classes (e.g. species), derived from a subset (i.e. training data) of video-observed quantitative features (e.g. phenotypic variation), to infer classes in subsequent observations. However, phenotypic variation often changes due to environmental conditions, which may lead to poor classification, if environmentally induced variation in phenotypes is not accounted for. Here we describe a framework for classifying species under changing environmental conditions based on the random forest classification. A sliding window approach was developed that restricts temporal and environmentally conditions to improve the classification. We tested our approach by applying the classification framework to experimental data. The experiment used a set of six ciliate species to monitor changes in community structure and behavior over hundreds of generations, in dozens of species combinations and across a temperature gradient. Differences in biotic and abiotic conditions caused simplistic classification approaches to be unsuccessful. In contrast, the sliding window approach allowed classification to be highly successful, as phenotypic differences driven by environmental change, could be captured by the classifier. Importantly, classification using the random forest algorithm showed comparable success when validated against traditional, slower, manual identification. Our framework allows for reliable classification in dynamic environments, and may help to improve strategies for long-term monitoring of species in changing environments. Our classification pipeline can be applied in fields assessing species community dynamics, such as eco-toxicology, ecology and evolutionary ecology. PMID:28472193
NASA Technical Reports Server (NTRS)
Tarabalka, Y.; Tilton, J. C.; Benediktsson, J. A.; Chanussot, J.
2012-01-01
The Hierarchical SEGmentation (HSEG) algorithm, which combines region object finding with region object clustering, has given good performances for multi- and hyperspectral image analysis. This technique produces at its output a hierarchical set of image segmentations. The automated selection of a single segmentation level is often necessary. We propose and investigate the use of automatically selected markers for this purpose. In this paper, a novel Marker-based HSEG (M-HSEG) method for spectral-spatial classification of hyperspectral images is proposed. Two classification-based approaches for automatic marker selection are adapted and compared for this purpose. Then, a novel constrained marker-based HSEG algorithm is applied, resulting in a spectral-spatial classification map. Three different implementations of the M-HSEG method are proposed and their performances in terms of classification accuracies are compared. The experimental results, presented for three hyperspectral airborne images, demonstrate that the proposed approach yields accurate segmentation and classification maps, and thus is attractive for remote sensing image analysis.
Statistical inference for remote sensing-based estimates of net deforestation
Ronald E. McRoberts; Brian F. Walters
2012-01-01
Statistical inference requires expression of an estimate in probabilistic terms, usually in the form of a confidence interval. An approach to constructing confidence intervals for remote sensing-based estimates of net deforestation is illustrated. The approach is based on post-classification methods using two independent forest/non-forest classifications because...
NASA Astrophysics Data System (ADS)
Juniati, E.; Arrofiqoh, E. N.
2017-09-01
Information extraction from remote sensing data especially land cover can be obtained by digital classification. In practical some people are more comfortable using visual interpretation to retrieve land cover information. However, it is highly influenced by subjectivity and knowledge of interpreter, also takes time in the process. Digital classification can be done in several ways, depend on the defined mapping approach and assumptions on data distribution. The study compared several classifiers method for some data type at the same location. The data used Landsat 8 satellite imagery, SPOT 6 and Orthophotos. In practical, the data used to produce land cover map in 1:50,000 map scale for Landsat, 1:25,000 map scale for SPOT and 1:5,000 map scale for Orthophotos, but using visual interpretation to retrieve information. Maximum likelihood Classifiers (MLC) which use pixel-based and parameters approach applied to such data, and also Artificial Neural Network classifiers which use pixel-based and non-parameters approach applied too. Moreover, this study applied object-based classifiers to the data. The classification system implemented is land cover classification on Indonesia topographic map. The classification applied to data source, which is expected to recognize the pattern and to assess consistency of the land cover map produced by each data. Furthermore, the study analyse benefits and limitations the use of methods.
An Extended Spectral-Spatial Classification Approach for Hyperspectral Data
NASA Astrophysics Data System (ADS)
Akbari, D.
2017-11-01
In this paper an extended classification approach for hyperspectral imagery based on both spectral and spatial information is proposed. The spatial information is obtained by an enhanced marker-based minimum spanning forest (MSF) algorithm. Three different methods of dimension reduction are first used to obtain the subspace of hyperspectral data: (1) unsupervised feature extraction methods including principal component analysis (PCA), independent component analysis (ICA), and minimum noise fraction (MNF); (2) supervised feature extraction including decision boundary feature extraction (DBFE), discriminate analysis feature extraction (DAFE), and nonparametric weighted feature extraction (NWFE); (3) genetic algorithm (GA). The spectral features obtained are then fed into the enhanced marker-based MSF classification algorithm. In the enhanced MSF algorithm, the markers are extracted from the classification maps obtained by both SVM and watershed segmentation algorithm. To evaluate the proposed approach, the Pavia University hyperspectral data is tested. Experimental results show that the proposed approach using GA achieves an approximately 8 % overall accuracy higher than the original MSF-based algorithm.
An objective and parsimonious approach for classifying natural flow regimes at a continental scale
Archfield, Stacey A.; Kennen, Jonathan G.; Carlisle, Daren M.; Wolock, David M.
2014-01-01
Hydro-ecological stream classification-the process of grouping streams by similar hydrologic responses and, by extension, similar aquatic habitat-has been widely accepted and is considered by some to be one of the first steps towards developing ecological flow targets. A new classification of 1543 streamgauges in the contiguous USA is presented by use of a novel and parsimonious approach to understand similarity in ecological streamflow response. This novel classification approach uses seven fundamental daily streamflow statistics (FDSS) rather than winnowing down an uncorrelated subset from 200 or more ecologically relevant streamflow statistics (ERSS) commonly used in hydro-ecological classification studies. The results of this investigation demonstrate that the distributions of 33 tested ERSS are consistently different among the classification groups derived from the seven FDSS. It is further shown that classification based solely on the 33 ERSS generally does a poorer job in grouping similar streamgauges than the classification based on the seven FDSS. This new classification approach has the additional advantages of overcoming some of the subjectivity associated with the selection of the classification variables and provides a set of robust continental-scale classes of US streamgauges. Published 2013. This article is a U.S. Government work and is in the public domain in the USA.
Selecting reusable components using algebraic specifications
NASA Technical Reports Server (NTRS)
Eichmann, David A.
1992-01-01
A significant hurdle confronts the software reuser attempting to select candidate components from a software repository - discriminating between those components without resorting to inspection of the implementation(s). We outline a mixed classification/axiomatic approach to this problem based upon our lattice-based faceted classification technique and Guttag and Horning's algebraic specification techniques. This approach selects candidates by natural language-derived classification, by their interfaces, using signatures, and by their behavior, using axioms. We briefly outline our problem domain and related work. Lattice-based faceted classifications are described; the reader is referred to surveys of the extensive literature for algebraic specification techniques. Behavioral support for reuse queries is presented, followed by the conclusions.
2015-01-01
Background TNM staging plays a critical role in the evaluation and management of a range of different types of cancers. The conventional combinatorial approach to the determination of an anatomic stage relies on the identification of distinct tumor (T), node (N), and metastasis (M) classifications to generate a TNM grouping. This process is inherently inefficient due to the need for scrupulous review of the criteria specified for each classification to ensure accurate assignment. An exclusionary approach to TNM staging based on sequential constraint of options may serve to minimize the number of classifications that need to be reviewed to accurately determine an anatomic stage. Objective Our aim was to evaluate the usability and utility of a Web-based app configured to demonstrate an exclusionary approach to TNM staging. Methods Internal medicine residents, surgery residents, and oncology fellows engaged in clinical training were asked to evaluate a Web-based app developed as an instructional aid incorporating (1) an exclusionary algorithm that polls tabulated classifications and sorts them into ranked order based on frequency counts, (2) reconfiguration of classification criteria to generate disambiguated yes/no questions that function as selection and exclusion prompts, and (3) a selectable grid of TNM groupings that provides dynamic graphic demonstration of the effects of sequentially selecting or excluding specific classifications. Subjects were asked to evaluate the performance of this app after completing exercises simulating the staging of different types of cancers encountered during training. Results Survey responses indicated high levels of agreement with statements supporting the usability and utility of this app. Subjects reported that its user interface provided a clear display with intuitive controls and that the exclusionary approach to TNM staging it demonstrated represented an efficient process of assignment that helped to clarify distinctions between tumor, node, and metastasis classifications. High overall usefulness ratings were bolstered by supplementary comments suggesting that this app might be readily adopted for use in clinical practice. Conclusions A Web-based app that utilizes an exclusionary algorithm to prompt the assignment of tumor, node, and metastasis classifications may serve as an effective instructional aid demonstrating an efficient and informative approach to TNM staging. PMID:28410163
NASA Astrophysics Data System (ADS)
Belciug, Smaranda; Serbanescu, Mircea-Sebastian
2015-09-01
Feature selection is considered a key factor in classifications/decision problems. It is currently used in designing intelligent decision systems to choose the best features which allow the best performance. This paper proposes a regression-based approach to select the most important predictors to significantly increase the classification performance. Application to breast cancer detection and recurrence using publically available datasets proved the efficiency of this technique.
Waring, R; Knight, R
2013-01-01
Children with speech sound disorders (SSD) form a heterogeneous group who differ in terms of the severity of their condition, underlying cause, speech errors, involvement of other aspects of the linguistic system and treatment response. To date there is no universal and agreed-upon classification system. Instead, a number of theoretically differing classification systems have been proposed based on either an aetiological (medical) approach, a descriptive-linguistic approach or a processing approach. To describe and review the supporting evidence, and to provide a critical evaluation of the current childhood SSD classification systems. Descriptions of the major specific approaches to classification are reviewed and research papers supporting the reliability and validity of the systems are evaluated. Three specific paediatric SSD classification systems; the aetiologic-based Speech Disorders Classification System, the descriptive-linguistic Differential Diagnosis system, and the processing-based Psycholinguistic Framework are identified as potentially useful in classifying children with SSD into homogeneous subgroups. The Differential Diagnosis system has a growing body of empirical support from clinical population studies, across language error pattern studies and treatment efficacy studies. The Speech Disorders Classification System is currently a research tool with eight proposed subgroups. The Psycholinguistic Framework is a potential bridge to linking cause and surface level speech errors. There is a need for a universally agreed-upon classification system that is useful to clinicians and researchers. The resulting classification system needs to be robust, reliable and valid. A universal classification system would allow for improved tailoring of treatments to subgroups of SSD which may, in turn, lead to improved treatment efficacy. © 2012 Royal College of Speech and Language Therapists.
A hybrid sensing approach for pure and adulterated honey classification.
Subari, Norazian; Mohamad Saleh, Junita; Md Shakaff, Ali Yeon; Zakaria, Ammar
2012-10-17
This paper presents a comparison between data from single modality and fusion methods to classify Tualang honey as pure or adulterated using Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) statistical classification approaches. Ten different brands of certified pure Tualang honey were obtained throughout peninsular Malaysia and Sumatera, Indonesia. Various concentrations of two types of sugar solution (beet and cane sugar) were used in this investigation to create honey samples of 20%, 40%, 60% and 80% adulteration concentrations. Honey data extracted from an electronic nose (e-nose) and Fourier Transform Infrared Spectroscopy (FTIR) were gathered, analyzed and compared based on fusion methods. Visual observation of classification plots revealed that the PCA approach able to distinct pure and adulterated honey samples better than the LDA technique. Overall, the validated classification results based on FTIR data (88.0%) gave higher classification accuracy than e-nose data (76.5%) using the LDA technique. Honey classification based on normalized low-level and intermediate-level FTIR and e-nose fusion data scored classification accuracies of 92.2% and 88.7%, respectively using the Stepwise LDA method. The results suggested that pure and adulterated honey samples were better classified using FTIR and e-nose fusion data than single modality data.
Salvador-Carulla, Luis; Bertelli, Marco; Martinez-Leal, Rafael
2018-03-01
To increase the expert knowledge-base on intellectual developmental disorders (IDDs) by investigating the typology trajectories of consensus formation in the classification systems up to the 11th edition of the International Classification of Diseases (ICD-11). This expert review combines an analysis of key recent literature and the revision of the consensus formation and contestation in the expert committees contributing to the classification systems since the 1950s. Historically two main approaches have contributed to the development of this knowledge-base: a neurodevelopmental-clinical approach and a psychoeducational-social approach. These approaches show a complex interaction throughout the history of IDD and have had a diverse influence on its classification. Although in theory Diagnostic and Statistical Manual (DSM)-5 and ICD adhere to the neurodevelopmental-clinical model, the new definition in the ICD-11 follows a restrictive normality approach to intellectual quotient and to the measurement of adaptive behaviour. On the contrary DSM-5 is closer to the recommendations made by the WHO 'Working Group on Mental Retardation' for ICD-11 for an integrative approach. A cyclical pattern of consensus formation has been identified in IDD. The revision of the three major classification systems in the last decade has increased the terminological and conceptual variability and the overall scientific contestation on IDD.
A fuzzy hill-climbing algorithm for the development of a compact associative classifier
NASA Astrophysics Data System (ADS)
Mitra, Soumyaroop; Lam, Sarah S.
2012-02-01
Classification, a data mining technique, has widespread applications including medical diagnosis, targeted marketing, and others. Knowledge discovery from databases in the form of association rules is one of the important data mining tasks. An integrated approach, classification based on association rules, has drawn the attention of the data mining community over the last decade. While attention has been mainly focused on increasing classifier accuracies, not much efforts have been devoted towards building interpretable and less complex models. This paper discusses the development of a compact associative classification model using a hill-climbing approach and fuzzy sets. The proposed methodology builds the rule-base by selecting rules which contribute towards increasing training accuracy, thus balancing classification accuracy with the number of classification association rules. The results indicated that the proposed associative classification model can achieve competitive accuracies on benchmark datasets with continuous attributes and lend better interpretability, when compared with other rule-based systems.
An automatic graph-based approach for artery/vein classification in retinal images.
Dashtbozorg, Behdad; Mendonça, Ana Maria; Campilho, Aurélio
2014-03-01
The classification of retinal vessels into artery/vein (A/V) is an important phase for automating the detection of vascular changes, and for the calculation of characteristic signs associated with several systemic diseases such as diabetes, hypertension, and other cardiovascular conditions. This paper presents an automatic approach for A/V classification based on the analysis of a graph extracted from the retinal vasculature. The proposed method classifies the entire vascular tree deciding on the type of each intersection point (graph nodes) and assigning one of two labels to each vessel segment (graph links). Final classification of a vessel segment as A/V is performed through the combination of the graph-based labeling results with a set of intensity features. The results of this proposed method are compared with manual labeling for three public databases. Accuracy values of 88.3%, 87.4%, and 89.8% are obtained for the images of the INSPIRE-AVR, DRIVE, and VICAVR databases, respectively. These results demonstrate that our method outperforms recent approaches for A/V classification.
Semi-supervised classification tool for DubaiSat-2 multispectral imagery
NASA Astrophysics Data System (ADS)
Al-Mansoori, Saeed
2015-10-01
This paper addresses a semi-supervised classification tool based on a pixel-based approach of the multi-spectral satellite imagery. There are not many studies demonstrating such algorithm for the multispectral images, especially when the image consists of 4 bands (Red, Green, Blue and Near Infrared) as in DubaiSat-2 satellite images. The proposed approach utilizes both unsupervised and supervised classification schemes sequentially to identify four classes in the image, namely, water bodies, vegetation, land (developed and undeveloped areas) and paved areas (i.e. roads). The unsupervised classification concept is applied to identify two classes; water bodies and vegetation, based on a well-known index that uses the distinct wavelengths of visible and near-infrared sunlight that is absorbed and reflected by the plants to identify the classes; this index parameter is called "Normalized Difference Vegetation Index (NDVI)". Afterward, the supervised classification is performed by selecting training homogenous samples for roads and land areas. Here, a precise selection of training samples plays a vital role in the classification accuracy. Post classification is finally performed to enhance the classification accuracy, where the classified image is sieved, clumped and filtered before producing final output. Overall, the supervised classification approach produced higher accuracy than the unsupervised method. This paper shows some current preliminary research results which point out the effectiveness of the proposed technique in a virtual perspective.
Armutlu, Pelin; Ozdemir, Muhittin E; Uney-Yuksektepe, Fadime; Kavakli, I Halil; Turkay, Metin
2008-10-03
A priori analysis of the activity of drugs on the target protein by computational approaches can be useful in narrowing down drug candidates for further experimental tests. Currently, there are a large number of computational methods that predict the activity of drugs on proteins. In this study, we approach the activity prediction problem as a classification problem and, we aim to improve the classification accuracy by introducing an algorithm that combines partial least squares regression with mixed-integer programming based hyper-boxes classification method, where drug molecules are classified as low active or high active regarding their binding activity (IC50 values) on target proteins. We also aim to determine the most significant molecular descriptors for the drug molecules. We first apply our approach by analyzing the activities of widely known inhibitor datasets including Acetylcholinesterase (ACHE), Benzodiazepine Receptor (BZR), Dihydrofolate Reductase (DHFR), Cyclooxygenase-2 (COX-2) with known IC50 values. The results at this stage proved that our approach consistently gives better classification accuracies compared to 63 other reported classification methods such as SVM, Naïve Bayes, where we were able to predict the experimentally determined IC50 values with a worst case accuracy of 96%. To further test applicability of this approach we first created dataset for Cytochrome P450 C17 inhibitors and then predicted their activities with 100% accuracy. Our results indicate that this approach can be utilized to predict the inhibitory effects of inhibitors based on their molecular descriptors. This approach will not only enhance drug discovery process, but also save time and resources committed.
Land Cover Classification in a Complex Urban-Rural Landscape with Quickbird Imagery
Moran, Emilio Federico.
2010-01-01
High spatial resolution images have been increasingly used for urban land use/cover classification, but the high spectral variation within the same land cover, the spectral confusion among different land covers, and the shadow problem often lead to poor classification performance based on the traditional per-pixel spectral-based classification methods. This paper explores approaches to improve urban land cover classification with Quickbird imagery. Traditional per-pixel spectral-based supervised classification, incorporation of textural images and multispectral images, spectral-spatial classifier, and segmentation-based classification are examined in a relatively new developing urban landscape, Lucas do Rio Verde in Mato Grosso State, Brazil. This research shows that use of spatial information during the image classification procedure, either through the integrated use of textural and spectral images or through the use of segmentation-based classification method, can significantly improve land cover classification performance. PMID:21643433
A machine-learned computational functional genomics-based approach to drug classification.
Lötsch, Jörn; Ultsch, Alfred
2016-12-01
The public accessibility of "big data" about the molecular targets of drugs and the biological functions of genes allows novel data science-based approaches to pharmacology that link drugs directly with their effects on pathophysiologic processes. This provides a phenotypic path to drug discovery and repurposing. This paper compares the performance of a functional genomics-based criterion to the traditional drug target-based classification. Knowledge discovery in the DrugBank and Gene Ontology databases allowed the construction of a "drug target versus biological process" matrix as a combination of "drug versus genes" and "genes versus biological processes" matrices. As a canonical example, such matrices were constructed for classical analgesic drugs. These matrices were projected onto a toroid grid of 50 × 82 artificial neurons using a self-organizing map (SOM). The distance, respectively, cluster structure of the high-dimensional feature space of the matrices was visualized on top of this SOM using a U-matrix. The cluster structure emerging on the U-matrix provided a correct classification of the analgesics into two main classes of opioid and non-opioid analgesics. The classification was flawless with both the functional genomics and the traditional target-based criterion. The functional genomics approach inherently included the drugs' modulatory effects on biological processes. The main pharmacological actions known from pharmacological science were captures, e.g., actions on lipid signaling for non-opioid analgesics that comprised many NSAIDs and actions on neuronal signal transmission for opioid analgesics. Using machine-learned techniques for computational drug classification in a comparative assessment, a functional genomics-based criterion was found to be similarly suitable for drug classification as the traditional target-based criterion. This supports a utility of functional genomics-based approaches to computational system pharmacology for drug discovery and repurposing.
ERIC Educational Resources Information Center
Kim, Jiseon
2010-01-01
Classification testing has been widely used to make categorical decisions by determining whether an examinee has a certain degree of ability required by established standards. As computer technologies have developed, classification testing has become more computerized. Several approaches have been proposed and investigated in the context of…
ERIC Educational Resources Information Center
Zero to Three: National Center for Infants, Toddlers and Families, Washington, DC.
The diagnostic framework presented in this manual seeks to address the need for a systematic, multi-disciplinary, developmentally based approach to the classification of mental health and developmental difficulties in the first 4 years of life. An introduction discusses clinical approaches to assessment and diagnosis, gives an overview of the…
ERIC Educational Resources Information Center
Wieder, Serena, Ed.
The diagnostic framework presented in this manual seeks to address the need for a systematic, multidisciplinary, developmentally based approach to the classification of mental health and developmental difficulties in the first 4 years of life. An introduction discusses clinical approaches to assessment and diagnosis, gives an overview of the…
NASA Astrophysics Data System (ADS)
Srinivasan, Yeshwanth; Hernes, Dana; Tulpule, Bhakti; Yang, Shuyu; Guo, Jiangling; Mitra, Sunanda; Yagneswaran, Sriraja; Nutter, Brian; Jeronimo, Jose; Phillips, Benny; Long, Rodney; Ferris, Daron
2005-04-01
Automated segmentation and classification of diagnostic markers in medical imagery are challenging tasks. Numerous algorithms for segmentation and classification based on statistical approaches of varying complexity are found in the literature. However, the design of an efficient and automated algorithm for precise classification of desired diagnostic markers is extremely image-specific. The National Library of Medicine (NLM), in collaboration with the National Cancer Institute (NCI), is creating an archive of 60,000 digitized color images of the uterine cervix. NLM is developing tools for the analysis and dissemination of these images over the Web for the study of visual features correlated with precancerous neoplasia and cancer. To enable indexing of images of the cervix, it is essential to develop algorithms for the segmentation of regions of interest, such as acetowhitened regions, and automatic identification and classification of regions exhibiting mosaicism and punctation. Success of such algorithms depends, primarily, on the selection of relevant features representing the region of interest. We present color and geometric features based statistical classification and segmentation algorithms yielding excellent identification of the regions of interest. The distinct classification of the mosaic regions from the non-mosaic ones has been obtained by clustering multiple geometric and color features of the segmented sections using various morphological and statistical approaches. Such automated classification methodologies will facilitate content-based image retrieval from the digital archive of uterine cervix and have the potential of developing an image based screening tool for cervical cancer.
Marker-Based Hierarchical Segmentation and Classification Approach for Hyperspectral Imagery
NASA Technical Reports Server (NTRS)
Tarabalka, Yuliya; Tilton, James C.; Benediktsson, Jon Atli; Chanussot, Jocelyn
2011-01-01
The Hierarchical SEGmentation (HSEG) algorithm, which is a combination of hierarchical step-wise optimization and spectral clustering, has given good performances for hyperspectral image analysis. This technique produces at its output a hierarchical set of image segmentations. The automated selection of a single segmentation level is often necessary. We propose and investigate the use of automatically selected markers for this purpose. In this paper, a novel Marker-based HSEG (M-HSEG) method for spectral-spatial classification of hyperspectral images is proposed. First, pixelwise classification is performed and the most reliably classified pixels are selected as markers, with the corresponding class labels. Then, a novel constrained marker-based HSEG algorithm is applied, resulting in a spectral-spatial classification map. The experimental results show that the proposed approach yields accurate segmentation and classification maps, and thus is attractive for hyperspectral image analysis.
NASA Astrophysics Data System (ADS)
Zagouras, Athanassios; Argiriou, Athanassios A.; Flocas, Helena A.; Economou, George; Fotopoulos, Spiros
2012-11-01
Classification of weather maps at various isobaric levels as a methodological tool is used in several problems related to meteorology, climatology, atmospheric pollution and to other fields for many years. Initially the classification was performed manually. The criteria used by the person performing the classification are features of isobars or isopleths of geopotential height, depending on the type of maps to be classified. Although manual classifications integrate the perceptual experience and other unquantifiable qualities of the meteorology specialists involved, these are typically subjective and time consuming. Furthermore, during the last years different approaches of automated methods for atmospheric circulation classification have been proposed, which present automated and so-called objective classifications. In this paper a new method of atmospheric circulation classification of isobaric maps is presented. The method is based on graph theory. It starts with an intelligent prototype selection using an over-partitioning mode of fuzzy c-means (FCM) algorithm, proceeds to a graph formulation for the entire dataset and produces the clusters based on the contemporary dominant sets clustering method. Graph theory is a novel mathematical approach, allowing a more efficient representation of spatially correlated data, compared to the classical Euclidian space representation approaches, used in conventional classification methods. The method has been applied to the classification of 850 hPa atmospheric circulation over the Eastern Mediterranean. The evaluation of the automated methods is performed by statistical indexes; results indicate that the classification is adequately comparable with other state-of-the-art automated map classification methods, for a variable number of clusters.
NASA Astrophysics Data System (ADS)
Paul, Subir; Nagesh Kumar, D.
2018-04-01
Hyperspectral (HS) data comprises of continuous spectral responses of hundreds of narrow spectral bands with very fine spectral resolution or bandwidth, which offer feature identification and classification with high accuracy. In the present study, Mutual Information (MI) based Segmented Stacked Autoencoder (S-SAE) approach for spectral-spatial classification of the HS data is proposed to reduce the complexity and computational time compared to Stacked Autoencoder (SAE) based feature extraction. A non-parametric dependency measure (MI) based spectral segmentation is proposed instead of linear and parametric dependency measure to take care of both linear and nonlinear inter-band dependency for spectral segmentation of the HS bands. Then morphological profiles are created corresponding to segmented spectral features to assimilate the spatial information in the spectral-spatial classification approach. Two non-parametric classifiers, Support Vector Machine (SVM) with Gaussian kernel and Random Forest (RF) are used for classification of the three most popularly used HS datasets. Results of the numerical experiments carried out in this study have shown that SVM with a Gaussian kernel is providing better results for the Pavia University and Botswana datasets whereas RF is performing better for Indian Pines dataset. The experiments performed with the proposed methodology provide encouraging results compared to numerous existing approaches.
Image search engine with selective filtering and feature-element-based classification
NASA Astrophysics Data System (ADS)
Li, Qing; Zhang, Yujin; Dai, Shengyang
2001-12-01
With the growth of Internet and storage capability in recent years, image has become a widespread information format in World Wide Web. However, it has become increasingly harder to search for images of interest, and effective image search engine for the WWW needs to be developed. We propose in this paper a selective filtering process and a novel approach for image classification based on feature element in the image search engine we developed for the WWW. First a selective filtering process is embedded in a general web crawler to filter out the meaningless images with GIF format. Two parameters that can be obtained easily are used in the filtering process. Our classification approach first extract feature elements from images instead of feature vectors. Compared with feature vectors, feature elements can better capture visual meanings of the image according to subjective perception of human beings. Different from traditional image classification method, our classification approach based on feature element doesn't calculate the distance between two vectors in the feature space, while trying to find associations between feature element and class attribute of the image. Experiments are presented to show the efficiency of the proposed approach.
Saa, Jaime F Delgado; Çetin, Müjdat
2012-04-01
We consider the problem of classification of imaginary motor tasks from electroencephalography (EEG) data for brain-computer interfaces (BCIs) and propose a new approach based on hidden conditional random fields (HCRFs). HCRFs are discriminative graphical models that are attractive for this problem because they (1) exploit the temporal structure of EEG; (2) include latent variables that can be used to model different brain states in the signal; and (3) involve learned statistical models matched to the classification task, avoiding some of the limitations of generative models. Our approach involves spatial filtering of the EEG signals and estimation of power spectra based on autoregressive modeling of temporal segments of the EEG signals. Given this time-frequency representation, we select certain frequency bands that are known to be associated with execution of motor tasks. These selected features constitute the data that are fed to the HCRF, parameters of which are learned from training data. Inference algorithms on the HCRFs are used for the classification of motor tasks. We experimentally compare this approach to the best performing methods in BCI competition IV as well as a number of more recent methods and observe that our proposed method yields better classification accuracy.
Domingo-Salvany, Antònia; Bacigalupe, Amaia; Carrasco, José Miguel; Espelt, Albert; Ferrando, Josep; Borrell, Carme
2013-01-01
In Spain, the new National Classification of Occupations (Clasificación Nacional de Ocupaciones [CNO-2011]) is substantially different to the 1994 edition, and requires adaptation of occupational social classes for use in studies of health inequalities. This article presents two proposals to measure social class: the new classification of occupational social class (CSO-SEE12), based on the CNO-2011 and a neo-Weberian perspective, and a social class classification based on a neo-Marxist approach. The CSO-SEE12 is the result of a detailed review of the CNO-2011 codes. In contrast, the neo-Marxist classification is derived from variables related to capital and organizational and skill assets. The proposed CSO-SEE12 consists of seven classes that can be grouped into a smaller number of categories according to study needs. The neo-Marxist classification consists of 12 categories in which home owners are divided into three categories based on capital goods and employed persons are grouped into nine categories composed of organizational and skill assets. These proposals are complemented by a proposed classification of educational level that integrates the various curricula in Spain and provides correspondences with the International Standard Classification of Education. Copyright © 2012 SESPAS. Published by Elsevier Espana. All rights reserved.
An ordinal classification approach for CTG categorization.
Georgoulas, George; Karvelis, Petros; Gavrilis, Dimitris; Stylios, Chrysostomos D; Nikolakopoulos, George
2017-07-01
Evaluation of cardiotocogram (CTG) is a standard approach employed during pregnancy and delivery. But, its interpretation requires high level expertise to decide whether the recording is Normal, Suspicious or Pathological. Therefore, a number of attempts have been carried out over the past three decades for development automated sophisticated systems. These systems are usually (multiclass) classification systems that assign a category to the respective CTG. However most of these systems usually do not take into consideration the natural ordering of the categories associated with CTG recordings. In this work, an algorithm that explicitly takes into consideration the ordering of CTG categories, based on binary decomposition method, is investigated. Achieved results, using as a base classifier the C4.5 decision tree classifier, prove that the ordinal classification approach is marginally better than the traditional multiclass classification approach, which utilizes the standard C4.5 algorithm for several performance criteria.
A Hybrid Sensing Approach for Pure and Adulterated Honey Classification
Subari, Norazian; Saleh, Junita Mohamad; Shakaff, Ali Yeon Md; Zakaria, Ammar
2012-01-01
This paper presents a comparison between data from single modality and fusion methods to classify Tualang honey as pure or adulterated using Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) statistical classification approaches. Ten different brands of certified pure Tualang honey were obtained throughout peninsular Malaysia and Sumatera, Indonesia. Various concentrations of two types of sugar solution (beet and cane sugar) were used in this investigation to create honey samples of 20%, 40%, 60% and 80% adulteration concentrations. Honey data extracted from an electronic nose (e-nose) and Fourier Transform Infrared Spectroscopy (FTIR) were gathered, analyzed and compared based on fusion methods. Visual observation of classification plots revealed that the PCA approach able to distinct pure and adulterated honey samples better than the LDA technique. Overall, the validated classification results based on FTIR data (88.0%) gave higher classification accuracy than e-nose data (76.5%) using the LDA technique. Honey classification based on normalized low-level and intermediate-level FTIR and e-nose fusion data scored classification accuracies of 92.2% and 88.7%, respectively using the Stepwise LDA method. The results suggested that pure and adulterated honey samples were better classified using FTIR and e-nose fusion data than single modality data. PMID:23202033
Automated classification of cell morphology by coherence-controlled holographic microscopy
NASA Astrophysics Data System (ADS)
Strbkova, Lenka; Zicha, Daniel; Vesely, Pavel; Chmelik, Radim
2017-08-01
In the last few years, classification of cells by machine learning has become frequently used in biology. However, most of the approaches are based on morphometric (MO) features, which are not quantitative in terms of cell mass. This may result in poor classification accuracy. Here, we study the potential contribution of coherence-controlled holographic microscopy enabling quantitative phase imaging for the classification of cell morphologies. We compare our approach with the commonly used method based on MO features. We tested both classification approaches in an experiment with nutritionally deprived cancer tissue cells, while employing several supervised machine learning algorithms. Most of the classifiers provided higher performance when quantitative phase features were employed. Based on the results, it can be concluded that the quantitative phase features played an important role in improving the performance of the classification. The methodology could be valuable help in refining the monitoring of live cells in an automated fashion. We believe that coherence-controlled holographic microscopy, as a tool for quantitative phase imaging, offers all preconditions for the accurate automated analysis of live cell behavior while enabling noninvasive label-free imaging with sufficient contrast and high-spatiotemporal phase sensitivity.
Automated classification of cell morphology by coherence-controlled holographic microscopy.
Strbkova, Lenka; Zicha, Daniel; Vesely, Pavel; Chmelik, Radim
2017-08-01
In the last few years, classification of cells by machine learning has become frequently used in biology. However, most of the approaches are based on morphometric (MO) features, which are not quantitative in terms of cell mass. This may result in poor classification accuracy. Here, we study the potential contribution of coherence-controlled holographic microscopy enabling quantitative phase imaging for the classification of cell morphologies. We compare our approach with the commonly used method based on MO features. We tested both classification approaches in an experiment with nutritionally deprived cancer tissue cells, while employing several supervised machine learning algorithms. Most of the classifiers provided higher performance when quantitative phase features were employed. Based on the results, it can be concluded that the quantitative phase features played an important role in improving the performance of the classification. The methodology could be valuable help in refining the monitoring of live cells in an automated fashion. We believe that coherence-controlled holographic microscopy, as a tool for quantitative phase imaging, offers all preconditions for the accurate automated analysis of live cell behavior while enabling noninvasive label-free imaging with sufficient contrast and high-spatiotemporal phase sensitivity. (2017) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).
Gordijn, Sanne J; Korteweg, Fleurisca J; Erwich, Jan Jaap H M; Holm, Jozien P; van Diem, Mariet Th; Bergman, Klasien A; Timmer, Albertus
2009-06-01
Many classification systems for perinatal mortality are available, all with their own strengths and weaknesses: none of them has been universally accepted. We present a systematic multilayered approach for the analysis of perinatal mortality based on information related to the moment of death, the conditions associated with death and the underlying cause of death, using a combination of representatives of existing classification systems. We compared the existing classification systems regarding their definition of the perinatal period, level of complexity, inclusion of maternal, foetal and/or placental factors and whether they focus at a clinical or pathological viewpoint. Furthermore, we allocated the classification systems to one of three categories: 'when', 'what' or 'why', dependent on whether the allocation of the individual cases of perinatal mortality is based on the moment of death ('when'), the clinical conditions associated with death ('what'), or the underlying cause of death ('why'). A multilayered approach for the analysis and classification of perinatal mortality is possible by using combinations of existing systems; for example the Wigglesworth or Nordic Baltic ('when'), ReCoDe ('what') and Tulip ('why') classification systems. This approach is useful not only for in depth analysis of perinatal mortality in the developed world but also for analysis of perinatal mortality in the developing countries, where resources to investigate death are often limited.
Multi-label literature classification based on the Gene Ontology graph.
Jin, Bo; Muller, Brian; Zhai, Chengxiang; Lu, Xinghua
2008-12-08
The Gene Ontology is a controlled vocabulary for representing knowledge related to genes and proteins in a computable form. The current effort of manually annotating proteins with the Gene Ontology is outpaced by the rate of accumulation of biomedical knowledge in literature, which urges the development of text mining approaches to facilitate the process by automatically extracting the Gene Ontology annotation from literature. The task is usually cast as a text classification problem, and contemporary methods are confronted with unbalanced training data and the difficulties associated with multi-label classification. In this research, we investigated the methods of enhancing automatic multi-label classification of biomedical literature by utilizing the structure of the Gene Ontology graph. We have studied three graph-based multi-label classification algorithms, including a novel stochastic algorithm and two top-down hierarchical classification methods for multi-label literature classification. We systematically evaluated and compared these graph-based classification algorithms to a conventional flat multi-label algorithm. The results indicate that, through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods can significantly improve predictions of the Gene Ontology terms implied by the analyzed text. Furthermore, the graph-based multi-label classifiers are capable of suggesting Gene Ontology annotations (to curators) that are closely related to the true annotations even if they fail to predict the true ones directly. A software package implementing the studied algorithms is available for the research community. Through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods have better potential than the conventional flat multi-label classification approach to facilitate protein annotation based on the literature.
Caumes, Géraldine; Borrel, Alexandre; Abi Hussein, Hiba; Camproux, Anne-Claude; Regad, Leslie
2017-09-01
Small molecules interact with their protein target on surface cavities known as binding pockets. Pocket-based approaches are very useful in all of the phases of drug design. Their first step is estimating the binding pocket based on protein structure. The available pocket-estimation methods produce different pockets for the same target. The aim of this work is to investigate the effects of different pocket-estimation methods on the results of pocket-based approaches. We focused on the effect of three pocket-estimation methods on a pocket-ligand (PL) classification. This pocket-based approach is useful for understanding the correspondence between the pocket and ligand spaces and to develop pharmacological profiling models. We found pocket-estimation methods yield different binding pockets in terms of boundaries and properties. These differences are responsible for the variation in the PL classification results that can have an impact on the detected correspondence between pocket and ligand profiles. Thus, we highlighted the importance of the pocket-estimation method choice in pocket-based approaches. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Comparison of Sub-pixel Classification Approaches for Crop-specific Mapping
The Moderate Resolution Imaging Spectroradiometer (MODIS) data has been increasingly used for crop mapping and other agricultural applications. Phenology-based classification approaches using the NDVI (Normalized Difference Vegetation Index) 16-day composite (250 m) data product...
NASA Astrophysics Data System (ADS)
Zhang, Min; Zhou, Xiangrong; Goshima, Satoshi; Chen, Huayue; Muramatsu, Chisako; Hara, Takeshi; Yokoyama, Ryojiro; Kanematsu, Masayuki; Fujita, Hiroshi
2012-03-01
We aim at using a new texton based texture classification method in the classification of pulmonary emphysema in computed tomography (CT) images of the lungs. Different from conventional computer-aided diagnosis (CAD) pulmonary emphysema classification methods, in this paper, firstly, the dictionary of texton is learned via applying sparse representation(SR) to image patches in the training dataset. Then the SR coefficients of the test images over the dictionary are used to construct the histograms for texture presentations. Finally, classification is performed by using a nearest neighbor classifier with a histogram dissimilarity measure as distance. The proposed approach is tested on 3840 annotated regions of interest consisting of normal tissue and mild, moderate and severe pulmonary emphysema of three subtypes. The performance of the proposed system, with an accuracy of about 88%, is comparably higher than state of the art method based on the basic rotation invariant local binary pattern histograms and the texture classification method based on texton learning by k-means, which performs almost the best among other approaches in the literature.
A neural network approach to cloud classification
NASA Technical Reports Server (NTRS)
Lee, Jonathan; Weger, Ronald C.; Sengupta, Sailes K.; Welch, Ronald M.
1990-01-01
It is shown that, using high-spatial-resolution data, very high cloud classification accuracies can be obtained with a neural network approach. A texture-based neural network classifier using only single-channel visible Landsat MSS imagery achieves an overall cloud identification accuracy of 93 percent. Cirrus can be distinguished from boundary layer cloudiness with an accuracy of 96 percent, without the use of an infrared channel. Stratocumulus is retrieved with an accuracy of 92 percent, cumulus at 90 percent. The use of the neural network does not improve cirrus classification accuracy. Rather, its main effect is in the improved separation between stratocumulus and cumulus cloudiness. While most cloud classification algorithms rely on linear parametric schemes, the present study is based on a nonlinear, nonparametric four-layer neural network approach. A three-layer neural network architecture, the nonparametric K-nearest neighbor approach, and the linear stepwise discriminant analysis procedure are compared. A significant finding is that significantly higher accuracies are attained with the nonparametric approaches using only 20 percent of the database as training data, compared to 67 percent of the database in the linear approach.
Combining multiple decisions: applications to bioinformatics
NASA Astrophysics Data System (ADS)
Yukinawa, N.; Takenouchi, T.; Oba, S.; Ishii, S.
2008-01-01
Multi-class classification is one of the fundamental tasks in bioinformatics and typically arises in cancer diagnosis studies by gene expression profiling. This article reviews two recent approaches to multi-class classification by combining multiple binary classifiers, which are formulated based on a unified framework of error-correcting output coding (ECOC). The first approach is to construct a multi-class classifier in which each binary classifier to be aggregated has a weight value to be optimally tuned based on the observed data. In the second approach, misclassification of each binary classifier is formulated as a bit inversion error with a probabilistic model by making an analogy to the context of information transmission theory. Experimental studies using various real-world datasets including cancer classification problems reveal that both of the new methods are superior or comparable to other multi-class classification methods.
CW-SSIM kernel based random forest for image classification
NASA Astrophysics Data System (ADS)
Fan, Guangzhe; Wang, Zhou; Wang, Jiheng
2010-07-01
Complex wavelet structural similarity (CW-SSIM) index has been proposed as a powerful image similarity metric that is robust to translation, scaling and rotation of images, but how to employ it in image classification applications has not been deeply investigated. In this paper, we incorporate CW-SSIM as a kernel function into a random forest learning algorithm. This leads to a novel image classification approach that does not require a feature extraction or dimension reduction stage at the front end. We use hand-written digit recognition as an example to demonstrate our algorithm. We compare the performance of the proposed approach with random forest learning based on other kernels, including the widely adopted Gaussian and the inner product kernels. Empirical evidences show that the proposed method is superior in its classification power. We also compared our proposed approach with the direct random forest method without kernel and the popular kernel-learning method support vector machine. Our test results based on both simulated and realworld data suggest that the proposed approach works superior to traditional methods without the feature selection procedure.
A Bio-Inspired Herbal Tea Flavour Assessment Technique
Zakaria, Nur Zawatil Isqi; Masnan, Maz Jamilah; Zakaria, Ammar; Shakaff, Ali Yeon Md
2014-01-01
Herbal-based products are becoming a widespread production trend among manufacturers for the domestic and international markets. As the production increases to meet the market demand, it is very crucial for the manufacturer to ensure that their products have met specific criteria and fulfil the intended quality determined by the quality controller. One famous herbal-based product is herbal tea. This paper investigates bio-inspired flavour assessments in a data fusion framework involving an e-nose and e-tongue. The objectives are to attain good classification of different types and brands of herbal tea, classification of different flavour masking effects and finally classification of different concentrations of herbal tea. Two data fusion levels were employed in this research, low level data fusion and intermediate level data fusion. Four classification approaches; LDA, SVM, KNN and PNN were examined in search of the best classifier to achieve the research objectives. In order to evaluate the classifiers' performance, an error estimator based on k-fold cross validation and leave-one-out were applied. Classification based on GC-MS TIC data was also included as a comparison to the classification performance using fusion approaches. Generally, KNN outperformed the other classification techniques for the three flavour assessments in the low level data fusion and intermediate level data fusion. However, the classification results based on GC-MS TIC data are varied. PMID:25010697
Visual brain activity patterns classification with simultaneous EEG-fMRI: A multimodal approach.
Ahmad, Rana Fayyaz; Malik, Aamir Saeed; Kamel, Nidal; Reza, Faruque; Amin, Hafeez Ullah; Hussain, Muhammad
2017-01-01
Classification of the visual information from the brain activity data is a challenging task. Many studies reported in the literature are based on the brain activity patterns using either fMRI or EEG/MEG only. EEG and fMRI considered as two complementary neuroimaging modalities in terms of their temporal and spatial resolution to map the brain activity. For getting a high spatial and temporal resolution of the brain at the same time, simultaneous EEG-fMRI seems to be fruitful. In this article, we propose a new method based on simultaneous EEG-fMRI data and machine learning approach to classify the visual brain activity patterns. We acquired EEG-fMRI data simultaneously on the ten healthy human participants by showing them visual stimuli. Data fusion approach is used to merge EEG and fMRI data. Machine learning classifier is used for the classification purposes. Results showed that superior classification performance has been achieved with simultaneous EEG-fMRI data as compared to the EEG and fMRI data standalone. This shows that multimodal approach improved the classification accuracy results as compared with other approaches reported in the literature. The proposed simultaneous EEG-fMRI approach for classifying the brain activity patterns can be helpful to predict or fully decode the brain activity patterns.
Le, Long N; Jones, Douglas L
2018-03-01
Audio classification techniques often depend on the availability of a large labeled training dataset for successful performance. However, in many application domains of audio classification (e.g., wildlife monitoring), obtaining labeled data is still a costly and laborious process. Motivated by this observation, a technique is proposed to efficiently learn a clean template from a few labeled, but likely corrupted (by noise and interferences), data samples. This learning can be done efficiently via tensorial dynamic time warping on the articulation index-based time-frequency representations of audio data. The learned template can then be used in audio classification following the standard template-based approach. Experimental results show that the proposed approach outperforms both (1) the recurrent neural network approach and (2) the state-of-the-art in the template-based approach on a wildlife detection application with few training samples.
Ortega, Julio; Asensio-Cubero, Javier; Gan, John Q; Ortiz, Andrés
2016-07-15
Brain-computer interfacing (BCI) applications based on the classification of electroencephalographic (EEG) signals require solving high-dimensional pattern classification problems with such a relatively small number of training patterns that curse of dimensionality problems usually arise. Multiresolution analysis (MRA) has useful properties for signal analysis in both temporal and spectral analysis, and has been broadly used in the BCI field. However, MRA usually increases the dimensionality of the input data. Therefore, some approaches to feature selection or feature dimensionality reduction should be considered for improving the performance of the MRA based BCI. This paper investigates feature selection in the MRA-based frameworks for BCI. Several wrapper approaches to evolutionary multiobjective feature selection are proposed with different structures of classifiers. They are evaluated by comparing with baseline methods using sparse representation of features or without feature selection. The statistical analysis, by applying the Kolmogorov-Smirnoff and Kruskal-Wallis tests to the means of the Kappa values evaluated by using the test patterns in each approach, has demonstrated some advantages of the proposed approaches. In comparison with the baseline MRA approach used in previous studies, the proposed evolutionary multiobjective feature selection approaches provide similar or even better classification performances, with significant reduction in the number of features that need to be computed.
NASA Technical Reports Server (NTRS)
Benediktsson, Jon A.; Swain, Philip H.; Ersoy, Okan K.
1990-01-01
Neural network learning procedures and statistical classificaiton methods are applied and compared empirically in classification of multisource remote sensing and geographic data. Statistical multisource classification by means of a method based on Bayesian classification theory is also investigated and modified. The modifications permit control of the influence of the data sources involved in the classification process. Reliability measures are introduced to rank the quality of the data sources. The data sources are then weighted according to these rankings in the statistical multisource classification. Four data sources are used in experiments: Landsat MSS data and three forms of topographic data (elevation, slope, and aspect). Experimental results show that two different approaches have unique advantages and disadvantages in this classification application.
NASA Astrophysics Data System (ADS)
Ghaffarian, S.; Ghaffarian, S.
2014-08-01
This paper presents a novel approach to detect the buildings by automization of the training area collecting stage for supervised classification. The method based on the fact that a 3d building structure should cast a shadow under suitable imaging conditions. Therefore, the methodology begins with the detection and masking out the shadow areas using luminance component of the LAB color space, which indicates the lightness of the image, and a novel double thresholding technique. Further, the training areas for supervised classification are selected by automatically determining a buffer zone on each building whose shadow is detected by using the shadow shape and the sun illumination direction. Thereafter, by calculating the statistic values of each buffer zone which is collected from the building areas the Improved Parallelepiped Supervised Classification is executed to detect the buildings. Standard deviation thresholding applied to the Parallelepiped classification method to improve its accuracy. Finally, simple morphological operations conducted for releasing the noises and increasing the accuracy of the results. The experiments were performed on set of high resolution Google Earth images. The performance of the proposed approach was assessed by comparing the results of the proposed approach with the reference data by using well-known quality measurements (Precision, Recall and F1-score) to evaluate the pixel-based and object-based performances of the proposed approach. Evaluation of the results illustrates that buildings detected from dense and suburban districts with divers characteristics and color combinations using our proposed method have 88.4 % and 853 % overall pixel-based and object-based precision performances, respectively.
Towards the use of similarity distances to music genre classification: A comparative study.
Goienetxea, Izaro; Martínez-Otzeta, José María; Sierra, Basilio; Mendialdua, Iñigo
2018-01-01
Music genre classification is a challenging research concept, for which open questions remain regarding classification approach, music piece representation, distances between/within genres, and so on. In this paper an investigation on the classification of generated music pieces is performed, based on the idea that grouping close related known pieces in different sets -or clusters- and then generating in an automatic way a new song which is somehow "inspired" in each set, the new song would be more likely to be classified as belonging to the set which inspired it, based on the same distance used to separate the clusters. Different music pieces representations and distances among pieces are used; obtained results are promising, and indicate the appropriateness of the used approach even in a such a subjective area as music genre classification is.
Towards the use of similarity distances to music genre classification: A comparative study
Martínez-Otzeta, José María; Sierra, Basilio; Mendialdua, Iñigo
2018-01-01
Music genre classification is a challenging research concept, for which open questions remain regarding classification approach, music piece representation, distances between/within genres, and so on. In this paper an investigation on the classification of generated music pieces is performed, based on the idea that grouping close related known pieces in different sets –or clusters– and then generating in an automatic way a new song which is somehow “inspired” in each set, the new song would be more likely to be classified as belonging to the set which inspired it, based on the same distance used to separate the clusters. Different music pieces representations and distances among pieces are used; obtained results are promising, and indicate the appropriateness of the used approach even in a such a subjective area as music genre classification is. PMID:29444160
A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification
NASA Astrophysics Data System (ADS)
Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun
2016-12-01
Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value.
A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification.
Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun
2016-12-01
Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value.
A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification
Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun
2016-01-01
Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value. PMID:27905520
Tsai, Yu Hsin; Stow, Douglas; Weeks, John
2013-01-01
The goal of this study was to map and quantify the number of newly constructed buildings in Accra, Ghana between 2002 and 2010 based on high spatial resolution satellite image data. Two semi-automated feature detection approaches for detecting and mapping newly constructed buildings based on QuickBird very high spatial resolution satellite imagery were analyzed: (1) post-classification comparison; and (2) bi-temporal layerstack classification. Feature Analyst software based on a spatial contextual classifier and ENVI Feature Extraction that uses a true object-based image analysis approach of image segmentation and segment classification were evaluated. Final map products representing new building objects were compared and assessed for accuracy using two object-based accuracy measures, completeness and correctness. The bi-temporal layerstack method generated more accurate results compared to the post-classification comparison method due to less confusion with background objects. The spectral/spatial contextual approach (Feature Analyst) outperformed the true object-based feature delineation approach (ENVI Feature Extraction) due to its ability to more reliably delineate individual buildings of various sizes. Semi-automated, object-based detection followed by manual editing appears to be a reliable and efficient approach for detecting and enumerating new building objects. A bivariate regression analysis was performed using neighborhood-level estimates of new building density regressed on a census-derived measure of socio-economic status, yielding an inverse relationship with R2 = 0.31 (n = 27; p = 0.00). The primary utility of the new building delineation results is to support spatial analyses of land cover and land use and demographic change. PMID:24415810
Islam, Md Rabiul; Tanaka, Toshihisa; Molla, Md Khademul Islam
2018-05-08
When designing multiclass motor imagery-based brain-computer interface (MI-BCI), a so-called tangent space mapping (TSM) method utilizing the geometric structure of covariance matrices is an effective technique. This paper aims to introduce a method using TSM for finding accurate operational frequency bands related brain activities associated with MI tasks. A multichannel electroencephalogram (EEG) signal is decomposed into multiple subbands, and tangent features are then estimated on each subband. A mutual information analysis-based effective algorithm is implemented to select subbands containing features capable of improving motor imagery classification accuracy. Thus obtained features of selected subbands are combined to get feature space. A principal component analysis-based approach is employed to reduce the features dimension and then the classification is accomplished by a support vector machine (SVM). Offline analysis demonstrates the proposed multiband tangent space mapping with subband selection (MTSMS) approach outperforms state-of-the-art methods. It acheives the highest average classification accuracy for all datasets (BCI competition dataset 2a, IIIa, IIIb, and dataset JK-HH1). The increased classification accuracy of MI tasks with the proposed MTSMS approach can yield effective implementation of BCI. The mutual information-based subband selection method is implemented to tune operation frequency bands to represent actual motor imagery tasks.
NASA Technical Reports Server (NTRS)
Myint, Soe W.; Mesev, Victor; Quattrochi, Dale; Wentz, Elizabeth A.
2013-01-01
Remote sensing methods used to generate base maps to analyze the urban environment rely predominantly on digital sensor data from space-borne platforms. This is due in part from new sources of high spatial resolution data covering the globe, a variety of multispectral and multitemporal sources, sophisticated statistical and geospatial methods, and compatibility with GIS data sources and methods. The goal of this chapter is to review the four groups of classification methods for digital sensor data from space-borne platforms; per-pixel, sub-pixel, object-based (spatial-based), and geospatial methods. Per-pixel methods are widely used methods that classify pixels into distinct categories based solely on the spectral and ancillary information within that pixel. They are used for simple calculations of environmental indices (e.g., NDVI) to sophisticated expert systems to assign urban land covers. Researchers recognize however, that even with the smallest pixel size the spectral information within a pixel is really a combination of multiple urban surfaces. Sub-pixel classification methods therefore aim to statistically quantify the mixture of surfaces to improve overall classification accuracy. While within pixel variations exist, there is also significant evidence that groups of nearby pixels have similar spectral information and therefore belong to the same classification category. Object-oriented methods have emerged that group pixels prior to classification based on spectral similarity and spatial proximity. Classification accuracy using object-based methods show significant success and promise for numerous urban 3 applications. Like the object-oriented methods that recognize the importance of spatial proximity, geospatial methods for urban mapping also utilize neighboring pixels in the classification process. The primary difference though is that geostatistical methods (e.g., spatial autocorrelation methods) are utilized during both the pre- and post-classification steps. Within this chapter, each of the four approaches is described in terms of scale and accuracy classifying urban land use and urban land cover; and for its range of urban applications. We demonstrate the overview of four main classification groups in Figure 1 while Table 1 details the approaches with respect to classification requirements and procedures (e.g., reflectance conversion, steps before training sample selection, training samples, spatial approaches commonly used, classifiers, primary inputs for classification, output structures, number of output layers, and accuracy assessment). The chapter concludes with a brief summary of the methods reviewed and the challenges that remain in developing new classification methods for improving the efficiency and accuracy of mapping urban areas.
Bennet, Jaison; Ganaprakasam, Chilambuchelvan Arul; Arputharaj, Kannan
2014-01-01
Cancer classification by doctors and radiologists was based on morphological and clinical features and had limited diagnostic ability in olden days. The recent arrival of DNA microarray technology has led to the concurrent monitoring of thousands of gene expressions in a single chip which stimulates the progress in cancer classification. In this paper, we have proposed a hybrid approach for microarray data classification based on nearest neighbor (KNN), naive Bayes, and support vector machine (SVM). Feature selection prior to classification plays a vital role and a feature selection technique which combines discrete wavelet transform (DWT) and moving window technique (MWT) is used. The performance of the proposed method is compared with the conventional classifiers like support vector machine, nearest neighbor, and naive Bayes. Experiments have been conducted on both real and benchmark datasets and the results indicate that the ensemble approach produces higher classification accuracy than conventional classifiers. This paper serves as an automated system for the classification of cancer and can be applied by doctors in real cases which serve as a boon to the medical community. This work further reduces the misclassification of cancers which is highly not allowed in cancer detection.
Bottai, Matteo; Tjärnlund, Anna; Santoni, Giola; Werth, Victoria P; Pilkington, Clarissa; de Visser, Marianne; Alfredsson, Lars; Amato, Anthony A; Barohn, Richard J; Liang, Matthew H; Aggarwal, Rohit; Arnardottir, Snjolaug; Chinoy, Hector; Cooper, Robert G; Danko, Katalin; Dimachkie, Mazen M; Feldman, Brian M; García-De La Torre, Ignacio; Gordon, Patrick; Hayashi, Taichi; Katz, James D; Kohsaka, Hitoshi; Lachenbruch, Peter A; Lang, Bianca A; Li, Yuhui; Oddis, Chester V; Olesinka, Marzena; Reed, Ann M; Rutkowska-Sak, Lidia; Sanner, Helga; Selva-O’Callaghan, Albert; Wook Song, Yeong; Ytterberg, Steven R; Miller, Frederick W; Rider, Lisa G; Lundberg, Ingrid E; Amoruso, Maria
2017-01-01
Objective To describe the methodology used to develop new classification criteria for adult and juvenile idiopathic inflammatory myopathies (IIMs) and their major subgroups. Methods An international, multidisciplinary group of myositis experts produced a set of 93 potentially relevant variables to be tested for inclusion in the criteria. Rheumatology, dermatology, neurology and paediatric clinics worldwide collected data on 976 IIM cases (74% adults, 26% children) and 624 non-IIM comparator cases with mimicking conditions (82% adults, 18% children). The participating clinicians classified each case as IIM or non-IIM. Generally, the classification of any given patient was based on few variables, leaving remaining variables unmeasured. We investigated the strength of the association between all variables and between these and the disease status as determined by the physician. We considered three approaches: (1) a probability-score approach, (2) a sum-of-items approach criteria and (3) a classification-tree approach. Results The approaches yielded several candidate models that were scrutinised with respect to statistical performance and clinical relevance. The probability-score approach showed superior statistical performance and clinical practicability and was therefore preferred over the others. We developed a classification tree for subclassification of patients with IIM. A calculator for electronic devices, such as computers and smartphones, facilitates the use of the European League Against Rheumatism/American College of Rheumatology (EULAR/ACR) classification criteria. Conclusions The new EULAR/ACR classification criteria provide a patient’s probability of having IIM for use in clinical and research settings. The probability is based on a score obtained by summing the weights associated with a set of criteria items. PMID:29177080
Histogram Curve Matching Approaches for Object-based Image Classification of Land Cover and Land Use
Toure, Sory I.; Stow, Douglas A.; Weeks, John R.; Kumar, Sunil
2013-01-01
The classification of image-objects is usually done using parametric statistical measures of central tendency and/or dispersion (e.g., mean or standard deviation). The objectives of this study were to analyze digital number histograms of image objects and evaluate classifications measures exploiting characteristic signatures of such histograms. Two histograms matching classifiers were evaluated and compared to the standard nearest neighbor to mean classifier. An ADS40 airborne multispectral image of San Diego, California was used for assessing the utility of curve matching classifiers in a geographic object-based image analysis (GEOBIA) approach. The classifications were performed with data sets having 0.5 m, 2.5 m, and 5 m spatial resolutions. Results show that histograms are reliable features for characterizing classes. Also, both histogram matching classifiers consistently performed better than the one based on the standard nearest neighbor to mean rule. The highest classification accuracies were produced with images having 2.5 m spatial resolution. PMID:24403648
Building a common pipeline for rule-based document classification.
Patterson, Olga V; Ginter, Thomas; DuVall, Scott L
2013-01-01
Instance-based classification of clinical text is a widely used natural language processing task employed as a step for patient classification, document retrieval, or information extraction. Rule-based approaches rely on concept identification and context analysis in order to determine the appropriate class. We propose a five-step process that enables even small research teams to develop simple but powerful rule-based NLP systems by taking advantage of a common UIMA AS based pipeline for classification. Our proposed methodology coupled with the general-purpose solution provides researchers with access to the data locked in clinical text in cases of limited human resources and compact timelines.
An Active Learning Framework for Hyperspectral Image Classification Using Hierarchical Segmentation
NASA Technical Reports Server (NTRS)
Zhang, Zhou; Pasolli, Edoardo; Crawford, Melba M.; Tilton, James C.
2015-01-01
Augmenting spectral data with spatial information for image classification has recently gained significant attention, as classification accuracy can often be improved by extracting spatial information from neighboring pixels. In this paper, we propose a new framework in which active learning (AL) and hierarchical segmentation (HSeg) are combined for spectral-spatial classification of hyperspectral images. The spatial information is extracted from a best segmentation obtained by pruning the HSeg tree using a new supervised strategy. The best segmentation is updated at each iteration of the AL process, thus taking advantage of informative labeled samples provided by the user. The proposed strategy incorporates spatial information in two ways: 1) concatenating the extracted spatial features and the original spectral features into a stacked vector and 2) extending the training set using a self-learning-based semi-supervised learning (SSL) approach. Finally, the two strategies are combined within an AL framework. The proposed framework is validated with two benchmark hyperspectral datasets. Higher classification accuracies are obtained by the proposed framework with respect to five other state-of-the-art spectral-spatial classification approaches. Moreover, the effectiveness of the proposed pruning strategy is also demonstrated relative to the approaches based on a fixed segmentation.
A semi-supervised classification algorithm using the TAD-derived background as training data
NASA Astrophysics Data System (ADS)
Fan, Lei; Ambeau, Brittany; Messinger, David W.
2013-05-01
In general, spectral image classification algorithms fall into one of two categories: supervised and unsupervised. In unsupervised approaches, the algorithm automatically identifies clusters in the data without a priori information about those clusters (except perhaps the expected number of them). Supervised approaches require an analyst to identify training data to learn the characteristics of the clusters such that they can then classify all other pixels into one of the pre-defined groups. The classification algorithm presented here is a semi-supervised approach based on the Topological Anomaly Detection (TAD) algorithm. The TAD algorithm defines background components based on a mutual k-Nearest Neighbor graph model of the data, along with a spectral connected components analysis. Here, the largest components produced by TAD are used as regions of interest (ROI's),or training data for a supervised classification scheme. By combining those ROI's with a Gaussian Maximum Likelihood (GML) or a Minimum Distance to the Mean (MDM) algorithm, we are able to achieve a semi supervised classification method. We test this classification algorithm against data collected by the HyMAP sensor over the Cooke City, MT area and University of Pavia scene.
Solti, Imre; Cooke, Colin R; Xia, Fei; Wurfel, Mark M
2009-11-01
This paper compares the performance of keyword and machine learning-based chest x-ray report classification for Acute Lung Injury (ALI). ALI mortality is approximately 30 percent. High mortality is, in part, a consequence of delayed manual chest x-ray classification. An automated system could reduce the time to recognize ALI and lead to reductions in mortality. For our study, 96 and 857 chest x-ray reports in two corpora were labeled by domain experts for ALI. We developed a keyword and a Maximum Entropy-based classification system. Word unigram and character n-grams provided the features for the machine learning system. The Maximum Entropy algorithm with character 6-gram achieved the highest performance (Recall=0.91, Precision=0.90 and F-measure=0.91) on the 857-report corpus. This study has shown that for the classification of ALI chest x-ray reports, the machine learning approach is superior to the keyword based system and achieves comparable results to highest performing physician annotators.
Solti, Imre; Cooke, Colin R.; Xia, Fei; Wurfel, Mark M.
2010-01-01
This paper compares the performance of keyword and machine learning-based chest x-ray report classification for Acute Lung Injury (ALI). ALI mortality is approximately 30 percent. High mortality is, in part, a consequence of delayed manual chest x-ray classification. An automated system could reduce the time to recognize ALI and lead to reductions in mortality. For our study, 96 and 857 chest x-ray reports in two corpora were labeled by domain experts for ALI. We developed a keyword and a Maximum Entropy-based classification system. Word unigram and character n-grams provided the features for the machine learning system. The Maximum Entropy algorithm with character 6-gram achieved the highest performance (Recall=0.91, Precision=0.90 and F-measure=0.91) on the 857-report corpus. This study has shown that for the classification of ALI chest x-ray reports, the machine learning approach is superior to the keyword based system and achieves comparable results to highest performing physician annotators. PMID:21152268
Begum, Shahina; Barua, Shaibal; Ahmed, Mobyen Uddin
2014-07-03
Today, clinicians often do diagnosis and classification of diseases based on information collected from several physiological sensor signals. However, sensor signal could easily be vulnerable to uncertain noises or interferences and due to large individual variations sensitivity to different physiological sensors could also vary. Therefore, multiple sensor signal fusion is valuable to provide more robust and reliable decision. This paper demonstrates a physiological sensor signal classification approach using sensor signal fusion and case-based reasoning. The proposed approach has been evaluated to classify Stressed or Relaxed individuals using sensor data fusion. Physiological sensor signals i.e., Heart Rate (HR), Finger Temperature (FT), Respiration Rate (RR), Carbon dioxide (CO2) and Oxygen Saturation (SpO2) are collected during the data collection phase. Here, sensor fusion has been done in two different ways: (i) decision-level fusion using features extracted through traditional approaches; and (ii) data-level fusion using features extracted by means of Multivariate Multiscale Entropy (MMSE). Case-Based Reasoning (CBR) is applied for the classification of the signals. The experimental result shows that the proposed system could classify Stressed or Relaxed individual 87.5% accurately compare to an expert in the domain. So, it shows promising result in the psychophysiological domain and could be possible to adapt this approach to other relevant healthcare systems.
NASA Astrophysics Data System (ADS)
Li, Long; Solana, Carmen; Canters, Frank; Kervyn, Matthieu
2017-10-01
Mapping lava flows using satellite images is an important application of remote sensing in volcanology. Several volcanoes have been mapped through remote sensing using a wide range of data, from optical to thermal infrared and radar images, using techniques such as manual mapping, supervised/unsupervised classification, and elevation subtraction. So far, spectral-based mapping applications mainly focus on the use of traditional pixel-based classifiers, without much investigation into the added value of object-based approaches and into advantages of using machine learning algorithms. In this study, Nyamuragira, characterized by a series of > 20 overlapping lava flows erupted over the last century, was used as a case study. The random forest classifier was tested to map lava flows based on pixels and objects. Image classification was conducted for the 20 individual flows and for 8 groups of flows of similar age using a Landsat 8 image and a DEM of the volcano, both at 30-meter spatial resolution. Results show that object-based classification produces maps with continuous and homogeneous lava surfaces, in agreement with the physical characteristics of lava flows, while lava flows mapped through the pixel-based classification are heterogeneous and fragmented including much "salt and pepper noise". In terms of accuracy, both pixel-based and object-based classification performs well but the former results in higher accuracies than the latter except for mapping lava flow age groups without using topographic features. It is concluded that despite spectral similarity, lava flows of contrasting age can be well discriminated and mapped by means of image classification. The classification approach demonstrated in this study only requires easily accessible image data and can be applied to other volcanoes as well if there is sufficient information to calibrate the mapping.
Classification Framework for ICT-Based Learning Technologies for Disabled People
ERIC Educational Resources Information Center
Hersh, Marion
2017-01-01
The paper presents the first systematic approach to the classification of inclusive information and communication technologies (ICT)-based learning technologies and ICT-based learning technologies for disabled people which covers both assistive and general learning technologies, is valid for all disabled people and considers the full range of…
Faust, Kevin; Xie, Quin; Han, Dominick; Goyle, Kartikay; Volynskaya, Zoya; Djuric, Ugljesa; Diamandis, Phedias
2018-05-16
There is growing interest in utilizing artificial intelligence, and particularly deep learning, for computer vision in histopathology. While accumulating studies highlight expert-level performance of convolutional neural networks (CNNs) on focused classification tasks, most studies rely on probability distribution scores with empirically defined cutoff values based on post-hoc analysis. More generalizable tools that allow humans to visualize histology-based deep learning inferences and decision making are scarce. Here, we leverage t-distributed Stochastic Neighbor Embedding (t-SNE) to reduce dimensionality and depict how CNNs organize histomorphologic information. Unique to our workflow, we develop a quantitative and transparent approach to visualizing classification decisions prior to softmax compression. By discretizing the relationships between classes on the t-SNE plot, we show we can super-impose randomly sampled regions of test images and use their distribution to render statistically-driven classifications. Therefore, in addition to providing intuitive outputs for human review, this visual approach can carry out automated and objective multi-class classifications similar to more traditional and less-transparent categorical probability distribution scores. Importantly, this novel classification approach is driven by a priori statistically defined cutoffs. It therefore serves as a generalizable classification and anomaly detection tool less reliant on post-hoc tuning. Routine incorporation of this convenient approach for quantitative visualization and error reduction in histopathology aims to accelerate early adoption of CNNs into generalized real-world applications where unanticipated and previously untrained classes are often encountered.
NASA Astrophysics Data System (ADS)
Bellón, Beatriz; Bégué, Agnès; Lo Seen, Danny; Lebourgeois, Valentine; Evangelista, Balbino Antônio; Simões, Margareth; Demonte Ferraz, Rodrigo Peçanha
2018-06-01
Cropping systems' maps at fine scale over large areas provide key information for further agricultural production and environmental impact assessments, and thus represent a valuable tool for effective land-use planning. There is, therefore, a growing interest in mapping cropping systems in an operational manner over large areas, and remote sensing approaches based on vegetation index time series analysis have proven to be an efficient tool. However, supervised pixel-based approaches are commonly adopted, requiring resource consuming field campaigns to gather training data. In this paper, we present a new object-based unsupervised classification approach tested on an annual MODIS 16-day composite Normalized Difference Vegetation Index time series and a Landsat 8 mosaic of the State of Tocantins, Brazil, for the 2014-2015 growing season. Two variants of the approach are compared: an hyperclustering approach, and a landscape-clustering approach involving a previous stratification of the study area into landscape units on which the clustering is then performed. The main cropping systems of Tocantins, characterized by the crop types and cropping patterns, were efficiently mapped with the landscape-clustering approach. Results show that stratification prior to clustering significantly improves the classification accuracies for underrepresented and sparsely distributed cropping systems. This study illustrates the potential of unsupervised classification for large area cropping systems' mapping and contributes to the development of generic tools for supporting large-scale agricultural monitoring across regions.
Mode of Action (MOA) Assignment Classifications for Ecotoxicology: An Evaluation of approaches
The mode of toxic action (MOA) is recognized as a key determinant of chemical toxicity and as an alternative to chemical class-based predictive toxicity modeling. However, MOA classification has never been standardized in ecotoxicology, and a comprehensive comparison of classific...
Integrated Low-Rank-Based Discriminative Feature Learning for Recognition.
Zhou, Pan; Lin, Zhouchen; Zhang, Chao
2016-05-01
Feature learning plays a central role in pattern recognition. In recent years, many representation-based feature learning methods have been proposed and have achieved great success in many applications. However, these methods perform feature learning and subsequent classification in two separate steps, which may not be optimal for recognition tasks. In this paper, we present a supervised low-rank-based approach for learning discriminative features. By integrating latent low-rank representation (LatLRR) with a ridge regression-based classifier, our approach combines feature learning with classification, so that the regulated classification error is minimized. In this way, the extracted features are more discriminative for the recognition tasks. Our approach benefits from a recent discovery on the closed-form solutions to noiseless LatLRR. When there is noise, a robust Principal Component Analysis (PCA)-based denoising step can be added as preprocessing. When the scale of a problem is large, we utilize a fast randomized algorithm to speed up the computation of robust PCA. Extensive experimental results demonstrate the effectiveness and robustness of our method.
NASA Astrophysics Data System (ADS)
Broderick, Ciaran; Fealy, Rowan
2013-04-01
Circulation type classifications (CTCs) compiled as part of the COST733 Action, entitled 'Harmonisation and Application of Weather Type Classifications for European Regions', are examined for their synoptic and climatological applicability to Ireland based on their ability to characterise surface temperature and precipitation. In all 16 different objective classification schemes, representative of four different methodological approaches to circulation typing (optimization algorithms, threshold based methods, eigenvector techniques and leader algorithms) are considered. Several statistical metrics which variously quantify the ability of CTCs to discretize daily data into well-defined homogeneous groups are used to evaluate and compare different approaches to synoptic typing. The records from 14 meteorological stations located across the island of Ireland are used in the study. The results indicate that while it was not possible to identify a single optimum classification or approach to circulation typing - conditional on the location and surface variables considered - a number of general assertions regarding the performance of different schemes can be made. The findings for surface temperature indicate that that those classifications based on predefined thresholds (e.g. Litynski, GrossWetterTypes and original Lamb Weather Type) perform well, as do the Kruizinga and Lund classification schemes. Similarly for precipitation predefined type classifications return high skill scores, as do those classifications derived using some optimization procedure (e.g. SANDRA, Self Organizing Maps and K-Means clustering). For both temperature and precipitation the results generally indicate that the classifications perform best for the winter season - reflecting the closer coupling between large-scale circulation and surface conditions during this period. In contrast to the findings for temperature, spatial patterns in the performance of classifications were more evident for precipitation. In the case of this variable those more westerly synoptic stations open to zonal airflow and less influenced by regional scale forcings generally exhibited a stronger link with large-scale circulation.
Locality-preserving sparse representation-based classification in hyperspectral imagery
NASA Astrophysics Data System (ADS)
Gao, Lianru; Yu, Haoyang; Zhang, Bing; Li, Qingting
2016-10-01
This paper proposes to combine locality-preserving projections (LPP) and sparse representation (SR) for hyperspectral image classification. The LPP is first used to reduce the dimensionality of all the training and testing data by finding the optimal linear approximations to the eigenfunctions of the Laplace Beltrami operator on the manifold, where the high-dimensional data lies. Then, SR codes the projected testing pixels as sparse linear combinations of all the training samples to classify the testing pixels by evaluating which class leads to the minimum approximation error. The integration of LPP and SR represents an innovative contribution to the literature. The proposed approach, called locality-preserving SR-based classification, addresses the imbalance between high dimensionality of hyperspectral data and the limited number of training samples. Experimental results on three real hyperspectral data sets demonstrate that the proposed approach outperforms the original counterpart, i.e., SR-based classification.
NASA Astrophysics Data System (ADS)
Pipaud, Isabel; Lehmkuhl, Frank
2017-09-01
In the field of geomorphology, automated extraction and classification of landforms is one of the most active research areas. Until the late 2000s, this task has primarily been tackled using pixel-based approaches. As these methods consider pixels and pixel neighborhoods as the sole basic entities for analysis, they cannot account for the irregular boundaries of real-world objects. Object-based analysis frameworks emerging from the field of remote sensing have been proposed as an alternative approach, and were successfully applied in case studies falling in the domains of both general and specific geomorphology. In this context, the a-priori selection of scale parameters or bandwidths is crucial for the segmentation result, because inappropriate parametrization will either result in over-segmentation or insufficient segmentation. In this study, we describe a novel supervised method for delineation and classification of alluvial fans, and assess its applicability using a SRTM 1‧‧ DEM scene depicting a section of the north-eastern Mongolian Altai, located in northwest Mongolia. The approach is premised on the application of mean-shift segmentation and the use of a one-class support vector machine (SVM) for classification. To consider variability in terms of alluvial fan dimension and shape, segmentation is performed repeatedly for different weightings of the incorporated morphometric parameters as well as different segmentation bandwidths. The final classification layer is obtained by selecting, for each real-world object, the most appropriate segmentation result according to fuzzy membership values derived from the SVM classification. Our results show that mean-shift segmentation and SVM-based classification provide an effective framework for delineation and classification of a particular landform. Variable bandwidths and terrain parameter weightings were identified as being crucial for consideration of intra-class variability, and, in turn, for a constantly high segmentation quality. Our analysis further reveals that incorporation of morphometric parameters quantifying specific morphological aspects of a landform is indispensable for developing an accurate classification scheme. Alluvial fans exhibiting accentuated composite morphologies were identified as a major challenge for automatic delineation, as they cannot be fully captured by a single segmentation run. There is, however, a high probability that this shortcoming can be overcome by enhancing the presented approach with a routine merging fan sub-entities based on their spatial relationships.
A Transform-Based Feature Extraction Approach for Motor Imagery Tasks Classification
Khorshidtalab, Aida; Mesbah, Mostefa; Salami, Momoh J. E.
2015-01-01
In this paper, we present a new motor imagery classification method in the context of electroencephalography (EEG)-based brain–computer interface (BCI). This method uses a signal-dependent orthogonal transform, referred to as linear prediction singular value decomposition (LP-SVD), for feature extraction. The transform defines the mapping as the left singular vectors of the LP coefficient filter impulse response matrix. Using a logistic tree-based model classifier; the extracted features are classified into one of four motor imagery movements. The proposed approach was first benchmarked against two related state-of-the-art feature extraction approaches, namely, discrete cosine transform (DCT) and adaptive autoregressive (AAR)-based methods. By achieving an accuracy of 67.35%, the LP-SVD approach outperformed the other approaches by large margins (25% compared with DCT and 6 % compared with AAR-based methods). To further improve the discriminatory capability of the extracted features and reduce the computational complexity, we enlarged the extracted feature subset by incorporating two extra features, namely, Q- and the Hotelling’s \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}$T^{2}$ \\end{document} statistics of the transformed EEG and introduced a new EEG channel selection method. The performance of the EEG classification based on the expanded feature set and channel selection method was compared with that of a number of the state-of-the-art classification methods previously reported with the BCI IIIa competition data set. Our method came second with an average accuracy of 81.38%. PMID:27170898
A novel underwater dam crack detection and classification approach based on sonar images
Shi, Pengfei; Fan, Xinnan; Ni, Jianjun; Khan, Zubair; Li, Min
2017-01-01
Underwater dam crack detection and classification based on sonar images is a challenging task because underwater environments are complex and because cracks are quite random and diverse in nature. Furthermore, obtainable sonar images are of low resolution. To address these problems, a novel underwater dam crack detection and classification approach based on sonar imagery is proposed. First, the sonar images are divided into image blocks. Second, a clustering analysis of a 3-D feature space is used to obtain the crack fragments. Third, the crack fragments are connected using an improved tensor voting method. Fourth, a minimum spanning tree is used to obtain the crack curve. Finally, an improved evidence theory combined with fuzzy rule reasoning is proposed to classify the cracks. Experimental results show that the proposed approach is able to detect underwater dam cracks and classify them accurately and effectively under complex underwater environments. PMID:28640925
A novel underwater dam crack detection and classification approach based on sonar images.
Shi, Pengfei; Fan, Xinnan; Ni, Jianjun; Khan, Zubair; Li, Min
2017-01-01
Underwater dam crack detection and classification based on sonar images is a challenging task because underwater environments are complex and because cracks are quite random and diverse in nature. Furthermore, obtainable sonar images are of low resolution. To address these problems, a novel underwater dam crack detection and classification approach based on sonar imagery is proposed. First, the sonar images are divided into image blocks. Second, a clustering analysis of a 3-D feature space is used to obtain the crack fragments. Third, the crack fragments are connected using an improved tensor voting method. Fourth, a minimum spanning tree is used to obtain the crack curve. Finally, an improved evidence theory combined with fuzzy rule reasoning is proposed to classify the cracks. Experimental results show that the proposed approach is able to detect underwater dam cracks and classify them accurately and effectively under complex underwater environments.
Sarkar, Sankho Turjo; Bhondekar, Amol P; Macaš, Martin; Kumar, Ritesh; Kaur, Rishemjit; Sharma, Anupma; Gulati, Ashu; Kumar, Amod
2015-11-01
The paper presents a novel encoding scheme for neuronal code generation for odour recognition using an electronic nose (EN). This scheme is based on channel encoding using multiple Gaussian receptive fields superimposed over the temporal EN responses. The encoded data is further applied to a spiking neural network (SNN) for pattern classification. Two forms of SNN, a back-propagation based SpikeProp and a dynamic evolving SNN are used to learn the encoded responses. The effects of information encoding on the performance of SNNs have been investigated. Statistical tests have been performed to determine the contribution of the SNN and the encoding scheme to overall odour discrimination. The approach has been implemented in odour classification of orthodox black tea (Kangra-Himachal Pradesh Region) thereby demonstrating a biomimetic approach for EN data analysis. Copyright © 2015 Elsevier Ltd. All rights reserved.
Multiple Spectral-Spatial Classification Approach for Hyperspectral Data
NASA Technical Reports Server (NTRS)
Tarabalka, Yuliya; Benediktsson, Jon Atli; Chanussot, Jocelyn; Tilton, James C.
2010-01-01
A .new multiple classifier approach for spectral-spatial classification of hyperspectral images is proposed. Several classifiers are used independently to classify an image. For every pixel, if all the classifiers have assigned this pixel to the same class, the pixel is kept as a marker, i.e., a seed of the spatial region, with the corresponding class label. We propose to use spectral-spatial classifiers at the preliminary step of the marker selection procedure, each of them combining the results of a pixel-wise classification and a segmentation map. Different segmentation methods based on dissimilar principles lead to different classification results. Furthermore, a minimum spanning forest is built, where each tree is rooted on a classification -driven marker and forms a region in the spectral -spatial classification: map. Experimental results are presented for two hyperspectral airborne images. The proposed method significantly improves classification accuracies, when compared to previously proposed classification techniques.
Sheets, H David; Covino, Kristen M; Panasiewicz, Joanna M; Morris, Sara R
2006-01-01
Background Geometric morphometric methods of capturing information about curves or outlines of organismal structures may be used in conjunction with canonical variates analysis (CVA) to assign specimens to groups or populations based on their shapes. This methodological paper examines approaches to optimizing the classification of specimens based on their outlines. This study examines the performance of four approaches to the mathematical representation of outlines and two different approaches to curve measurement as applied to a collection of feather outlines. A new approach to the dimension reduction necessary to carry out a CVA on this type of outline data with modest sample sizes is also presented, and its performance is compared to two other approaches to dimension reduction. Results Two semi-landmark-based methods, bending energy alignment and perpendicular projection, are shown to produce roughly equal rates of classification, as do elliptical Fourier methods and the extended eigenshape method of outline measurement. Rates of classification were not highly dependent on the number of points used to represent a curve or the manner in which those points were acquired. The new approach to dimensionality reduction, which utilizes a variable number of principal component (PC) axes, produced higher cross-validation assignment rates than either the standard approach of using a fixed number of PC axes or a partial least squares method. Conclusion Classification of specimens based on feather shape was not highly dependent of the details of the method used to capture shape information. The choice of dimensionality reduction approach was more of a factor, and the cross validation rate of assignment may be optimized using the variable number of PC axes method presented herein. PMID:16978414
Lu, Yingjie
2013-01-01
To facilitate patient involvement in online health community and obtain informative support and emotional support they need, a topic identification approach was proposed in this paper for identifying automatically topics of the health-related messages in online health community, thus assisting patients in reaching the most relevant messages for their queries efficiently. Feature-based classification framework was presented for automatic topic identification in our study. We first collected the messages related to some predefined topics in a online health community. Then we combined three different types of features, n-gram-based features, domain-specific features and sentiment features to build four feature sets for health-related text representation. Finally, three different text classification techniques, C4.5, Naïve Bayes and SVM were adopted to evaluate our topic classification model. By comparing different feature sets and different classification techniques, we found that n-gram-based features, domain-specific features and sentiment features were all considered to be effective in distinguishing different types of health-related topics. In addition, feature reduction technique based on information gain was also effective to improve the topic classification performance. In terms of classification techniques, SVM outperformed C4.5 and Naïve Bayes significantly. The experimental results demonstrated that the proposed approach could identify the topics of online health-related messages efficiently.
NASA Astrophysics Data System (ADS)
Tarai, Madhumita; Kumar, Keshav; Divya, O.; Bairi, Partha; Mishra, Kishor Kumar; Mishra, Ashok Kumar
2017-09-01
The present work compares the dissimilarity and covariance based unsupervised chemometric classification approaches by taking the total synchronous fluorescence spectroscopy data sets acquired for the cumin and non-cumin based herbal preparations. The conventional decomposition method involves eigenvalue-eigenvector analysis of the covariance of the data set and finds the factors that can explain the overall major sources of variation present in the data set. The conventional approach does this irrespective of the fact that the samples belong to intrinsically different groups and hence leads to poor class separation. The present work shows that classification of such samples can be optimized by performing the eigenvalue-eigenvector decomposition on the pair-wise dissimilarity matrix.
Vulnerable land ecosystems classification using spatial context and spectral indices
NASA Astrophysics Data System (ADS)
Ibarrola-Ulzurrun, Edurne; Gonzalo-Martín, Consuelo; Marcello, Javier
2017-10-01
Natural habitats are exposed to growing pressure due to intensification of land use and tourism development. Thus, obtaining information on the vegetation is necessary for conservation and management projects. In this context, remote sensing is an important tool for monitoring and managing habitats, being classification a crucial stage. The majority of image classifications techniques are based upon the pixel-based approach. An alternative is the object-based (OBIA) approach, in which a previous segmentation step merges image pixels to create objects that are then classified. Besides, improved results may be gained by incorporating additional spatial information and specific spectral indices into the classification process. The main goal of this work was to implement and assess object-based classification techniques on very-high resolution imagery incorporating spectral indices and contextual spatial information in the classification models. The study area was Teide National Park in Canary Islands (Spain) using Worldview-2 orthoready imagery. In the classification model, two common indices were selected Normalized Difference Vegetation Index (NDVI) and Optimized Soil Adjusted Vegetation Index (OSAVI), as well as two specific Worldview-2 sensor indices, Worldview Vegetation Index and Worldview Soil Index. To include the contextual information, Grey Level Co-occurrence Matrices (GLCM) were used. The classification was performed training a Support Vector Machine with sufficient and representative number of vegetation samples (Spartocytisus supranubius, Pterocephalus lasiospermus, Descurainia bourgaeana and Pinus canariensis) as well as urban, road and bare soil classes. Confusion Matrices were computed to evaluate the results from each classification model obtaining the highest overall accuracy (90.07%) combining both Worldview indices with the GLCM-dissimilarity.
Cloud field classification based on textural features
NASA Technical Reports Server (NTRS)
Sengupta, Sailes Kumar
1989-01-01
An essential component in global climate research is accurate cloud cover and type determination. Of the two approaches to texture-based classification (statistical and textural), only the former is effective in the classification of natural scenes such as land, ocean, and atmosphere. In the statistical approach that was adopted, parameters characterizing the stochastic properties of the spatial distribution of grey levels in an image are estimated and then used as features for cloud classification. Two types of textural measures were used. One is based on the distribution of the grey level difference vector (GLDV), and the other on a set of textural features derived from the MaxMin cooccurrence matrix (MMCM). The GLDV method looks at the difference D of grey levels at pixels separated by a horizontal distance d and computes several statistics based on this distribution. These are then used as features in subsequent classification. The MaxMin tectural features on the other hand are based on the MMCM, a matrix whose (I,J)th entry give the relative frequency of occurrences of the grey level pair (I,J) that are consecutive and thresholded local extremes separated by a given pixel distance d. Textural measures are then computed based on this matrix in much the same manner as is done in texture computation using the grey level cooccurrence matrix. The database consists of 37 cloud field scenes from LANDSAT imagery using a near IR visible channel. The classification algorithm used is the well known Stepwise Discriminant Analysis. The overall accuracy was estimated by the percentage or correct classifications in each case. It turns out that both types of classifiers, at their best combination of features, and at any given spatial resolution give approximately the same classification accuracy. A neural network based classifier with a feed forward architecture and a back propagation training algorithm is used to increase the classification accuracy, using these two classes of features. Preliminary results based on the GLDV textural features alone look promising.
Application of Wavelet Transform for PDZ Domain Classification
Daqrouq, Khaled; Alhmouz, Rami; Balamesh, Ahmed; Memic, Adnan
2015-01-01
PDZ domains have been identified as part of an array of signaling proteins that are often unrelated, except for the well-conserved structural PDZ domain they contain. These domains have been linked to many disease processes including common Avian influenza, as well as very rare conditions such as Fraser and Usher syndromes. Historically, based on the interactions and the nature of bonds they form, PDZ domains have most often been classified into one of three classes (class I, class II and others - class III), that is directly dependent on their binding partner. In this study, we report on three unique feature extraction approaches based on the bigram and trigram occurrence and existence rearrangements within the domain's primary amino acid sequences in assisting PDZ domain classification. Wavelet packet transform (WPT) and Shannon entropy denoted by wavelet entropy (WE) feature extraction methods were proposed. Using 115 unique human and mouse PDZ domains, the existence rearrangement approach yielded a high recognition rate (78.34%), which outperformed our occurrence rearrangements based method. The recognition rate was (81.41%) with validation technique. The method reported for PDZ domain classification from primary sequences proved to be an encouraging approach for obtaining consistent classification results. We anticipate that by increasing the database size, we can further improve feature extraction and correct classification. PMID:25860375
A structural classification for inland northwest forest vegetation.
Kevin L. O' Hara; Penelope A. Latham; Paul Hessburg; Bradley G. Smith
1996-01-01
Existing approaches to vegetation classification range from those bassed on potential vegetation to others based on existing vegetation composition, or existing structural or physiognomic characteristics. Examples of these classifications are numerous, and in some cases, date back hundreds of years (Mueller-Dumbois and Ellenberg 1974). Small-scale or stand level...
Automatic Classification Using Supervised Learning in a Medical Document Filtering Application.
ERIC Educational Resources Information Center
Mostafa, J.; Lam, W.
2000-01-01
Presents a multilevel model of the information filtering process that permits document classification. Evaluates a document classification approach based on a supervised learning algorithm, measures the accuracy of the algorithm in a neural network that was trained to classify medical documents on cell biology, and discusses filtering…
A hybrid clustering and classification approach for predicting crash injury severity on rural roads.
Hasheminejad, Seyed Hessam-Allah; Zahedi, Mohsen; Hasheminejad, Seyed Mohammad Hossein
2018-03-01
As a threat for transportation system, traffic crashes have a wide range of social consequences for governments. Traffic crashes are increasing in developing countries and Iran as a developing country is not immune from this risk. There are several researches in the literature to predict traffic crash severity based on artificial neural networks (ANNs), support vector machines and decision trees. This paper attempts to investigate the crash injury severity of rural roads by using a hybrid clustering and classification approach to compare the performance of classification algorithms before and after applying the clustering. In this paper, a novel rule-based genetic algorithm (GA) is proposed to predict crash injury severity, which is evaluated by performance criteria in comparison with classification algorithms like ANN. The results obtained from analysis of 13,673 crashes (5600 property damage, 778 fatal crashes, 4690 slight injuries and 2605 severe injuries) on rural roads in Tehran Province of Iran during 2011-2013 revealed that the proposed GA method outperforms other classification algorithms based on classification metrics like precision (86%), recall (88%) and accuracy (87%). Moreover, the proposed GA method has the highest level of interpretation, is easy to understand and provides feedback to analysts.
Real-time classification of vehicles by type within infrared imagery
NASA Astrophysics Data System (ADS)
Kundegorski, Mikolaj E.; Akçay, Samet; Payen de La Garanderie, Grégoire; Breckon, Toby P.
2016-10-01
Real-time classification of vehicles into sub-category types poses a significant challenge within infra-red imagery due to the high levels of intra-class variation in thermal vehicle signatures caused by aspects of design, current operating duration and ambient thermal conditions. Despite these challenges, infra-red sensing offers significant generalized target object detection advantages in terms of all-weather operation and invariance to visual camouflage techniques. This work investigates the accuracy of a number of real-time object classification approaches for this task within the wider context of an existing initial object detection and tracking framework. Specifically we evaluate the use of traditional feature-driven bag of visual words and histogram of oriented gradient classification approaches against modern convolutional neural network architectures. Furthermore, we use classical photogrammetry, within the context of current target detection and classification techniques, as a means of approximating 3D target position within the scene based on this vehicle type classification. Based on photogrammetric estimation of target position, we then illustrate the use of regular Kalman filter based tracking operating on actual 3D vehicle trajectories. Results are presented using a conventional thermal-band infra-red (IR) sensor arrangement where targets are tracked over a range of evaluation scenarios.
Improved Hierarchical Optimization-Based Classification of Hyperspectral Images Using Shape Analysis
NASA Technical Reports Server (NTRS)
Tarabalka, Yuliya; Tilton, James C.
2012-01-01
A new spectral-spatial method for classification of hyperspectral images is proposed. The HSegClas method is based on the integration of probabilistic classification and shape analysis within the hierarchical step-wise optimization algorithm. First, probabilistic support vector machines classification is applied. Then, at each iteration two neighboring regions with the smallest Dissimilarity Criterion (DC) are merged, and classification probabilities are recomputed. The important contribution of this work consists in estimating a DC between regions as a function of statistical, classification and geometrical (area and rectangularity) features. Experimental results are presented on a 102-band ROSIS image of the Center of Pavia, Italy. The developed approach yields more accurate classification results when compared to previously proposed methods.
Task-Driven Dictionary Learning Based on Mutual Information for Medical Image Classification.
Diamant, Idit; Klang, Eyal; Amitai, Michal; Konen, Eli; Goldberger, Jacob; Greenspan, Hayit
2017-06-01
We present a novel variant of the bag-of-visual-words (BoVW) method for automated medical image classification. Our approach improves the BoVW model by learning a task-driven dictionary of the most relevant visual words per task using a mutual information-based criterion. Additionally, we generate relevance maps to visualize and localize the decision of the automatic classification algorithm. These maps demonstrate how the algorithm works and show the spatial layout of the most relevant words. We applied our algorithm to three different tasks: chest x-ray pathology identification (of four pathologies: cardiomegaly, enlarged mediastinum, right consolidation, and left consolidation), liver lesion classification into four categories in computed tomography (CT) images and benign/malignant clusters of microcalcifications (MCs) classification in breast mammograms. Validation was conducted on three datasets: 443 chest x-rays, 118 portal phase CT images of liver lesions, and 260 mammography MCs. The proposed method improves the classical BoVW method for all tested applications. For chest x-ray, area under curve of 0.876 was obtained for enlarged mediastinum identification compared to 0.855 using classical BoVW (with p-value 0.01). For MC classification, a significant improvement of 4% was achieved using our new approach (with p-value = 0.03). For liver lesion classification, an improvement of 6% in sensitivity and 2% in specificity were obtained (with p-value 0.001). We demonstrated that classification based on informative selected set of words results in significant improvement. Our new BoVW approach shows promising results in clinically important domains. Additionally, it can discover relevant parts of images for the task at hand without explicit annotations for training data. This can provide computer-aided support for medical experts in challenging image analysis tasks.
NASA Astrophysics Data System (ADS)
Sun, Ziheng; Fang, Hui; Di, Liping; Yue, Peng
2016-09-01
It was an untouchable dream for remote sensing experts to realize total automatic image classification without inputting any parameter values. Experts usually spend hours and hours on tuning the input parameters of classification algorithms in order to obtain the best results. With the rapid development of knowledge engineering and cyberinfrastructure, a lot of data processing and knowledge reasoning capabilities become online accessible, shareable and interoperable. Based on these recent improvements, this paper presents an idea of parameterless automatic classification which only requires an image and automatically outputs a labeled vector. No parameters and operations are needed from endpoint consumers. An approach is proposed to realize the idea. It adopts an ontology database to store the experiences of tuning values for classifiers. A sample database is used to record training samples of image segments. Geoprocessing Web services are used as functionality blocks to finish basic classification steps. Workflow technology is involved to turn the overall image classification into a total automatic process. A Web-based prototypical system named PACS (Parameterless Automatic Classification System) is implemented. A number of images are fed into the system for evaluation purposes. The results show that the approach could automatically classify remote sensing images and have a fairly good average accuracy. It is indicated that the classified results will be more accurate if the two databases have higher quality. Once the experiences and samples in the databases are accumulated as many as an expert has, the approach should be able to get the results with similar quality to that a human expert can get. Since the approach is total automatic and parameterless, it can not only relieve remote sensing workers from the heavy and time-consuming parameter tuning work, but also significantly shorten the waiting time for consumers and facilitate them to engage in image classification activities. Currently, the approach is used only on high resolution optical three-band remote sensing imagery. The feasibility using the approach on other kinds of remote sensing images or involving additional bands in classification will be studied in future.
Integrative Chemical-Biological Read-Across Approach for Chemical Hazard Classification
Low, Yen; Sedykh, Alexander; Fourches, Denis; Golbraikh, Alexander; Whelan, Maurice; Rusyn, Ivan; Tropsha, Alexander
2013-01-01
Traditional read-across approaches typically rely on the chemical similarity principle to predict chemical toxicity; however, the accuracy of such predictions is often inadequate due to the underlying complex mechanisms of toxicity. Here we report on the development of a hazard classification and visualization method that draws upon both chemical structural similarity and comparisons of biological responses to chemicals measured in multiple short-term assays (”biological” similarity). The Chemical-Biological Read-Across (CBRA) approach infers each compound's toxicity from those of both chemical and biological analogs whose similarities are determined by the Tanimoto coefficient. Classification accuracy of CBRA was compared to that of classical RA and other methods using chemical descriptors alone, or in combination with biological data. Different types of adverse effects (hepatotoxicity, hepatocarcinogenicity, mutagenicity, and acute lethality) were classified using several biological data types (gene expression profiling and cytotoxicity screening). CBRA-based hazard classification exhibited consistently high external classification accuracy and applicability to diverse chemicals. Transparency of the CBRA approach is aided by the use of radial plots that show the relative contribution of analogous chemical and biological neighbors. Identification of both chemical and biological features that give rise to the high accuracy of CBRA-based toxicity prediction facilitates mechanistic interpretation of the models. PMID:23848138
A Critical Review of Mode of Action (MOA) Assignment ...
There are various structure-based classification schemes to categorize chemicals based on mode of action (MOA) which have been applied for both eco and human health toxicology. With increasing calls to assess thousands of chemicals, some of which have little available information other than structure, clear understanding how each of these MOA schemes was devised, what information they are based on, and the limitations of each approach is critical. Several groups are developing low-tier methods to more easily classify or assess chemicals, using approaches such as the ecological threshold of concern (eco-TTC) and chemical-activity. Evaluation of these approaches and determination of their domain of applicability is partly dependent on the MOA classification that is used. The most commonly used MOA classification schemes for ecotoxicology include Verhaar and Russom (included in ASTER), both of which are used to predict acute aquatic toxicity MOA. Verhaar is a QSAR-based system that classifies chemicals into one of 4 classes, with a 5th class specified for those chemicals that are not classified in the other 4. ASTER/Russom includes 8 classifications: narcotics (3 groups), oxidative phosphorylation uncouplers, respiratory inhibitors, electrophiles/proelectrophiles, AChE inhibitors, or CNS seizure agents. Other methodologies include TEST (Toxicity Estimation Software Tool), a computational chemistry-based application that allows prediction to one of 5 broad MOA
Guijarro, María; Pajares, Gonzalo; Herrera, P. Javier
2009-01-01
The increasing technology of high-resolution image airborne sensors, including those on board Unmanned Aerial Vehicles, demands automatic solutions for processing, either on-line or off-line, the huge amountds of image data sensed during the flights. The classification of natural spectral signatures in images is one potential application. The actual tendency in classification is oriented towards the combination of simple classifiers. In this paper we propose a combined strategy based on the Deterministic Simulated Annealing (DSA) framework. The simple classifiers used are the well tested supervised parametric Bayesian estimator and the Fuzzy Clustering. The DSA is an optimization approach, which minimizes an energy function. The main contribution of DSA is its ability to avoid local minima during the optimization process thanks to the annealing scheme. It outperforms simple classifiers used for the combination and some combined strategies, including a scheme based on the fuzzy cognitive maps and an optimization approach based on the Hopfield neural network paradigm. PMID:22399989
A hybrid MLP-CNN classifier for very fine resolution remotely sensed image classification
NASA Astrophysics Data System (ADS)
Zhang, Ce; Pan, Xin; Li, Huapeng; Gardiner, Andy; Sargent, Isabel; Hare, Jonathon; Atkinson, Peter M.
2018-06-01
The contextual-based convolutional neural network (CNN) with deep architecture and pixel-based multilayer perceptron (MLP) with shallow structure are well-recognized neural network algorithms, representing the state-of-the-art deep learning method and the classical non-parametric machine learning approach, respectively. The two algorithms, which have very different behaviours, were integrated in a concise and effective way using a rule-based decision fusion approach for the classification of very fine spatial resolution (VFSR) remotely sensed imagery. The decision fusion rules, designed primarily based on the classification confidence of the CNN, reflect the generally complementary patterns of the individual classifiers. In consequence, the proposed ensemble classifier MLP-CNN harvests the complementary results acquired from the CNN based on deep spatial feature representation and from the MLP based on spectral discrimination. Meanwhile, limitations of the CNN due to the adoption of convolutional filters such as the uncertainty in object boundary partition and loss of useful fine spatial resolution detail were compensated. The effectiveness of the ensemble MLP-CNN classifier was tested in both urban and rural areas using aerial photography together with an additional satellite sensor dataset. The MLP-CNN classifier achieved promising performance, consistently outperforming the pixel-based MLP, spectral and textural-based MLP, and the contextual-based CNN in terms of classification accuracy. This research paves the way to effectively address the complicated problem of VFSR image classification.
An important challenge for an integrative approach to developmental systems toxicology is associating putative molecular initiating events (MIEs), cell signaling pathways, cell function and modeled fetal exposure kinetics. We have developed a chemical classification model based o...
A transversal approach for patch-based label fusion via matrix completion
Sanroma, Gerard; Wu, Guorong; Gao, Yaozong; Thung, Kim-Han; Guo, Yanrong; Shen, Dinggang
2015-01-01
Recently, multi-atlas patch-based label fusion has received an increasing interest in the medical image segmentation field. After warping the anatomical labels from the atlas images to the target image by registration, label fusion is the key step to determine the latent label for each target image point. Two popular types of patch-based label fusion approaches are (1) reconstruction-based approaches that compute the target labels as a weighted average of atlas labels, where the weights are derived by reconstructing the target image patch using the atlas image patches; and (2) classification-based approaches that determine the target label as a mapping of the target image patch, where the mapping function is often learned using the atlas image patches and their corresponding labels. Both approaches have their advantages and limitations. In this paper, we propose a novel patch-based label fusion method to combine the above two types of approaches via matrix completion (and hence, we call it transversal). As we will show, our method overcomes the individual limitations of both reconstruction-based and classification-based approaches. Since the labeling confidences may vary across the target image points, we further propose a sequential labeling framework that first labels the highly confident points and then gradually labels more challenging points in an iterative manner, guided by the label information determined in the previous iterations. We demonstrate the performance of our novel label fusion method in segmenting the hippocampus in the ADNI dataset, subcortical and limbic structures in the LONI dataset, and mid-brain structures in the SATA dataset. We achieve more accurate segmentation results than both reconstruction-based and classification-based approaches. Our label fusion method is also ranked 1st in the online SATA Multi-Atlas Segmentation Challenge. PMID:26160394
A novel and efficient technique for identification and classification of GPCRs.
Gupta, Ravi; Mittal, Ankush; Singh, Kuldip
2008-07-01
G-protein coupled receptors (GPCRs) play a vital role in different biological processes, such as regulation of growth, death, and metabolism of cells. GPCRs are the focus of significant amount of current pharmaceutical research since they interact with more than 50% of prescription drugs. The dipeptide-based support vector machine (SVM) approach is the most accurate technique to identify and classify the GPCRs. However, this approach has two major disadvantages. First, the dimension of dipeptide-based feature vector is equal to 400. The large dimension makes the classification task computationally and memory wise inefficient. Second, it does not consider the biological properties of protein sequence for identification and classification of GPCRs. In this paper, we present a novel-feature-based SVM classification technique. The novel features are derived by applying wavelet-based time series analysis approach on protein sequences. The proposed feature space summarizes the variance information of seven important biological properties of amino acids in a protein sequence. In addition, the dimension of the feature vector for proposed technique is equal to 35. Experiments were performed on GPCRs protein sequences available at GPCRs Database. Our approach achieves an accuracy of 99.9%, 98.06%, 97.78%, and 94.08% for GPCR superfamily, families, subfamilies, and subsubfamilies (amine group), respectively, when evaluated using fivefold cross-validation. Further, an accuracy of 99.8%, 97.26%, and 97.84% was obtained when evaluated on unseen or recall datasets of GPCR superfamily, families, and subfamilies, respectively. Comparison with dipeptide-based SVM technique shows the effectiveness of our approach.
An ensemble predictive modeling framework for breast cancer classification.
Nagarajan, Radhakrishnan; Upreti, Meenakshi
2017-12-01
Molecular changes often precede clinical presentation of diseases and can be useful surrogates with potential to assist in informed clinical decision making. Recent studies have demonstrated the usefulness of modeling approaches such as classification that can predict the clinical outcomes from molecular expression profiles. While useful, a majority of these approaches implicitly use all molecular markers as features in the classification process often resulting in sparse high-dimensional projection of the samples often comparable to that of the sample size. In this study, a variant of the recently proposed ensemble classification approach is used for predicting good and poor-prognosis breast cancer samples from their molecular expression profiles. In contrast to traditional single and ensemble classifiers, the proposed approach uses multiple base classifiers with varying feature sets obtained from two-dimensional projection of the samples in conjunction with a majority voting strategy for predicting the class labels. In contrast to our earlier implementation, base classifiers in the ensembles are chosen based on maximal sensitivity and minimal redundancy by choosing only those with low average cosine distance. The resulting ensemble sets are subsequently modeled as undirected graphs. Performance of four different classification algorithms is shown to be better within the proposed ensemble framework in contrast to using them as traditional single classifier systems. Significance of a subset of genes with high-degree centrality in the network abstractions across the poor-prognosis samples is also discussed. Copyright © 2017 Elsevier Inc. All rights reserved.
Xu, Kele; Feng, Dawei; Mi, Haibo
2017-11-23
The automatic detection of diabetic retinopathy is of vital importance, as it is the main cause of irreversible vision loss in the working-age population in the developed world. The early detection of diabetic retinopathy occurrence can be very helpful for clinical treatment; although several different feature extraction approaches have been proposed, the classification task for retinal images is still tedious even for those trained clinicians. Recently, deep convolutional neural networks have manifested superior performance in image classification compared to previous handcrafted feature-based image classification methods. Thus, in this paper, we explored the use of deep convolutional neural network methodology for the automatic classification of diabetic retinopathy using color fundus image, and obtained an accuracy of 94.5% on our dataset, outperforming the results obtained by using classical approaches.
Dissimilarity representations in lung parenchyma classification
NASA Astrophysics Data System (ADS)
Sørensen, Lauge; de Bruijne, Marleen
2009-02-01
A good problem representation is important for a pattern recognition system to be successful. The traditional approach to statistical pattern recognition is feature representation. More specifically, objects are represented by a number of features in a feature vector space, and classifiers are built in this representation. This is also the general trend in lung parenchyma classification in computed tomography (CT) images, where the features often are measures on feature histograms. Instead, we propose to build normal density based classifiers in dissimilarity representations for lung parenchyma classification. This allows for the classifiers to work on dissimilarities between objects, which might be a more natural way of representing lung parenchyma. In this context, dissimilarity is defined between CT regions of interest (ROI)s. ROIs are represented by their CT attenuation histogram and ROI dissimilarity is defined as a histogram dissimilarity measure between the attenuation histograms. In this setting, the full histograms are utilized according to the chosen histogram dissimilarity measure. We apply this idea to classification of different emphysema patterns as well as normal, healthy tissue. Two dissimilarity representation approaches as well as different histogram dissimilarity measures are considered. The approaches are evaluated on a set of 168 CT ROIs using normal density based classifiers all showing good performance. Compared to using histogram dissimilarity directly as distance in a emph{k} nearest neighbor classifier, which achieves a classification accuracy of 92.9%, the best dissimilarity representation based classifier is significantly better with a classification accuracy of 97.0% (text{emph{p" border="0" class="imgtopleft"> = 0.046).
Rule-guided human classification of Volunteered Geographic Information
NASA Astrophysics Data System (ADS)
Ali, Ahmed Loai; Falomir, Zoe; Schmid, Falko; Freksa, Christian
2017-05-01
During the last decade, web technologies and location sensing devices have evolved generating a form of crowdsourcing known as Volunteered Geographic Information (VGI). VGI acted as a platform of spatial data collection, in particular, when a group of public participants are involved in collaborative mapping activities: they work together to collect, share, and use information about geographic features. VGI exploits participants' local knowledge to produce rich data sources. However, the resulting data inherits problematic data classification. In VGI projects, the challenges of data classification are due to the following: (i) data is likely prone to subjective classification, (ii) remote contributions and flexible contribution mechanisms in most projects, and (iii) the uncertainty of spatial data and non-strict definitions of geographic features. These factors lead to various forms of problematic classification: inconsistent, incomplete, and imprecise data classification. This research addresses classification appropriateness. Whether the classification of an entity is appropriate or inappropriate is related to quantitative and/or qualitative observations. Small differences between observations may be not recognizable particularly for non-expert participants. Hence, in this paper, the problem is tackled by developing a rule-guided classification approach. This approach exploits data mining techniques of Association Classification (AC) to extract descriptive (qualitative) rules of specific geographic features. The rules are extracted based on the investigation of qualitative topological relations between target features and their context. Afterwards, the extracted rules are used to develop a recommendation system able to guide participants to the most appropriate classification. The approach proposes two scenarios to guide participants towards enhancing the quality of data classification. An empirical study is conducted to investigate the classification of grass-related features like forest, garden, park, and meadow. The findings of this study indicate the feasibility of the proposed approach.
Teaching/Learning Methods and Students' Classification of Food Items
ERIC Educational Resources Information Center
Hamilton-Ekeke, Joy-Telu; Thomas, Malcolm
2011-01-01
Purpose: This study aims to investigate the effectiveness of a teaching method (TLS (Teaching/Learning Sequence)) based on a social constructivist paradigm on students' conceptualisation of classification of food. Design/methodology/approach: The study compared the TLS model developed by the researcher based on the social constructivist paradigm…
A Novel Modulation Classification Approach Using Gabor Filter Network
Ghauri, Sajjad Ahmed; Qureshi, Ijaz Mansoor; Cheema, Tanveer Ahmed; Malik, Aqdas Naveed
2014-01-01
A Gabor filter network based approach is used for feature extraction and classification of digital modulated signals by adaptively tuning the parameters of Gabor filter network. Modulation classification of digitally modulated signals is done under the influence of additive white Gaussian noise (AWGN). The modulations considered for the classification purpose are PSK 2 to 64, FSK 2 to 64, and QAM 4 to 64. The Gabor filter network uses the network structure of two layers; the first layer which is input layer constitutes the adaptive feature extraction part and the second layer constitutes the signal classification part. The Gabor atom parameters are tuned using Delta rule and updating of weights of Gabor filter using least mean square (LMS) algorithm. The simulation results show that proposed novel modulation classification algorithm has high classification accuracy at low signal to noise ratio (SNR) on AWGN channel. PMID:25126603
Emotion models for textual emotion classification
NASA Astrophysics Data System (ADS)
Bruna, O.; Avetisyan, H.; Holub, J.
2016-11-01
This paper deals with textual emotion classification which gained attention in recent years. Emotion classification is used in user experience, product evaluation, national security, and tutoring applications. It attempts to detect the emotional content in the input text and based on different approaches establish what kind of emotional content is present, if any. Textual emotion classification is the most difficult to handle, since it relies mainly on linguistic resources and it introduces many challenges to assignment of text to emotion represented by a proper model. A crucial part of each emotion detector is emotion model. Focus of this paper is to introduce emotion models used for classification. Categorical and dimensional models of emotion are explained and some more advanced approaches are mentioned.
NASA Astrophysics Data System (ADS)
Sukawattanavijit, Chanika; Srestasathiern, Panu
2017-10-01
Land Use and Land Cover (LULC) information are significant to observe and evaluate environmental change. LULC classification applying remotely sensed data is a technique popularly employed on a global and local dimension particularly, in urban areas which have diverse land cover types. These are essential components of the urban terrain and ecosystem. In the present, object-based image analysis (OBIA) is becoming widely popular for land cover classification using the high-resolution image. COSMO-SkyMed SAR data was fused with THAICHOTE (namely, THEOS: Thailand Earth Observation Satellite) optical data for land cover classification using object-based. This paper indicates a comparison between object-based and pixel-based approaches in image fusion. The per-pixel method, support vector machines (SVM) was implemented to the fused image based on Principal Component Analysis (PCA). For the objectbased classification was applied to the fused images to separate land cover classes by using nearest neighbor (NN) classifier. Finally, the accuracy assessment was employed by comparing with the classification of land cover mapping generated from fused image dataset and THAICHOTE image. The object-based data fused COSMO-SkyMed with THAICHOTE images demonstrated the best classification accuracies, well over 85%. As the results, an object-based data fusion provides higher land cover classification accuracy than per-pixel data fusion.
NASA Astrophysics Data System (ADS)
Wu, M. F.; Sun, Z. C.; Yang, B.; Yu, S. S.
2016-11-01
In order to reduce the “salt and pepper” in pixel-based urban land cover classification and expand the application of fusion of multi-source data in the field of urban remote sensing, WorldView-2 imagery and airborne Light Detection and Ranging (LiDAR) data were used to improve the classification of urban land cover. An approach of object- oriented hierarchical classification was proposed in our study. The processing of proposed method consisted of two hierarchies. (1) In the first hierarchy, LiDAR Normalized Digital Surface Model (nDSM) image was segmented to objects. The NDVI, Costal Blue and nDSM thresholds were set for extracting building objects. (2) In the second hierarchy, after removing building objects, WorldView-2 fused imagery was obtained by Haze-ratio-based (HR) fusion, and was segmented. A SVM classifier was applied to generate road/parking lot, vegetation and bare soil objects. (3) Trees and grasslands were split based on an nDSM threshold (2.4 meter). The results showed that compared with pixel-based and non-hierarchical object-oriented approach, proposed method provided a better performance of urban land cover classification, the overall accuracy (OA) and overall kappa (OK) improved up to 92.75% and 0.90. Furthermore, proposed method reduced “salt and pepper” in pixel-based classification, improved the extraction accuracy of buildings based on LiDAR nDSM image segmentation, and reduced the confusion between trees and grasslands through setting nDSM threshold.
Efficient Fingercode Classification
NASA Astrophysics Data System (ADS)
Sun, Hong-Wei; Law, Kwok-Yan; Gollmann, Dieter; Chung, Siu-Leung; Li, Jian-Bin; Sun, Jia-Guang
In this paper, we present an efficient fingerprint classification algorithm which is an essential component in many critical security application systems e. g. systems in the e-government and e-finance domains. Fingerprint identification is one of the most important security requirements in homeland security systems such as personnel screening and anti-money laundering. The problem of fingerprint identification involves searching (matching) the fingerprint of a person against each of the fingerprints of all registered persons. To enhance performance and reliability, a common approach is to reduce the search space by firstly classifying the fingerprints and then performing the search in the respective class. Jain et al. proposed a fingerprint classification algorithm based on a two-stage classifier, which uses a K-nearest neighbor classifier in its first stage. The fingerprint classification algorithm is based on the fingercode representation which is an encoding of fingerprints that has been demonstrated to be an effective fingerprint biometric scheme because of its ability to capture both local and global details in a fingerprint image. We enhance this approach by improving the efficiency of the K-nearest neighbor classifier for fingercode-based fingerprint classification. Our research firstly investigates the various fast search algorithms in vector quantization (VQ) and the potential application in fingerprint classification, and then proposes two efficient algorithms based on the pyramid-based search algorithms in VQ. Experimental results on DB1 of FVC 2004 demonstrate that our algorithms can outperform the full search algorithm and the original pyramid-based search algorithms in terms of computational efficiency without sacrificing accuracy.
Lati, Ran N; Filin, Sagi; Aly, Radi; Lande, Tal; Levin, Ilan; Eizenberg, Hanan
2014-07-01
Weed/crop classification is considered the main problem in developing precise weed-management methodologies, because both crops and weeds share similar hues. Great effort has been invested in the development of classification models, most based on expensive sensors and complicated algorithms. However, satisfactory results are not consistently obtained due to imaging conditions in the field. We report on an innovative approach that combines advances in genetic engineering and robust image-processing methods to detect weeds and distinguish them from crop plants by manipulating the crop's leaf color. We demonstrate this on genetically modified tomato (germplasm AN-113) which expresses a purple leaf color. An autonomous weed/crop classification is performed using an invariant-hue transformation that is applied to images acquired by a standard consumer camera (visible wavelength) and handles variations in illumination intensities. The integration of these methodologies is simple and effective, and classification results were accurate and stable under a wide range of imaging conditions. Using this approach, we simplify the most complicated stage in image-based weed/crop classification models. © 2013 Society of Chemical Industry.
Automatic evidence quality prediction to support evidence-based decision making.
Sarker, Abeed; Mollá, Diego; Paris, Cécile
2015-06-01
Evidence-based medicine practice requires practitioners to obtain the best available medical evidence, and appraise the quality of the evidence when making clinical decisions. Primarily due to the plethora of electronically available data from the medical literature, the manual appraisal of the quality of evidence is a time-consuming process. We present a fully automatic approach for predicting the quality of medical evidence in order to aid practitioners at point-of-care. Our approach extracts relevant information from medical article abstracts and utilises data from a specialised corpus to apply supervised machine learning for the prediction of the quality grades. Following an in-depth analysis of the usefulness of features (e.g., publication types of articles), they are extracted from the text via rule-based approaches and from the meta-data associated with the articles, and then applied in the supervised classification model. We propose the use of a highly scalable and portable approach using a sequence of high precision classifiers, and introduce a simple evaluation metric called average error distance (AED) that simplifies the comparison of systems. We also perform elaborate human evaluations to compare the performance of our system against human judgments. We test and evaluate our approaches on a publicly available, specialised, annotated corpus containing 1132 evidence-based recommendations. Our rule-based approach performs exceptionally well at the automatic extraction of publication types of articles, with F-scores of up to 0.99 for high-quality publication types. For evidence quality classification, our approach obtains an accuracy of 63.84% and an AED of 0.271. The human evaluations show that the performance of our system, in terms of AED and accuracy, is comparable to the performance of humans on the same data. The experiments suggest that our structured text classification framework achieves evaluation results comparable to those of human performance. Our overall classification approach and evaluation technique are also highly portable and can be used for various evidence grading scales. Copyright © 2015 Elsevier B.V. All rights reserved.
Tarai, Madhumita; Kumar, Keshav; Divya, O; Bairi, Partha; Mishra, Kishor Kumar; Mishra, Ashok Kumar
2017-09-05
The present work compares the dissimilarity and covariance based unsupervised chemometric classification approaches by taking the total synchronous fluorescence spectroscopy data sets acquired for the cumin and non-cumin based herbal preparations. The conventional decomposition method involves eigenvalue-eigenvector analysis of the covariance of the data set and finds the factors that can explain the overall major sources of variation present in the data set. The conventional approach does this irrespective of the fact that the samples belong to intrinsically different groups and hence leads to poor class separation. The present work shows that classification of such samples can be optimized by performing the eigenvalue-eigenvector decomposition on the pair-wise dissimilarity matrix. Copyright © 2017 Elsevier B.V. All rights reserved.
Alzheimer's Disease Detection by Pseudo Zernike Moment and Linear Regression Classification.
Wang, Shui-Hua; Du, Sidan; Zhang, Yin; Phillips, Preetha; Wu, Le-Nan; Chen, Xian-Qing; Zhang, Yu-Dong
2017-01-01
This study presents an improved method based on "Gorji et al. Neuroscience. 2015" by introducing a relatively new classifier-linear regression classification. Our method selects one axial slice from 3D brain image, and employed pseudo Zernike moment with maximum order of 15 to extract 256 features from each image. Finally, linear regression classification was harnessed as the classifier. The proposed approach obtains an accuracy of 97.51%, a sensitivity of 96.71%, and a specificity of 97.73%. Our method performs better than Gorji's approach and five other state-of-the-art approaches. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Computational approaches for the classification of seed storage proteins.
Radhika, V; Rao, V Sree Hari
2015-07-01
Seed storage proteins comprise a major part of the protein content of the seed and have an important role on the quality of the seed. These storage proteins are important because they determine the total protein content and have an effect on the nutritional quality and functional properties for food processing. Transgenic plants are being used to develop improved lines for incorporation into plant breeding programs and the nutrient composition of seeds is a major target of molecular breeding programs. Hence, classification of these proteins is crucial for the development of superior varieties with improved nutritional quality. In this study we have applied machine learning algorithms for classification of seed storage proteins. We have presented an algorithm based on nearest neighbor approach for classification of seed storage proteins and compared its performance with decision tree J48, multilayer perceptron neural (MLP) network and support vector machine (SVM) libSVM. The model based on our algorithm has been able to give higher classification accuracy in comparison to the other methods.
Data Clustering and Evolving Fuzzy Decision Tree for Data Base Classification Problems
NASA Astrophysics Data System (ADS)
Chang, Pei-Chann; Fan, Chin-Yuan; Wang, Yen-Wen
Data base classification suffers from two well known difficulties, i.e., the high dimensionality and non-stationary variations within the large historic data. This paper presents a hybrid classification model by integrating a case based reasoning technique, a Fuzzy Decision Tree (FDT), and Genetic Algorithms (GA) to construct a decision-making system for data classification in various data base applications. The model is major based on the idea that the historic data base can be transformed into a smaller case-base together with a group of fuzzy decision rules. As a result, the model can be more accurately respond to the current data under classifying from the inductions by these smaller cases based fuzzy decision trees. Hit rate is applied as a performance measure and the effectiveness of our proposed model is demonstrated by experimentally compared with other approaches on different data base classification applications. The average hit rate of our proposed model is the highest among others.
Welikala, R A; Fraz, M M; Dehmeshki, J; Hoppe, A; Tah, V; Mann, S; Williamson, T H; Barman, S A
2015-07-01
Proliferative diabetic retinopathy (PDR) is a condition that carries a high risk of severe visual impairment. The hallmark of PDR is the growth of abnormal new vessels. In this paper, an automated method for the detection of new vessels from retinal images is presented. This method is based on a dual classification approach. Two vessel segmentation approaches are applied to create two separate binary vessel map which each hold vital information. Local morphology features are measured from each binary vessel map to produce two separate 4-D feature vectors. Independent classification is performed for each feature vector using a support vector machine (SVM) classifier. The system then combines these individual outcomes to produce a final decision. This is followed by the creation of additional features to generate 21-D feature vectors, which feed into a genetic algorithm based feature selection approach with the objective of finding feature subsets that improve the performance of the classification. Sensitivity and specificity results using a dataset of 60 images are 0.9138 and 0.9600, respectively, on a per patch basis and 1.000 and 0.975, respectively, on a per image basis. Copyright © 2015 Elsevier Ltd. All rights reserved.
Comparing K-mer based methods for improved classification of 16S sequences.
Vinje, Hilde; Liland, Kristian Hovde; Almøy, Trygve; Snipen, Lars
2015-07-01
The need for precise and stable taxonomic classification is highly relevant in modern microbiology. Parallel to the explosion in the amount of sequence data accessible, there has also been a shift in focus for classification methods. Previously, alignment-based methods were the most applicable tools. Now, methods based on counting K-mers by sliding windows are the most interesting classification approach with respect to both speed and accuracy. Here, we present a systematic comparison on five different K-mer based classification methods for the 16S rRNA gene. The methods differ from each other both in data usage and modelling strategies. We have based our study on the commonly known and well-used naïve Bayes classifier from the RDP project, and four other methods were implemented and tested on two different data sets, on full-length sequences as well as fragments of typical read-length. The difference in classification error obtained by the methods seemed to be small, but they were stable and for both data sets tested. The Preprocessed nearest-neighbour (PLSNN) method performed best for full-length 16S rRNA sequences, significantly better than the naïve Bayes RDP method. On fragmented sequences the naïve Bayes Multinomial method performed best, significantly better than all other methods. For both data sets explored, and on both full-length and fragmented sequences, all the five methods reached an error-plateau. We conclude that no K-mer based method is universally best for classifying both full-length sequences and fragments (reads). All methods approach an error plateau indicating improved training data is needed to improve classification from here. Classification errors occur most frequent for genera with few sequences present. For improving the taxonomy and testing new classification methods, the need for a better and more universal and robust training data set is crucial.
Haque, Mohammad Nazmul; Noman, Nasimul; Berretta, Regina; Moscato, Pablo
2016-01-01
Classification of datasets with imbalanced sample distributions has always been a challenge. In general, a popular approach for enhancing classification performance is the construction of an ensemble of classifiers. However, the performance of an ensemble is dependent on the choice of constituent base classifiers. Therefore, we propose a genetic algorithm-based search method for finding the optimum combination from a pool of base classifiers to form a heterogeneous ensemble. The algorithm, called GA-EoC, utilises 10 fold-cross validation on training data for evaluating the quality of each candidate ensembles. In order to combine the base classifiers decision into ensemble's output, we used the simple and widely used majority voting approach. The proposed algorithm, along with the random sub-sampling approach to balance the class distribution, has been used for classifying class-imbalanced datasets. Additionally, if a feature set was not available, we used the (α, β) - k Feature Set method to select a better subset of features for classification. We have tested GA-EoC with three benchmarking datasets from the UCI-Machine Learning repository, one Alzheimer's disease dataset and a subset of the PubFig database of Columbia University. In general, the performance of the proposed method on the chosen datasets is robust and better than that of the constituent base classifiers and many other well-known ensembles. Based on our empirical study we claim that a genetic algorithm is a superior and reliable approach to heterogeneous ensemble construction and we expect that the proposed GA-EoC would perform consistently in other cases.
Haque, Mohammad Nazmul; Noman, Nasimul; Berretta, Regina; Moscato, Pablo
2016-01-01
Classification of datasets with imbalanced sample distributions has always been a challenge. In general, a popular approach for enhancing classification performance is the construction of an ensemble of classifiers. However, the performance of an ensemble is dependent on the choice of constituent base classifiers. Therefore, we propose a genetic algorithm-based search method for finding the optimum combination from a pool of base classifiers to form a heterogeneous ensemble. The algorithm, called GA-EoC, utilises 10 fold-cross validation on training data for evaluating the quality of each candidate ensembles. In order to combine the base classifiers decision into ensemble’s output, we used the simple and widely used majority voting approach. The proposed algorithm, along with the random sub-sampling approach to balance the class distribution, has been used for classifying class-imbalanced datasets. Additionally, if a feature set was not available, we used the (α, β) − k Feature Set method to select a better subset of features for classification. We have tested GA-EoC with three benchmarking datasets from the UCI-Machine Learning repository, one Alzheimer’s disease dataset and a subset of the PubFig database of Columbia University. In general, the performance of the proposed method on the chosen datasets is robust and better than that of the constituent base classifiers and many other well-known ensembles. Based on our empirical study we claim that a genetic algorithm is a superior and reliable approach to heterogeneous ensemble construction and we expect that the proposed GA-EoC would perform consistently in other cases. PMID:26764911
Li, Zhao-Liang
2018-01-01
Few studies have examined hyperspectral remote-sensing image classification with type-II fuzzy sets. This paper addresses image classification based on a hyperspectral remote-sensing technique using an improved interval type-II fuzzy c-means (IT2FCM*) approach. In this study, in contrast to other traditional fuzzy c-means-based approaches, the IT2FCM* algorithm considers the ranking of interval numbers and the spectral uncertainty. The classification results based on a hyperspectral dataset using the FCM, IT2FCM, and the proposed improved IT2FCM* algorithms show that the IT2FCM* method plays the best performance according to the clustering accuracy. In this paper, in order to validate and demonstrate the separability of the IT2FCM*, four type-I fuzzy validity indexes are employed, and a comparative analysis of these fuzzy validity indexes also applied in FCM and IT2FCM methods are made. These four indexes are also applied into different spatial and spectral resolution datasets to analyze the effects of spectral and spatial scaling factors on the separability of FCM, IT2FCM, and IT2FCM* methods. The results of these validity indexes from the hyperspectral datasets show that the improved IT2FCM* algorithm have the best values among these three algorithms in general. The results demonstrate that the IT2FCM* exhibits good performance in hyperspectral remote-sensing image classification because of its ability to handle hyperspectral uncertainty. PMID:29373548
Zhan, Liang; Zhou, Jiayu; Wang, Yalin; Jin, Yan; Jahanshad, Neda; Prasad, Gautam; Nir, Talia M.; Leonardo, Cassandra D.; Ye, Jieping; Thompson, Paul M.; for the Alzheimer’s Disease Neuroimaging Initiative
2015-01-01
Alzheimer’s disease (AD) involves a gradual breakdown of brain connectivity, and network analyses offer a promising new approach to track and understand disease progression. Even so, our ability to detect degenerative changes in brain networks depends on the methods used. Here we compared several tractography and feature extraction methods to see which ones gave best diagnostic classification for 202 people with AD, mild cognitive impairment or normal cognition, scanned with 41-gradient diffusion-weighted magnetic resonance imaging as part of the Alzheimer’s Disease Neuroimaging Initiative (ADNI) project. We computed brain networks based on whole brain tractography with nine different methods – four of them tensor-based deterministic (FACT, RK2, SL, and TL), two orientation distribution function (ODF)-based deterministic (FACT, RK2), two ODF-based probabilistic approaches (Hough and PICo), and one “ball-and-stick” approach (Probtrackx). Brain networks derived from different tractography algorithms did not differ in terms of classification performance on ADNI, but performing principal components analysis on networks helped classification in some cases. Small differences may still be detectable in a truly vast cohort, but these experiments help assess the relative advantages of different tractography algorithms, and different post-processing choices, when used for classification. PMID:25926791
Lagrangian methods of cosmic web classification
NASA Astrophysics Data System (ADS)
Fisher, J. D.; Faltenbacher, A.; Johnson, M. S. T.
2016-05-01
The cosmic web defines the large-scale distribution of matter we see in the Universe today. Classifying the cosmic web into voids, sheets, filaments and nodes allows one to explore structure formation and the role environmental factors have on halo and galaxy properties. While existing studies of cosmic web classification concentrate on grid-based methods, this work explores a Lagrangian approach where the V-web algorithm proposed by Hoffman et al. is implemented with techniques borrowed from smoothed particle hydrodynamics. The Lagrangian approach allows one to classify individual objects (e.g. particles or haloes) based on properties of their nearest neighbours in an adaptive manner. It can be applied directly to a halo sample which dramatically reduces computational cost and potentially allows an application of this classification scheme to observed galaxy samples. Finally, the Lagrangian nature admits a straightforward inclusion of the Hubble flow negating the necessity of a visually defined threshold value which is commonly employed by grid-based classification methods.
Zhao, Weixiang; Davis, Cristina E.
2011-01-01
Objective This paper introduces a modified artificial immune system (AIS)-based pattern recognition method to enhance the recognition ability of the existing conventional AIS-based classification approach and demonstrates the superiority of the proposed new AIS-based method via two case studies of breast cancer diagnosis. Methods and materials Conventionally, the AIS approach is often coupled with the k nearest neighbor (k-NN) algorithm to form a classification method called AIS-kNN. In this paper we discuss the basic principle and possible problems of this conventional approach, and propose a new approach where AIS is integrated with the radial basis function – partial least square regression (AIS-RBFPLS). Additionally, both the two AIS-based approaches are compared with two classical and powerful machine learning methods, back-propagation neural network (BPNN) and orthogonal radial basis function network (Ortho-RBF network). Results The diagnosis results show that: (1) both the AIS-kNN and the AIS-RBFPLS proved to be a good machine leaning method for clinical diagnosis, but the proposed AIS-RBFPLS generated an even lower misclassification ratio, especially in the cases where the conventional AIS-kNN approach generated poor classification results because of possible improper AIS parameters. For example, based upon the AIS memory cells of “replacement threshold = 0.3”, the average misclassification ratios of two approaches for study 1 are 3.36% (AIS-RBFPLS) and 9.07% (AIS-kNN), and the misclassification ratios for study 2 are 19.18% (AIS-RBFPLS) and 28.36% (AIS-kNN); (2) the proposed AIS-RBFPLS presented its robustness in terms of the AIS-created memory cells, showing a smaller standard deviation of the results from the multiple trials than AIS-kNN. For example, using the result from the first set of AIS memory cells as an example, the standard deviations of the misclassification ratios for study 1 are 0.45% (AIS-RBFPLS) and 8.71% (AIS-kNN) and those for study 2 are 0.49% (AIS-RBFPLS) and 6.61% (AIS-kNN); and (3) the proposed AIS-RBFPLS classification approaches also yielded better diagnosis results than two classical neural network approaches of BPNN and Ortho-RBF network. Conclusion In summary, this paper proposed a new machine learning method for complex systems by integrating the AIS system with RBFPLS. This new method demonstrates its satisfactory effect on classification accuracy for clinical diagnosis, and also indicates its wide potential applications to other diagnosis and detection problems. PMID:21515033
Deep learning for tumor classification in imaging mass spectrometry.
Behrmann, Jens; Etmann, Christian; Boskamp, Tobias; Casadonte, Rita; Kriegsmann, Jörg; Maaß, Peter
2018-04-01
Tumor classification using imaging mass spectrometry (IMS) data has a high potential for future applications in pathology. Due to the complexity and size of the data, automated feature extraction and classification steps are required to fully process the data. Since mass spectra exhibit certain structural similarities to image data, deep learning may offer a promising strategy for classification of IMS data as it has been successfully applied to image classification. Methodologically, we propose an adapted architecture based on deep convolutional networks to handle the characteristics of mass spectrometry data, as well as a strategy to interpret the learned model in the spectral domain based on a sensitivity analysis. The proposed methods are evaluated on two algorithmically challenging tumor classification tasks and compared to a baseline approach. Competitiveness of the proposed methods is shown on both tasks by studying the performance via cross-validation. Moreover, the learned models are analyzed by the proposed sensitivity analysis revealing biologically plausible effects as well as confounding factors of the considered tasks. Thus, this study may serve as a starting point for further development of deep learning approaches in IMS classification tasks. https://gitlab.informatik.uni-bremen.de/digipath/Deep_Learning_for_Tumor_Classification_in_IMS. jbehrmann@uni-bremen.de or christianetmann@uni-bremen.de. Supplementary data are available at Bioinformatics online.
NASA Astrophysics Data System (ADS)
Gao, Lin; Cheng, Wei; Zhang, Jinhua; Wang, Jue
2016-08-01
Brain-computer interface (BCI) systems provide an alternative communication and control approach for people with limited motor function. Therefore, the feature extraction and classification approach should differentiate the relative unusual state of motion intention from a common resting state. In this paper, we sought a novel approach for multi-class classification in BCI applications. We collected electroencephalographic (EEG) signals registered by electrodes placed over the scalp during left hand motor imagery, right hand motor imagery, and resting state for ten healthy human subjects. We proposed using the Kolmogorov complexity (Kc) for feature extraction and a multi-class Adaboost classifier with extreme learning machine as base classifier for classification, in order to classify the three-class EEG samples. An average classification accuracy of 79.5% was obtained for ten subjects, which greatly outperformed commonly used approaches. Thus, it is concluded that the proposed method could improve the performance for classification of motor imagery tasks for multi-class samples. It could be applied in further studies to generate the control commands to initiate the movement of a robotic exoskeleton or orthosis, which finally facilitates the rehabilitation of disabled people.
Raster Vs. Point Cloud LiDAR Data Classification
NASA Astrophysics Data System (ADS)
El-Ashmawy, N.; Shaker, A.
2014-09-01
Airborne Laser Scanning systems with light detection and ranging (LiDAR) technology is one of the fast and accurate 3D point data acquisition techniques. Generating accurate digital terrain and/or surface models (DTM/DSM) is the main application of collecting LiDAR range data. Recently, LiDAR range and intensity data have been used for land cover classification applications. Data range and Intensity, (strength of the backscattered signals measured by the LiDAR systems), are affected by the flying height, the ground elevation, scanning angle and the physical characteristics of the objects surface. These effects may lead to uneven distribution of point cloud or some gaps that may affect the classification process. Researchers have investigated the conversion of LiDAR range point data to raster image for terrain modelling. Interpolation techniques have been used to achieve the best representation of surfaces, and to fill the gaps between the LiDAR footprints. Interpolation methods are also investigated to generate LiDAR range and intensity image data for land cover classification applications. In this paper, different approach has been followed to classifying the LiDAR data (range and intensity) for land cover mapping. The methodology relies on the classification of the point cloud data based on their range and intensity and then converted the classified points into raster image. The gaps in the data are filled based on the classes of the nearest neighbour. Land cover maps are produced using two approaches using: (a) the conventional raster image data based on point interpolation; and (b) the proposed point data classification. A study area covering an urban district in Burnaby, British Colombia, Canada, is selected to compare the results of the two approaches. Five different land cover classes can be distinguished in that area: buildings, roads and parking areas, trees, low vegetation (grass), and bare soil. The results show that an improvement of around 10 % in the classification results can be achieved by using the proposed approach.
Pathological Bases for a Robust Application of Cancer Molecular Classification
Diaz-Cano, Salvador J.
2015-01-01
Any robust classification system depends on its purpose and must refer to accepted standards, its strength relying on predictive values and a careful consideration of known factors that can affect its reliability. In this context, a molecular classification of human cancer must refer to the current gold standard (histological classification) and try to improve it with key prognosticators for metastatic potential, staging and grading. Although organ-specific examples have been published based on proteomics, transcriptomics and genomics evaluations, the most popular approach uses gene expression analysis as a direct correlate of cellular differentiation, which represents the key feature of the histological classification. RNA is a labile molecule that varies significantly according with the preservation protocol, its transcription reflect the adaptation of the tumor cells to the microenvironment, it can be passed through mechanisms of intercellular transference of genetic information (exosomes), and it is exposed to epigenetic modifications. More robust classifications should be based on stable molecules, at the genetic level represented by DNA to improve reliability, and its analysis must deal with the concept of intratumoral heterogeneity, which is at the origin of tumor progression and is the byproduct of the selection process during the clonal expansion and progression of neoplasms. The simultaneous analysis of multiple DNA targets and next generation sequencing offer the best practical approach for an analytical genomic classification of tumors. PMID:25898411
About decomposition approach for solving the classification problem
NASA Astrophysics Data System (ADS)
Andrianova, A. A.
2016-11-01
This article describes the features of the application of an algorithm with using of decomposition methods for solving the binary classification problem of constructing a linear classifier based on Support Vector Machine method. Application of decomposition reduces the volume of calculations, in particular, due to the emerging possibilities to build parallel versions of the algorithm, which is a very important advantage for the solution of problems with big data. The analysis of the results of computational experiments conducted using the decomposition approach. The experiment use known data set for binary classification problem.
ERIC Educational Resources Information Center
Güyer, Tolga; Aydogdu, Seyhmus
2016-01-01
This study suggests a classification model and an e-learning system based on this model for all instructional theories, approaches, models, strategies, methods, and technics being used in the process of instructional design that constitutes a direct or indirect resource for educational technology based on the theory of intuitionistic fuzzy sets…
Semi-Supervised Active Learning for Sound Classification in Hybrid Learning Environments.
Han, Wenjing; Coutinho, Eduardo; Ruan, Huabin; Li, Haifeng; Schuller, Björn; Yu, Xiaojie; Zhu, Xuan
2016-01-01
Coping with scarcity of labeled data is a common problem in sound classification tasks. Approaches for classifying sounds are commonly based on supervised learning algorithms, which require labeled data which is often scarce and leads to models that do not generalize well. In this paper, we make an efficient combination of confidence-based Active Learning and Self-Training with the aim of minimizing the need for human annotation for sound classification model training. The proposed method pre-processes the instances that are ready for labeling by calculating their classifier confidence scores, and then delivers the candidates with lower scores to human annotators, and those with high scores are automatically labeled by the machine. We demonstrate the feasibility and efficacy of this method in two practical scenarios: pool-based and stream-based processing. Extensive experimental results indicate that our approach requires significantly less labeled instances to reach the same performance in both scenarios compared to Passive Learning, Active Learning and Self-Training. A reduction of 52.2% in human labeled instances is achieved in both of the pool-based and stream-based scenarios on a sound classification task considering 16,930 sound instances.
Semi-Supervised Active Learning for Sound Classification in Hybrid Learning Environments
Han, Wenjing; Coutinho, Eduardo; Li, Haifeng; Schuller, Björn; Yu, Xiaojie; Zhu, Xuan
2016-01-01
Coping with scarcity of labeled data is a common problem in sound classification tasks. Approaches for classifying sounds are commonly based on supervised learning algorithms, which require labeled data which is often scarce and leads to models that do not generalize well. In this paper, we make an efficient combination of confidence-based Active Learning and Self-Training with the aim of minimizing the need for human annotation for sound classification model training. The proposed method pre-processes the instances that are ready for labeling by calculating their classifier confidence scores, and then delivers the candidates with lower scores to human annotators, and those with high scores are automatically labeled by the machine. We demonstrate the feasibility and efficacy of this method in two practical scenarios: pool-based and stream-based processing. Extensive experimental results indicate that our approach requires significantly less labeled instances to reach the same performance in both scenarios compared to Passive Learning, Active Learning and Self-Training. A reduction of 52.2% in human labeled instances is achieved in both of the pool-based and stream-based scenarios on a sound classification task considering 16,930 sound instances. PMID:27627768
NASA Astrophysics Data System (ADS)
Selwyn, Ebenezer Juliet; Florinabel, D. Jemi
2018-04-01
Compound image segmentation plays a vital role in the compression of computer screen images. Computer screen images are images which are mixed with textual, graphical, or pictorial contents. In this paper, we present a comparison of two transform based block classification of compound images based on metrics like speed of classification, precision and recall rate. Block based classification approaches normally divide the compound images into fixed size blocks of non-overlapping in nature. Then frequency transform like Discrete Cosine Transform (DCT) and Discrete Wavelet Transform (DWT) are applied over each block. Mean and standard deviation are computed for each 8 × 8 block and are used as features set to classify the compound images into text/graphics and picture/background block. The classification accuracy of block classification based segmentation techniques are measured by evaluation metrics like precision and recall rate. Compound images of smooth background and complex background images containing text of varying size, colour and orientation are considered for testing. Experimental evidence shows that the DWT based segmentation provides significant improvement in recall rate and precision rate approximately 2.3% than DCT based segmentation with an increase in block classification time for both smooth and complex background images.
Xu, Jun; Luo, Xiaofei; Wang, Guanhao; Gilmore, Hannah; Madabhushi, Anant
2016-01-01
Epithelial (EP) and stromal (ST) are two types of tissues in histological images. Automated segmentation or classification of EP and ST tissues is important when developing computerized system for analyzing the tumor microenvironment. In this paper, a Deep Convolutional Neural Networks (DCNN) based feature learning is presented to automatically segment or classify EP and ST regions from digitized tumor tissue microarrays (TMAs). Current approaches are based on handcraft feature representation, such as color, texture, and Local Binary Patterns (LBP) in classifying two regions. Compared to handcrafted feature based approaches, which involve task dependent representation, DCNN is an end-to-end feature extractor that may be directly learned from the raw pixel intensity value of EP and ST tissues in a data driven fashion. These high-level features contribute to the construction of a supervised classifier for discriminating the two types of tissues. In this work we compare DCNN based models with three handcraft feature extraction based approaches on two different datasets which consist of 157 Hematoxylin and Eosin (H&E) stained images of breast cancer and 1376 immunohistological (IHC) stained images of colorectal cancer, respectively. The DCNN based feature learning approach was shown to have a F1 classification score of 85%, 89%, and 100%, accuracy (ACC) of 84%, 88%, and 100%, and Matthews Correlation Coefficient (MCC) of 86%, 77%, and 100% on two H&E stained (NKI and VGH) and IHC stained data, respectively. Our DNN based approach was shown to outperform three handcraft feature extraction based approaches in terms of the classification of EP and ST regions. PMID:28154470
Xu, Jun; Luo, Xiaofei; Wang, Guanhao; Gilmore, Hannah; Madabhushi, Anant
2016-05-26
Epithelial (EP) and stromal (ST) are two types of tissues in histological images. Automated segmentation or classification of EP and ST tissues is important when developing computerized system for analyzing the tumor microenvironment. In this paper, a Deep Convolutional Neural Networks (DCNN) based feature learning is presented to automatically segment or classify EP and ST regions from digitized tumor tissue microarrays (TMAs). Current approaches are based on handcraft feature representation, such as color, texture, and Local Binary Patterns (LBP) in classifying two regions. Compared to handcrafted feature based approaches, which involve task dependent representation, DCNN is an end-to-end feature extractor that may be directly learned from the raw pixel intensity value of EP and ST tissues in a data driven fashion. These high-level features contribute to the construction of a supervised classifier for discriminating the two types of tissues. In this work we compare DCNN based models with three handcraft feature extraction based approaches on two different datasets which consist of 157 Hematoxylin and Eosin (H&E) stained images of breast cancer and 1376 immunohistological (IHC) stained images of colorectal cancer, respectively. The DCNN based feature learning approach was shown to have a F1 classification score of 85%, 89%, and 100%, accuracy (ACC) of 84%, 88%, and 100%, and Matthews Correlation Coefficient (MCC) of 86%, 77%, and 100% on two H&E stained (NKI and VGH) and IHC stained data, respectively. Our DNN based approach was shown to outperform three handcraft feature extraction based approaches in terms of the classification of EP and ST regions.
Thilak Krishna, Thilakam Vimal; Creusere, Charles D; Voelz, David G
2011-01-01
Polarization, a property of light that conveys information about the transverse electric field orientation, complements other attributes of electromagnetic radiation such as intensity and frequency. Using multiple passive polarimetric images, we develop an iterative, model-based approach to estimate the complex index of refraction and apply it to target classification.
A Cradle-to-Grave Integrated Approach to Using UNIFORMAT II
ERIC Educational Resources Information Center
Schneider, Richard C.; Cain, David A.
2009-01-01
The ASTM E1557/UNIFORMAT II standard is a three-level, function-oriented classification which links the schematic phase Preliminary Project Descriptions (PPD), based on Construction Standard Institute (CSI) Practice FF/180, to elemental cost estimates based on R.S. Means Cost Data. With the UNIFORMAT II Standard Classification for Building…
NASA Astrophysics Data System (ADS)
Yang, He; Ma, Ben; Du, Qian; Yang, Chenghai
2010-08-01
In this paper, we propose approaches to improve the pixel-based support vector machine (SVM) classification for urban land use and land cover (LULC) mapping from airborne hyperspectral imagery with high spatial resolution. Class spatial neighborhood relationship is used to correct the misclassified class pairs, such as roof and trail, road and roof. These classes may be difficult to be separated because they may have similar spectral signatures and their spatial features are not distinct enough to help their discrimination. In addition, misclassification incurred from within-class trivial spectral variation can be corrected by using pixel connectivity information in a local window so that spectrally homogeneous regions can be well preserved. Our experimental results demonstrate the efficiency of the proposed approaches in classification accuracy improvement. The overall performance is competitive to the object-based SVM classification.
Mammographic mass classification based on possibility theory
NASA Astrophysics Data System (ADS)
Hmida, Marwa; Hamrouni, Kamel; Solaiman, Basel; Boussetta, Sana
2017-03-01
Shape and margin features are very important for differentiating between benign and malignant masses in mammographic images. In fact, benign masses are usually round and oval and have smooth contours. However, malignant tumors have generally irregular shape and appear lobulated or speculated in margins. This knowledge suffers from imprecision and ambiguity. Therefore, this paper deals with the problem of mass classification by using shape and margin features while taking into account the uncertainty linked to the degree of truth of the available information and the imprecision related to its content. Thus, in this work, we proposed a novel mass classification approach which provides a possibility based representation of the extracted shape features and builds a possibility knowledge basis in order to evaluate the possibility degree of malignancy and benignity for each mass. For experimentation, the MIAS database was used and the classification results show the great performance of our approach in spite of using simple features.
Hoang, Tuan; Tran, Dat; Huang, Xu
2013-01-01
Common Spatial Pattern (CSP) is a state-of-the-art method for feature extraction in Brain-Computer Interface (BCI) systems. However it is designed for 2-class BCI classification problems. Current extensions of this method to multiple classes based on subspace union and covariance matrix similarity do not provide a high performance. This paper presents a new approach to solving multi-class BCI classification problems by forming a subspace resembled from original subspaces and the proposed method for this approach is called Approximation-based Common Principal Component (ACPC). We perform experiments on Dataset 2a used in BCI Competition IV to evaluate the proposed method. This dataset was designed for motor imagery classification with 4 classes. Preliminary experiments show that the proposed ACPC feature extraction method when combining with Support Vector Machines outperforms CSP-based feature extraction methods on the experimental dataset.
NASA Astrophysics Data System (ADS)
Haaf, Ezra; Barthel, Roland
2016-04-01
When assessing hydrogeological conditions at the regional scale, the analyst is often confronted with uncertainty of structures, inputs and processes while having to base inference on scarce and patchy data. Haaf and Barthel (2015) proposed a concept for handling this predicament by developing a groundwater systems classification framework, where information is transferred from similar, but well-explored and better understood to poorly described systems. The concept is based on the central hypothesis that similar systems react similarly to the same inputs and vice versa. It is conceptually related to PUB (Prediction in ungauged basins) where organization of systems and processes by quantitative methods is intended and used to improve understanding and prediction. Furthermore, using the framework it is expected that regional conceptual and numerical models can be checked or enriched by ensemble generated data from neighborhood-based estimators. In a first step, groundwater hydrographs from a large dataset in Southern Germany are compared in an effort to identify structural similarity in groundwater dynamics. A number of approaches to group hydrographs, mostly based on a similarity measure - which have previously only been used in local-scale studies, can be found in the literature. These are tested alongside different global feature extraction techniques. The resulting classifications are then compared to a visual "expert assessment"-based classification which serves as a reference. A ranking of the classification methods is carried out and differences shown. Selected groups from the classifications are related to geological descriptors. Here we present the most promising results from a comparison of classifications based on series correlation, different series distances and series features, such as the coefficients of the discrete Fourier transform and the intrinsic mode functions of empirical mode decomposition. Additionally, we show examples of classes corresponding to geological descriptors. Haaf, E., Barthel, R., 2015. Methods for assessing hydrogeological similarity and for classification of groundwater systems on the regional scale, EGU General Assembly 2015, Vienna, Austria.
NASA Astrophysics Data System (ADS)
Gavish, Yoni; O'Connell, Jerome; Marsh, Charles J.; Tarantino, Cristina; Blonda, Palma; Tomaselli, Valeria; Kunin, William E.
2018-02-01
The increasing need for high quality Habitat/Land-Cover (H/LC) maps has triggered considerable research into novel machine-learning based classification models. In many cases, H/LC classes follow pre-defined hierarchical classification schemes (e.g., CORINE), in which fine H/LC categories are thematically nested within more general categories. However, none of the existing machine-learning algorithms account for this pre-defined hierarchical structure. Here we introduce a novel Random Forest (RF) based application of hierarchical classification, which fits a separate local classification model in every branching point of the thematic tree, and then integrates all the different local models to a single global prediction. We applied the hierarchal RF approach in a NATURA 2000 site in Italy, using two land-cover (CORINE, FAO-LCCS) and one habitat classification scheme (EUNIS) that differ from one another in the shape of the class hierarchy. For all 3 classification schemes, both the hierarchical model and a flat model alternative provided accurate predictions, with kappa values mostly above 0.9 (despite using only 2.2-3.2% of the study area as training cells). The flat approach slightly outperformed the hierarchical models when the hierarchy was relatively simple, while the hierarchical model worked better under more complex thematic hierarchies. Most misclassifications came from habitat pairs that are thematically distant yet spectrally similar. In 2 out of 3 classification schemes, the additional constraints of the hierarchical model resulted with fewer such serious misclassifications relative to the flat model. The hierarchical model also provided valuable information on variable importance which can shed light into "black-box" based machine learning algorithms like RF. We suggest various ways by which hierarchical classification models can increase the accuracy and interpretability of H/LC classification maps.
An Optimization-based Framework to Learn Conditional Random Fields for Multi-label Classification
Naeini, Mahdi Pakdaman; Batal, Iyad; Liu, Zitao; Hong, CharmGil; Hauskrecht, Milos
2015-01-01
This paper studies multi-label classification problem in which data instances are associated with multiple, possibly high-dimensional, label vectors. This problem is especially challenging when labels are dependent and one cannot decompose the problem into a set of independent classification problems. To address the problem and properly represent label dependencies we propose and study a pairwise conditional random Field (CRF) model. We develop a new approach for learning the structure and parameters of the CRF from data. The approach maximizes the pseudo likelihood of observed labels and relies on the fast proximal gradient descend for learning the structure and limited memory BFGS for learning the parameters of the model. Empirical results on several datasets show that our approach outperforms several multi-label classification baselines, including recently published state-of-the-art methods. PMID:25927015
New insights into the classification and nomenclature of cortical GABAergic interneurons.
DeFelipe, Javier; López-Cruz, Pedro L; Benavides-Piccione, Ruth; Bielza, Concha; Larrañaga, Pedro; Anderson, Stewart; Burkhalter, Andreas; Cauli, Bruno; Fairén, Alfonso; Feldmeyer, Dirk; Fishell, Gord; Fitzpatrick, David; Freund, Tamás F; González-Burgos, Guillermo; Hestrin, Shaul; Hill, Sean; Hof, Patrick R; Huang, Josh; Jones, Edward G; Kawaguchi, Yasuo; Kisvárday, Zoltán; Kubota, Yoshiyuki; Lewis, David A; Marín, Oscar; Markram, Henry; McBain, Chris J; Meyer, Hanno S; Monyer, Hannah; Nelson, Sacha B; Rockland, Kathleen; Rossier, Jean; Rubenstein, John L R; Rudy, Bernardo; Scanziani, Massimo; Shepherd, Gordon M; Sherwood, Chet C; Staiger, Jochen F; Tamás, Gábor; Thomson, Alex; Wang, Yun; Yuste, Rafael; Ascoli, Giorgio A
2013-03-01
A systematic classification and accepted nomenclature of neuron types is much needed but is currently lacking. This article describes a possible taxonomical solution for classifying GABAergic interneurons of the cerebral cortex based on a novel, web-based interactive system that allows experts to classify neurons with pre-determined criteria. Using Bayesian analysis and clustering algorithms on the resulting data, we investigated the suitability of several anatomical terms and neuron names for cortical GABAergic interneurons. Moreover, we show that supervised classification models could automatically categorize interneurons in agreement with experts' assignments. These results demonstrate a practical and objective approach to the naming, characterization and classification of neurons based on community consensus.
New insights into the classification and nomenclature of cortical GABAergic interneurons
DeFelipe, Javier; López-Cruz, Pedro L.; Benavides-Piccione, Ruth; Bielza, Concha; Larrañaga, Pedro; Anderson, Stewart; Burkhalter, Andreas; Cauli, Bruno; Fairén, Alfonso; Feldmeyer, Dirk; Fishell, Gord; Fitzpatrick, David; Freund, Tamás F.; González-Burgos, Guillermo; Hestrin, Shaul; Hill, Sean; Hof, Patrick R.; Huang, Josh; Jones, Edward G.; Kawaguchi, Yasuo; Kisvárday, Zoltán; Kubota, Yoshiyuki; Lewis, David A.; Marín, Oscar; Markram, Henry; McBain, Chris J.; Meyer, Hanno S.; Monyer, Hannah; Nelson, Sacha B.; Rockland, Kathleen; Rossier, Jean; Rubenstein, John L. R.; Rudy, Bernardo; Scanziani, Massimo; Shepherd, Gordon M.; Sherwood, Chet C.; Staiger, Jochen F.; Tamás, Gábor; Thomson, Alex; Wang, Yun; Yuste, Rafael; Ascoli, Giorgio A.
2013-01-01
A systematic classification and accepted nomenclature of neuron types is much needed but is currently lacking. This article describes a possible taxonomical solution for classifying GABAergic interneurons of the cerebral cortex based on a novel, web-based interactive system that allows experts to classify neurons with pre-determined criteria. Using Bayesian analysis and clustering algorithms on the resulting data, we investigated the suitability of several anatomical terms and neuron names for cortical GABAergic interneurons. Moreover, we show that supervised classification models could automatically categorize interneurons in agreement with experts’ assignments. These results demonstrate a practical and objective approach to the naming, characterization and classification of neurons based on community consensus. PMID:23385869
Structure-based classification and ontology in chemistry
2012-01-01
Background Recent years have seen an explosion in the availability of data in the chemistry domain. With this information explosion, however, retrieving relevant results from the available information, and organising those results, become even harder problems. Computational processing is essential to filter and organise the available resources so as to better facilitate the work of scientists. Ontologies encode expert domain knowledge in a hierarchically organised machine-processable format. One such ontology for the chemical domain is ChEBI. ChEBI provides a classification of chemicals based on their structural features and a role or activity-based classification. An example of a structure-based class is 'pentacyclic compound' (compounds containing five-ring structures), while an example of a role-based class is 'analgesic', since many different chemicals can act as analgesics without sharing structural features. Structure-based classification in chemistry exploits elegant regularities and symmetries in the underlying chemical domain. As yet, there has been neither a systematic analysis of the types of structural classification in use in chemistry nor a comparison to the capabilities of available technologies. Results We analyze the different categories of structural classes in chemistry, presenting a list of patterns for features found in class definitions. We compare these patterns of class definition to tools which allow for automation of hierarchy construction within cheminformatics and within logic-based ontology technology, going into detail in the latter case with respect to the expressive capabilities of the Web Ontology Language and recent extensions for modelling structured objects. Finally we discuss the relationships and interactions between cheminformatics approaches and logic-based approaches. Conclusion Systems that perform intelligent reasoning tasks on chemistry data require a diverse set of underlying computational utilities including algorithmic, statistical and logic-based tools. For the task of automatic structure-based classification of chemical entities, essential to managing the vast swathes of chemical data being brought online, systems which are capable of hybrid reasoning combining several different approaches are crucial. We provide a thorough review of the available tools and methodologies, and identify areas of open research. PMID:22480202
Image aesthetic quality evaluation using convolution neural network embedded learning
NASA Astrophysics Data System (ADS)
Li, Yu-xin; Pu, Yuan-yuan; Xu, Dan; Qian, Wen-hua; Wang, Li-peng
2017-11-01
A way of embedded learning convolution neural network (ELCNN) based on the image content is proposed to evaluate the image aesthetic quality in this paper. Our approach can not only solve the problem of small-scale data but also score the image aesthetic quality. First, we chose Alexnet and VGG_S to compare for confirming which is more suitable for this image aesthetic quality evaluation task. Second, to further boost the image aesthetic quality classification performance, we employ the image content to train aesthetic quality classification models. But the training samples become smaller and only using once fine-tuning cannot make full use of the small-scale data set. Third, to solve the problem in second step, a way of using twice fine-tuning continually based on the aesthetic quality label and content label respective is proposed, the classification probability of the trained CNN models is used to evaluate the image aesthetic quality. The experiments are carried on the small-scale data set of Photo Quality. The experiment results show that the classification accuracy rates of our approach are higher than the existing image aesthetic quality evaluation approaches.
Evaluating multimedia chemical persistence: Classification and regression tree analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bennett, D.H.; McKone, T.E.; Kastenberg, W.E.
2000-04-01
For the thousands of chemicals continuously released into the environment, it is desirable to make prospective assessments of those likely to be persistent. Widely distributed persistent chemicals are impossible to remove from the environment and remediation by natural processes may take decades, which is problematic if adverse health or ecological effects are discovered after prolonged release into the environment. A tiered approach using a classification scheme and a multimedia model for determining persistence is presented. Using specific criteria for persistence, a classification tree is developed to classify a chemical as persistent or nonpersistent based on the chemical properties. In thismore » approach, the classification is derived from the results of a standardized unit world multimedia model. Thus, the classifications are more robust for multimedia pollutants than classifications using a single medium half-life. The method can be readily implemented and provides insight without requiring extensive and often unavailable data. This method can be used to classify chemicals when only a few properties are known and can be used to direct further data collection. Case studies are presented to demonstrate the advantages of the approach.« less
NASA Astrophysics Data System (ADS)
Guo, Yiqing; Jia, Xiuping; Paull, David
2018-06-01
The explosive availability of remote sensing images has challenged supervised classification algorithms such as Support Vector Machines (SVM), as training samples tend to be highly limited due to the expensive and laborious task of ground truthing. The temporal correlation and spectral similarity between multitemporal images have opened up an opportunity to alleviate this problem. In this study, a SVM-based Sequential Classifier Training (SCT-SVM) approach is proposed for multitemporal remote sensing image classification. The approach leverages the classifiers of previous images to reduce the required number of training samples for the classifier training of an incoming image. For each incoming image, a rough classifier is firstly predicted based on the temporal trend of a set of previous classifiers. The predicted classifier is then fine-tuned into a more accurate position with current training samples. This approach can be applied progressively to sequential image data, with only a small number of training samples being required from each image. Experiments were conducted with Sentinel-2A multitemporal data over an agricultural area in Australia. Results showed that the proposed SCT-SVM achieved better classification accuracies compared with two state-of-the-art model transfer algorithms. When training data are insufficient, the overall classification accuracy of the incoming image was improved from 76.18% to 94.02% with the proposed SCT-SVM, compared with those obtained without the assistance from previous images. These results demonstrate that the leverage of a priori information from previous images can provide advantageous assistance for later images in multitemporal image classification.
Multi-Temporal Land Cover Classification with Long Short-Term Memory Neural Networks
NASA Astrophysics Data System (ADS)
Rußwurm, M.; Körner, M.
2017-05-01
Land cover classification (LCC) is a central and wide field of research in earth observation and has already put forth a variety of classification techniques. Many approaches are based on classification techniques considering observation at certain points in time. However, some land cover classes, such as crops, change their spectral characteristics due to environmental influences and can thus not be monitored effectively with classical mono-temporal approaches. Nevertheless, these temporal observations should be utilized to benefit the classification process. After extensive research has been conducted on modeling temporal dynamics by spectro-temporal profiles using vegetation indices, we propose a deep learning approach to utilize these temporal characteristics for classification tasks. In this work, we show how long short-term memory (LSTM) neural networks can be employed for crop identification purposes with SENTINEL 2A observations from large study areas and label information provided by local authorities. We compare these temporal neural network models, i.e., LSTM and recurrent neural network (RNN), with a classical non-temporal convolutional neural network (CNN) model and an additional support vector machine (SVM) baseline. With our rather straightforward LSTM variant, we exceeded state-of-the-art classification performance, thus opening promising potential for further research.
Scheme, Erik J; Englehart, Kevin B
2013-07-01
When controlling a powered upper limb prosthesis it is important not only to know how to move the device, but also when not to move. A novel approach to pattern recognition control, using a selective multiclass one-versus-one classification scheme has been shown to be capable of rejecting unintended motions. This method was shown to outperform other popular classification schemes when presented with muscle contractions that did not correspond to desired actions. In this work, a 3-D Fitts' Law test is proposed as a suitable alternative to using virtual limb environments for evaluating real-time myoelectric control performance. The test is used to compare the selective approach to a state-of-the-art linear discriminant analysis classification based scheme. The framework is shown to obey Fitts' Law for both control schemes, producing linear regression fittings with high coefficients of determination (R(2) > 0.936). Additional performance metrics focused on quality of control are discussed and incorporated in the evaluation. Using this framework the selective classification based scheme is shown to produce significantly higher efficiency and completion rates, and significantly lower overshoot and stopping distances, with no significant difference in throughput.
Evaluating Chemical Persistence in a Multimedia Environment: ACART Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bennett, D.H.; McKone, T.E.; Kastenberg, W.E.
1999-02-01
For the thousands of chemicals continuously released into the environment, it is desirable to make prospective assessments of those likely to be persistent. Persistent chemicals are difficult to remove if adverse health or ecological effects are later discovered. A tiered approach using a classification scheme and a multimedia model for determining persistence is presented. Using specific criteria for persistence, a classification tree is developed to classify a chemical as ''persistent'' or ''non-persistent'' based on the chemical properties. In this approach, the classification is derived from the results of a standardized unit world multimedia model. Thus, the classifications are more robustmore » for multimedia pollutants than classifications using a single medium half-life. The method can be readily implemented and provides insight without requiring extensive and often unavailable data. This method can be used to classify chemicals when only a few properties are known and be used to direct further data collection. Case studies are presented to demonstrate the advantages of the approach.« less
Automated Detection of a Crossing Contact Based on Its Doppler Shift
2009-03-01
contacts in passive sonar systems. A common approach is the application of high- gain processing followed by successive classification criteria. Most...contacts in passive sonar systems. A common approach is the application of high-gain processing followed by successive classification criteria...RESEARCH MOTIVATION The trade-off between the false alarm and detection probability is fundamental in radar and sonar . (Chevalier, 2002) A common
Improved EEG Event Classification Using Differential Energy.
Harati, A; Golmohammadi, M; Lopez, S; Obeid, I; Picone, J
2015-12-01
Feature extraction for automatic classification of EEG signals typically relies on time frequency representations of the signal. Techniques such as cepstral-based filter banks or wavelets are popular analysis techniques in many signal processing applications including EEG classification. In this paper, we present a comparison of a variety of approaches to estimating and postprocessing features. To further aid in discrimination of periodic signals from aperiodic signals, we add a differential energy term. We evaluate our approaches on the TUH EEG Corpus, which is the largest publicly available EEG corpus and an exceedingly challenging task due to the clinical nature of the data. We demonstrate that a variant of a standard filter bank-based approach, coupled with first and second derivatives, provides a substantial reduction in the overall error rate. The combination of differential energy and derivatives produces a 24 % absolute reduction in the error rate and improves our ability to discriminate between signal events and background noise. This relatively simple approach proves to be comparable to other popular feature extraction approaches such as wavelets, but is much more computationally efficient.
ERIC Educational Resources Information Center
Chen, Pei-Hua; Chang, Hua-Hua; Wu, Haiyan
2012-01-01
Two sampling-and-classification-based procedures were developed for automated test assembly: the Cell Only and the Cell and Cube methods. A simulation study based on a 540-item bank was conducted to compare the performance of the procedures with the performance of a mixed-integer programming (MIP) method for assembling multiple parallel test…
The fallopian canal: a comprehensive review and proposal of a new classification.
Mortazavi, M M; Latif, B; Verma, K; Adeeb, N; Deep, A; Griessenauer, C J; Tubbs, R S; Fukushima, T
2014-03-01
The facial nerve follows a complex course through the skull base. Understanding its anatomy is crucial during standard skull base approaches and resection of certain skull base tumors closely related to the nerve, especially, tumors at the cerebellopontine angle. Herein, we review the fallopian canal and its implications in surgical approaches to the skull base. Furthermore, we suggest a new classification. Based on the anatomy and literature, we propose that the meatal segment of the facial nerve be included as a component of the fallopian canal. A comprehensive knowledge of the course of the facial nerve is important to those who treat patients with pathology of or near this cranial nerve.
The minimum distance approach to classification
NASA Technical Reports Server (NTRS)
Wacker, A. G.; Landgrebe, D. A.
1971-01-01
The work to advance the state-of-the-art of miminum distance classification is reportd. This is accomplished through a combination of theoretical and comprehensive experimental investigations based on multispectral scanner data. A survey of the literature for suitable distance measures was conducted and the results of this survey are presented. It is shown that minimum distance classification, using density estimators and Kullback-Leibler numbers as the distance measure, is equivalent to a form of maximum likelihood sample classification. It is also shown that for the parametric case, minimum distance classification is equivalent to nearest neighbor classification in the parameter space.
NASA Astrophysics Data System (ADS)
Talai, Sahand; Boelmans, Kai; Sedlacik, Jan; Forkert, Nils D.
2017-03-01
Parkinsonian syndromes encompass a spectrum of neurodegenerative diseases, which can be classified into various subtypes. The differentiation of these subtypes is typically conducted based on clinical criteria. Due to the overlap of intra-syndrome symptoms, the accurate differential diagnosis based on clinical guidelines remains a challenge with failure rates up to 25%. The aim of this study is to present an image-based classification method of patients with Parkinson's disease (PD) and patients with progressive supranuclear palsy (PSP), an atypical variant of PD. Therefore, apparent diffusion coefficient (ADC) parameter maps were calculated based on diffusion-tensor magnetic resonance imaging (MRI) datasets. Mean ADC values were determined in 82 brain regions using an atlas-based approach. The extracted mean ADC values for each patient were then used as features for classification using a linear kernel support vector machine classifier. To increase the classification accuracy, a feature selection was performed, which resulted in the top 17 attributes to be used as the final input features. A leave-one-out cross validation based on 56 PD and 21 PSP subjects revealed that the proposed method is capable of differentiating PD and PSP patients with an accuracy of 94.8%. In conclusion, the classification of PD and PSP patients based on ADC features obtained from diffusion MRI datasets is a promising new approach for the differentiation of Parkinsonian syndromes in the broader context of decision support systems.
NASA Astrophysics Data System (ADS)
Goetz-Weiss, L. R.; Herzfeld, U. C.; Trantow, T.; Hunke, E. C.; Maslanik, J. A.; Crocker, R. I.
2016-12-01
An important problem in model-data comparison is the identification of parameters that can be extracted from observational data as well as used in numerical models, which are typically based on idealized physical processes. Here, we present a suite of approaches to characterization and classification of sea ice and land ice types, properties and provinces based on several types of remote-sensing data. Applications will be given to not only illustrate the approach, but employ it in model evaluation and understanding of physical processes. (1) In a geostatistical characterization, spatial sea-ice properties in the Chukchi and Beaufort Sea and in Elsoon Lagoon are derived from analysis of RADARSAT and ERS-2 SAR data. (2) The analysis is taken further by utilizing multi-parameter feature vectors as inputs for unsupervised and supervised statistical classification, which facilitates classification of different sea-ice types. (3) Characteristic sea-ice parameters, as resultant from the classification, can then be applied in model evaluation, as demonstrated for the ridging scheme of the Los Alamos sea ice model, CICE, using high-resolution altimeter and image data collected from unmanned aircraft over Fram Strait during the Characterization of Arctic Sea Ice Experiment (CASIE). The characteristic parameters chosen in this application are directly related to deformation processes, which also underly the ridging scheme. (4) The method that is capable of the most complex classification tasks is the connectionist-geostatistical classification method. This approach has been developed to identify currently up to 18 different crevasse types in order to map progression of the surge through the complex Bering-Bagley Glacier System, Alaska, in 2011-2014. The analysis utilizes airborne altimeter data and video image data and satellite image data. Results of the crevasse classification are compare to fracture modeling and found to match.
Quality Evaluation of Land-Cover Classification Using Convolutional Neural Network
NASA Astrophysics Data System (ADS)
Dang, Y.; Zhang, J.; Zhao, Y.; Luo, F.; Ma, W.; Yu, F.
2018-04-01
Land-cover classification is one of the most important products of earth observation, which focuses mainly on profiling the physical characters of the land surface with temporal and distribution attributes and contains the information of both natural and man-made coverage elements, such as vegetation, soil, glaciers, rivers, lakes, marsh wetlands and various man-made structures. In recent years, the amount of high-resolution remote sensing data has increased sharply. Accordingly, the volume of land-cover classification products increases, as well as the need to evaluate such frequently updated products that is a big challenge. Conventionally, the automatic quality evaluation of land-cover classification is made through pixel-based classifying algorithms, which lead to a much trickier task and consequently hard to keep peace with the required updating frequency. In this paper, we propose a novel quality evaluation approach for evaluating the land-cover classification by a scene classification method Convolutional Neural Network (CNN) model. By learning from remote sensing data, those randomly generated kernels that serve as filter matrixes evolved to some operators that has similar functions to man-crafted operators, like Sobel operator or Canny operator, and there are other kernels learned by the CNN model that are much more complex and can't be understood as existing filters. The method using CNN approach as the core algorithm serves quality-evaluation tasks well since it calculates a bunch of outputs which directly represent the image's membership grade to certain classes. An automatic quality evaluation approach for the land-cover DLG-DOM coupling data (DLG for Digital Line Graphic, DOM for Digital Orthophoto Map) will be introduced in this paper. The CNN model as an robustness method for image evaluation, then brought out the idea of an automatic quality evaluation approach for land-cover classification. Based on this experiment, new ideas of quality evaluation of DLG-DOM coupling land-cover classification or other kinds of labelled remote sensing data can be further studied.
Classification via Clustering for Predicting Final Marks Based on Student Participation in Forums
ERIC Educational Resources Information Center
Lopez, M. I.; Luna, J. M.; Romero, C.; Ventura, S.
2012-01-01
This paper proposes a classification via clustering approach to predict the final marks in a university course on the basis of forum data. The objective is twofold: to determine if student participation in the course forum can be a good predictor of the final marks for the course and to examine whether the proposed classification via clustering…
Three-Way Analysis of Spectrospatial Electromyography Data: Classification and Interpretation
Kauppi, Jukka-Pekka; Hahne, Janne; Müller, Klaus-Robert; Hyvärinen, Aapo
2015-01-01
Classifying multivariate electromyography (EMG) data is an important problem in prosthesis control as well as in neurophysiological studies and diagnosis. With modern high-density EMG sensor technology, it is possible to capture the rich spectrospatial structure of the myoelectric activity. We hypothesize that multi-way machine learning methods can efficiently utilize this structure in classification as well as reveal interesting patterns in it. To this end, we investigate the suitability of existing three-way classification methods to EMG-based hand movement classification in spectrospatial domain, as well as extend these methods by sparsification and regularization. We propose to use Fourier-domain independent component analysis as preprocessing to improve classification and interpretability of the results. In high-density EMG experiments on hand movements across 10 subjects, three-way classification yielded higher average performance compared with state-of-the art classification based on temporal features, suggesting that the three-way analysis approach can efficiently utilize detailed spectrospatial information of high-density EMG. Phase and amplitude patterns of features selected by the classifier in finger-movement data were found to be consistent with known physiology. Thus, our approach can accurately resolve hand and finger movements on the basis of detailed spectrospatial information, and at the same time allows for physiological interpretation of the results. PMID:26039100
NASA Astrophysics Data System (ADS)
Saran, Sameer; Sterk, Geert; Kumar, Suresh
2007-10-01
Land use/cover is an important watershed surface characteristic that affects surface runoff and erosion. Many of the available hydrological models divide the watershed into Hydrological Response Units (HRU), which are spatial units with expected similar hydrological behaviours. The division into HRU's requires good-quality spatial data on land use/cover. This paper presents different approaches to attain an optimal land use/cover map based on remote sensing imagery for a Himalayan watershed in northern India. First digital classifications using maximum likelihood classifier (MLC) and a decision tree classifier were applied. The results obtained from the decision tree were better and even improved after post classification sorting. But the obtained land use/cover map was not sufficient for the delineation of HRUs, since the agricultural land use/cover class did not discriminate between the two major crops in the area i.e. paddy and maize. Therefore we adopted a visual classification approach using optical data alone and also fused with ENVISAT ASAR data. This second step with detailed classification system resulted into better classification accuracy within the 'agricultural land' class which will be further combined with topography and soil type to derive HRU's for physically-based hydrological modelling.
Model-based Clustering of High-Dimensional Data in Astrophysics
NASA Astrophysics Data System (ADS)
Bouveyron, C.
2016-05-01
The nature of data in Astrophysics has changed, as in other scientific fields, in the past decades due to the increase of the measurement capabilities. As a consequence, data are nowadays frequently of high dimensionality and available in mass or stream. Model-based techniques for clustering are popular tools which are renowned for their probabilistic foundations and their flexibility. However, classical model-based techniques show a disappointing behavior in high-dimensional spaces which is mainly due to their dramatical over-parametrization. The recent developments in model-based classification overcome these drawbacks and allow to efficiently classify high-dimensional data, even in the "small n / large p" situation. This work presents a comprehensive review of these recent approaches, including regularization-based techniques, parsimonious modeling, subspace classification methods and classification methods based on variable selection. The use of these model-based methods is also illustrated on real-world classification problems in Astrophysics using R packages.
NASA Astrophysics Data System (ADS)
Abdullahi, Sahra; Schardt, Mathias; Pretzsch, Hans
2017-05-01
Forest structure at stand level plays a key role for sustainable forest management, since the biodiversity, productivity, growth and stability of the forest can be positively influenced by managing its structural diversity. In contrast to field-based measurements, remote sensing techniques offer a cost-efficient opportunity to collect area-wide information about forest stand structure with high spatial and temporal resolution. Especially Interferometric Synthetic Aperture Radar (InSAR), which facilitates worldwide acquisition of 3d information independent from weather conditions and illumination, is convenient to capture forest stand structure. This study purposes an unsupervised two-stage clustering approach for forest structure classification based on height information derived from interferometric X-band SAR data which was performed in complex temperate forest stands of Traunstein forest (South Germany). In particular, a four dimensional input data set composed of first-order height statistics was non-linearly projected on a two-dimensional Self-Organizing Map, spatially ordered according to similarity (based on the Euclidean distance) in the first stage and classified using the k-means algorithm in the second stage. The study demonstrated that X-band InSAR data exhibits considerable capabilities for forest structure classification. Moreover, the unsupervised classification approach achieved meaningful and reasonable results by means of comparison to aerial imagery and LiDAR data.
Fast detection of tobacco mosaic virus infected tobacco using laser-induced breakdown spectroscopy
NASA Astrophysics Data System (ADS)
Peng, Jiyu; Song, Kunlin; Zhu, Hongyan; Kong, Wenwen; Liu, Fei; Shen, Tingting; He, Yong
2017-03-01
Tobacco mosaic virus (TMV) is one of the most devastating viruses to crops, which can cause severe production loss and affect the quality of products. In this study, we have proposed a novel approach to discriminate TMV-infected tobacco based on laser-induced breakdown spectroscopy (LIBS). Two different kinds of tobacco samples (fresh leaves and dried leaf pellets) were collected for spectral acquisition, and partial least squared discrimination analysis (PLS-DA) was used to establish classification models based on full spectrum and observed emission lines. The influences of moisture content on spectral profile, signal stability and plasma parameters (temperature and electron density) were also analysed. The results revealed that moisture content in fresh tobacco leaves would worsen the stability of analysis, and have a detrimental effect on the classification results. Good classification results were achieved based on the data from both full spectrum and observed emission lines of dried leaves, approaching 97.2% and 88.9% in the prediction set, respectively. In addition, support vector machine (SVM) could improve the classification results and eliminate influences of moisture content. The preliminary results indicate that LIBS coupled with chemometrics could provide a fast, efficient and low-cost approach for TMV-infected disease detection in tobacco leaves.
Fast detection of tobacco mosaic virus infected tobacco using laser-induced breakdown spectroscopy
Peng, Jiyu; Song, Kunlin; Zhu, Hongyan; Kong, Wenwen; Liu, Fei; Shen, Tingting; He, Yong
2017-01-01
Tobacco mosaic virus (TMV) is one of the most devastating viruses to crops, which can cause severe production loss and affect the quality of products. In this study, we have proposed a novel approach to discriminate TMV-infected tobacco based on laser-induced breakdown spectroscopy (LIBS). Two different kinds of tobacco samples (fresh leaves and dried leaf pellets) were collected for spectral acquisition, and partial least squared discrimination analysis (PLS-DA) was used to establish classification models based on full spectrum and observed emission lines. The influences of moisture content on spectral profile, signal stability and plasma parameters (temperature and electron density) were also analysed. The results revealed that moisture content in fresh tobacco leaves would worsen the stability of analysis, and have a detrimental effect on the classification results. Good classification results were achieved based on the data from both full spectrum and observed emission lines of dried leaves, approaching 97.2% and 88.9% in the prediction set, respectively. In addition, support vector machine (SVM) could improve the classification results and eliminate influences of moisture content. The preliminary results indicate that LIBS coupled with chemometrics could provide a fast, efficient and low-cost approach for TMV-infected disease detection in tobacco leaves. PMID:28300144
Identification of an Efficient Gene Expression Panel for Glioblastoma Classification
Zelaya, Ivette; Laks, Dan R.; Zhao, Yining; Kawaguchi, Riki; Gao, Fuying; Kornblum, Harley I.; Coppola, Giovanni
2016-01-01
We present here a novel genetic algorithm-based random forest (GARF) modeling technique that enables a reduction in the complexity of large gene disease signatures to highly accurate, greatly simplified gene panels. When applied to 803 glioblastoma multiforme samples, this method allowed the 840-gene Verhaak et al. gene panel (the standard in the field) to be reduced to a 48-gene classifier, while retaining 90.91% classification accuracy, and outperforming the best available alternative methods. Additionally, using this approach we produced a 32-gene panel which allows for better consistency between RNA-seq and microarray-based classifications, improving cross-platform classification retention from 69.67% to 86.07%. A webpage producing these classifications is available at http://simplegbm.semel.ucla.edu. PMID:27855170
Hierarchical vs non-hierarchical audio indexation and classification for video genres
NASA Astrophysics Data System (ADS)
Dammak, Nouha; BenAyed, Yassine
2018-04-01
In this paper, Support Vector Machines (SVMs) are used for segmenting and indexing video genres based on only audio features extracted at block level, which has a prominent asset by capturing local temporal information. The main contribution of our study is to show the wide effect on the classification accuracies while using an hierarchical categorization structure based on Mel Frequency Cepstral Coefficients (MFCC) audio descriptor. In fact, the classification consists in three common video genres: sports videos, music clips and news scenes. The sub-classification may divide each genre into several multi-speaker and multi-dialect sub-genres. The validation of this approach was carried out on over 360 minutes of video span yielding a classification accuracy of over 99%.
NASA Astrophysics Data System (ADS)
Fan, Jiayuan; Tan, Hui Li; Toomik, Maria; Lu, Shijian
2016-10-01
Spatial pyramid matching has demonstrated its power for image recognition task by pooling features from spatially increasingly fine sub-regions. Motivated by the concept of feature pooling at multiple pyramid levels, we propose a novel spectral-spatial hyperspectral image classification approach using superpixel-based spatial pyramid representation. This technique first generates multiple superpixel maps by decreasing the superpixel number gradually along with the increased spatial regions for labelled samples. By using every superpixel map, sparse representation of pixels within every spatial region is then computed through local max pooling. Finally, features learned from training samples are aggregated and trained by a support vector machine (SVM) classifier. The proposed spectral-spatial hyperspectral image classification technique has been evaluated on two public hyperspectral datasets, including the Indian Pines image containing 16 different agricultural scene categories with a 20m resolution acquired by AVIRIS and the University of Pavia image containing 9 land-use categories with a 1.3m spatial resolution acquired by the ROSIS-03 sensor. Experimental results show significantly improved performance compared with the state-of-the-art works. The major contributions of this proposed technique include (1) a new spectral-spatial classification approach to generate feature representation for hyperspectral image, (2) a complementary yet effective feature pooling approach, i.e. the superpixel-based spatial pyramid representation that is used for the spatial correlation study, (3) evaluation on two public hyperspectral image datasets with superior image classification performance.
Iliyasu, Abdullah M; Fatichah, Chastine
2017-12-19
A quantum hybrid (QH) intelligent approach that blends the adaptive search capability of the quantum-behaved particle swarm optimisation (QPSO) method with the intuitionistic rationality of traditional fuzzy k -nearest neighbours (Fuzzy k -NN) algorithm (known simply as the Q-Fuzzy approach) is proposed for efficient feature selection and classification of cells in cervical smeared (CS) images. From an initial multitude of 17 features describing the geometry, colour, and texture of the CS images, the QPSO stage of our proposed technique is used to select the best subset features (i.e., global best particles) that represent a pruned down collection of seven features. Using a dataset of almost 1000 images, performance evaluation of our proposed Q-Fuzzy approach assesses the impact of our feature selection on classification accuracy by way of three experimental scenarios that are compared alongside two other approaches: the All-features (i.e., classification without prior feature selection) and another hybrid technique combining the standard PSO algorithm with the Fuzzy k -NN technique (P-Fuzzy approach). In the first and second scenarios, we further divided the assessment criteria in terms of classification accuracy based on the choice of best features and those in terms of the different categories of the cervical cells. In the third scenario, we introduced new QH hybrid techniques, i.e., QPSO combined with other supervised learning methods, and compared the classification accuracy alongside our proposed Q-Fuzzy approach. Furthermore, we employed statistical approaches to establish qualitative agreement with regards to the feature selection in the experimental scenarios 1 and 3. The synergy between the QPSO and Fuzzy k -NN in the proposed Q-Fuzzy approach improves classification accuracy as manifest in the reduction in number cell features, which is crucial for effective cervical cancer detection and diagnosis.
Consensus Classification Using Non-Optimized Classifiers.
Brownfield, Brett; Lemos, Tony; Kalivas, John H
2018-04-03
Classifying samples into categories is a common problem in analytical chemistry and other fields. Classification is usually based on only one method, but numerous classifiers are available with some being complex, such as neural networks, and others are simple, such as k nearest neighbors. Regardless, most classification schemes require optimization of one or more tuning parameters for best classification accuracy, sensitivity, and specificity. A process not requiring exact selection of tuning parameter values would be useful. To improve classification, several ensemble approaches have been used in past work to combine classification results from multiple optimized single classifiers. The collection of classifications for a particular sample are then combined by a fusion process such as majority vote to form the final classification. Presented in this Article is a method to classify a sample by combining multiple classification methods without specifically classifying the sample by each method, that is, the classification methods are not optimized. The approach is demonstrated on three analytical data sets. The first is a beer authentication set with samples measured on five instruments, allowing fusion of multiple instruments by three ways. The second data set is composed of textile samples from three classes based on Raman spectra. This data set is used to demonstrate the ability to classify simultaneously with different data preprocessing strategies, thereby reducing the need to determine the ideal preprocessing method, a common prerequisite for accurate classification. The third data set contains three wine cultivars for three classes measured at 13 unique chemical and physical variables. In all cases, fusion of nonoptimized classifiers improves classification. Also presented are atypical uses of Procrustes analysis and extended inverted signal correction (EISC) for distinguishing sample similarities to respective classes.
NASA Astrophysics Data System (ADS)
Tarando, Sebastian Roberto; Fetita, Catalin; Brillet, Pierre-Yves
2017-03-01
The infiltrative lung diseases are a class of irreversible, non-neoplastic lung pathologies requiring regular follow-up with CT imaging. Quantifying the evolution of the patient status imposes the development of automated classification tools for lung texture. Traditionally, such classification relies on a two-dimensional analysis of axial CT images. This paper proposes a cascade of the existing CNN based CAD system, specifically tuned-up. The advantage of using a deep learning approach is a better regularization of the classification output. In a preliminary evaluation, the combined approach was tested on a 13 patient database of various lung pathologies, showing an increase of 10% in True Positive Rate (TPR) with respect to the best suited state of the art CNN for this task.
A Partial Least Squares Based Procedure for Upstream Sequence Classification in Prokaryotes.
Mehmood, Tahir; Bohlin, Jon; Snipen, Lars
2015-01-01
The upstream region of coding genes is important for several reasons, for instance locating transcription factor, binding sites, and start site initiation in genomic DNA. Motivated by a recently conducted study, where multivariate approach was successfully applied to coding sequence modeling, we have introduced a partial least squares (PLS) based procedure for the classification of true upstream prokaryotic sequence from background upstream sequence. The upstream sequences of conserved coding genes over genomes were considered in analysis, where conserved coding genes were found by using pan-genomics concept for each considered prokaryotic species. PLS uses position specific scoring matrix (PSSM) to study the characteristics of upstream region. Results obtained by PLS based method were compared with Gini importance of random forest (RF) and support vector machine (SVM), which is much used method for sequence classification. The upstream sequence classification performance was evaluated by using cross validation, and suggested approach identifies prokaryotic upstream region significantly better to RF (p-value < 0.01) and SVM (p-value < 0.01). Further, the proposed method also produced results that concurred with known biological characteristics of the upstream region.
NASA Astrophysics Data System (ADS)
d'Oleire-Oltmanns, Sebastian; Marzolff, Irene; Tiede, Dirk; Blaschke, Thomas
2015-04-01
The need for area-wide landform mapping approaches, especially in terms of land degradation, can be ascribed to the fact that within area-wide landform mapping approaches, the (spatial) context of erosional landforms is considered by providing additional information on the physiography neighboring the distinct landform. This study presents an approach for the detection of gully-affected areas by applying object-based image analysis in the region of Taroudannt, Morocco, which is highly affected by gully erosion while simultaneously representing a major region of agro-industry with a high demand of arable land. Various sensors provide readily available high-resolution optical satellite data with a much better temporal resolution than 3D terrain data which lead to the development of an area-wide mapping approach to extract gully-affected areas using only optical satellite imagery. The classification rule-set was developed with a clear focus on virtual spatial independence within the software environment of eCognition Developer. This allows the incorporation of knowledge about the target objects under investigation. Only optical QuickBird-2 satellite data and freely-available OpenStreetMap (OSM) vector data were used as input data. The OSM vector data were incorporated in order to mask out plantations and residential areas. Optical input data are more readily available for a broad range of users compared to terrain data, which is considered to be a major advantage. The methodology additionally incorporates expert knowledge and freely-available vector data in a cyclic object-based image analysis approach. This connects the two fields of geomorphology and remote sensing. The classification results allow conclusions on the current distribution of gullies. The results of the classification were checked against manually delineated reference data incorporating expert knowledge based on several field campaigns in the area, resulting in an overall classification accuracy of 62%. The error of omission accounts for 38% and the error of commission for 16%, respectively. Additionally, a manual assessment was carried out to assess the quality of the applied classification algorithm. The limited error of omission contributes with 23% to the overall error of omission and the limited error of commission contributes with 98% to the overall error of commission. This assessment improves the results and confirms the high quality of the developed approach for area-wide mapping of gully-affected areas in larger regions. In the field of landform mapping, the overall quality of the classification results is often assessed with more than one method to incorporate all aspects adequately.
NASA Astrophysics Data System (ADS)
Pedersen, G. B. M.
2016-02-01
A new object-oriented approach is developed to classify glaciovolcanic landforms (Procedure A) and their landform elements boundaries (Procedure B). It utilizes the principle that glaciovolcanic edifices are geomorphometrically distinct from lava shields and plains (Pedersen and Grosse, 2014), and the approach is tested on data from Reykjanes Peninsula, Iceland. The outlined procedures utilize slope and profile curvature attribute maps (20 m/pixel) and the classified results are evaluated quantitatively through error matrix maps (Procedure A) and visual inspection (Procedure B). In procedure A, the highest obtained accuracy is 94.1%, but even simple mapping procedures provide good results (> 90% accuracy). Successful classification of glaciovolcanic landform element boundaries (Procedure B) is also achieved and this technique has the potential to delineate the transition from intraglacial to subaerial volcanic activity in orthographic view. This object-oriented approach based on geomorphometry overcomes issues with vegetation cover, which has been typically problematic for classification schemes utilizing spectral data. Furthermore, it handles complex edifice outlines well and is easily incorporated into a GIS environment, where results can be edited or fused with other mapping results. The approach outlined here is designed to map glaciovolcanic edifices within the Icelandic neovolcanic zone but may also be applied to similar subaerial or submarine volcanic settings, where steep volcanic edifices are surrounded by flat plains.
This study applied a phenology-based land-cover classification approach across the Laurentian Great Lakes Basin (GLB) using time-series data consisting of 23 Moderate Resolution Imaging Spectroradiometer (MODIS) Normalized Difference Vegetation Index (NDVI) composite images (250 ...
GPU-Based Point Cloud Superpositioning for Structural Comparisons of Protein Binding Sites.
Leinweber, Matthias; Fober, Thomas; Freisleben, Bernd
2018-01-01
In this paper, we present a novel approach to solve the labeled point cloud superpositioning problem for performing structural comparisons of protein binding sites. The solution is based on a parallel evolution strategy that operates on large populations and runs on GPU hardware. The proposed evolution strategy reduces the likelihood of getting stuck in a local optimum of the multimodal real-valued optimization problem represented by labeled point cloud superpositioning. The performance of the GPU-based parallel evolution strategy is compared to a previously proposed CPU-based sequential approach for labeled point cloud superpositioning, indicating that the GPU-based parallel evolution strategy leads to qualitatively better results and significantly shorter runtimes, with speed improvements of up to a factor of 1,500 for large populations. Binary classification tests based on the ATP, NADH, and FAD protein subsets of CavBase, a database containing putative binding sites, show average classification rate improvements from about 92 percent (CPU) to 96 percent (GPU). Further experiments indicate that the proposed GPU-based labeled point cloud superpositioning approach can be superior to traditional protein comparison approaches based on sequence alignments.
A new epileptic seizure classification based exclusively on ictal semiology.
Lüders, H; Acharya, J; Baumgartner, C; Benbadis, S; Bleasel, A; Burgess, R; Dinner, D S; Ebner, A; Foldvary, N; Geller, E; Hamer, H; Holthausen, H; Kotagal, P; Morris, H; Meencke, H J; Noachtar, S; Rosenow, F; Sakamoto, A; Steinhoff, B J; Tuxhorn, I; Wyllie, E
1999-03-01
Historically, seizure semiology was the main feature in the differential diagnosis of epileptic syndromes. With the development of clinical EEG, the definition of electroclinical complexes became an essential tool to define epileptic syndromes, particularly focal epileptic syndromes. Modern advances in diagnostic technology, particularly in neuroimaging and molecular biology, now permit better definitions of epileptic syndromes. At the same time detailed studies showed that there does not necessarily exist a one-to-one relationship between epileptic seizures or electroclinical complexes and epileptic syndromes. These developments call for the reintroduction of an epileptic seizure classification based exclusively on clinical semiology, similar to the seizure classifications which were used by neurologists before the introduction of the modern diagnostic methods. This classification of epileptic seizures should always be complemented by an epileptic syndrome classification based on all the available clinical information (clinical history, neurological exam, ictal semiology, EEG, anatomical and functional neuroimaging, etc.). Such an approach is more consistent with mainstream clinical neurology and would avoid the current confusion between the classification of epileptic seizures (which in the International Seizure Classification is actually a classification of electroclinical complexes) and the classification of epileptic syndromes.
NASA Astrophysics Data System (ADS)
Krumpe, Tanja; Walter, Carina; Rosenstiel, Wolfgang; Spüler, Martin
2016-08-01
Objective. In this study, the feasibility of detecting a P300 via an asynchronous classification mode in a reactive EEG-based brain-computer interface (BCI) was evaluated. The P300 is one of the most popular BCI control signals and therefore used in many applications, mostly for active communication purposes (e.g. P300 speller). As the majority of all systems work with a stimulus-locked mode of classification (synchronous), the field of applications is limited. A new approach needs to be applied in a setting in which a stimulus-locked classification cannot be used due to the fact that the presented stimuli cannot be controlled or predicted by the system. Approach. A continuous observation task requiring the detection of outliers was implemented to test such an approach. The study was divided into an offline and an online part. Main results. Both parts of the study revealed that an asynchronous detection of the P300 can successfully be used to detect single events with high specificity. It also revealed that no significant difference in performance was found between the synchronous and the asynchronous approach. Significance. The results encourage the use of an asynchronous classification approach in suitable applications without a potential loss in performance.
NASA Astrophysics Data System (ADS)
Dekavalla, Maria; Argialas, Demetre
2017-07-01
The analysis of undersea topography and geomorphological features provides necessary information to related disciplines and many applications. The development of an automated knowledge-based classification approach of undersea topography and geomorphological features is challenging due to their multi-scale nature. The aim of the study is to develop and evaluate an automated knowledge-based OBIA approach to: i) decompose the global undersea topography to multi-scale regions of distinct morphometric properties, and ii) assign the derived regions to characteristic geomorphological features. First, the global undersea topography was decomposed through the SRTM30_PLUS bathymetry data to the so-called morphometric objects of discrete morphometric properties and spatial scales defined by data-driven methods (local variance graphs and nested means) and multi-scale analysis. The derived morphometric objects were combined with additional relative topographic position information computed with a self-adaptive pattern recognition method (geomorphons), and auxiliary data and were assigned to characteristic undersea geomorphological feature classes through a knowledge base, developed from standard definitions. The decomposition of the SRTM30_PLUS data to morphometric objects was considered successful for the requirements of maximizing intra-object and inter-object heterogeneity, based on the near zero values of the Moran's I and the low values of the weighted variance index. The knowledge-based classification approach was tested for its transferability in six case studies of various tectonic settings and achieved the efficient extraction of 11 undersea geomorphological feature classes. The classification results for the six case studies were compared with the digital global seafloor geomorphic features map (GSFM). The 11 undersea feature classes and their producer's accuracies in respect to the GSFM relevant areas were Basin (95%), Continental Shelf (94.9%), Trough (88.4%), Plateau (78.9%), Continental Slope (76.4%), Trench (71.2%), Abyssal Hill (62.9%), Abyssal Plain (62.4%), Ridge (49.8%), Seamount (48.8%) and Continental Rise (25.4%). The knowledge-based OBIA classification approach was considered transferable since the percentages of spatial and thematic agreement between the most of the classified undersea feature classes and the GSFM exhibited low deviations across the six case studies.
Tan, Joon Liang; Khang, Tsung Fei; Ngeow, Yun Fong; Choo, Siew Woh
2013-12-13
Mycobacterium abscessus is a rapidly growing mycobacterium that is often associated with human infections. The taxonomy of this species has undergone several revisions and is still being debated. In this study, we sequenced the genomes of 12 M. abscessus strains and used phylogenomic analysis to perform subspecies classification. A data mining approach was used to rank and select informative genes based on the relative entropy metric for the construction of a phylogenetic tree. The resulting tree topology was similar to that generated using the concatenation of five classical housekeeping genes: rpoB, hsp65, secA, recA and sodA. Additional support for the reliability of the subspecies classification came from the analysis of erm41 and ITS gene sequences, single nucleotide polymorphisms (SNPs)-based classification and strain clustering demonstrated by a variable number tandem repeat (VNTR) assay and a multilocus sequence analysis (MLSA). We subsequently found that the concatenation of a minimal set of three median-ranked genes: DNA polymerase III subunit alpha (polC), 4-hydroxy-2-ketovalerate aldolase (Hoa) and cell division protein FtsZ (ftsZ), is sufficient to recover the same tree topology. PCR assays designed specifically for these genes showed that all three genes could be amplified in the reference strain of M. abscessus ATCC 19977T. This study provides proof of concept that whole-genome sequence-based data mining approach can provide confirmatory evidence of the phylogenetic informativeness of existing markers, as well as lead to the discovery of a more economical and informative set of markers that produces similar subspecies classification in M. abscessus. The systematic procedure used in this study to choose the informative minimal set of gene markers can potentially be applied to species or subspecies classification of other bacteria.
Understanding similarity of groundwater systems with empirical copulas
NASA Astrophysics Data System (ADS)
Haaf, Ezra; Kumar, Rohini; Samaniego, Luis; Barthel, Roland
2016-04-01
Within the classification framework for groundwater systems that aims for identifying similarity of hydrogeological systems and transferring information from a well-observed to an ungauged system (Haaf and Barthel, 2015; Haaf and Barthel, 2016), we propose a copula-based method for describing groundwater-systems similarity. Copulas are an emerging method in hydrological sciences that make it possible to model the dependence structure of two groundwater level time series, independently of the effects of their marginal distributions. This study is based on Samaniego et al. (2010), which described an approach calculating dissimilarity measures from bivariate empirical copula densities of streamflow time series. Subsequently, streamflow is predicted in ungauged basins by transferring properties from similar catchments. The proposed approach is innovative because copula-based similarity has not yet been applied to groundwater systems. Here we estimate the pairwise dependence structure of 600 wells in Southern Germany using 10 years of weekly groundwater level observations. Based on these empirical copulas, dissimilarity measures are estimated, such as the copula's lower- and upper corner cumulated probability, copula-based Spearman's rank correlation - as proposed by Samaniego et al. (2010). For the characterization of groundwater systems, copula-based metrics are compared with dissimilarities obtained from precipitation signals corresponding to the presumed area of influence of each groundwater well. This promising approach provides a new tool for advancing similarity-based classification of groundwater system dynamics. Haaf, E., Barthel, R., 2015. Methods for assessing hydrogeological similarity and for classification of groundwater systems on the regional scale, EGU General Assembly 2015, Vienna, Austria. Haaf, E., Barthel, R., 2016. An approach for classification of hydrogeological systems at the regional scale based on groundwater hydrographs EGU General Assembly 2016, Vienna, Austria. Samaniego, L., Bardossy, A., Kumar, R., 2010. Streamflow prediction in ungauged catchments using copula-based dissimilarity measures. Water Resources Research, 46. DOI:10.1029/2008wr007695
A study of earthquake-induced building detection by object oriented classification approach
NASA Astrophysics Data System (ADS)
Sabuncu, Asli; Damla Uca Avci, Zehra; Sunar, Filiz
2017-04-01
Among the natural hazards, earthquakes are the most destructive disasters and cause huge loss of lives, heavily infrastructure damages and great financial losses every year all around the world. According to the statistics about the earthquakes, more than a million earthquakes occur which is equal to two earthquakes per minute in the world. Natural disasters have brought more than 780.000 deaths approximately % 60 of all mortality is due to the earthquakes after 2001. A great earthquake took place at 38.75 N 43.36 E in the eastern part of Turkey in Van Province on On October 23th, 2011. 604 people died and about 4000 buildings seriously damaged and collapsed after this earthquake. In recent years, the use of object oriented classification approach based on different object features, such as spectral, textural, shape and spatial information, has gained importance and became widespread for the classification of high-resolution satellite images and orthophotos. The motivation of this study is to detect the collapsed buildings and debris areas after the earthquake by using very high-resolution satellite images and orthophotos with the object oriented classification and also see how well remote sensing technology was carried out in determining the collapsed buildings. In this study, two different land surfaces were selected as homogenous and heterogeneous case study areas. In the first step of application, multi-resolution segmentation was applied and optimum parameters were selected to obtain the objects in each area after testing different color/shape and compactness/smoothness values. In the next step, two different classification approaches, namely "supervised" and "unsupervised" approaches were applied and their classification performances were compared. Object-based Image Analysis (OBIA) was performed using e-Cognition software.
Audio-guided audiovisual data segmentation, indexing, and retrieval
NASA Astrophysics Data System (ADS)
Zhang, Tong; Kuo, C.-C. Jay
1998-12-01
While current approaches for video segmentation and indexing are mostly focused on visual information, audio signals may actually play a primary role in video content parsing. In this paper, we present an approach for automatic segmentation, indexing, and retrieval of audiovisual data, based on audio content analysis. The accompanying audio signal of audiovisual data is first segmented and classified into basic types, i.e., speech, music, environmental sound, and silence. This coarse-level segmentation and indexing step is based upon morphological and statistical analysis of several short-term features of the audio signals. Then, environmental sounds are classified into finer classes, such as applause, explosions, bird sounds, etc. This fine-level classification and indexing step is based upon time- frequency analysis of audio signals and the use of the hidden Markov model as the classifier. On top of this archiving scheme, an audiovisual data retrieval system is proposed. Experimental results show that the proposed approach has an accuracy rate higher than 90 percent for the coarse-level classification, and higher than 85 percent for the fine-level classification. Examples of audiovisual data segmentation and retrieval are also provided.
Acosta-Mesa, Héctor-Gabriel; Rechy-Ramírez, Fernando; Mezura-Montes, Efrén; Cruz-Ramírez, Nicandro; Hernández Jiménez, Rodolfo
2014-06-01
In this work, we present a novel application of time series discretization using evolutionary programming for the classification of precancerous cervical lesions. The approach optimizes the number of intervals in which the length and amplitude of the time series should be compressed, preserving the important information for classification purposes. Using evolutionary programming, the search for a good discretization scheme is guided by a cost function which considers three criteria: the entropy regarding the classification, the complexity measured as the number of different strings needed to represent the complete data set, and the compression rate assessed as the length of the discrete representation. This discretization approach is evaluated using a time series data based on temporal patterns observed during a classical test used in cervical cancer detection; the classification accuracy reached by our method is compared with the well-known times series discretization algorithm SAX and the dimensionality reduction method PCA. Statistical analysis of the classification accuracy shows that the discrete representation is as efficient as the complete raw representation for the present application, reducing the dimensionality of the time series length by 97%. This representation is also very competitive in terms of classification accuracy when compared with similar approaches. Copyright © 2014 Elsevier Inc. All rights reserved.
[Classification and organization technologies in public health].
Filatov, V B; Zhiliaeva, E P; Kal'fa, Iu I
2000-01-01
The authors discuss the impact and main characteristics of organization technologies in public health and the processes of their development and evaluation. They offer an original definition of the notion "organization technologies" with approaches to their classification. A system of logical bases is offered, which can be used for classification. These bases include the level of organization maturity and stage of development of organization technology, its destination to a certain level of management, type of influence and concentration of trend, mechanism of effect, functional group, and methods of development.
Classification systems for natural resource management
Kleckner, Richard L.
1981-01-01
Resource managers employ various types of resource classification systems in their management activities such as inventory, mapping, and data analysis. Classification is the ordering or arranging of objects into groups or sets on the basis of their relationships, and as such, provide the resource managers with a structure for organizing their needed information. In addition of conforming to certain logical principles, resource classifications should be flexible, widely applicable to a variety of environmental conditions, and useable with minimal training. The process of classification may be approached from the bottom up (aggregation) or the top down (subdivision) or a combination of both, depending on the purpose of the classification. Most resource classification systems in use today focus on a single resource and are used for a single, limited purpose. However, resource managers now must employ the concept of multiple use in their management activities. What they need is an integrated, ecologically based approach to resource classification which would fulfill multiple-use mandates. In an effort to achieve resource-data compatibility and data sharing among Federal agencies, and interagency agreement has been signed by five Federal agencies to coordinate and cooperate in the area of resource classification and inventory.
A fuzzy decision tree for fault classification.
Zio, Enrico; Baraldi, Piero; Popescu, Irina C
2008-02-01
In plant accident management, the control room operators are required to identify the causes of the accident, based on the different patterns of evolution of the monitored process variables thereby developing. This task is often quite challenging, given the large number of process parameters monitored and the intense emotional states under which it is performed. To aid the operators, various techniques of fault classification have been engineered. An important requirement for their practical application is the physical interpretability of the relationships among the process variables underpinning the fault classification. In this view, the present work propounds a fuzzy approach to fault classification, which relies on fuzzy if-then rules inferred from the clustering of available preclassified signal data, which are then organized in a logical and transparent decision tree structure. The advantages offered by the proposed approach are precisely that a transparent fault classification model is mined out of the signal data and that the underlying physical relationships among the process variables are easily interpretable as linguistic if-then rules that can be explicitly visualized in the decision tree structure. The approach is applied to a case study regarding the classification of simulated faults in the feedwater system of a boiling water reactor.
Metric learning for automatic sleep stage classification.
Phan, Huy; Do, Quan; Do, The-Luan; Vu, Duc-Lung
2013-01-01
We introduce in this paper a metric learning approach for automatic sleep stage classification based on single-channel EEG data. We show that learning a global metric from training data instead of using the default Euclidean metric, the k-nearest neighbor classification rule outperforms state-of-the-art methods on Sleep-EDF dataset with various classification settings. The overall accuracy for Awake/Sleep and 4-class classification setting are 98.32% and 94.49% respectively. Furthermore, the superior accuracy is achieved by performing classification on a low-dimensional feature space derived from time and frequency domains and without the need for artifact removal as a preprocessing step.
Knowledge-rich temporal relation identification and classification in clinical notes
D’Souza, Jennifer; Ng, Vincent
2014-01-01
Motivation: We examine the task of temporal relation classification for the clinical domain. Our approach to this task departs from existing ones in that it is (i) ‘knowledge-rich’, employing sophisticated knowledge derived from discourse relations as well as both domain-independent and domain-dependent semantic relations, and (ii) ‘hybrid’, combining the strengths of rule-based and learning-based approaches. Evaluation results on the i2b2 Clinical Temporal Relations Challenge corpus show that our approach yields a 17–24% and 8–14% relative reduction in error over a state-of-the-art learning-based baseline system when gold-standard and automatically identified temporal relations are used, respectively. Database URL: http://www.hlt.utdallas.edu/~jld082000/temporal-relations/ PMID:25414383
Classification of Marital Relationships: An Empirical Approach.
ERIC Educational Resources Information Center
Snyder, Douglas K.; Smith, Gregory T.
1986-01-01
Derives an empirically based classification system of marital relationships, employing a multidimensional self-report measure of marital interaction. Spouses' profiles on the Marital Satisfaction Inventory for samples of clinic and nonclinic couples were subjected to cluster analysis, resulting in separate five-group typologies for husbands and…
Classification of multiple sclerosis lesions using adaptive dictionary learning.
Deshpande, Hrishikesh; Maurel, Pierre; Barillot, Christian
2015-12-01
This paper presents a sparse representation and an adaptive dictionary learning based method for automated classification of multiple sclerosis (MS) lesions in magnetic resonance (MR) images. Manual delineation of MS lesions is a time-consuming task, requiring neuroradiology experts to analyze huge volume of MR data. This, in addition to the high intra- and inter-observer variability necessitates the requirement of automated MS lesion classification methods. Among many image representation models and classification methods that can be used for such purpose, we investigate the use of sparse modeling. In the recent years, sparse representation has evolved as a tool in modeling data using a few basis elements of an over-complete dictionary and has found applications in many image processing tasks including classification. We propose a supervised classification approach by learning dictionaries specific to the lesions and individual healthy brain tissues, which include white matter (WM), gray matter (GM) and cerebrospinal fluid (CSF). The size of the dictionaries learned for each class plays a major role in data representation but it is an even more crucial element in the case of competitive classification. Our approach adapts the size of the dictionary for each class, depending on the complexity of the underlying data. The algorithm is validated using 52 multi-sequence MR images acquired from 13 MS patients. The results demonstrate the effectiveness of our approach in MS lesion classification. Copyright © 2015 Elsevier Ltd. All rights reserved.
Sidibé, Désiré; Sankar, Shrinivasan; Lemaître, Guillaume; Rastgoo, Mojdeh; Massich, Joan; Cheung, Carol Y; Tan, Gavin S W; Milea, Dan; Lamoureux, Ecosse; Wong, Tien Y; Mériaudeau, Fabrice
2017-02-01
This paper proposes a method for automatic classification of spectral domain OCT data for the identification of patients with retinal diseases such as Diabetic Macular Edema (DME). We address this issue as an anomaly detection problem and propose a method that not only allows the classification of the OCT volume, but also allows the identification of the individual diseased B-scans inside the volume. Our approach is based on modeling the appearance of normal OCT images with a Gaussian Mixture Model (GMM) and detecting abnormal OCT images as outliers. The classification of an OCT volume is based on the number of detected outliers. Experimental results with two different datasets show that the proposed method achieves a sensitivity and a specificity of 80% and 93% on the first dataset, and 100% and 80% on the second one. Moreover, the experiments show that the proposed method achieves better classification performance than other recently published works. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Uav-Based Crops Classification with Joint Features from Orthoimage and Dsm Data
NASA Astrophysics Data System (ADS)
Liu, B.; Shi, Y.; Duan, Y.; Wu, W.
2018-04-01
Accurate crops classification remains a challenging task due to the same crop with different spectra and different crops with same spectrum phenomenon. Recently, UAV-based remote sensing approach gains popularity not only for its high spatial and temporal resolution, but also for its ability to obtain spectraand spatial data at the same time. This paper focus on how to take full advantages of spatial and spectrum features to improve crops classification accuracy, based on an UAV platform equipped with a general digital camera. Texture and spatial features extracted from the RGB orthoimage and the digital surface model of the monitoring area are analysed and integrated within a SVM classification framework. Extensive experiences results indicate that the overall classification accuracy is drastically improved from 72.9 % to 94.5 % when the spatial features are combined together, which verified the feasibility and effectiveness of the proposed method.
Confidence level estimation in multi-target classification problems
NASA Astrophysics Data System (ADS)
Chang, Shi; Isaacs, Jason; Fu, Bo; Shin, Jaejeong; Zhu, Pingping; Ferrari, Silvia
2018-04-01
This paper presents an approach for estimating the confidence level in automatic multi-target classification performed by an imaging sensor on an unmanned vehicle. An automatic target recognition algorithm comprised of a deep convolutional neural network in series with a support vector machine classifier detects and classifies targets based on the image matrix. The joint posterior probability mass function of target class, features, and classification estimates is learned from labeled data, and recursively updated as additional images become available. Based on the learned joint probability mass function, the approach presented in this paper predicts the expected confidence level of future target classifications, prior to obtaining new images. The proposed approach is tested with a set of simulated sonar image data. The numerical results show that the estimated confidence level provides a close approximation to the actual confidence level value determined a posteriori, i.e. after the new image is obtained by the on-board sensor. Therefore, the expected confidence level function presented in this paper can be used to adaptively plan the path of the unmanned vehicle so as to optimize the expected confidence levels and ensure that all targets are classified with satisfactory confidence after the path is executed.
Suh, Jong Hwan
2016-01-01
In recent years, the anonymous nature of the Internet has made it difficult to detect manipulated user reputations in social media, as well as to ensure the qualities of users and their posts. To deal with this, this study designs and examines an automatic approach that adopts writing style features to estimate user reputations in social media. Under varying ways of defining Good and Bad classes of user reputations based on the collected data, it evaluates the classification performance of the state-of-art methods: four writing style features, i.e. lexical, syntactic, structural, and content-specific, and eight classification techniques, i.e. four base learners-C4.5, Neural Network (NN), Support Vector Machine (SVM), and Naïve Bayes (NB)-and four Random Subspace (RS) ensemble methods based on the four base learners. When South Korea's Web forum, Daum Agora, was selected as a test bed, the experimental results show that the configuration of the full feature set containing content-specific features and RS-SVM combining RS and SVM gives the best accuracy for classification if the test bed poster reputations are segmented strictly into Good and Bad classes by portfolio approach. Pairwise t tests on accuracy confirm two expectations coming from the literature reviews: first, the feature set adding content-specific features outperform the others; second, ensemble learning methods are more viable than base learners. Moreover, among the four ways on defining the classes of user reputations, i.e. like, dislike, sum, and portfolio, the results show that the portfolio approach gives the highest accuracy.
EEG Subspace Analysis and Classification Using Principal Angles for Brain-Computer Interfaces
NASA Astrophysics Data System (ADS)
Ashari, Rehab Bahaaddin
Brain-Computer Interfaces (BCIs) help paralyzed people who have lost some or all of their ability to communicate and control the outside environment from loss of voluntary muscle control. Most BCIs are based on the classification of multichannel electroencephalography (EEG) signals recorded from users as they respond to external stimuli or perform various mental activities. The classification process is fraught with difficulties caused by electrical noise, signal artifacts, and nonstationarity. One approach to reducing the effects of similar difficulties in other domains is the use of principal angles between subspaces, which has been applied mostly to video sequences. This dissertation studies and examines different ideas using principal angles and subspaces concepts. It introduces a novel mathematical approach for comparing sets of EEG signals for use in new BCI technology. The success of the presented results show that principal angles are also a useful approach to the classification of EEG signals that are recorded during a BCI typing application. In this application, the appearance of a subject's desired letter is detected by identifying a P300-wave within a one-second window of EEG following the flash of a letter. Smoothing the signals before using them is the only preprocessing step that was implemented in this study. The smoothing process based on minimizing the second derivative in time is implemented to increase the classification accuracy instead of using the bandpass filter that relies on assumptions on the frequency content of EEG. This study examines four different ways of removing outliers that are based on the principal angles and shows that the outlier removal methods did not help in the presented situations. One of the concepts that this dissertation focused on is the effect of the number of trials on the classification accuracies. The achievement of the good classification results by using a small number of trials starting from two trials only, should make this approach more appropriate for online BCI applications. In order to understand and test how EEG signals are different from one subject to another, different users are tested in this dissertation, some with motor impairments. Furthermore, the concept of transferring information between subjects is examined by training the approach on one subject and testing it on the other subject using the training subject's EEG subspaces to classify the testing subject's trials.
NASA Astrophysics Data System (ADS)
Sah, Shagan
An increasingly important application of remote sensing is to provide decision support during emergency response and disaster management efforts. Land cover maps constitute one such useful application product during disaster events; if generated rapidly after any disaster, such map products can contribute to the efficacy of the response effort. In light of recent nuclear incidents, e.g., after the earthquake/tsunami in Japan (2011), our research focuses on constructing rapid and accurate land cover maps of the impacted area in case of an accidental nuclear release. The methodology involves integration of results from two different approaches, namely coarse spatial resolution multi-temporal and fine spatial resolution imagery, to increase classification accuracy. Although advanced methods have been developed for classification using high spatial or temporal resolution imagery, only a limited amount of work has been done on fusion of these two remote sensing approaches. The presented methodology thus involves integration of classification results from two different remote sensing modalities in order to improve classification accuracy. The data used included RapidEye and MODIS scenes over the Nine Mile Point Nuclear Power Station in Oswego (New York, USA). The first step in the process was the construction of land cover maps from freely available, high temporal resolution, low spatial resolution MODIS imagery using a time-series approach. We used the variability in the temporal signatures among different land cover classes for classification. The time series-specific features were defined by various physical properties of a pixel, such as variation in vegetation cover and water content over time. The pixels were classified into four land cover classes - forest, urban, water, and vegetation - using Euclidean and Mahalanobis distance metrics. On the other hand, a high spatial resolution commercial satellite, such as RapidEye, can be tasked to capture images over the affected area in the case of a nuclear event. This imagery served as a second source of data to augment results from the time series approach. The classifications from the two approaches were integrated using an a posteriori probability-based fusion approach. This was done by establishing a relationship between the classes, obtained after classification of the two data sources. Despite the coarse spatial resolution of MODIS pixels, acceptable accuracies were obtained using time series features. The overall accuracies using the fusion-based approach were in the neighborhood of 80%, when compared with GIS data sets from New York State. This fusion thus contributed to classification accuracy refinement, with a few additional advantages, such as correction for cloud cover and providing for an approach that is robust against point-in-time seasonal anomalies, due to the inclusion of multi-temporal data. We concluded that this approach is capable of generating land cover maps of acceptable accuracy and rapid turnaround, which in turn can yield reliable estimates of crop acreage of a region. The final algorithm is part of an automated software tool, which can be used by emergency response personnel to generate a nuclear ingestion pathway information product within a few hours of data collection.
A Generalized Mixture Framework for Multi-label Classification
Hong, Charmgil; Batal, Iyad; Hauskrecht, Milos
2015-01-01
We develop a novel probabilistic ensemble framework for multi-label classification that is based on the mixtures-of-experts architecture. In this framework, we combine multi-label classification models in the classifier chains family that decompose the class posterior distribution P(Y1, …, Yd|X) using a product of posterior distributions over components of the output space. Our approach captures different input–output and output–output relations that tend to change across data. As a result, we can recover a rich set of dependency relations among inputs and outputs that a single multi-label classification model cannot capture due to its modeling simplifications. We develop and present algorithms for learning the mixtures-of-experts models from data and for performing multi-label predictions on unseen data instances. Experiments on multiple benchmark datasets demonstrate that our approach achieves highly competitive results and outperforms the existing state-of-the-art multi-label classification methods. PMID:26613069
Pyne, Matthew I.; Carlisle, Daren M.; Konrad, Christopher P.; Stein, Eric D.
2017-01-01
Regional classification of streams is an early step in the Ecological Limits of Hydrologic Alteration framework. Many stream classifications are based on an inductive approach using hydrologic data from minimally disturbed basins, but this approach may underrepresent streams from heavily disturbed basins or sparsely gaged arid regions. An alternative is a deductive approach, using watershed climate, land use, and geomorphology to classify streams, but this approach may miss important hydrological characteristics of streams. We classified all stream reaches in California using both approaches. First, we used Bayesian and hierarchical clustering to classify reaches according to watershed characteristics. Streams were clustered into seven classes according to elevation, sedimentary rock, and winter precipitation. Permutation-based analysis of variance and random forest analyses were used to determine which hydrologic variables best separate streams into their respective classes. Stream typology (i.e., the class that a stream reach is assigned to) is shaped mainly by patterns of high and mean flow behavior within the stream's landscape context. Additionally, random forest was used to determine which hydrologic variables best separate minimally disturbed reference streams from non-reference streams in each of the seven classes. In contrast to stream typology, deviation from reference conditions is more difficult to detect and is largely defined by changes in low-flow variables, average daily flow, and duration of flow. Our combined deductive/inductive approach allows us to estimate flow under minimally disturbed conditions based on the deductive analysis and compare to measured flow based on the inductive analysis in order to estimate hydrologic change.
Fusion and Gaussian mixture based classifiers for SONAR data
NASA Astrophysics Data System (ADS)
Kotari, Vikas; Chang, KC
2011-06-01
Underwater mines are inexpensive and highly effective weapons. They are difficult to detect and classify. Hence detection and classification of underwater mines is essential for the safety of naval vessels. This necessitates a formulation of highly efficient classifiers and detection techniques. Current techniques primarily focus on signals from one source. Data fusion is known to increase the accuracy of detection and classification. In this paper, we formulated a fusion-based classifier and a Gaussian mixture model (GMM) based classifier for classification of underwater mines. The emphasis has been on sound navigation and ranging (SONAR) signals due to their extensive use in current naval operations. The classifiers have been tested on real SONAR data obtained from University of California Irvine (UCI) repository. The performance of both GMM based classifier and fusion based classifier clearly demonstrate their superior classification accuracy over conventional single source cases and validate our approach.
Australian diagnosis related groups: Drivers of complexity adjustment.
Jackson, Terri; Dimitropoulos, Vera; Madden, Richard; Gillett, Steve
2015-11-01
In undertaking a major revision to the Australian Refined Diagnosis Related Group (ARDRG) classification, we set out to contrast Australia's approach to using data on additional (not principal) diagnoses with major international approaches in splitting base or Adjacent Diagnosis Related Groups (ADRGs). Comparative policy analysis/narrative review of peer-reviewed and grey literature on international approaches to use of additional (secondary) diagnoses in the development of Australian and international DRG systems. European and US approaches to characterise complexity of inpatient care are well-documented, providing useful points of comparison with Australia's. Australia, with good data sources, has continued to refine its national DRG classification using increasingly sophisticated approaches. Hospital funders in Australia and in other systems are often under pressure from provider groups to expand classifications to reflect clinical complexity. DRG development in most healthcare systems reviewed here reflects four critical factors: these socio-political factors, the quality and depth of the coded data available to characterise the mix of cases in a healthcare system, the size of the underlying population, and the intended scope and use of the classification. Australia's relatively small national population has constrained the size of its DRG classifications, and development has been concentrated on inpatient care in public hospitals. Development of casemix classifications in health care is driven by both technical and socio-political factors. Use of additional diagnoses to adjust for patient complexity and cost needs to respond to these in each casemix application. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Siskind, Dan; Harris, Meredith; Pirkis, Jane; Whiteford, Harvey
2013-06-01
A lack of definitional clarity in supported accommodation and the absence of a widely accepted system for classifying supported accommodation models creates barriers to service planning and evaluation. We undertook a systematic review of existing supported accommodation classification systems. Using a structured system for qualitative data analysis, we reviewed the stratification features in these classification systems, identified the key elements of supported accommodation and arranged them into domains and dimensions to create a new taxonomy. The existing classification systems were mapped onto the new taxonomy to verify the domains and dimensions. Existing classification systems used either a service-level characteristic or programmatic approach. We proposed a taxonomy based around four domains: duration of tenure; patient characteristics; housing characteristics; and service characteristics. All of the domains in the taxonomy were drawn from the existing classification structures; however, none of the existing classification structures covered all of the domains in the taxonomy. Existing classification systems are regionally based, limited in scope and lack flexibility. A domains-based taxonomy can allow more accurate description of supported accommodation services, aid in identifying the service elements likely to improve outcomes for specific patient populations, and assist in service planning.
Hierarchical Higher Order Crf for the Classification of Airborne LIDAR Point Clouds in Urban Areas
NASA Astrophysics Data System (ADS)
Niemeyer, J.; Rottensteiner, F.; Soergel, U.; Heipke, C.
2016-06-01
We propose a novel hierarchical approach for the classification of airborne 3D lidar points. Spatial and semantic context is incorporated via a two-layer Conditional Random Field (CRF). The first layer operates on a point level and utilises higher order cliques. Segments are generated from the labelling obtained in this way. They are the entities of the second layer, which incorporates larger scale context. The classification result of the segments is introduced as an energy term for the next iteration of the point-based layer. This framework iterates and mutually propagates context to improve the classification results. Potentially wrong decisions can be revised at later stages. The output is a labelled point cloud as well as segments roughly corresponding to object instances. Moreover, we present two new contextual features for the segment classification: the distance and the orientation of a segment with respect to the closest road. It is shown that the classification benefits from these features. In our experiments the hierarchical framework improve the overall accuracies by 2.3% on a point-based level and by 3.0% on a segment-based level, respectively, compared to a purely point-based classification.
Webb-Robertson, Bobbie-Jo M.; Wiberg, Holli K.; Matzke, Melissa M.; ...
2015-04-09
In this review, we apply selected imputation strategies to label-free liquid chromatography–mass spectrometry (LC–MS) proteomics datasets to evaluate the accuracy with respect to metrics of variance and classification. We evaluate several commonly used imputation approaches for individual merits and discuss the caveats of each approach with respect to the example LC–MS proteomics data. In general, local similarity-based approaches, such as the regularized expectation maximization and least-squares adaptive algorithms, yield the best overall performances with respect to metrics of accuracy and robustness. However, no single algorithm consistently outperforms the remaining approaches, and in some cases, performing classification without imputation sometimes yieldedmore » the most accurate classification. Thus, because of the complex mechanisms of missing data in proteomics, which also vary from peptide to protein, no individual method is a single solution for imputation. In summary, on the basis of the observations in this review, the goal for imputation in the field of computational proteomics should be to develop new approaches that work generically for this data type and new strategies to guide users in the selection of the best imputation for their dataset and analysis objectives.« less
Gene Selection and Cancer Classification: A Rough Sets Based Approach
NASA Astrophysics Data System (ADS)
Sun, Lijun; Miao, Duoqian; Zhang, Hongyun
Indentification of informative gene subsets responsible for discerning between available samples of gene expression data is an important task in bioinformatics. Reducts, from rough sets theory, corresponding to a minimal set of essential genes for discerning samples, is an efficient tool for gene selection. Due to the compuational complexty of the existing reduct algoritms, feature ranking is usually used to narrow down gene space as the first step and top ranked genes are selected . In this paper,we define a novel certierion based on the expression level difference btween classes and contribution to classification of the gene for scoring genes and present a algorithm for generating all possible reduct from informative genes.The algorithm takes the whole attribute sets into account and find short reduct with a significant reduction in computational complexity. An exploration of this approach on benchmark gene expression data sets demonstrates that this approach is successful for selecting high discriminative genes and the classification accuracy is impressive.
Object-Based Classification and Change Detection of Hokkaido, Japan
NASA Astrophysics Data System (ADS)
Park, J. G.; Harada, I.; Kwak, Y.
2016-06-01
Topography and geology are factors to characterize the distribution of natural vegetation. Topographic contour is particularly influential on the living conditions of plants such as soil moisture, sunlight, and windiness. Vegetation associations having similar characteristics are present in locations having similar topographic conditions unless natural disturbances such as landslides and forest fires or artificial disturbances such as deforestation and man-made plantation bring about changes in such conditions. We developed a vegetation map of Japan using an object-based segmentation approach with topographic information (elevation, slope, slope direction) that is closely related to the distribution of vegetation. The results found that the object-based classification is more effective to produce a vegetation map than the pixel-based classification.
Lesion classification using clinical and visual data fusion by multiple kernel learning
NASA Astrophysics Data System (ADS)
Kisilev, Pavel; Hashoul, Sharbell; Walach, Eugene; Tzadok, Asaf
2014-03-01
To overcome operator dependency and to increase diagnosis accuracy in breast ultrasound (US), a lot of effort has been devoted to developing computer-aided diagnosis (CAD) systems for breast cancer detection and classification. Unfortunately, the efficacy of such CAD systems is limited since they rely on correct automatic lesions detection and localization, and on robustness of features computed based on the detected areas. In this paper we propose a new approach to boost the performance of a Machine Learning based CAD system, by combining visual and clinical data from patient files. We compute a set of visual features from breast ultrasound images, and construct the textual descriptor of patients by extracting relevant keywords from patients' clinical data files. We then use the Multiple Kernel Learning (MKL) framework to train SVM based classifier to discriminate between benign and malignant cases. We investigate different types of data fusion methods, namely, early, late, and intermediate (MKL-based) fusion. Our database consists of 408 patient cases, each containing US images, textual description of complaints and symptoms filled by physicians, and confirmed diagnoses. We show experimentally that the proposed MKL-based approach is superior to other classification methods. Even though the clinical data is very sparse and noisy, its MKL-based fusion with visual features yields significant improvement of the classification accuracy, as compared to the image features only based classifier.
Automated classification of Acid Rock Drainage potential from Corescan drill core imagery
NASA Astrophysics Data System (ADS)
Cracknell, M. J.; Jackson, L.; Parbhakar-Fox, A.; Savinova, K.
2017-12-01
Classification of the acid forming potential of waste rock is important for managing environmental hazards associated with mining operations. Current methods for the classification of acid rock drainage (ARD) potential usually involve labour intensive and subjective assessment of drill core and/or hand specimens. Manual methods are subject to operator bias, human error and the amount of material that can be assessed within a given time frame is limited. The automated classification of ARD potential documented here is based on the ARD Index developed by Parbhakar-Fox et al. (2011). This ARD Index involves the combination of five indicators: A - sulphide content; B - sulphide alteration; C - sulphide morphology; D - primary neutraliser content; and E - sulphide mineral association. Several components of the ARD Index require accurate identification of sulphide minerals. This is achieved by classifying Corescan Red-Green-Blue true colour images into the presence or absence of sulphide minerals using supervised classification. Subsequently, sulphide classification images are processed and combined with Corescan SWIR-based mineral classifications to obtain information on sulphide content, indices representing sulphide textures (disseminated versus massive and degree of veining), and spatially associated minerals. This information is combined to calculate ARD Index indicator values that feed into the classification of ARD potential. Automated ARD potential classifications of drill core samples associated with a porphyry Cu-Au deposit are compared to manually derived classifications and those obtained by standard static geochemical testing and X-ray diffractometry analyses. Results indicate a high degree of similarity between automated and manual ARD potential classifications. Major differences between approaches are observed in sulphide and neutraliser mineral percentages, likely due to the subjective nature of manual estimates of mineral content. The automated approach presented here for the classification of ARD potential offers rapid, repeatable and accurate outcomes comparable to manually derived classifications. Methods for automated ARD classifications from digital drill core data represent a step-change for geoenvironmental management practices in the mining industry.
Physical Human Activity Recognition Using Wearable Sensors.
Attal, Ferhat; Mohammed, Samer; Dedabrishvili, Mariam; Chamroukhi, Faicel; Oukhellou, Latifa; Amirat, Yacine
2015-12-11
This paper presents a review of different classification techniques used to recognize human activities from wearable inertial sensor data. Three inertial sensor units were used in this study and were worn by healthy subjects at key points of upper/lower body limbs (chest, right thigh and left ankle). Three main steps describe the activity recognition process: sensors' placement, data pre-processing and data classification. Four supervised classification techniques namely, k-Nearest Neighbor (k-NN), Support Vector Machines (SVM), Gaussian Mixture Models (GMM), and Random Forest (RF) as well as three unsupervised classification techniques namely, k-Means, Gaussian mixture models (GMM) and Hidden Markov Model (HMM), are compared in terms of correct classification rate, F-measure, recall, precision, and specificity. Raw data and extracted features are used separately as inputs of each classifier. The feature selection is performed using a wrapper approach based on the RF algorithm. Based on our experiments, the results obtained show that the k-NN classifier provides the best performance compared to other supervised classification algorithms, whereas the HMM classifier is the one that gives the best results among unsupervised classification algorithms. This comparison highlights which approach gives better performance in both supervised and unsupervised contexts. It should be noted that the obtained results are limited to the context of this study, which concerns the classification of the main daily living human activities using three wearable accelerometers placed at the chest, right shank and left ankle of the subject.
A statistical approach to root system classification
Bodner, Gernot; Leitner, Daniel; Nakhforoosh, Alireza; Sobotik, Monika; Moder, Karl; Kaul, Hans-Peter
2013-01-01
Plant root systems have a key role in ecology and agronomy. In spite of fast increase in root studies, still there is no classification that allows distinguishing among distinctive characteristics within the diversity of rooting strategies. Our hypothesis is that a multivariate approach for “plant functional type” identification in ecology can be applied to the classification of root systems. The classification method presented is based on a data-defined statistical procedure without a priori decision on the classifiers. The study demonstrates that principal component based rooting types provide efficient and meaningful multi-trait classifiers. The classification method is exemplified with simulated root architectures and morphological field data. Simulated root architectures showed that morphological attributes with spatial distribution parameters capture most distinctive features within root system diversity. While developmental type (tap vs. shoot-borne systems) is a strong, but coarse classifier, topological traits provide the most detailed differentiation among distinctive groups. Adequacy of commonly available morphologic traits for classification is supported by field data. Rooting types emerging from measured data, mainly distinguished by diameter/weight and density dominated types. Similarity of root systems within distinctive groups was the joint result of phylogenetic relation and environmental as well as human selection pressure. We concluded that the data-define classification is appropriate for integration of knowledge obtained with different root measurement methods and at various scales. Currently root morphology is the most promising basis for classification due to widely used common measurement protocols. To capture details of root diversity efforts in architectural measurement techniques are essential. PMID:23914200
A statistical approach to root system classification.
Bodner, Gernot; Leitner, Daniel; Nakhforoosh, Alireza; Sobotik, Monika; Moder, Karl; Kaul, Hans-Peter
2013-01-01
Plant root systems have a key role in ecology and agronomy. In spite of fast increase in root studies, still there is no classification that allows distinguishing among distinctive characteristics within the diversity of rooting strategies. Our hypothesis is that a multivariate approach for "plant functional type" identification in ecology can be applied to the classification of root systems. The classification method presented is based on a data-defined statistical procedure without a priori decision on the classifiers. The study demonstrates that principal component based rooting types provide efficient and meaningful multi-trait classifiers. The classification method is exemplified with simulated root architectures and morphological field data. Simulated root architectures showed that morphological attributes with spatial distribution parameters capture most distinctive features within root system diversity. While developmental type (tap vs. shoot-borne systems) is a strong, but coarse classifier, topological traits provide the most detailed differentiation among distinctive groups. Adequacy of commonly available morphologic traits for classification is supported by field data. Rooting types emerging from measured data, mainly distinguished by diameter/weight and density dominated types. Similarity of root systems within distinctive groups was the joint result of phylogenetic relation and environmental as well as human selection pressure. We concluded that the data-define classification is appropriate for integration of knowledge obtained with different root measurement methods and at various scales. Currently root morphology is the most promising basis for classification due to widely used common measurement protocols. To capture details of root diversity efforts in architectural measurement techniques are essential.
Physical Human Activity Recognition Using Wearable Sensors
Attal, Ferhat; Mohammed, Samer; Dedabrishvili, Mariam; Chamroukhi, Faicel; Oukhellou, Latifa; Amirat, Yacine
2015-01-01
This paper presents a review of different classification techniques used to recognize human activities from wearable inertial sensor data. Three inertial sensor units were used in this study and were worn by healthy subjects at key points of upper/lower body limbs (chest, right thigh and left ankle). Three main steps describe the activity recognition process: sensors’ placement, data pre-processing and data classification. Four supervised classification techniques namely, k-Nearest Neighbor (k-NN), Support Vector Machines (SVM), Gaussian Mixture Models (GMM), and Random Forest (RF) as well as three unsupervised classification techniques namely, k-Means, Gaussian mixture models (GMM) and Hidden Markov Model (HMM), are compared in terms of correct classification rate, F-measure, recall, precision, and specificity. Raw data and extracted features are used separately as inputs of each classifier. The feature selection is performed using a wrapper approach based on the RF algorithm. Based on our experiments, the results obtained show that the k-NN classifier provides the best performance compared to other supervised classification algorithms, whereas the HMM classifier is the one that gives the best results among unsupervised classification algorithms. This comparison highlights which approach gives better performance in both supervised and unsupervised contexts. It should be noted that the obtained results are limited to the context of this study, which concerns the classification of the main daily living human activities using three wearable accelerometers placed at the chest, right shank and left ankle of the subject. PMID:26690450
NASA Astrophysics Data System (ADS)
Saran, Sameer; Sterk, Geert; Kumar, Suresh
2009-10-01
Land use/land cover is an important watershed surface characteristic that affects surface runoff and erosion. Many of the available hydrological models divide the watershed into Hydrological Response Units (HRU), which are spatial units with expected similar hydrological behaviours. The division into HRU's requires good-quality spatial data on land use/land cover. This paper presents different approaches to attain an optimal land use/land cover map based on remote sensing imagery for a Himalayan watershed in northern India. First digital classifications using maximum likelihood classifier (MLC) and a decision tree classifier were applied. The results obtained from the decision tree were better and even improved after post classification sorting. But the obtained land use/land cover map was not sufficient for the delineation of HRUs, since the agricultural land use/land cover class did not discriminate between the two major crops in the area i.e. paddy and maize. Subsequently the digital classification on fused data (ASAR and ASTER) were attempted to map land use/land cover classes with emphasis to delineate the paddy and maize crops but the supervised classification over fused datasets did not provide the desired accuracy and proper delineation of paddy and maize crops. Eventually, we adopted a visual classification approach on fused data. This second step with detailed classification system resulted into better classification accuracy within the 'agricultural land' class which will be further combined with topography and soil type to derive HRU's for physically-based hydrological modeling.
Discriminative Hierarchical K-Means Tree for Large-Scale Image Classification.
Chen, Shizhi; Yang, Xiaodong; Tian, Yingli
2015-09-01
A key challenge in large-scale image classification is how to achieve efficiency in terms of both computation and memory without compromising classification accuracy. The learning-based classifiers achieve the state-of-the-art accuracies, but have been criticized for the computational complexity that grows linearly with the number of classes. The nonparametric nearest neighbor (NN)-based classifiers naturally handle large numbers of categories, but incur prohibitively expensive computation and memory costs. In this brief, we present a novel classification scheme, i.e., discriminative hierarchical K-means tree (D-HKTree), which combines the advantages of both learning-based and NN-based classifiers. The complexity of the D-HKTree only grows sublinearly with the number of categories, which is much better than the recent hierarchical support vector machines-based methods. The memory requirement is the order of magnitude less than the recent Naïve Bayesian NN-based approaches. The proposed D-HKTree classification scheme is evaluated on several challenging benchmark databases and achieves the state-of-the-art accuracies, while with significantly lower computation cost and memory requirement.
Bayoumi, Ahmed B; Laviv, Yosef; Yokus, Burhan; Efe, Ibrahim E; Toktas, Zafer Orkun; Kilic, Turker; Demir, Mustafa K; Konya, Deniz; Kasper, Ekkehard M
2017-11-01
1) To provide neurosurgeons and radiologists with a new quantitative and anatomical method to describe spinal meningiomas (SM) consistently. 2) To provide a guide to the surgical approach needed and amount of bony resection required based on the proposed classification. 3) To report the distribution of our 58 cases of SM over different Stages and Subtypes in correlation to the surgical treatment needed for each case. 4) To briefly review the literature on the rare non-conventional surgical corridors to resect SM. We reviewed the literature to report on previously published cohorts and classifications used to describe the location of the tumor inside the spinal canal. We reviewed the cases that were published prior showing non-conventional surgical approaches to resect spinal meningiomas. We proposed our classification system composed of Staging based on maximal cross-sectional surface area of tumor inside canal, Typing based on number of quadrants occupied by tumor and Subtyping based on location of the tumor bulk to spinal cord. Extradural and extra-spinal growth were also covered by our classification. We then applied it retrospectively on our 58 cases. 12 articles were published illustrating overlapping terms to describe spinal meningiomas. Another 7 articles were published reporting on 23 cases of anteriorly located spinal meningiomas treated with approaches other than laminectomies/laminoplasties. 4 Types, 9 Subtypes and 4 Stages were described in our Classification System. In our series of 58 patients, no midline anterior type was represented. Therefore, all our cases were treated by laminectomies or laminoplasties (with/without facetectomies) except a case with a paraspinal component where a costotransversectomy was needed. Spinal meningiomas can be radiologically described in a precise fashion. Selection of surgical corridor depends mainly on location of tumor bulk inside canal. Copyright © 2017 Elsevier B.V. All rights reserved.
Bisenius, Sandrine; Mueller, Karsten; Diehl-Schmid, Janine; Fassbender, Klaus; Grimmer, Timo; Jessen, Frank; Kassubek, Jan; Kornhuber, Johannes; Landwehrmeyer, Bernhard; Ludolph, Albert; Schneider, Anja; Anderl-Straub, Sarah; Stuke, Katharina; Danek, Adrian; Otto, Markus; Schroeter, Matthias L
2017-01-01
Primary progressive aphasia (PPA) encompasses the three subtypes nonfluent/agrammatic variant PPA, semantic variant PPA, and the logopenic variant PPA, which are characterized by distinct patterns of language difficulties and regional brain atrophy. To validate the potential of structural magnetic resonance imaging data for early individual diagnosis, we used support vector machine classification on grey matter density maps obtained by voxel-based morphometry analysis to discriminate PPA subtypes (44 patients: 16 nonfluent/agrammatic variant PPA, 17 semantic variant PPA, 11 logopenic variant PPA) from 20 healthy controls (matched for sample size, age, and gender) in the cohort of the multi-center study of the German consortium for frontotemporal lobar degeneration. Here, we compared a whole-brain with a meta-analysis-based disease-specific regions-of-interest approach for support vector machine classification. We also used support vector machine classification to discriminate the three PPA subtypes from each other. Whole brain support vector machine classification enabled a very high accuracy between 91 and 97% for identifying specific PPA subtypes vs. healthy controls, and 78/95% for the discrimination between semantic variant vs. nonfluent/agrammatic or logopenic PPA variants. Only for the discrimination between nonfluent/agrammatic and logopenic PPA variants accuracy was low with 55%. Interestingly, the regions that contributed the most to the support vector machine classification of patients corresponded largely to the regions that were atrophic in these patients as revealed by group comparisons. Although the whole brain approach took also into account regions that were not covered in the regions-of-interest approach, both approaches showed similar accuracies due to the disease-specificity of the selected networks. Conclusion, support vector machine classification of multi-center structural magnetic resonance imaging data enables prediction of PPA subtypes with a very high accuracy paving the road for its application in clinical settings.
Mujtaba, Ghulam; Shuib, Liyana; Raj, Ram Gopal; Rajandram, Retnagowri; Shaikh, Khairunisa; Al-Garadi, Mohammed Ali
2017-01-01
Widespread implementation of electronic databases has improved the accessibility of plaintext clinical information for supplementary use. Numerous machine learning techniques, such as supervised machine learning approaches or ontology-based approaches, have been employed to obtain useful information from plaintext clinical data. This study proposes an automatic multi-class classification system to predict accident-related causes of death from plaintext autopsy reports through expert-driven feature selection with supervised automatic text classification decision models. Accident-related autopsy reports were obtained from one of the largest hospital in Kuala Lumpur. These reports belong to nine different accident-related causes of death. Master feature vector was prepared by extracting features from the collected autopsy reports by using unigram with lexical categorization. This master feature vector was used to detect cause of death [according to internal classification of disease version 10 (ICD-10) classification system] through five automated feature selection schemes, proposed expert-driven approach, five subset sizes of features, and five machine learning classifiers. Model performance was evaluated using precisionM, recallM, F-measureM, accuracy, and area under ROC curve. Four baselines were used to compare the results with the proposed system. Random forest and J48 decision models parameterized using expert-driven feature selection yielded the highest evaluation measure approaching (85% to 90%) for most metrics by using a feature subset size of 30. The proposed system also showed approximately 14% to 16% improvement in the overall accuracy compared with the existing techniques and four baselines. The proposed system is feasible and practical to use for automatic classification of ICD-10-related cause of death from autopsy reports. The proposed system assists pathologists to accurately and rapidly determine underlying cause of death based on autopsy findings. Furthermore, the proposed expert-driven feature selection approach and the findings are generally applicable to other kinds of plaintext clinical reports.
Mujtaba, Ghulam; Shuib, Liyana; Raj, Ram Gopal; Rajandram, Retnagowri; Shaikh, Khairunisa; Al-Garadi, Mohammed Ali
2017-01-01
Objectives Widespread implementation of electronic databases has improved the accessibility of plaintext clinical information for supplementary use. Numerous machine learning techniques, such as supervised machine learning approaches or ontology-based approaches, have been employed to obtain useful information from plaintext clinical data. This study proposes an automatic multi-class classification system to predict accident-related causes of death from plaintext autopsy reports through expert-driven feature selection with supervised automatic text classification decision models. Methods Accident-related autopsy reports were obtained from one of the largest hospital in Kuala Lumpur. These reports belong to nine different accident-related causes of death. Master feature vector was prepared by extracting features from the collected autopsy reports by using unigram with lexical categorization. This master feature vector was used to detect cause of death [according to internal classification of disease version 10 (ICD-10) classification system] through five automated feature selection schemes, proposed expert-driven approach, five subset sizes of features, and five machine learning classifiers. Model performance was evaluated using precisionM, recallM, F-measureM, accuracy, and area under ROC curve. Four baselines were used to compare the results with the proposed system. Results Random forest and J48 decision models parameterized using expert-driven feature selection yielded the highest evaluation measure approaching (85% to 90%) for most metrics by using a feature subset size of 30. The proposed system also showed approximately 14% to 16% improvement in the overall accuracy compared with the existing techniques and four baselines. Conclusion The proposed system is feasible and practical to use for automatic classification of ICD-10-related cause of death from autopsy reports. The proposed system assists pathologists to accurately and rapidly determine underlying cause of death based on autopsy findings. Furthermore, the proposed expert-driven feature selection approach and the findings are generally applicable to other kinds of plaintext clinical reports. PMID:28166263
Noise tolerant dendritic lattice associative memories
NASA Astrophysics Data System (ADS)
Ritter, Gerhard X.; Schmalz, Mark S.; Hayden, Eric; Tucker, Marc
2011-09-01
Linear classifiers based on computation over the real numbers R (e.g., with operations of addition and multiplication) denoted by (R, +, x), have been represented extensively in the literature of pattern recognition. However, a different approach to pattern classification involves the use of addition, maximum, and minimum operations over the reals in the algebra (R, +, maximum, minimum) These pattern classifiers, based on lattice algebra, have been shown to exhibit superior information storage capacity, fast training and short convergence times, high pattern classification accuracy, and low computational cost. Such attributes are not always found, for example, in classical neural nets based on the linear inner product. In a special type of lattice associative memory (LAM), called a dendritic LAM or DLAM, it is possible to achieve noise-tolerant pattern classification by varying the design of noise or error acceptance bounds. This paper presents theory and algorithmic approaches for the computation of noise-tolerant lattice associative memories (LAMs) under a variety of input constraints. Of particular interest are the classification of nonergodic data in noise regimes with time-varying statistics. DLAMs, which are a specialization of LAMs derived from concepts of biological neural networks, have successfully been applied to pattern classification from hyperspectral remote sensing data, as well as spatial object recognition from digital imagery. The authors' recent research in the development of DLAMs is overviewed, with experimental results that show utility for a wide variety of pattern classification applications. Performance results are presented in terms of measured computational cost, noise tolerance, classification accuracy, and throughput for a variety of input data and noise levels.
A Game-Based Approach to Learning the Idea of Chemical Elements and Their Periodic Classification
ERIC Educational Resources Information Center
Franco-Mariscal, Antonio Joaquín; Oliva-Martínez, José María; Blanco-López, Ángel; España-Ramos, Enrique
2016-01-01
In this paper, the characteristics and results of a teaching unit based on the use of educational games to learn the idea of chemical elements and their periodic classification in secondary education are analyzed. The method is aimed at Spanish students aged 15-16 and consists of 24 1-h sessions. The results obtained on implementing the teaching…
NASA Astrophysics Data System (ADS)
Gajda, Agnieszka; Wójtowicz-Nowakowska, Anna
2013-04-01
A comparison of the accuracy of pixel based and object based classifications of integrated optical and LiDAR data Land cover maps are generally produced on the basis of high resolution imagery. Recently, LiDAR (Light Detection and Ranging) data have been brought into use in diverse applications including land cover mapping. In this study we attempted to assess the accuracy of land cover classification using both high resolution aerial imagery and LiDAR data (airborne laser scanning, ALS), testing two classification approaches: a pixel-based classification and object-oriented image analysis (OBIA). The study was conducted on three test areas (3 km2 each) in the administrative area of Kraków, Poland, along the course of the Vistula River. They represent three different dominating land cover types of the Vistula River valley. Test site 1 had a semi-natural vegetation, with riparian forests and shrubs, test site 2 represented a densely built-up area, and test site 3 was an industrial site. Point clouds from ALS and ortophotomaps were both captured in November 2007. Point cloud density was on average 16 pt/m2 and it contained additional information about intensity and encoded RGB values. Ortophotomaps had a spatial resolution of 10 cm. From point clouds two raster maps were generated: intensity (1) and (2) normalised Digital Surface Model (nDSM), both with the spatial resolution of 50 cm. To classify the aerial data, a supervised classification approach was selected. Pixel based classification was carried out in ERDAS Imagine software. Ortophotomaps and intensity and nDSM rasters were used in classification. 15 homogenous training areas representing each cover class were chosen. Classified pixels were clumped to avoid salt and pepper effect. Object oriented image object classification was carried out in eCognition software, which implements both the optical and ALS data. Elevation layers (intensity, firs/last reflection, etc.) were used at segmentation stage due to proper wages usage. Thus a more precise and unambiguous boundaries of segments (objects) were received. As a results of the classification 5 classes of land cover (buildings, water, high and low vegetation and others) were extracted. Both pixel-based image analysis and OBIA were conducted with a minimum mapping unit of 10m2. Results were validated on the basis on manual classification and random points (80 per test area), reference data set was manually interpreted using ortophotomaps and expert knowledge of the test site areas.
Lin, Yuan-Pin; Yang, Yi-Hsuan; Jung, Tzyy-Ping
2014-01-01
Electroencephalography (EEG)-based emotion classification during music listening has gained increasing attention nowadays due to its promise of potential applications such as musical affective brain-computer interface (ABCI), neuromarketing, music therapy, and implicit multimedia tagging and triggering. However, music is an ecologically valid and complex stimulus that conveys certain emotions to listeners through compositions of musical elements. Using solely EEG signals to distinguish emotions remained challenging. This study aimed to assess the applicability of a multimodal approach by leveraging the EEG dynamics and acoustic characteristics of musical contents for the classification of emotional valence and arousal. To this end, this study adopted machine-learning methods to systematically elucidate the roles of the EEG and music modalities in the emotion modeling. The empirical results suggested that when whole-head EEG signals were available, the inclusion of musical contents did not improve the classification performance. The obtained performance of 74~76% using solely EEG modality was statistically comparable to that using the multimodality approach. However, if EEG dynamics were only available from a small set of electrodes (likely the case in real-life applications), the music modality would play a complementary role and augment the EEG results from around 61-67% in valence classification and from around 58-67% in arousal classification. The musical timber appeared to replace less-discriminative EEG features and led to improvements in both valence and arousal classification, whereas musical loudness was contributed specifically to the arousal classification. The present study not only provided principles for constructing an EEG-based multimodal approach, but also revealed the fundamental insights into the interplay of the brain activity and musical contents in emotion modeling.
Lin, Yuan-Pin; Yang, Yi-Hsuan; Jung, Tzyy-Ping
2014-01-01
Electroencephalography (EEG)-based emotion classification during music listening has gained increasing attention nowadays due to its promise of potential applications such as musical affective brain-computer interface (ABCI), neuromarketing, music therapy, and implicit multimedia tagging and triggering. However, music is an ecologically valid and complex stimulus that conveys certain emotions to listeners through compositions of musical elements. Using solely EEG signals to distinguish emotions remained challenging. This study aimed to assess the applicability of a multimodal approach by leveraging the EEG dynamics and acoustic characteristics of musical contents for the classification of emotional valence and arousal. To this end, this study adopted machine-learning methods to systematically elucidate the roles of the EEG and music modalities in the emotion modeling. The empirical results suggested that when whole-head EEG signals were available, the inclusion of musical contents did not improve the classification performance. The obtained performance of 74~76% using solely EEG modality was statistically comparable to that using the multimodality approach. However, if EEG dynamics were only available from a small set of electrodes (likely the case in real-life applications), the music modality would play a complementary role and augment the EEG results from around 61–67% in valence classification and from around 58–67% in arousal classification. The musical timber appeared to replace less-discriminative EEG features and led to improvements in both valence and arousal classification, whereas musical loudness was contributed specifically to the arousal classification. The present study not only provided principles for constructing an EEG-based multimodal approach, but also revealed the fundamental insights into the interplay of the brain activity and musical contents in emotion modeling. PMID:24822035
Yu, Yingyan
2014-01-01
Histopathological classification is in a pivotal position in both basic research and clinical diagnosis and treatment of gastric cancer. Currently, there are different classification systems in basic science and clinical application. In medical literatures, different classifications are used including Lauren and WHO systems, which have confused many researchers. Lauren classification has been proposed for half a century, but is still used worldwide. It shows many advantages of simple, easy handling with prognostic significance. The WHO classification scheme is better than Lauren classification in that it is continuously being revised according to the progress of gastric cancer, and is always used in the clinical and pathological diagnosis of common scenarios. Along with the progression of genomics, transcriptomics, proteomics, metabolomics researches, molecular classification of gastric cancer becomes the current hot topics. The traditional therapeutic approach based on phenotypic characteristics of gastric cancer will most likely be replaced with a gene variation mode. The gene-targeted therapy against the same molecular variation seems more reasonable than traditional chemical treatment based on the same morphological change.
Doostparast Torshizi, Abolfazl; Petzold, Linda R
2018-01-01
Data integration methods that combine data from different molecular levels such as genome, epigenome, transcriptome, etc., have received a great deal of interest in the past few years. It has been demonstrated that the synergistic effects of different biological data types can boost learning capabilities and lead to a better understanding of the underlying interactions among molecular levels. In this paper we present a graph-based semi-supervised classification algorithm that incorporates latent biological knowledge in the form of biological pathways with gene expression and DNA methylation data. The process of graph construction from biological pathways is based on detecting condition-responsive genes, where 3 sets of genes are finally extracted: all condition responsive genes, high-frequency condition-responsive genes, and P-value-filtered genes. The proposed approach is applied to ovarian cancer data downloaded from the Human Genome Atlas. Extensive numerical experiments demonstrate superior performance of the proposed approach compared to other state-of-the-art algorithms, including the latest graph-based classification techniques. Simulation results demonstrate that integrating various data types enhances classification performance and leads to a better understanding of interrelations between diverse omics data types. The proposed approach outperforms many of the state-of-the-art data integration algorithms. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Mediterranean Land Use and Land Cover Classification Assessment Using High Spatial Resolution Data
NASA Astrophysics Data System (ADS)
Elhag, Mohamed; Boteva, Silvena
2016-10-01
Landscape fragmentation is noticeably practiced in Mediterranean regions and imposes substantial complications in several satellite image classification methods. To some extent, high spatial resolution data were able to overcome such complications. For better classification performances in Land Use Land Cover (LULC) mapping, the current research adopts different classification methods comparison for LULC mapping using Sentinel-2 satellite as a source of high spatial resolution. Both of pixel-based and an object-based classification algorithms were assessed; the pixel-based approach employs Maximum Likelihood (ML), Artificial Neural Network (ANN) algorithms, Support Vector Machine (SVM), and, the object-based classification uses the Nearest Neighbour (NN) classifier. Stratified Masking Process (SMP) that integrates a ranking process within the classes based on spectral fluctuation of the sum of the training and testing sites was implemented. An analysis of the overall and individual accuracy of the classification results of all four methods reveals that the SVM classifier was the most efficient overall by distinguishing most of the classes with the highest accuracy. NN succeeded to deal with artificial surface classes in general while agriculture area classes, and forest and semi-natural area classes were segregated successfully with SVM. Furthermore, a comparative analysis indicates that the conventional classification method yielded better accuracy results than the SMP method overall with both classifiers used, ML and SVM.
Ground-based cloud classification by learning stable local binary patterns
NASA Astrophysics Data System (ADS)
Wang, Yu; Shi, Cunzhao; Wang, Chunheng; Xiao, Baihua
2018-07-01
Feature selection and extraction is the first step in implementing pattern classification. The same is true for ground-based cloud classification. Histogram features based on local binary patterns (LBPs) are widely used to classify texture images. However, the conventional uniform LBP approach cannot capture all the dominant patterns in cloud texture images, thereby resulting in low classification performance. In this study, a robust feature extraction method by learning stable LBPs is proposed based on the averaged ranks of the occurrence frequencies of all rotation invariant patterns defined in the LBPs of cloud images. The proposed method is validated with a ground-based cloud classification database comprising five cloud types. Experimental results demonstrate that the proposed method achieves significantly higher classification accuracy than the uniform LBP, local texture patterns (LTP), dominant LBP (DLBP), completed LBP (CLTP) and salient LBP (SaLBP) methods in this cloud image database and under different noise conditions. And the performance of the proposed method is comparable with that of the popular deep convolutional neural network (DCNN) method, but with less computation complexity. Furthermore, the proposed method also achieves superior performance on an independent test data set.
Morton, Lindsay M.; Linet, Martha S.; Clarke, Christina A.; Kadin, Marshall E.; Vajdic, Claire M.; Monnereau, Alain; Maynadié, Marc; Chiu, Brian C.-H.; Marcos-Gragera, Rafael; Costantini, Adele Seniori; Cerhan, James R.; Weisenburger, Dennis D.
2010-01-01
After publication of the updated World Health Organization (WHO) classification of tumors of hematopoietic and lymphoid tissues in 2008, the Pathology Working Group of the International Lymphoma Epidemiology Consortium (InterLymph) now presents an update of the hierarchical classification of lymphoid neoplasms for epidemiologic research based on the 2001 WHO classification, which we published in 2007. The updated hierarchical classification incorporates all of the major and provisional entities in the 2008 WHO classification, including newly defined entities based on age, site, certain infections, and molecular characteristics, as well as borderline categories, early and “in situ” lesions, disorders with limited capacity for clinical progression, lesions without current International Classification of Diseases for Oncology, 3rd Edition codes, and immunodeficiency-associated lymphoproliferative disorders. WHO subtypes are defined in hierarchical groupings, with newly defined groups for small B-cell lymphomas with plasmacytic differentiation and for primary cutaneous T-cell lymphomas. We suggest approaches for applying the hierarchical classification in various epidemiologic settings, including strategies for dealing with multiple coexisting lymphoma subtypes in one patient, and cases with incomplete pathologic information. The pathology materials useful for state-of-the-art epidemiology studies are also discussed. We encourage epidemiologists to adopt the updated InterLymph hierarchical classification, which incorporates the most recent WHO entities while demonstrating their relationship to older classifications. PMID:20699439
Turner, Jennifer J; Morton, Lindsay M; Linet, Martha S; Clarke, Christina A; Kadin, Marshall E; Vajdic, Claire M; Monnereau, Alain; Maynadié, Marc; Chiu, Brian C-H; Marcos-Gragera, Rafael; Costantini, Adele Seniori; Cerhan, James R; Weisenburger, Dennis D
2010-11-18
After publication of the updated World Health Organization (WHO) classification of tumors of hematopoietic and lymphoid tissues in 2008, the Pathology Working Group of the International Lymphoma Epidemiology Consortium (InterLymph) now presents an update of the hierarchical classification of lymphoid neoplasms for epidemiologic research based on the 2001 WHO classification, which we published in 2007. The updated hierarchical classification incorporates all of the major and provisional entities in the 2008 WHO classification, including newly defined entities based on age, site, certain infections, and molecular characteristics, as well as borderline categories, early and "in situ" lesions, disorders with limited capacity for clinical progression, lesions without current International Classification of Diseases for Oncology, 3rd Edition codes, and immunodeficiency-associated lymphoproliferative disorders. WHO subtypes are defined in hierarchical groupings, with newly defined groups for small B-cell lymphomas with plasmacytic differentiation and for primary cutaneous T-cell lymphomas. We suggest approaches for applying the hierarchical classification in various epidemiologic settings, including strategies for dealing with multiple coexisting lymphoma subtypes in one patient, and cases with incomplete pathologic information. The pathology materials useful for state-of-the-art epidemiology studies are also discussed. We encourage epidemiologists to adopt the updated InterLymph hierarchical classification, which incorporates the most recent WHO entities while demonstrating their relationship to older classifications.
Tensor-based classification of an auditory mobile BCI without a subject-specific calibration phase
NASA Astrophysics Data System (ADS)
Zink, Rob; Hunyadi, Borbála; Van Huffel, Sabine; De Vos, Maarten
2016-04-01
Objective. One of the major drawbacks in EEG brain-computer interfaces (BCI) is the need for subject-specific training of the classifier. By removing the need for a supervised calibration phase, new users could potentially explore a BCI faster. In this work we aim to remove this subject-specific calibration phase and allow direct classification. Approach. We explore canonical polyadic decompositions and block term decompositions of the EEG. These methods exploit structure in higher dimensional data arrays called tensors. The BCI tensors are constructed by concatenating ERP templates from other subjects to a target and non-target trial and the inherent structure guides a decomposition that allows accurate classification. We illustrate the new method on data from a three-class auditory oddball paradigm. Main results. The presented approach leads to a fast and intuitive classification with accuracies competitive with a supervised and cross-validated LDA approach. Significance. The described methods are a promising new way of classifying BCI data with a forthright link to the original P300 ERP signal over the conventional and widely used supervised approaches.
NASA Astrophysics Data System (ADS)
Basak, Subhash C.; Mills, Denise; Hawkins, Douglas M.
2008-06-01
A hierarchical classification study was carried out based on a set of 70 chemicals—35 which produce allergic contact dermatitis (ACD) and 35 which do not. This approach was implemented using a regular ridge regression computer code, followed by conversion of regression output to binary data values. The hierarchical descriptor classes used in the modeling include topostructural (TS), topochemical (TC), and quantum chemical (QC), all of which are based solely on chemical structure. The concordance, sensitivity, and specificity are reported. The model based on the TC descriptors was found to be the best, while the TS model was extremely poor.
Hydrological Climate Classification: Can We Improve on Köppen-Geiger?
NASA Astrophysics Data System (ADS)
Knoben, W.; Woods, R. A.; Freer, J. E.
2017-12-01
Classification is essential in the study of complex natural systems, yet hydrology so far has no formal way to structure the climate forcing which underlies hydrologic response. Various climate classification systems can be borrowed from other disciplines but these are based on different organizing principles than a hydrological classification might use. From gridded global data we calculate a gridded aridity index, an aridity seasonality index and a rain-vs-snow index, which we use to cluster global locations into climate groups. We then define the membership degree of nearly 1100 catchments to each of our climate groups based on each catchment's climate and investigate the extent to which streamflow responses within each climate group are similar. We compare this climate classification approach with the often-used Köppen-Geiger classification, using statistical tests based on streamflow signature values. We find that three climate indices are sufficient to distinguish 18 different climate types world-wide. Climates tend to change gradually in space and catchments can thus belong to multiple climate groups, albeit with different degrees of membership. Streamflow responses within a climate group tend to be similar, regardless of the catchments' geographical proximity. A Wilcoxon two-sample test based on streamflow signature values for each climate group shows that the new classification can distinguish different flow regimes using this classification scheme. The Köppen-Geiger approach uses 29 climate classes but is less able to differentiate streamflow regimes. Climate forcing exerts a strong control on typical hydrologic response and both change gradually in space. This makes arbitrary hard boundaries in any classification scheme difficult to defend. Any hydrological classification should thus acknowledge these gradual changes in forcing. Catchment characteristics (soil or vegetation type, land use, etc) can vary more quickly in space than climate does, which can explain streamflow differences between geographically close locations. Summarizing, this work shows that hydrology needs its own way to structure climate forcing, acknowledging that climates vary gradually on a global scale and explicitly including those climate aspects that drive seasonal changes in hydrologic regimes.
Zorman, Milan; Sánchez de la Rosa, José Luis; Dinevski, Dejan
2011-12-01
It is not very often to see a symbol-based machine learning approach to be used for the purpose of image classification and recognition. In this paper we will present such an approach, which we first used on the follicular lymphoma images. Lymphoma is a broad term encompassing a variety of cancers of the lymphatic system. Lymphoma is differentiated by the type of cell that multiplies and how the cancer presents itself. It is very important to get an exact diagnosis regarding lymphoma and to determine the treatments that will be most effective for the patient's condition. Our work was focused on the identification of lymphomas by finding follicles in microscopy images provided by the Laboratory of Pathology in the University Hospital of Tenerife, Spain. We divided our work in two stages: in the first stage we did image pre-processing and feature extraction, and in the second stage we used different symbolic machine learning approaches for pixel classification. Symbolic machine learning approaches are often neglected when looking for image analysis tools. They are not only known for a very appropriate knowledge representation, but also claimed to lack computational power. The results we got are very promising and show that symbolic approaches can be successful in image analysis applications.
Developing collaborative classifiers using an expert-based model
Mountrakis, G.; Watts, R.; Luo, L.; Wang, Jingyuan
2009-01-01
This paper presents a hierarchical, multi-stage adaptive strategy for image classification. We iteratively apply various classification methods (e.g., decision trees, neural networks), identify regions of parametric and geographic space where accuracy is low, and in these regions, test and apply alternate methods repeating the process until the entire image is classified. Currently, classifiers are evaluated through human input using an expert-based system; therefore, this paper acts as the proof of concept for collaborative classifiers. Because we decompose the problem into smaller, more manageable sub-tasks, our classification exhibits increased flexibility compared to existing methods since classification methods are tailored to the idiosyncrasies of specific regions. A major benefit of our approach is its scalability and collaborative support since selected low-accuracy classifiers can be easily replaced with others without affecting classification accuracy in high accuracy areas. At each stage, we develop spatially explicit accuracy metrics that provide straightforward assessment of results by non-experts and point to areas that need algorithmic improvement or ancillary data. Our approach is demonstrated in the task of detecting impervious surface areas, an important indicator for human-induced alterations to the environment, using a 2001 Landsat scene from Las Vegas, Nevada. ?? 2009 American Society for Photogrammetry and Remote Sensing.
NASA Technical Reports Server (NTRS)
Sabol, Donald E., Jr.; Roberts, Dar A.; Adams, John B.; Smith, Milton O.
1993-01-01
An important application of remote sensing is to map and monitor changes over large areas of the land surface. This is particularly significant with the current interest in monitoring vegetation communities. Most of traditional methods for mapping different types of plant communities are based upon statistical classification techniques (i.e., parallel piped, nearest-neighbor, etc.) applied to uncalibrated multispectral data. Classes from these techniques are typically difficult to interpret (particularly to a field ecologist/botanist). Also, classes derived for one image can be very different from those derived from another image of the same area, making interpretation of observed temporal changes nearly impossible. More recently, neural networks have been applied to classification. Neural network classification, based upon spectral matching, is weak in dealing with spectral mixtures (a condition prevalent in images of natural surfaces). Another approach to mapping vegetation communities is based on spectral mixture analysis, which can provide a consistent framework for image interpretation. Roberts et al. (1990) mapped vegetation using the band residuals from a simple mixing model (the same spectral endmembers applied to all image pixels). Sabol et al. (1992b) and Roberts et al. (1992) used different methods to apply the most appropriate spectral endmembers to each image pixel, thereby allowing mapping of vegetation based upon the the different endmember spectra. In this paper, we describe a new approach to classification of vegetation communities based upon the spectra fractions derived from spectral mixture analysis. This approach was applied to three 1992 AVIRIS images of Jasper Ridge, California to observe seasonal changes in surface composition.
Best Merge Region Growing with Integrated Probabilistic Classification for Hyperspectral Imagery
NASA Technical Reports Server (NTRS)
Tarabalka, Yuliya; Tilton, James C.
2011-01-01
A new method for spectral-spatial classification of hyperspectral images is proposed. The method is based on the integration of probabilistic classification within the hierarchical best merge region growing algorithm. For this purpose, preliminary probabilistic support vector machines classification is performed. Then, hierarchical step-wise optimization algorithm is applied, by iteratively merging regions with the smallest Dissimilarity Criterion (DC). The main novelty of this method consists in defining a DC between regions as a function of region statistical and geometrical features along with classification probabilities. Experimental results are presented on a 200-band AVIRIS image of the Northwestern Indiana s vegetation area and compared with those obtained by recently proposed spectral-spatial classification techniques. The proposed method improves classification accuracies when compared to other classification approaches.
NASA Astrophysics Data System (ADS)
Diesing, Markus; Green, Sophie L.; Stephens, David; Lark, R. Murray; Stewart, Heather A.; Dove, Dayton
2014-08-01
Marine spatial planning and conservation need underpinning with sufficiently detailed and accurate seabed substrate and habitat maps. Although multibeam echosounders enable us to map the seabed with high resolution and spatial accuracy, there is still a lack of fit-for-purpose seabed maps. This is due to the high costs involved in carrying out systematic seabed mapping programmes and the fact that the development of validated, repeatable, quantitative and objective methods of swath acoustic data interpretation is still in its infancy. We compared a wide spectrum of approaches including manual interpretation, geostatistics, object-based image analysis and machine-learning to gain further insights into the accuracy and comparability of acoustic data interpretation approaches based on multibeam echosounder data (bathymetry, backscatter and derivatives) and seabed samples with the aim to derive seabed substrate maps. Sample data were split into a training and validation data set to allow us to carry out an accuracy assessment. Overall thematic classification accuracy ranged from 67% to 76% and Cohen's kappa varied between 0.34 and 0.52. However, these differences were not statistically significant at the 5% level. Misclassifications were mainly associated with uncommon classes, which were rarely sampled. Map outputs were between 68% and 87% identical. To improve classification accuracy in seabed mapping, we suggest that more studies on the effects of factors affecting the classification performance as well as comparative studies testing the performance of different approaches need to be carried out with a view to developing guidelines for selecting an appropriate method for a given dataset. In the meantime, classification accuracy might be improved by combining different techniques to hybrid approaches and multi-method ensembles.
Ding, Xuemei; Bucholc, Magda; Wang, Haiying; Glass, David H; Wang, Hui; Clarke, Dave H; Bjourson, Anthony John; Dowey, Le Roy C; O'Kane, Maurice; Prasad, Girijesh; Maguire, Liam; Wong-Lin, KongFatt
2018-06-27
There is currently a lack of an efficient, objective and systemic approach towards the classification of Alzheimer's disease (AD), due to its complex etiology and pathogenesis. As AD is inherently dynamic, it is also not clear how the relationships among AD indicators vary over time. To address these issues, we propose a hybrid computational approach for AD classification and evaluate it on the heterogeneous longitudinal AIBL dataset. Specifically, using clinical dementia rating as an index of AD severity, the most important indicators (mini-mental state examination, logical memory recall, grey matter and cerebrospinal volumes from MRI and active voxels from PiB-PET brain scans, ApoE, and age) can be automatically identified from parallel data mining algorithms. In this work, Bayesian network modelling across different time points is used to identify and visualize time-varying relationships among the significant features, and importantly, in an efficient way using only coarse-grained data. Crucially, our approach suggests key data features and their appropriate combinations that are relevant for AD severity classification with high accuracy. Overall, our study provides insights into AD developments and demonstrates the potential of our approach in supporting efficient AD diagnosis.
A Feature Mining Based Approach for the Classification of Text Documents into Disjoint Classes.
ERIC Educational Resources Information Center
Nieto Sanchez, Salvador; Triantaphyllou, Evangelos; Kraft, Donald
2002-01-01
Proposes a new approach for classifying text documents into two disjoint classes. Highlights include a brief overview of document clustering; a data mining approach called the One Clause at a Time (OCAT) algorithm which is based on mathematical logic; vector space model (VSM); and comparing the OCAT to the VSM. (Author/LRW)
Influence of nuclei segmentation on breast cancer malignancy classification
NASA Astrophysics Data System (ADS)
Jelen, Lukasz; Fevens, Thomas; Krzyzak, Adam
2009-02-01
Breast Cancer is one of the most deadly cancers affecting middle-aged women. Accurate diagnosis and prognosis are crucial to reduce the high death rate. Nowadays there are numerous diagnostic tools for breast cancer diagnosis. In this paper we discuss a role of nuclear segmentation from fine needle aspiration biopsy (FNA) slides and its influence on malignancy classification. Classification of malignancy plays a very important role during the diagnosis process of breast cancer. Out of all cancer diagnostic tools, FNA slides provide the most valuable information about the cancer malignancy grade which helps to choose an appropriate treatment. This process involves assessing numerous nuclear features and therefore precise segmentation of nuclei is very important. In this work we compare three powerful segmentation approaches and test their impact on the classification of breast cancer malignancy. The studied approaches involve level set segmentation, fuzzy c-means segmentation and textural segmentation based on co-occurrence matrix. Segmented nuclei were used to extract nuclear features for malignancy classification. For classification purposes four different classifiers were trained and tested with previously extracted features. The compared classifiers are Multilayer Perceptron (MLP), Self-Organizing Maps (SOM), Principal Component-based Neural Network (PCA) and Support Vector Machines (SVM). The presented results show that level set segmentation yields the best results over the three compared approaches and leads to a good feature extraction with a lowest average error rate of 6.51% over four different classifiers. The best performance was recorded for multilayer perceptron with an error rate of 3.07% using fuzzy c-means segmentation.
The rationale behind Pierre Duhem's natural classification.
Bhakthavatsalam, Sindhuja
2015-06-01
The central concern of this paper is the interpretation of Duhem's attitude towards physical theory. Based on his view that the classification of experimental laws yielded by theory progressively approaches a natural classification-a classification reflecting that of underlying realities-Duhem has been construed as a realist of sorts in recent literature. Here I argue that his positive attitude towards the theoretic classification of laws had rather to do with the pragmatic rationality of the physicist. Duhem's idea of natural classification was an intuitive idea in the mind of the physicist that had to be affirmed in order to justify the physicist's pursuit of theory. Copyright © 2015 Elsevier Ltd. All rights reserved.
2011-01-01
Background Tokenization is an important component of language processing yet there is no widely accepted tokenization method for English texts, including biomedical texts. Other than rule based techniques, tokenization in the biomedical domain has been regarded as a classification task. Biomedical classifier-based tokenizers either split or join textual objects through classification to form tokens. The idiosyncratic nature of each biomedical tokenizer’s output complicates adoption and reuse. Furthermore, biomedical tokenizers generally lack guidance on how to apply an existing tokenizer to a new domain (subdomain). We identify and complete a novel tokenizer design pattern and suggest a systematic approach to tokenizer creation. We implement a tokenizer based on our design pattern that combines regular expressions and machine learning. Our machine learning approach differs from the previous split-join classification approaches. We evaluate our approach against three other tokenizers on the task of tokenizing biomedical text. Results Medpost and our adapted Viterbi tokenizer performed best with a 92.9% and 92.4% accuracy respectively. Conclusions Our evaluation of our design pattern and guidelines supports our claim that the design pattern and guidelines are a viable approach to tokenizer construction (producing tokenizers matching leading custom-built tokenizers in a particular domain). Our evaluation also demonstrates that ambiguous tokenizations can be disambiguated through POS tagging. In doing so, POS tag sequences and training data have a significant impact on proper text tokenization. PMID:21658288
Reformulating Constraints for Compilability and Efficiency
NASA Technical Reports Server (NTRS)
Tong, Chris; Braudaway, Wesley; Mohan, Sunil; Voigt, Kerstin
1992-01-01
KBSDE is a knowledge compiler that uses a classification-based approach to map solution constraints in a task specification onto particular search algorithm components that will be responsible for satisfying those constraints (e.g., local constraints are incorporated in generators; global constraints are incorporated in either testers or hillclimbing patchers). Associated with each type of search algorithm component is a subcompiler that specializes in mapping constraints into components of that type. Each of these subcompilers in turn uses a classification-based approach, matching a constraint passed to it against one of several schemas, and applying a compilation technique associated with that schema. While much progress has occurred in our research since we first laid out our classification-based approach [Ton91], we focus in this paper on our reformulation research. Two important reformulation issues that arise out of the choice of a schema-based approach are: (1) compilability-- Can a constraint that does not directly match any of a particular subcompiler's schemas be reformulated into one that does? and (2) Efficiency-- If the efficiency of the compiled search algorithm depends on the compiler's performance, and the compiler's performance depends on the form in which the constraint was expressed, can we find forms for constraints which compile better, or reformulate constraints whose forms can be recognized as ones that compile poorly? In this paper, we describe a set of techniques we are developing for partially addressing these issues.
A Study of Feature Combination for Vehicle Detection Based on Image Processing
2014-01-01
Video analytics play a critical role in most recent traffic monitoring and driver assistance systems. In this context, the correct detection and classification of surrounding vehicles through image analysis has been the focus of extensive research in the last years. Most of the pieces of work reported for image-based vehicle verification make use of supervised classification approaches and resort to techniques, such as histograms of oriented gradients (HOG), principal component analysis (PCA), and Gabor filters, among others. Unfortunately, existing approaches are lacking in two respects: first, comparison between methods using a common body of work has not been addressed; second, no study of the combination potentiality of popular features for vehicle classification has been reported. In this study the performance of the different techniques is first reviewed and compared using a common public database. Then, the combination capabilities of these techniques are explored and a methodology is presented for the fusion of classifiers built upon them, taking into account also the vehicle pose. The study unveils the limitations of single-feature based classification and makes clear that fusion of classifiers is highly beneficial for vehicle verification. PMID:24672299
An algorithm for the arithmetic classification of multilattices.
Indelicato, Giuliana
2013-01-01
A procedure for the construction and the classification of monoatomic multilattices in arbitrary dimension is developed. The algorithm allows one to determine the location of the points of all monoatomic multilattices with a given symmetry, or to determine whether two assigned multilattices are arithmetically equivalent. This approach is based on ideas from integral matrix theory, in particular the reduction to the Smith normal form, and can be coded to provide a classification software package.
Juan-Albarracín, Javier; Fuster-Garcia, Elies; Manjón, José V; Robles, Montserrat; Aparici, F; Martí-Bonmatí, L; García-Gómez, Juan M
2015-01-01
Automatic brain tumour segmentation has become a key component for the future of brain tumour treatment. Currently, most of brain tumour segmentation approaches arise from the supervised learning standpoint, which requires a labelled training dataset from which to infer the models of the classes. The performance of these models is directly determined by the size and quality of the training corpus, whose retrieval becomes a tedious and time-consuming task. On the other hand, unsupervised approaches avoid these limitations but often do not reach comparable results than the supervised methods. In this sense, we propose an automated unsupervised method for brain tumour segmentation based on anatomical Magnetic Resonance (MR) images. Four unsupervised classification algorithms, grouped by their structured or non-structured condition, were evaluated within our pipeline. Considering the non-structured algorithms, we evaluated K-means, Fuzzy K-means and Gaussian Mixture Model (GMM), whereas as structured classification algorithms we evaluated Gaussian Hidden Markov Random Field (GHMRF). An automated postprocess based on a statistical approach supported by tissue probability maps is proposed to automatically identify the tumour classes after the segmentations. We evaluated our brain tumour segmentation method with the public BRAin Tumor Segmentation (BRATS) 2013 Test and Leaderboard datasets. Our approach based on the GMM model improves the results obtained by most of the supervised methods evaluated with the Leaderboard set and reaches the second position in the ranking. Our variant based on the GHMRF achieves the first position in the Test ranking of the unsupervised approaches and the seventh position in the general Test ranking, which confirms the method as a viable alternative for brain tumour segmentation.
Belgiu, Mariana; Dr Guţ, Lucian
2014-10-01
Although multiresolution segmentation (MRS) is a powerful technique for dealing with very high resolution imagery, some of the image objects that it generates do not match the geometries of the target objects, which reduces the classification accuracy. MRS can, however, be guided to produce results that approach the desired object geometry using either supervised or unsupervised approaches. Although some studies have suggested that a supervised approach is preferable, there has been no comparative evaluation of these two approaches. Therefore, in this study, we have compared supervised and unsupervised approaches to MRS. One supervised and two unsupervised segmentation methods were tested on three areas using QuickBird and WorldView-2 satellite imagery. The results were assessed using both segmentation evaluation methods and an accuracy assessment of the resulting building classifications. Thus, differences in the geometries of the image objects and in the potential to achieve satisfactory thematic accuracies were evaluated. The two approaches yielded remarkably similar classification results, with overall accuracies ranging from 82% to 86%. The performance of one of the unsupervised methods was unexpectedly similar to that of the supervised method; they identified almost identical scale parameters as being optimal for segmenting buildings, resulting in very similar geometries for the resulting image objects. The second unsupervised method produced very different image objects from the supervised method, but their classification accuracies were still very similar. The latter result was unexpected because, contrary to previously published findings, it suggests a high degree of independence between the segmentation results and classification accuracy. The results of this study have two important implications. The first is that object-based image analysis can be automated without sacrificing classification accuracy, and the second is that the previously accepted idea that classification is dependent on segmentation is challenged by our unexpected results, casting doubt on the value of pursuing 'optimal segmentation'. Our results rather suggest that as long as under-segmentation remains at acceptable levels, imperfections in segmentation can be ruled out, so that a high level of classification accuracy can still be achieved.
Statistical methods and neural network approaches for classification of data from multiple sources
NASA Technical Reports Server (NTRS)
Benediktsson, Jon Atli; Swain, Philip H.
1990-01-01
Statistical methods for classification of data from multiple data sources are investigated and compared to neural network models. A problem with using conventional multivariate statistical approaches for classification of data of multiple types is in general that a multivariate distribution cannot be assumed for the classes in the data sources. Another common problem with statistical classification methods is that the data sources are not equally reliable. This means that the data sources need to be weighted according to their reliability but most statistical classification methods do not have a mechanism for this. This research focuses on statistical methods which can overcome these problems: a method of statistical multisource analysis and consensus theory. Reliability measures for weighting the data sources in these methods are suggested and investigated. Secondly, this research focuses on neural network models. The neural networks are distribution free since no prior knowledge of the statistical distribution of the data is needed. This is an obvious advantage over most statistical classification methods. The neural networks also automatically take care of the problem involving how much weight each data source should have. On the other hand, their training process is iterative and can take a very long time. Methods to speed up the training procedure are introduced and investigated. Experimental results of classification using both neural network models and statistical methods are given, and the approaches are compared based on these results.
NASA Astrophysics Data System (ADS)
Chung, C.; Nagol, J. R.; Tao, X.; Anand, A.; Dempewolf, J.
2015-12-01
Increasing agricultural production while at the same time preserving the environment has become a challenging task. There is a need for new approaches for use of multi-scale and multi-source remote sensing data as well as ground based measurements for mapping and monitoring crop and ecosystem state to support decision making by governmental and non-governmental organizations for sustainable agricultural development. High resolution sub-meter imagery plays an important role in such an integrative framework of landscape monitoring. It helps link the ground based data to more easily available coarser resolution data, facilitating calibration and validation of derived remote sensing products. Here we present a hierarchical Object Based Image Analysis (OBIA) approach to classify sub-meter imagery. The primary reason for choosing OBIA is to accommodate pixel sizes smaller than the object or class of interest. Especially in non-homogeneous savannah regions of Tanzania, this is an important concern and the traditional pixel based spectral signature approach often fails. Ortho-rectified, calibrated, pan sharpened 0.5 meter resolution data acquired from DigitalGlobe's WorldView-2 satellite sensor was used for this purpose. Multi-scale hierarchical segmentation was performed using multi-resolution segmentation approach to facilitate the use of texture, neighborhood context, and the relationship between super and sub objects for training and classification. eCognition, a commonly used OBIA software program, was used for this purpose. Both decision tree and random forest approaches for classification were tested. The Kappa index agreement for both algorithms surpassed the 85%. The results demonstrate that using hierarchical OBIA can effectively and accurately discriminate classes at even LCCS-3 legend.
A face and palmprint recognition approach based on discriminant DCT feature extraction.
Jing, Xiao-Yuan; Zhang, David
2004-12-01
In the field of image processing and recognition, discrete cosine transform (DCT) and linear discrimination are two widely used techniques. Based on them, we present a new face and palmprint recognition approach in this paper. It first uses a two-dimensional separability judgment to select the DCT frequency bands with favorable linear separability. Then from the selected bands, it extracts the linear discriminative features by an improved Fisherface method and performs the classification by the nearest neighbor classifier. We detailedly analyze theoretical advantages of our approach in feature extraction. The experiments on face databases and palmprint database demonstrate that compared to the state-of-the-art linear discrimination methods, our approach obtains better classification performance. It can significantly improve the recognition rates for face and palmprint data and effectively reduce the dimension of feature space.
Chanel, Guillaume; Pichon, Swann; Conty, Laurence; Berthoz, Sylvie; Chevallier, Coralie; Grèzes, Julie
2015-01-01
Multivariate pattern analysis (MVPA) has been applied successfully to task-based and resting-based fMRI recordings to investigate which neural markers distinguish individuals with autistic spectrum disorders (ASD) from controls. While most studies have focused on brain connectivity during resting state episodes and regions of interest approaches (ROI), a wealth of task-based fMRI datasets have been acquired in these populations in the last decade. This calls for techniques that can leverage information not only from a single dataset, but from several existing datasets that might share some common features and biomarkers. We propose a fully data-driven (voxel-based) approach that we apply to two different fMRI experiments with social stimuli (faces and bodies). The method, based on Support Vector Machines (SVMs) and Recursive Feature Elimination (RFE), is first trained for each experiment independently and each output is then combined to obtain a final classification output. Second, this RFE output is used to determine which voxels are most often selected for classification to generate maps of significant discriminative activity. Finally, to further explore the clinical validity of the approach, we correlate phenotypic information with obtained classifier scores. The results reveal good classification accuracy (range between 69% and 92.3%). Moreover, we were able to identify discriminative activity patterns pertaining to the social brain without relying on a priori ROI definitions. Finally, social motivation was the only dimension which correlated with classifier scores, suggesting that it is the main dimension captured by the classifiers. Altogether, we believe that the present RFE method proves to be efficient and may help identifying relevant biomarkers by taking advantage of acquired task-based fMRI datasets in psychiatric populations. PMID:26793434
nRC: non-coding RNA Classifier based on structural features.
Fiannaca, Antonino; La Rosa, Massimo; La Paglia, Laura; Rizzo, Riccardo; Urso, Alfonso
2017-01-01
Non-coding RNA (ncRNA) are small non-coding sequences involved in gene expression regulation of many biological processes and diseases. The recent discovery of a large set of different ncRNAs with biologically relevant roles has opened the way to develop methods able to discriminate between the different ncRNA classes. Moreover, the lack of knowledge about the complete mechanisms in regulative processes, together with the development of high-throughput technologies, has required the help of bioinformatics tools in addressing biologists and clinicians with a deeper comprehension of the functional roles of ncRNAs. In this work, we introduce a new ncRNA classification tool, nRC (non-coding RNA Classifier). Our approach is based on features extraction from the ncRNA secondary structure together with a supervised classification algorithm implementing a deep learning architecture based on convolutional neural networks. We tested our approach for the classification of 13 different ncRNA classes. We obtained classification scores, using the most common statistical measures. In particular, we reach an accuracy and sensitivity score of about 74%. The proposed method outperforms other similar classification methods based on secondary structure features and machine learning algorithms, including the RNAcon tool that, to date, is the reference classifier. nRC tool is freely available as a docker image at https://hub.docker.com/r/tblab/nrc/. The source code of nRC tool is also available at https://github.com/IcarPA-TBlab/nrc.
Das, Dev Kumar; Ghosh, Madhumala; Pal, Mallika; Maiti, Asok K; Chakraborty, Chandan
2013-02-01
The aim of this paper is to address the development of computer assisted malaria parasite characterization and classification using machine learning approach based on light microscopic images of peripheral blood smears. In doing this, microscopic image acquisition from stained slides, illumination correction and noise reduction, erythrocyte segmentation, feature extraction, feature selection and finally classification of different stages of malaria (Plasmodium vivax and Plasmodium falciparum) have been investigated. The erythrocytes are segmented using marker controlled watershed transformation and subsequently total ninety six features describing shape-size and texture of erythrocytes are extracted in respect to the parasitemia infected versus non-infected cells. Ninety four features are found to be statistically significant in discriminating six classes. Here a feature selection-cum-classification scheme has been devised by combining F-statistic, statistical learning techniques i.e., Bayesian learning and support vector machine (SVM) in order to provide the higher classification accuracy using best set of discriminating features. Results show that Bayesian approach provides the highest accuracy i.e., 84% for malaria classification by selecting 19 most significant features while SVM provides highest accuracy i.e., 83.5% with 9 most significant features. Finally, the performance of these two classifiers under feature selection framework has been compared toward malaria parasite classification. Copyright © 2012 Elsevier Ltd. All rights reserved.
Drunk driving detection based on classification of multivariate time series.
Li, Zhenlong; Jin, Xue; Zhao, Xiaohua
2015-09-01
This paper addresses the problem of detecting drunk driving based on classification of multivariate time series. First, driving performance measures were collected from a test in a driving simulator located in the Traffic Research Center, Beijing University of Technology. Lateral position and steering angle were used to detect drunk driving. Second, multivariate time series analysis was performed to extract the features. A piecewise linear representation was used to represent multivariate time series. A bottom-up algorithm was then employed to separate multivariate time series. The slope and time interval of each segment were extracted as the features for classification. Third, a support vector machine classifier was used to classify driver's state into two classes (normal or drunk) according to the extracted features. The proposed approach achieved an accuracy of 80.0%. Drunk driving detection based on the analysis of multivariate time series is feasible and effective. The approach has implications for drunk driving detection. Copyright © 2015 Elsevier Ltd and National Safety Council. All rights reserved.
An information-based network approach for protein classification
Wan, Xiaogeng; Zhao, Xin; Yau, Stephen S. T.
2017-01-01
Protein classification is one of the critical problems in bioinformatics. Early studies used geometric distances and polygenetic-tree to classify proteins. These methods use binary trees to present protein classification. In this paper, we propose a new protein classification method, whereby theories of information and networks are used to classify the multivariate relationships of proteins. In this study, protein universe is modeled as an undirected network, where proteins are classified according to their connections. Our method is unsupervised, multivariate, and alignment-free. It can be applied to the classification of both protein sequences and structures. Nine examples are used to demonstrate the efficiency of our new method. PMID:28350835
Kazaryan, Airazat M.; Røsok, Bård I.; Edwin, Bjørn
2013-01-01
Background. Morbidity is a cornerstone assessing surgical treatment; nevertheless surgeons have not reached extensive consensus on this problem. Methods and Findings. Clavien, Dindo, and Strasberg with coauthors (1992, 2004, 2009, and 2010) made significant efforts to the standardization of surgical morbidity (Clavien-Dindo-Strasberg classification, last revision, the Accordion classification). However, this classification includes only postoperative complications and has two principal shortcomings: disregard of intraoperative events and confusing terminology. Postoperative events have a major impact on patient well-being. However, intraoperative events should also be recorded and reported even if they do not evidently affect the patient's postoperative well-being. The term surgical complication applied in the Clavien-Dindo-Strasberg classification may be regarded as an incident resulting in a complication caused by technical failure of surgery, in contrast to the so-called medical complications. Therefore, the term surgical complication contributes to misinterpretation of perioperative morbidity. The term perioperative adverse events comprising both intraoperative unfavourable incidents and postoperative complications could be regarded as better alternative. In 2005, Satava suggested a simple grading to evaluate intraoperative surgical errors. Based on that approach, we have elaborated a 3-grade classification of intraoperative incidents so that it can be used to grade intraoperative events of any type of surgery. Refinements have been made to the Accordion classification of postoperative complications. Interpretation. The proposed systematization of perioperative adverse events utilizing the combined application of two appraisal tools, that is, the elaborated classification of intraoperative incidents on the basis of the Satava approach to surgical error evaluation together with the modified Accordion classification of postoperative complication, appears to be an effective tool for comprehensive assessment of surgical outcomes. This concept was validated in regard to various surgical procedures. Broad implementation of this approach will promote the development of surgical science and practice. PMID:23762627
Inayat-Hussain, Salmaan H; Fukumura, Masao; Muiz Aziz, A; Jin, Chai Meng; Jin, Low Wei; Garcia-Milian, Rolando; Vasiliou, Vasilis; Deziel, Nicole C
2018-08-01
Recent trends have witnessed the global growth of unconventional oil and gas (UOG) production. Epidemiologic studies have suggested associations between proximity to UOG operations with increased adverse birth outcomes and cancer, though specific potential etiologic agents have not yet been identified. To perform effective risk assessment of chemicals used in UOG production, the first step of hazard identification followed by prioritization specifically for reproductive toxicity, carcinogenicity and mutagenicity is crucial in an evidence-based risk assessment approach. To date, there is no single hazard classification list based on the United Nations Globally Harmonized System (GHS), with countries applying the GHS standards to generate their own chemical hazard classification lists. A current challenge for chemical prioritization, particularly for a multi-national industry, is inconsistent hazard classification which may result in misjudgment of the potential public health risks. We present a novel approach for hazard identification followed by prioritization of reproductive toxicants found in UOG operations using publicly available regulatory databases. GHS classification for reproductive toxicity of 157 UOG-related chemicals identified as potential reproductive or developmental toxicants in a previous publication was assessed using eleven governmental regulatory agency databases. If there was discordance in classifications across agencies, the most stringent classification was assigned. Chemicals in the category of known or presumed human reproductive toxicants were further evaluated for carcinogenicity and germ cell mutagenicity based on government classifications. A scoring system was utilized to assign numerical values for reproductive health, cancer and germ cell mutation hazard endpoints. Using a Cytoscape analysis, both qualitative and quantitative results were presented visually to readily identify high priority UOG chemicals with evidence of multiple adverse effects. We observed substantial inconsistencies in classification among the 11 databases. By adopting the most stringent classification within and across countries, 43 chemicals were classified as known or presumed human reproductive toxicants (GHS Category 1), while 31 chemicals were classified as suspected human reproductive toxicants (GHS Category 2). The 43 reproductive toxicants were further subjected to analysis for carcinogenic and mutagenic properties. Calculated hazard scores and Cytoscape visualization yielded several high priority chemicals including potassium dichromate, cadmium, benzene and ethylene oxide. Our findings reveal diverging GHS classification outcomes for UOG chemicals across regulatory agencies. Adoption of the most stringent classification with application of hazard scores provides a useful approach to prioritize reproductive toxicants in UOG and other industries for exposure assessments and selection of safer alternatives. Copyright © 2018 Elsevier Ltd. All rights reserved.
Prohászka, Zoltán
2008-07-06
Haemolytic uremic syndrome and thrombotic thrombocytopenic purpura are overlapping clinical entities based on historical classification. Recent developments in the unfolding of the pathomechanisms of these diseases resulted in the creation of a molecular etiology-based classification. Understanding of some causative relationships yielded detailed diagnostic approaches, novel therapeutic options and thorough prognostic assortment of the patients. Although haemolytic uremic syndrome and thrombotic thrombocytopenic purpura are rare diseases with poor prognosis, the precise molecular etiology-based diagnosis might properly direct the therapy of the affected patients. The current review focuses on the theoretical background and detailed description of the available diagnostic possibilities, and some practical information necessary for the interpretation of their results.
A review of classification algorithms for EEG-based brain–computer interfaces: a 10 year update
NASA Astrophysics Data System (ADS)
Lotte, F.; Bougrain, L.; Cichocki, A.; Clerc, M.; Congedo, M.; Rakotomamonjy, A.; Yger, F.
2018-06-01
Objective. Most current electroencephalography (EEG)-based brain–computer interfaces (BCIs) are based on machine learning algorithms. There is a large diversity of classifier types that are used in this field, as described in our 2007 review paper. Now, approximately ten years after this review publication, many new algorithms have been developed and tested to classify EEG signals in BCIs. The time is therefore ripe for an updated review of EEG classification algorithms for BCIs. Approach. We surveyed the BCI and machine learning literature from 2007 to 2017 to identify the new classification approaches that have been investigated to design BCIs. We synthesize these studies in order to present such algorithms, to report how they were used for BCIs, what were the outcomes, and to identify their pros and cons. Main results. We found that the recently designed classification algorithms for EEG-based BCIs can be divided into four main categories: adaptive classifiers, matrix and tensor classifiers, transfer learning and deep learning, plus a few other miscellaneous classifiers. Among these, adaptive classifiers were demonstrated to be generally superior to static ones, even with unsupervised adaptation. Transfer learning can also prove useful although the benefits of transfer learning remain unpredictable. Riemannian geometry-based methods have reached state-of-the-art performances on multiple BCI problems and deserve to be explored more thoroughly, along with tensor-based methods. Shrinkage linear discriminant analysis and random forests also appear particularly useful for small training samples settings. On the other hand, deep learning methods have not yet shown convincing improvement over state-of-the-art BCI methods. Significance. This paper provides a comprehensive overview of the modern classification algorithms used in EEG-based BCIs, presents the principles of these methods and guidelines on when and how to use them. It also identifies a number of challenges to further advance EEG classification in BCI.
NASA Technical Reports Server (NTRS)
Joyce, A. T.
1974-01-01
Significant progress has been made in the classification of surface conditions (land uses) with computer-implemented techniques based on the use of ERTS digital data and pattern recognition software. The supervised technique presently used at the NASA Earth Resources Laboratory is based on maximum likelihood ratioing with a digital table look-up approach to classification. After classification, colors are assigned to the various surface conditions (land uses) classified, and the color-coded classification is film recorded on either positive or negative 9 1/2 in. film at the scale desired. Prints of the film strips are then mosaicked and photographed to produce a land use map in the format desired. Computer extraction of statistical information is performed to show the extent of each surface condition (land use) within any given land unit that can be identified in the image. Evaluations of the product indicate that classification accuracy is well within the limits for use by land resource managers and administrators. Classifications performed with digital data acquired during different seasons indicate that the combination of two or more classifications offer even better accuracy.
NASA Astrophysics Data System (ADS)
Pilarska, M.
2018-05-01
Airborne laser scanning (ALS) is a well-known and willingly used technology. One of the advantages of this technology is primarily its fast and accurate data registration. In recent years ALS is continuously developed. One of the latest achievements is multispectral ALS, which consists in obtaining simultaneously the data in more than one laser wavelength. In this article the results of the dual-wavelength ALS data classification are presented. The data were acquired with RIEGL VQ-1560i sensor, which is equipped with two laser scanners operating in different wavelengths: 532 nm and 1064 nm. Two classification approaches are presented in the article: classification, which is based on geometric relationships between points and classification, which mostly relies on the radiometric properties of registered objects. The overall accuracy of the geometric classification was 86 %, whereas for the radiometric classification it was 81 %. As a result, it can be assumed that the radiometric features which are provided by the multispectral ALS have potential to be successfully used in ALS point cloud classification.
Hierarchical Gene Selection and Genetic Fuzzy System for Cancer Microarray Data Classification
Nguyen, Thanh; Khosravi, Abbas; Creighton, Douglas; Nahavandi, Saeid
2015-01-01
This paper introduces a novel approach to gene selection based on a substantial modification of analytic hierarchy process (AHP). The modified AHP systematically integrates outcomes of individual filter methods to select the most informative genes for microarray classification. Five individual ranking methods including t-test, entropy, receiver operating characteristic (ROC) curve, Wilcoxon and signal to noise ratio are employed to rank genes. These ranked genes are then considered as inputs for the modified AHP. Additionally, a method that uses fuzzy standard additive model (FSAM) for cancer classification based on genes selected by AHP is also proposed in this paper. Traditional FSAM learning is a hybrid process comprising unsupervised structure learning and supervised parameter tuning. Genetic algorithm (GA) is incorporated in-between unsupervised and supervised training to optimize the number of fuzzy rules. The integration of GA enables FSAM to deal with the high-dimensional-low-sample nature of microarray data and thus enhance the efficiency of the classification. Experiments are carried out on numerous microarray datasets. Results demonstrate the performance dominance of the AHP-based gene selection against the single ranking methods. Furthermore, the combination of AHP-FSAM shows a great accuracy in microarray data classification compared to various competing classifiers. The proposed approach therefore is useful for medical practitioners and clinicians as a decision support system that can be implemented in the real medical practice. PMID:25823003
Hierarchical gene selection and genetic fuzzy system for cancer microarray data classification.
Nguyen, Thanh; Khosravi, Abbas; Creighton, Douglas; Nahavandi, Saeid
2015-01-01
This paper introduces a novel approach to gene selection based on a substantial modification of analytic hierarchy process (AHP). The modified AHP systematically integrates outcomes of individual filter methods to select the most informative genes for microarray classification. Five individual ranking methods including t-test, entropy, receiver operating characteristic (ROC) curve, Wilcoxon and signal to noise ratio are employed to rank genes. These ranked genes are then considered as inputs for the modified AHP. Additionally, a method that uses fuzzy standard additive model (FSAM) for cancer classification based on genes selected by AHP is also proposed in this paper. Traditional FSAM learning is a hybrid process comprising unsupervised structure learning and supervised parameter tuning. Genetic algorithm (GA) is incorporated in-between unsupervised and supervised training to optimize the number of fuzzy rules. The integration of GA enables FSAM to deal with the high-dimensional-low-sample nature of microarray data and thus enhance the efficiency of the classification. Experiments are carried out on numerous microarray datasets. Results demonstrate the performance dominance of the AHP-based gene selection against the single ranking methods. Furthermore, the combination of AHP-FSAM shows a great accuracy in microarray data classification compared to various competing classifiers. The proposed approach therefore is useful for medical practitioners and clinicians as a decision support system that can be implemented in the real medical practice.
Sertel, O.; Kong, J.; Shimada, H.; Catalyurek, U.V.; Saltz, J.H.; Gurcan, M.N.
2009-01-01
We are developing a computer-aided prognosis system for neuroblastoma (NB), a cancer of the nervous system and one of the most malignant tumors affecting children. Histopathological examination is an important stage for further treatment planning in routine clinical diagnosis of NB. According to the International Neuroblastoma Pathology Classification (the Shimada system), NB patients are classified into favorable and unfavorable histology based on the tissue morphology. In this study, we propose an image analysis system that operates on digitized H&E stained whole-slide NB tissue samples and classifies each slide as either stroma-rich or stroma-poor based on the degree of Schwannian stromal development. Our statistical framework performs the classification based on texture features extracted using co-occurrence statistics and local binary patterns. Due to the high resolution of digitized whole-slide images, we propose a multi-resolution approach that mimics the evaluation of a pathologist such that the image analysis starts from the lowest resolution and switches to higher resolutions when necessary. We employ an offine feature selection step, which determines the most discriminative features at each resolution level during the training step. A modified k-nearest neighbor classifier is used to determine the confidence level of the classification to make the decision at a particular resolution level. The proposed approach was independently tested on 43 whole-slide samples and provided an overall classification accuracy of 88.4%. PMID:20161324
NASA Astrophysics Data System (ADS)
Liu, Tao; Im, Jungho; Quackenbush, Lindi J.
2015-12-01
This study provides a novel approach to individual tree crown delineation (ITCD) using airborne Light Detection and Ranging (LiDAR) data in dense natural forests using two main steps: crown boundary refinement based on a proposed Fishing Net Dragging (FiND) method, and segment merging based on boundary classification. FiND starts with approximate tree crown boundaries derived using a traditional watershed method with Gaussian filtering and refines these boundaries using an algorithm that mimics how a fisherman drags a fishing net. Random forest machine learning is then used to classify boundary segments into two classes: boundaries between trees and boundaries between branches that belong to a single tree. Three groups of LiDAR-derived features-two from the pseudo waveform generated along with crown boundaries and one from a canopy height model (CHM)-were used in the classification. The proposed ITCD approach was tested using LiDAR data collected over a mountainous region in the Adirondack Park, NY, USA. Overall accuracy of boundary classification was 82.4%. Features derived from the CHM were generally more important in the classification than the features extracted from the pseudo waveform. A comprehensive accuracy assessment scheme for ITCD was also introduced by considering both area of crown overlap and crown centroids. Accuracy assessment using this new scheme shows the proposed ITCD achieved 74% and 78% as overall accuracy, respectively, for deciduous and mixed forest.
Das, Arpita; Bhattacharya, Mahua
2011-01-01
In the present work, authors have developed a treatment planning system implementing genetic based neuro-fuzzy approaches for accurate analysis of shape and margin of tumor masses appearing in breast using digital mammogram. It is obvious that a complicated structure invites the problem of over learning and misclassification. In proposed methodology, genetic algorithm (GA) has been used for searching of effective input feature vectors combined with adaptive neuro-fuzzy model for final classification of different boundaries of tumor masses. The study involves 200 digitized mammograms from MIAS and other databases and has shown 86% correct classification rate.
Fuzzy-Rough Nearest Neighbour Classification
NASA Astrophysics Data System (ADS)
Jensen, Richard; Cornelis, Chris
A new fuzzy-rough nearest neighbour (FRNN) classification algorithm is presented in this paper, as an alternative to Sarkar's fuzzy-rough ownership function (FRNN-O) approach. By contrast to the latter, our method uses the nearest neighbours to construct lower and upper approximations of decision classes, and classifies test instances based on their membership to these approximations. In the experimental analysis, we evaluate our approach with both classical fuzzy-rough approximations (based on an implicator and a t-norm), as well as with the recently introduced vaguely quantified rough sets. Preliminary results are very good, and in general FRNN outperforms FRNN-O, as well as the traditional fuzzy nearest neighbour (FNN) algorithm.
Kawabata, Yohei; Wada, Koichi; Nakatani, Manabu; Yamada, Shizuo; Onoue, Satomi
2011-11-25
The poor oral bioavailability arising from poor aqueous solubility should make drug research and development more difficult. Various approaches have been developed with a focus on enhancement of the solubility, dissolution rate, and oral bioavailability of poorly water-soluble drugs. To complete development works within a limited amount of time, the establishment of a suitable formulation strategy should be a key consideration for the pharmaceutical development of poorly water-soluble drugs. In this article, viable formulation options are reviewed on the basis of the biopharmaceutics classification system of drug substances. The article describes the basic approaches for poorly water-soluble drugs, such as crystal modification, micronization, amorphization, self-emulsification, cyclodextrin complexation, and pH modification. Literature-based examples of the formulation options for poorly water-soluble compounds and their practical application to marketed products are also provided. Classification of drug candidates based on their biopharmaceutical properties can provide an indication of the difficulty of drug development works. A better understanding of the physicochemical and biopharmaceutical properties of drug substances and the limitations of each delivery option should lead to efficient formulation development for poorly water-soluble drugs. Copyright © 2011 Elsevier B.V. All rights reserved.
Random whole metagenomic sequencing for forensic discrimination of soils.
Khodakova, Anastasia S; Smith, Renee J; Burgoyne, Leigh; Abarno, Damien; Linacre, Adrian
2014-01-01
Here we assess the ability of random whole metagenomic sequencing approaches to discriminate between similar soils from two geographically distinct urban sites for application in forensic science. Repeat samples from two parklands in residential areas separated by approximately 3 km were collected and the DNA was extracted. Shotgun, whole genome amplification (WGA) and single arbitrarily primed DNA amplification (AP-PCR) based sequencing techniques were then used to generate soil metagenomic profiles. Full and subsampled metagenomic datasets were then annotated against M5NR/M5RNA (taxonomic classification) and SEED Subsystems (metabolic classification) databases. Further comparative analyses were performed using a number of statistical tools including: hierarchical agglomerative clustering (CLUSTER); similarity profile analysis (SIMPROF); non-metric multidimensional scaling (NMDS); and canonical analysis of principal coordinates (CAP) at all major levels of taxonomic and metabolic classification. Our data showed that shotgun and WGA-based approaches generated highly similar metagenomic profiles for the soil samples such that the soil samples could not be distinguished accurately. An AP-PCR based approach was shown to be successful at obtaining reproducible site-specific metagenomic DNA profiles, which in turn were employed for successful discrimination of visually similar soil samples collected from two different locations.
Abnormality detection of mammograms by discriminative dictionary learning on DSIFT descriptors.
Tavakoli, Nasrin; Karimi, Maryam; Nejati, Mansour; Karimi, Nader; Reza Soroushmehr, S M; Samavi, Shadrokh; Najarian, Kayvan
2017-07-01
Detection and classification of breast lesions using mammographic images are one of the most difficult studies in medical image processing. A number of learning and non-learning methods have been proposed for detecting and classifying these lesions. However, the accuracy of the detection/classification still needs improvement. In this paper we propose a powerful classification method based on sparse learning to diagnose breast cancer in mammograms. For this purpose, a supervised discriminative dictionary learning approach is applied on dense scale invariant feature transform (DSIFT) features. A linear classifier is also simultaneously learned with the dictionary which can effectively classify the sparse representations. Our experimental results show the superior performance of our method compared to existing approaches.
Learning From Short Text Streams With Topic Drifts.
Li, Peipei; He, Lu; Wang, Haiyan; Hu, Xuegang; Zhang, Yuhong; Li, Lei; Wu, Xindong
2017-09-18
Short text streams such as search snippets and micro blogs have been popular on the Web with the emergence of social media. Unlike traditional normal text streams, these data present the characteristics of short length, weak signal, high volume, high velocity, topic drift, etc. Short text stream classification is hence a very challenging and significant task. However, this challenge has received little attention from the research community. Therefore, a new feature extension approach is proposed for short text stream classification with the help of a large-scale semantic network obtained from a Web corpus. It is built on an incremental ensemble classification model for efficiency. First, more semantic contexts based on the senses of terms in short texts are introduced to make up of the data sparsity using the open semantic network, in which all terms are disambiguated by their semantics to reduce the noise impact. Second, a concept cluster-based topic drifting detection method is proposed to effectively track hidden topic drifts. Finally, extensive studies demonstrate that as compared to several well-known concept drifting detection methods in data stream, our approach can detect topic drifts effectively, and it enables handling short text streams effectively while maintaining the efficiency as compared to several state-of-the-art short text classification approaches.
NASA Astrophysics Data System (ADS)
Beitlerová, Hana; Hieke, Falk; Žížala, Daniel; Kapička, Jiří; Keiser, Andreas; Schmidt, Jürgen; Schindewolf, Marcus
2017-04-01
Process-based erosion modelling is a developing and adequate tool to assess, simulate and understand the complex mechanisms of soil loss due to surface runoff. While the current state of available models includes powerful approaches, a major drawback is given by complex parametrization. A major input parameter for the physically based soil loss and deposition model EROSION 3D is represented by soil texture. However, as the model has been developed in Germany it is dependent on the German soil classification. To exploit data generated during a massive nationwide soil survey campaign taking place in the 1960s across the entire Czech Republic, a transfer from the Czech to the German or at least international (e.g. WRB) system is mandatory. During the survey the internal differentiation of grain sizes was realized in a two fractions approach, separating texture into solely above and below 0.01 mm rather than into clayey, silty and sandy textures. Consequently, the Czech system applies a classification of seven different textures based on the respective percentage of large and small particles, while in Germany 31 groups are essential. The followed approach of matching Czech soil survey data to the German system focusses on semi-logarithmic interpolation of the cumulative soil texture curve additionally on a regression equation based on a recent database of 128 soil pits. Furthermore, for each of the seven Czech texture classes a group of typically suitable classes of the German system was derived. A GIS-based spatial analysis to test approaches of interpolation the soil texture was carried out. First results show promising matches and pave the way to a Czech model application of EROSION 3D.
A General Architecture for Intelligent Tutoring of Diagnostic Classification Problem Solving
Crowley, Rebecca S.; Medvedeva, Olga
2003-01-01
We report on a general architecture for creating knowledge-based medical training systems to teach diagnostic classification problem solving. The approach is informed by our previous work describing the development of expertise in classification problem solving in Pathology. The architecture envelops the traditional Intelligent Tutoring System design within the Unified Problem-solving Method description Language (UPML) architecture, supporting component modularity and reuse. Based on the domain ontology, domain task ontology and case data, the abstract problem-solving methods of the expert model create a dynamic solution graph. Student interaction with the solution graph is filtered through an instructional layer, which is created by a second set of abstract problem-solving methods and pedagogic ontologies, in response to the current state of the student model. We outline the advantages and limitations of this general approach, and describe it’s implementation in SlideTutor–a developing Intelligent Tutoring System in Dermatopathology. PMID:14728159
Tuberculosis disease diagnosis using artificial immune recognition system.
Shamshirband, Shahaboddin; Hessam, Somayeh; Javidnia, Hossein; Amiribesheli, Mohsen; Vahdat, Shaghayegh; Petković, Dalibor; Gani, Abdullah; Kiah, Miss Laiha Mat
2014-01-01
There is a high risk of tuberculosis (TB) disease diagnosis among conventional methods. This study is aimed at diagnosing TB using hybrid machine learning approaches. Patient epicrisis reports obtained from the Pasteur Laboratory in the north of Iran were used. All 175 samples have twenty features. The features are classified based on incorporating a fuzzy logic controller and artificial immune recognition system. The features are normalized through a fuzzy rule based on a labeling system. The labeled features are categorized into normal and tuberculosis classes using the Artificial Immune Recognition Algorithm. Overall, the highest classification accuracy reached was for the 0.8 learning rate (α) values. The artificial immune recognition system (AIRS) classification approaches using fuzzy logic also yielded better diagnosis results in terms of detection accuracy compared to other empirical methods. Classification accuracy was 99.14%, sensitivity 87.00%, and specificity 86.12%.
D Land Cover Classification Based on Multispectral LIDAR Point Clouds
NASA Astrophysics Data System (ADS)
Zou, Xiaoliang; Zhao, Guihua; Li, Jonathan; Yang, Yuanxi; Fang, Yong
2016-06-01
Multispectral Lidar System can emit simultaneous laser pulses at the different wavelengths. The reflected multispectral energy is captured through a receiver of the sensor, and the return signal together with the position and orientation information of sensor is recorded. These recorded data are solved with GNSS/IMU data for further post-processing, forming high density multispectral 3D point clouds. As the first commercial multispectral airborne Lidar sensor, Optech Titan system is capable of collecting point clouds data from all three channels at 532nm visible (Green), at 1064 nm near infrared (NIR) and at 1550nm intermediate infrared (IR). It has become a new source of data for 3D land cover classification. The paper presents an Object Based Image Analysis (OBIA) approach to only use multispectral Lidar point clouds datasets for 3D land cover classification. The approach consists of three steps. Firstly, multispectral intensity images are segmented into image objects on the basis of multi-resolution segmentation integrating different scale parameters. Secondly, intensity objects are classified into nine categories by using the customized features of classification indexes and a combination the multispectral reflectance with the vertical distribution of object features. Finally, accuracy assessment is conducted via comparing random reference samples points from google imagery tiles with the classification results. The classification results show higher overall accuracy for most of the land cover types. Over 90% of overall accuracy is achieved via using multispectral Lidar point clouds for 3D land cover classification.
Large-scale classification of traffic signs under real-world conditions
NASA Astrophysics Data System (ADS)
Hazelhoff, Lykele; Creusen, Ivo; van de Wouw, Dennis; de With, Peter H. N.
2012-02-01
Traffic sign inventories are important to governmental agencies as they facilitate evaluation of traffic sign locations and are beneficial for road and sign maintenance. These inventories can be created (semi-)automatically based on street-level panoramic images. In these images, object detection is employed to detect the signs in each image, followed by a classification stage to retrieve the specific sign type. Classification of traffic signs is a complicated matter, since sign types are very similar with only minor differences within the sign, a high number of different signs is involved and multiple distortions occur, including variations in capturing conditions, occlusions, viewpoints and sign deformations. Therefore, we propose a method for robust classification of traffic signs, based on the Bag of Words approach for generic object classification. We extend the approach with a flexible, modular codebook to model the specific features of each sign type independently, in order to emphasize at the inter-sign differences instead of the parts common for all sign types. Additionally, this allows us to model and label the present false detections. Furthermore, analysis of the classification output provides the unreliable results. This classification system has been extensively tested for three different sign classes, covering 60 different sign types in total. These three data sets contain the sign detection results on street-level panoramic images, extracted from a country-wide database. The introduction of the modular codebook shows a significant improvement for all three sets, where the system is able to classify about 98% of the reliable results correctly.
Lauber, Chris; Gorbalenya, Alexander E
2012-04-01
Virus taxonomy has received little attention from the research community despite its broad relevance. In an accompanying paper (C. Lauber and A. E. Gorbalenya, J. Virol. 86:3890-3904, 2012), we have introduced a quantitative approach to hierarchically classify viruses of a family using pairwise evolutionary distances (PEDs) as a measure of genetic divergence. When applied to the six most conserved proteins of the Picornaviridae, it clustered 1,234 genome sequences in groups at three hierarchical levels (to which we refer as the "GENETIC classification"). In this study, we compare the GENETIC classification with the expert-based picornavirus taxonomy and outline differences in the underlying frameworks regarding the relation of virus groups and genetic diversity that represent, respectively, the structure and content of a classification. To facilitate the analysis, we introduce two novel diagrams. The first connects the genetic diversity of taxa to both the PED distribution and the phylogeny of picornaviruses. The second depicts a classification and the accommodated genetic diversity in a standardized manner. Generally, we found striking agreement between the two classifications on species and genus taxa. A few disagreements concern the species Human rhinovirus A and Human rhinovirus C and the genus Aphthovirus, which were split in the GENETIC classification. Furthermore, we propose a new supergenus level and universal, level-specific PED thresholds, not reached yet by many taxa. Since the species threshold is approached mostly by taxa with large sampling sizes and those infecting multiple hosts, it may represent an upper limit on divergence, beyond which homologous recombination in the six most conserved genes between two picornaviruses might not give viable progeny.
Ma, Yue; Tuskan, Gerald A.
2018-01-01
The existence of complete genome sequences makes it important to develop different approaches for classification of large-scale data sets and to make extraction of biological insights easier. Here, we propose an approach for classification of complete proteomes/protein sets based on protein distributions on some basic attributes. We demonstrate the usefulness of this approach by determining protein distributions in terms of two attributes: protein lengths and protein intrinsic disorder contents (ID). The protein distributions based on L and ID are surveyed for representative proteome organisms and protein sets from the three domains of life. The two-dimensional maps (designated as fingerprints here) from the protein distribution densities in the LD space defined by ln(L) and ID are then constructed. The fingerprints for different organisms and protein sets are found to be distinct with each other, and they can therefore be used for comparative studies. As a test case, phylogenetic trees have been constructed based on the protein distribution densities in the fingerprints of proteomes of organisms without performing any protein sequence comparison and alignments. The phylogenetic trees generated are biologically meaningful, demonstrating that the protein distributions in the LD space may serve as unique phylogenetic signals of the organisms at the proteome level. PMID:29686995
Detection of distorted frames in retinal video-sequences via machine learning
NASA Astrophysics Data System (ADS)
Kolar, Radim; Liberdova, Ivana; Odstrcilik, Jan; Hracho, Michal; Tornow, Ralf P.
2017-07-01
This paper describes detection of distorted frames in retinal sequences based on set of global features extracted from each frame. The feature vector is consequently used in classification step, in which three types of classifiers are tested. The best classification accuracy 96% has been achieved with support vector machine approach.
Discovery of User-Oriented Class Associations for Enriching Library Classification Schemes.
ERIC Educational Resources Information Center
Pu, Hsiao-Tieh
2002-01-01
Presents a user-based approach to exploring the possibility of adding user-oriented class associations to hierarchical library classification schemes. Classes not grouped in the same subject hierarchies yet relevant to users' knowledge are obtained by analyzing a log book of a university library's circulation records, using collaborative filtering…
USDA-ARS?s Scientific Manuscript database
In this paper, we propose approaches to improve the pixel-based support vector machine (SVM) classification for urban land use and land cover (LULC) mapping from airborne hyperspectral imagery with high spatial resolution. Class spatial neighborhood relationship is used to correct the misclassified ...
Zhang, Jianhua; Li, Sunan; Wang, Rubin
2017-01-01
In this paper, we deal with the Mental Workload (MWL) classification problem based on the measured physiological data. First we discussed the optimal depth (i.e., the number of hidden layers) and parameter optimization algorithms for the Convolutional Neural Networks (CNN). The base CNNs designed were tested according to five classification performance indices, namely Accuracy, Precision, F-measure, G-mean, and required training time. Then we developed an Ensemble Convolutional Neural Network (ECNN) to enhance the accuracy and robustness of the individual CNN model. For the ECNN design, three model aggregation approaches (weighted averaging, majority voting and stacking) were examined and a resampling strategy was used to enhance the diversity of individual CNN models. The results of MWL classification performance comparison indicated that the proposed ECNN framework can effectively improve MWL classification performance and is featured by entirely automatic feature extraction and MWL classification, when compared with traditional machine learning methods.
3D shape representation with spatial probabilistic distribution of intrinsic shape keypoints
NASA Astrophysics Data System (ADS)
Ghorpade, Vijaya K.; Checchin, Paul; Malaterre, Laurent; Trassoudaine, Laurent
2017-12-01
The accelerated advancement in modeling, digitizing, and visualizing techniques for 3D shapes has led to an increasing amount of 3D models creation and usage, thanks to the 3D sensors which are readily available and easy to utilize. As a result, determining the similarity between 3D shapes has become consequential and is a fundamental task in shape-based recognition, retrieval, clustering, and classification. Several decades of research in Content-Based Information Retrieval (CBIR) has resulted in diverse techniques for 2D and 3D shape or object classification/retrieval and many benchmark data sets. In this article, a novel technique for 3D shape representation and object classification has been proposed based on analyses of spatial, geometric distributions of 3D keypoints. These distributions capture the intrinsic geometric structure of 3D objects. The result of the approach is a probability distribution function (PDF) produced from spatial disposition of 3D keypoints, keypoints which are stable on object surface and invariant to pose changes. Each class/instance of an object can be uniquely represented by a PDF. This shape representation is robust yet with a simple idea, easy to implement but fast enough to compute. Both Euclidean and topological space on object's surface are considered to build the PDFs. Topology-based geodesic distances between keypoints exploit the non-planar surface properties of the object. The performance of the novel shape signature is tested with object classification accuracy. The classification efficacy of the new shape analysis method is evaluated on a new dataset acquired with a Time-of-Flight camera, and also, a comparative evaluation on a standard benchmark dataset with state-of-the-art methods is performed. Experimental results demonstrate superior classification performance of the new approach on RGB-D dataset and depth data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ren, Huiying; Hou, Zhangshuan; Huang, Maoyi
The Community Land Model (CLM) represents physical, chemical, and biological processes of the terrestrial ecosystems that interact with climate across a range of spatial and temporal scales. As CLM includes numerous sub-models and associated parameters, the high-dimensional parameter space presents a formidable challenge for quantifying uncertainty and improving Earth system predictions needed to assess environmental changes and risks. This study aims to evaluate the potential of transferring hydrologic model parameters in CLM through sensitivity analyses and classification across watersheds from the Model Parameter Estimation Experiment (MOPEX) in the United States. The sensitivity of CLM-simulated water and energy fluxes to hydrologicalmore » parameters across 431 MOPEX basins are first examined using an efficient stochastic sampling-based sensitivity analysis approach. Linear, interaction, and high-order nonlinear impacts are all identified via statistical tests and stepwise backward removal parameter screening. The basins are then classified accordingly to their parameter sensitivity patterns (internal attributes), as well as their hydrologic indices/attributes (external hydrologic factors) separately, using a Principal component analyses (PCA) and expectation-maximization (EM) –based clustering approach. Similarities and differences among the parameter sensitivity-based classification system (S-Class), the hydrologic indices-based classification (H-Class), and the Koppen climate classification systems (K-Class) are discussed. Within each S-class with similar parameter sensitivity characteristics, similar inversion modeling setups can be used for parameter calibration, and the parameters and their contribution or significance to water and energy cycling may also be more transferrable. This classification study provides guidance on identifiable parameters, and on parameterization and inverse model design for CLM but the methodology is applicable to other models. Inverting parameters at representative sites belonging to the same class can significantly reduce parameter calibration efforts.« less
A New Experiment on Bengali Character Recognition
NASA Astrophysics Data System (ADS)
Barman, Sumana; Bhattacharyya, Debnath; Jeon, Seung-Whan; Kim, Tai-Hoon; Kim, Haeng-Kon
This paper presents a method to use View based approach in Bangla Optical Character Recognition (OCR) system providing reduced data set to the ANN classification engine rather than the traditional OCR methods. It describes how Bangla characters are processed, trained and then recognized with the use of a Backpropagation Artificial neural network. This is the first published account of using a segmentation-free optical character recognition system for Bangla using a view based approach. The methodology presented here assumes that the OCR pre-processor has presented the input images to the classification engine described here. The size and the font face used to render the characters are also significant in both training and classification. The images are first converted into greyscale and then to binary images; these images are then scaled to a fit a pre-determined area with a fixed but significant number of pixels. The feature vectors are then formed extracting the characteristics points, which in this case is simply a series of 0s and 1s of fixed length. Finally, an artificial neural network is chosen for the training and classification process.
Application of LogitBoost Classifier for Traceability Using SNP Chip Data
Kang, Hyunsung; Cho, Seoae; Kim, Heebal; Seo, Kang-Seok
2015-01-01
Consumer attention to food safety has increased rapidly due to animal-related diseases; therefore, it is important to identify their places of origin (POO) for safety purposes. However, only a few studies have addressed this issue and focused on machine learning-based approaches. In the present study, classification analyses were performed using a customized SNP chip for POO prediction. To accomplish this, 4,122 pigs originating from 104 farms were genotyped using the SNP chip. Several factors were considered to establish the best prediction model based on these data. We also assessed the applicability of the suggested model using a kinship coefficient-filtering approach. Our results showed that the LogitBoost-based prediction model outperformed other classifiers in terms of classification performance under most conditions. Specifically, a greater level of accuracy was observed when a higher kinship-based cutoff was employed. These results demonstrated the applicability of a machine learning-based approach using SNP chip data for practical traceability. PMID:26436917
Application of LogitBoost Classifier for Traceability Using SNP Chip Data.
Kim, Kwondo; Seo, Minseok; Kang, Hyunsung; Cho, Seoae; Kim, Heebal; Seo, Kang-Seok
2015-01-01
Consumer attention to food safety has increased rapidly due to animal-related diseases; therefore, it is important to identify their places of origin (POO) for safety purposes. However, only a few studies have addressed this issue and focused on machine learning-based approaches. In the present study, classification analyses were performed using a customized SNP chip for POO prediction. To accomplish this, 4,122 pigs originating from 104 farms were genotyped using the SNP chip. Several factors were considered to establish the best prediction model based on these data. We also assessed the applicability of the suggested model using a kinship coefficient-filtering approach. Our results showed that the LogitBoost-based prediction model outperformed other classifiers in terms of classification performance under most conditions. Specifically, a greater level of accuracy was observed when a higher kinship-based cutoff was employed. These results demonstrated the applicability of a machine learning-based approach using SNP chip data for practical traceability.
Karayannis, Nicholas V; Jull, Gwendolen A; Hodges, Paul W
2012-02-20
Several classification schemes, each with its own philosophy and categorizing method, subgroup low back pain (LBP) patients with the intent to guide treatment. Physiotherapy derived schemes usually have a movement impairment focus, but the extent to which other biological, psychological, and social factors of pain are encompassed requires exploration. Furthermore, within the prevailing 'biological' domain, the overlap of subgrouping strategies within the orthopaedic examination remains unexplored. The aim of this study was "to review and clarify through developer/expert survey, the theoretical basis and content of physical movement classification schemes, determine their relative reliability and similarities/differences, and to consider the extent of incorporation of the bio-psycho-social framework within the schemes". A database search for relevant articles related to LBP and subgrouping or classification was conducted. Five dominant movement-based schemes were identified: Mechanical Diagnosis and Treatment (MDT), Treatment Based Classification (TBC), Pathoanatomic Based Classification (PBC), Movement System Impairment Classification (MSI), and O'Sullivan Classification System (OCS) schemes. Data were extracted and a survey sent to the classification scheme developers/experts to clarify operational criteria, reliability, decision-making, and converging/diverging elements between schemes. Survey results were integrated into the review and approval obtained for accuracy. Considerable diversity exists between schemes in how movement informs subgrouping and in the consideration of broader neurosensory, cognitive, emotional, and behavioural dimensions of LBP. Despite differences in assessment philosophy, a common element lies in their objective to identify a movement pattern related to a pain reduction strategy. Two dominant movement paradigms emerge: (i) loading strategies (MDT, TBC, PBC) aimed at eliciting a phenomenon of centralisation of symptoms; and (ii) modified movement strategies (MSI, OCS) targeted towards documenting the movement impairments associated with the pain state. Schemes vary on: the extent to which loading strategies are pursued; the assessment of movement dysfunction; and advocated treatment approaches. A biomechanical assessment predominates in the majority of schemes (MDT, PBC, MSI), certain psychosocial aspects (fear-avoidance) are considered in the TBC scheme, certain neurophysiologic (central versus peripherally mediated pain states) and psychosocial (cognitive and behavioural) aspects are considered in the OCS scheme.
NASA Astrophysics Data System (ADS)
Gautam, Nitin
The main objectives of this thesis are to develop a robust statistical method for the classification of ocean precipitation based on physical properties to which the SSM/I is sensitive and to examine how these properties vary globally and seasonally. A two step approach is adopted for the classification of oceanic precipitation classes from multispectral SSM/I data: (1)we subjectively define precipitation classes using a priori information about the precipitating system and its possible distinct signature on SSM/I data such as scattering by ice particles aloft in the precipitating cloud, emission by liquid rain water below freezing level, the difference of polarization at 19 GHz-an indirect measure of optical depth, etc.; (2)we then develop an objective classification scheme which is found to reproduce the subjective classification with high accuracy. This hybrid strategy allows us to use the characteristics of the data to define and encode classes and helps retain the physical interpretation of classes. The classification methods based on k-nearest neighbor and neural network are developed to objectively classify six precipitation classes. It is found that the classification method based neural network yields high accuracy for all precipitation classes. An inversion method based on minimum variance approach was used to retrieve gross microphysical properties of these precipitation classes such as column integrated liquid water path, column integrated ice water path, and column integrated min water path. This classification method is then applied to 2 years (1991-92) of SSM/I data to examine and document the seasonal and global distribution of precipitation frequency corresponding to each of these objectively defined six classes. The characteristics of the distribution are found to be consistent with assumptions used in defining these six precipitation classes and also with well known climatological patterns of precipitation regions. The seasonal and global distribution of these six classes is also compared with the earlier results obtained from Comprehensive Ocean Atmosphere Data Sets (COADS). It is found that the gross pattern of the distributions obtained from SSM/I and COADS data match remarkably well with each other.
Cheese Classification, Characterization, and Categorization: A Global Perspective.
Almena-Aliste, Montserrat; Mietton, Bernard
2014-02-01
Cheese is one of the most fascinating, complex, and diverse foods enjoyed today. Three elements constitute the cheese ecosystem: ripening agents, consisting of enzymes and microorganisms; the composition of the fresh cheese; and the environmental conditions during aging. These factors determine and define not only the sensory quality of the final cheese product but also the vast diversity of cheeses produced worldwide. How we define and categorize cheese is a complicated matter. There are various approaches to cheese classification, and a global approach for classification and characterization is needed. We review current cheese classification schemes and the limitations inherent in each of the schemes described. While some classification schemes are based on microbiological criteria, others rely on descriptions of the technologies used for cheese production. The goal of this review is to present an overview of comprehensive and practical integrative classification models in order to better describe cheese diversity and the fundamental differences within cheeses, as well as to connect fundamental technological, microbiological, chemical, and sensory characteristics to contribute to an overall characterization of the main families of cheese, including the expanding world of American artisanal cheeses.
Classification of Sporting Activities Using Smartphone Accelerometers
Mitchell, Edmond; Monaghan, David; O'Connor, Noel E.
2013-01-01
In this paper we present a framework that allows for the automatic identification of sporting activities using commonly available smartphones. We extract discriminative informational features from smartphone accelerometers using the Discrete Wavelet Transform (DWT). Despite the poor quality of their accelerometers, smartphones were used as capture devices due to their prevalence in today's society. Successful classification on this basis potentially makes the technology accessible to both elite and non-elite athletes. Extracted features are used to train different categories of classifiers. No one classifier family has a reportable direct advantage in activity classification problems to date; thus we examine classifiers from each of the most widely used classifier families. We investigate three classification approaches; a commonly used SVM-based approach, an optimized classification model and a fusion of classifiers. We also investigate the effect of changing several of the DWT input parameters, including mother wavelets, window lengths and DWT decomposition levels. During the course of this work we created a challenging sports activity analysis dataset, comprised of soccer and field-hockey activities. The average maximum F-measure accuracy of 87% was achieved using a fusion of classifiers, which was 6% better than a single classifier model and 23% better than a standard SVM approach. PMID:23604031
E.H. Helmer; T.A. Kennaway; D.H. Pedreros; M.L. Clark; H. Marcano-Vega; L.L. Tieszen; S.R. Schill; C.M.S. Carrington
2008-01-01
Satellite image-based mapping of tropical forests is vital to conservation planning. Standard methods for automated image classification, however, limit classification detail in complex tropical landscapes. In this study, we test an approach to Landsat image interpretation on four islands of the Lesser Antilles, including Grenada and St. Kitts, Nevis and St. Eustatius...
Bahadure, Nilesh Bhaskarrao; Ray, Arun Kumar; Thethi, Har Pal
2018-01-17
The detection of a brain tumor and its classification from modern imaging modalities is a primary concern, but a time-consuming and tedious work was performed by radiologists or clinical supervisors. The accuracy of detection and classification of tumor stages performed by radiologists is depended on their experience only, so the computer-aided technology is very important to aid with the diagnosis accuracy. In this study, to improve the performance of tumor detection, we investigated comparative approach of different segmentation techniques and selected the best one by comparing their segmentation score. Further, to improve the classification accuracy, the genetic algorithm is employed for the automatic classification of tumor stage. The decision of classification stage is supported by extracting relevant features and area calculation. The experimental results of proposed technique are evaluated and validated for performance and quality analysis on magnetic resonance brain images, based on segmentation score, accuracy, sensitivity, specificity, and dice similarity index coefficient. The experimental results achieved 92.03% accuracy, 91.42% specificity, 92.36% sensitivity, and an average segmentation score between 0.82 and 0.93 demonstrating the effectiveness of the proposed technique for identifying normal and abnormal tissues from brain MR images. The experimental results also obtained an average of 93.79% dice similarity index coefficient, which indicates better overlap between the automated extracted tumor regions with manually extracted tumor region by radiologists.
Nishio, Shin-Ya; Usami, Shin-Ichi
2017-03-01
Recent advances in next-generation sequencing (NGS) have given rise to new challenges due to the difficulties in variant pathogenicity interpretation and large dataset management, including many kinds of public population databases as well as public or commercial disease-specific databases. Here, we report a new database development tool, named the "Clinical NGS Database," for improving clinical NGS workflow through the unified management of variant information and clinical information. This database software offers a two-feature approach to variant pathogenicity classification. The first of these approaches is a phenotype similarity-based approach. This database allows the easy comparison of the detailed phenotype of each patient with the average phenotype of the same gene mutation at the variant or gene level. It is also possible to browse patients with the same gene mutation quickly. The other approach is a statistical approach to variant pathogenicity classification based on the use of the odds ratio for comparisons between the case and the control for each inheritance mode (families with apparently autosomal dominant inheritance vs. control, and families with apparently autosomal recessive inheritance vs. control). A number of case studies are also presented to illustrate the utility of this database. © 2016 The Authors. **Human Mutation published by Wiley Periodicals, Inc.
Selection-Fusion Approach for Classification of Datasets with Missing Values
Ghannad-Rezaie, Mostafa; Soltanian-Zadeh, Hamid; Ying, Hao; Dong, Ming
2010-01-01
This paper proposes a new approach based on missing value pattern discovery for classifying incomplete data. This approach is particularly designed for classification of datasets with a small number of samples and a high percentage of missing values where available missing value treatment approaches do not usually work well. Based on the pattern of the missing values, the proposed approach finds subsets of samples for which most of the features are available and trains a classifier for each subset. Then, it combines the outputs of the classifiers. Subset selection is translated into a clustering problem, allowing derivation of a mathematical framework for it. A trade off is established between the computational complexity (number of subsets) and the accuracy of the overall classifier. To deal with this trade off, a numerical criterion is proposed for the prediction of the overall performance. The proposed method is applied to seven datasets from the popular University of California, Irvine data mining archive and an epilepsy dataset from Henry Ford Hospital, Detroit, Michigan (total of eight datasets). Experimental results show that classification accuracy of the proposed method is superior to those of the widely used multiple imputations method and four other methods. They also show that the level of superiority depends on the pattern and percentage of missing values. PMID:20212921
Siegrist, Johannes; Fekete, Christine
2016-06-13
Theory-based approaches provide explanations of the impact of components of the International Classification of Functioning, Disability and Health (ICF) classification on outcomes such as health and wellbeing. Here, one such approach is proposed, focusing on social participation and its association with wellbeing. In addition to elaborating a theoretical approach, a narrative review of research on labour market participation of persons with severe disability, spinal cord injury, is conducted to illustrate the utility of the proposed approach. Availability and good quality of productive activities, in particular paid work, are expected to improve wellbeing by strengthening favourable experiences of personal control and social recognition. As these opportunities are restricted among persons with disabilities, conditions that enable full social participation need to be strengthened. Research identified several such conditions at the individual (e.g. coping, social support, educational skills) and the contextual socio-political level (e.g. quality of care, medical and vocational rehabilitation), although their potential of improving wellbeing has not yet been sufficiently explored. In conclusion, supplementing the established ICF classification by theory-based approaches may advance explanations of adverse effects of reduced functioning and wellbeing in disability. This new knowledge can guide the development of interventions to improve participation in general and social productivity in particular.
Pérez-Castillo, Yunierkis; Lazar, Cosmin; Taminau, Jonatan; Froeyen, Mathy; Cabrera-Pérez, Miguel Ángel; Nowé, Ann
2012-09-24
Computer-aided drug design has become an important component of the drug discovery process. Despite the advances in this field, there is not a unique modeling approach that can be successfully applied to solve the whole range of problems faced during QSAR modeling. Feature selection and ensemble modeling are active areas of research in ligand-based drug design. Here we introduce the GA(M)E-QSAR algorithm that combines the search and optimization capabilities of Genetic Algorithms with the simplicity of the Adaboost ensemble-based classification algorithm to solve binary classification problems. We also explore the usefulness of Meta-Ensembles trained with Adaboost and Voting schemes to further improve the accuracy, generalization, and robustness of the optimal Adaboost Single Ensemble derived from the Genetic Algorithm optimization. We evaluated the performance of our algorithm using five data sets from the literature and found that it is capable of yielding similar or better classification results to what has been reported for these data sets with a higher enrichment of active compounds relative to the whole actives subset when only the most active chemicals are considered. More important, we compared our methodology with state of the art feature selection and classification approaches and found that it can provide highly accurate, robust, and generalizable models. In the case of the Adaboost Ensembles derived from the Genetic Algorithm search, the final models are quite simple since they consist of a weighted sum of the output of single feature classifiers. Furthermore, the Adaboost scores can be used as ranking criterion to prioritize chemicals for synthesis and biological evaluation after virtual screening experiments.
Wang, Zhengxia; Zhu, Xiaofeng; Adeli, Ehsan; Zhu, Yingying; Nie, Feiping; Munsell, Brent
2018-01-01
Graph-based transductive learning (GTL) is a powerful machine learning technique that is used when sufficient training data is not available. In particular, conventional GTL approaches first construct a fixed inter-subject relation graph that is based on similarities in voxel intensity values in the feature domain, which can then be used to propagate the known phenotype data (i.e., clinical scores and labels) from the training data to the testing data in the label domain. However, this type of graph is exclusively learned in the feature domain, and primarily due to outliers in the observed features, may not be optimal for label propagation in the label domain. To address this limitation, a progressive GTL (pGTL) method is proposed that gradually finds an intrinsic data representation that more accurately aligns imaging features with the phenotype data. In general, optimal feature-to-phenotype alignment is achieved using an iterative approach that: (1) refines inter-subject relationships observed in the feature domain by using the learned intrinsic data representation in the label domain, (2) updates the intrinsic data representation from the refined inter-subject relationships, and (3) verifies the intrinsic data representation on the training data to guarantee an optimal classification when applied to testing data. Additionally, the iterative approach is extended to multi-modal imaging data to further improve pGTL classification accuracy. Using Alzheimer’s disease and Parkinson’s disease study data, the classification accuracy of the proposed pGTL method is compared to several state-of-the-art classification methods, and the results show pGTL can more accurately identify subjects, even at different progression stages, in these two study data sets. PMID:28551556
NASA Astrophysics Data System (ADS)
Alom, Md. Zahangir; Awwal, Abdul A. S.; Lowe-Webb, Roger; Taha, Tarek M.
2017-08-01
Deep-learning methods are gaining popularity because of their state-of-the-art performance in image classification tasks. In this paper, we explore classification of laser-beam images from the National Ignition Facility (NIF) using a novel deeplearning approach. NIF is the world's largest, most energetic laser. It has nearly 40,000 optics that precisely guide, reflect, amplify, and focus 192 laser beams onto a fusion target. NIF utilizes four petawatt lasers called the Advanced Radiographic Capability (ARC) to produce backlighting X-ray illumination to capture implosion dynamics of NIF experiments with picosecond temporal resolution. In the current operational configuration, four independent short-pulse ARC beams are created and combined in a split-beam configuration in each of two NIF apertures at the entry of the pre-amplifier. The subaperture beams then propagate through the NIF beampath up to the ARC compressor. Each ARC beamlet is separately compressed with a dedicated set of four gratings and recombined as sub-apertures for transport to the parabola vessel, where the beams are focused using parabolic mirrors and pointed to the target. Small angular errors in the compressor gratings can cause the sub-aperture beams to diverge from one another and prevent accurate alignment through the transport section between the compressor and parabolic mirrors. This is an off-normal condition that must be detected and corrected. The goal of the off-normal check is to determine whether the ARC beamlets are sufficiently overlapped into a merged single spot or diverged into two distinct spots. Thus, the objective of the current work is three-fold: developing a simple algorithm to perform off-normal classification, exploring the use of Convolutional Neural Network (CNN) for the same task, and understanding the inter-relationship of the two approaches. The CNN recognition results are compared with other machine-learning approaches, such as Deep Neural Network (DNN) and Support Vector Machine (SVM). The experimental results show around 96% classification accuracy using CNN; the CNN approach also provides comparable recognition results compared to the present feature-based off-normal detection. The feature-based solution was developed to capture the expertise of a human expert in classifying the images. The misclassified results are further studied to explain the differences and discover any discrepancies or inconsistencies in current classification.
Training echo state networks for rotation-invariant bone marrow cell classification.
Kainz, Philipp; Burgsteiner, Harald; Asslaber, Martin; Ahammer, Helmut
2017-01-01
The main principle of diagnostic pathology is the reliable interpretation of individual cells in context of the tissue architecture. Especially a confident examination of bone marrow specimen is dependent on a valid classification of myeloid cells. In this work, we propose a novel rotation-invariant learning scheme for multi-class echo state networks (ESNs), which achieves very high performance in automated bone marrow cell classification. Based on representing static images as temporal sequence of rotations, we show how ESNs robustly recognize cells of arbitrary rotations by taking advantage of their short-term memory capacity. The performance of our approach is compared to a classification random forest that learns rotation-invariance in a conventional way by exhaustively training on multiple rotations of individual samples. The methods were evaluated on a human bone marrow image database consisting of granulopoietic and erythropoietic cells in different maturation stages. Our ESN approach to cell classification does not rely on segmentation of cells or manual feature extraction and can therefore directly be applied to image data.
Zhao, Jiangsan; Bodner, Gernot; Rewald, Boris
2016-01-01
Phenotyping local crop cultivars is becoming more and more important, as they are an important genetic source for breeding – especially in regard to inherent root system architectures. Machine learning algorithms are promising tools to assist in the analysis of complex data sets; novel approaches are need to apply them on root phenotyping data of mature plants. A greenhouse experiment was conducted in large, sand-filled columns to differentiate 16 European Pisum sativum cultivars based on 36 manually derived root traits. Through combining random forest and support vector machine models, machine learning algorithms were successfully used for unbiased identification of most distinguishing root traits and subsequent pairwise cultivar differentiation. Up to 86% of pea cultivar pairs could be distinguished based on top five important root traits (Timp5) – Timp5 differed widely between cultivar pairs. Selecting top important root traits (Timp) provided a significant improved classification compared to using all available traits or randomly selected trait sets. The most frequent Timp of mature pea cultivars was total surface area of lateral roots originating from tap root segments at 0–5 cm depth. The high classification rate implies that culturing did not lead to a major loss of variability in root system architecture in the studied pea cultivars. Our results illustrate the potential of machine learning approaches for unbiased (root) trait selection and cultivar classification based on rather small, complex phenotypic data sets derived from pot experiments. Powerful statistical approaches are essential to make use of the increasing amount of (root) phenotyping information, integrating the complex trait sets describing crop cultivars. PMID:27999587
Groenendyk, Derek G.; Ferré, Ty P.A.; Thorp, Kelly R.; Rice, Amy K.
2015-01-01
Soils lie at the interface between the atmosphere and the subsurface and are a key component that control ecosystem services, food production, and many other processes at the Earth’s surface. There is a long-established convention for identifying and mapping soils by texture. These readily available, georeferenced soil maps and databases are used widely in environmental sciences. Here, we show that these traditional soil classifications can be inappropriate, contributing to bias and uncertainty in applications from slope stability to water resource management. We suggest a new approach to soil classification, with a detailed example from the science of hydrology. Hydrologic simulations based on common meteorological conditions were performed using HYDRUS-1D, spanning textures identified by the United States Department of Agriculture soil texture triangle. We consider these common conditions to be: drainage from saturation, infiltration onto a drained soil, and combined infiltration and drainage events. Using a k-means clustering algorithm, we created soil classifications based on the modeled hydrologic responses of these soils. The hydrologic-process-based classifications were compared to those based on soil texture and a single hydraulic property, Ks. Differences in classifications based on hydrologic response versus soil texture demonstrate that traditional soil texture classification is a poor predictor of hydrologic response. We then developed a QGIS plugin to construct soil maps combining a classification with georeferenced soil data from the Natural Resource Conservation Service. The spatial patterns of hydrologic response were more immediately informative, much simpler, and less ambiguous, for use in applications ranging from trafficability to irrigation management to flood control. The ease with which hydrologic-process-based classifications can be made, along with the improved quantitative predictions of soil responses and visualization of landscape function, suggest that hydrologic-process-based classifications should be incorporated into environmental process models and can be used to define application-specific maps of hydrologic function. PMID:26121466
Groenendyk, Derek G; Ferré, Ty P A; Thorp, Kelly R; Rice, Amy K
2015-01-01
Soils lie at the interface between the atmosphere and the subsurface and are a key component that control ecosystem services, food production, and many other processes at the Earth's surface. There is a long-established convention for identifying and mapping soils by texture. These readily available, georeferenced soil maps and databases are used widely in environmental sciences. Here, we show that these traditional soil classifications can be inappropriate, contributing to bias and uncertainty in applications from slope stability to water resource management. We suggest a new approach to soil classification, with a detailed example from the science of hydrology. Hydrologic simulations based on common meteorological conditions were performed using HYDRUS-1D, spanning textures identified by the United States Department of Agriculture soil texture triangle. We consider these common conditions to be: drainage from saturation, infiltration onto a drained soil, and combined infiltration and drainage events. Using a k-means clustering algorithm, we created soil classifications based on the modeled hydrologic responses of these soils. The hydrologic-process-based classifications were compared to those based on soil texture and a single hydraulic property, Ks. Differences in classifications based on hydrologic response versus soil texture demonstrate that traditional soil texture classification is a poor predictor of hydrologic response. We then developed a QGIS plugin to construct soil maps combining a classification with georeferenced soil data from the Natural Resource Conservation Service. The spatial patterns of hydrologic response were more immediately informative, much simpler, and less ambiguous, for use in applications ranging from trafficability to irrigation management to flood control. The ease with which hydrologic-process-based classifications can be made, along with the improved quantitative predictions of soil responses and visualization of landscape function, suggest that hydrologic-process-based classifications should be incorporated into environmental process models and can be used to define application-specific maps of hydrologic function.
Promoter Sequences Prediction Using Relational Association Rule Mining
Czibula, Gabriela; Bocicor, Maria-Iuliana; Czibula, Istvan Gergely
2012-01-01
In this paper we are approaching, from a computational perspective, the problem of promoter sequences prediction, an important problem within the field of bioinformatics. As the conditions for a DNA sequence to function as a promoter are not known, machine learning based classification models are still developed to approach the problem of promoter identification in the DNA. We are proposing a classification model based on relational association rules mining. Relational association rules are a particular type of association rules and describe numerical orderings between attributes that commonly occur over a data set. Our classifier is based on the discovery of relational association rules for predicting if a DNA sequence contains or not a promoter region. An experimental evaluation of the proposed model and comparison with similar existing approaches is provided. The obtained results show that our classifier overperforms the existing techniques for identifying promoter sequences, confirming the potential of our proposal. PMID:22563233
Thermogram breast cancer prediction approach based on Neutrosophic sets and fuzzy c-means algorithm.
Gaber, Tarek; Ismail, Gehad; Anter, Ahmed; Soliman, Mona; Ali, Mona; Semary, Noura; Hassanien, Aboul Ella; Snasel, Vaclav
2015-08-01
The early detection of breast cancer makes many women survive. In this paper, a CAD system classifying breast cancer thermograms to normal and abnormal is proposed. This approach consists of two main phases: automatic segmentation and classification. For the former phase, an improved segmentation approach based on both Neutrosophic sets (NS) and optimized Fast Fuzzy c-mean (F-FCM) algorithm was proposed. Also, post-segmentation process was suggested to segment breast parenchyma (i.e. ROI) from thermogram images. For the classification, different kernel functions of the Support Vector Machine (SVM) were used to classify breast parenchyma into normal or abnormal cases. Using benchmark database, the proposed CAD system was evaluated based on precision, recall, and accuracy as well as a comparison with related work. The experimental results showed that our system would be a very promising step toward automatic diagnosis of breast cancer using thermograms as the accuracy reached 100%.
Support-vector-based emergent self-organising approach for emotional understanding
NASA Astrophysics Data System (ADS)
Nguwi, Yok-Yen; Cho, Siu-Yeung
2010-12-01
This study discusses the computational analysis of general emotion understanding from questionnaires methodology. The questionnaires method approaches the subject by investigating the real experience that accompanied the emotions, whereas the other laboratory approaches are generally associated with exaggerated elements. We adopted a connectionist model called support-vector-based emergent self-organising map (SVESOM) to analyse the emotion profiling from the questionnaires method. The SVESOM first identifies the important variables by giving discriminative features with high ranking. The classifier then performs the classification based on the selected features. Experimental results show that the top rank features are in line with the work of Scherer and Wallbott [(1994), 'Evidence for Universality and Cultural Variation of Differential Emotion Response Patterning', Journal of Personality and Social Psychology, 66, 310-328], which approached the emotions physiologically. While the performance measures show that using the full features for classifications can degrade the performance, the selected features provide superior results in terms of accuracy and generalisation.
Classification of Birds and Bats Using Flight Tracks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cullinan, Valerie I.; Matzner, Shari; Duberstein, Corey A.
Classification of birds and bats that use areas targeted for offshore wind farm development and the inference of their behavior is essential to evaluating the potential effects of development. The current approach to assessing the number and distribution of birds at sea involves transect surveys using trained individuals in boats or airplanes or using high-resolution imagery. These approaches are costly and have safety concerns. Based on a limited annotated library extracted from a single-camera thermal video, we provide a framework for building models that classify birds and bats and their associated behaviors. As an example, we developed a discriminant modelmore » for theoretical flight paths and applied it to data (N = 64 tracks) extracted from 5-min video clips. The agreement between model- and observer-classified path types was initially only 41%, but it increased to 73% when small-scale jitter was censored and path types were combined. Classification of 46 tracks of bats, swallows, gulls, and terns on average was 82% accurate, based on a jackknife cross-validation. Model classification of bats and terns (N = 4 and 2, respectively) was 94% and 91% correct, respectively; however, the variance associated with the tracks from these targets is poorly estimated. Model classification of gulls and swallows (N ≥ 18) was on average 73% and 85% correct, respectively. The models developed here should be considered preliminary because they are based on a small data set both in terms of the numbers of species and the identified flight tracks. Future classification models would be greatly improved by including a measure of distance between the camera and the target.« less
NASA Astrophysics Data System (ADS)
Chang Chien, Kuang-Che; Fetita, Catalin; Brillet, Pierre-Yves; Prêteux, Françoise; Chang, Ruey-Feng
2009-02-01
Multi-detector computed tomography (MDCT) has high accuracy and specificity on volumetrically capturing serial images of the lung. It increases the capability of computerized classification for lung tissue in medical research. This paper proposes a three-dimensional (3D) automated approach based on mathematical morphology and fuzzy logic for quantifying and classifying interstitial lung diseases (ILDs) and emphysema. The proposed methodology is composed of several stages: (1) an image multi-resolution decomposition scheme based on a 3D morphological filter is used to detect and analyze the different density patterns of the lung texture. Then, (2) for each pattern in the multi-resolution decomposition, six features are computed, for which fuzzy membership functions define a probability of association with a pathology class. Finally, (3) for each pathology class, the probabilities are combined up according to the weight assigned to each membership function and two threshold values are used to decide the final class of the pattern. The proposed approach was tested on 10 MDCT cases and the classification accuracy was: emphysema: 95%, fibrosis/honeycombing: 84% and ground glass: 97%.
Weber, Marc; Teeling, Hanno; Huang, Sixing; Waldmann, Jost; Kassabgy, Mariette; Fuchs, Bernhard M; Klindworth, Anna; Klockow, Christine; Wichels, Antje; Gerdts, Gunnar; Amann, Rudolf; Glöckner, Frank Oliver
2011-05-01
Next-generation sequencing (NGS) technologies have enabled the application of broad-scale sequencing in microbial biodiversity and metagenome studies. Biodiversity is usually targeted by classifying 16S ribosomal RNA genes, while metagenomic approaches target metabolic genes. However, both approaches remain isolated, as long as the taxonomic and functional information cannot be interrelated. Techniques like self-organizing maps (SOMs) have been applied to cluster metagenomes into taxon-specific bins in order to link biodiversity with functions, but have not been applied to broad-scale NGS-based metagenomics yet. Here, we provide a novel implementation, demonstrate its potential and practicability, and provide a web-based service for public usage. Evaluation with published data sets mimicking varyingly complex habitats resulted into classification specificities and sensitivities of close to 100% to above 90% from phylum to genus level for assemblies exceeding 8 kb for low and medium complexity data. When applied to five real-world metagenomes of medium complexity from direct pyrosequencing of marine subsurface waters, classifications of assemblies above 2.5 kb were in good agreement with fluorescence in situ hybridizations, indicating that biodiversity was mostly retained within the metagenomes, and confirming high classification specificities. This was validated by two protein-based classifications (PBCs) methods. SOMs were able to retrieve the relevant taxa down to the genus level, while surpassing PBCs in resolution. In order to make the approach accessible to a broad audience, we implemented a feature-rich web-based SOM application named TaxSOM, which is freely available at http://www.megx.net/toolbox/taxsom. TaxSOM can classify reads or assemblies exceeding 2.5 kb with high accuracy and thus assists in linking biodiversity and functions in metagenome studies, which is a precondition to study microbial ecology in a holistic fashion.
NASA Astrophysics Data System (ADS)
Praskievicz, S. J.; Luo, C.
2017-12-01
Classification of rivers is useful for a variety of purposes, such as generating and testing hypotheses about watershed controls on hydrology, predicting hydrologic variables for ungaged rivers, and setting goals for river management. In this research, we present a bottom-up (based on machine learning) river classification designed to investigate the underlying physical processes governing rivers' hydrologic regimes. The classification was developed for the entire state of Alabama, based on 248 United States Geological Survey (USGS) stream gages that met criteria for length and completeness of records. Five dimensionless hydrologic signatures were derived for each gage: slope of the flow duration curve (indicator of flow variability), baseflow index (ratio of baseflow to average streamflow), rising limb density (number of rising limbs per unit time), runoff ratio (ratio of long-term average streamflow to long-term average precipitation), and streamflow elasticity (sensitivity of streamflow to precipitation). We used a Bayesian clustering algorithm to classify the gages, based on the five hydrologic signatures, into distinct hydrologic regimes. We then used classification and regression trees (CART) to predict each gaged river's membership in different hydrologic regimes based on climatic and watershed variables. Using existing geospatial data, we applied the CART analysis to classify ungaged streams in Alabama, with the National Hydrography Dataset Plus (NHDPlus) catchment (average area 3 km2) as the unit of classification. The results of the classification can be used for meeting management and conservation objectives in Alabama, such as developing statewide standards for environmental instream flows. Such hydrologic classification approaches are promising for contributing to process-based understanding of river systems.
Medical and Endoscopic Management of Gastric Varices
Al-Osaimi, Abdullah M. S.; Caldwell, Stephen H.
2011-01-01
In the past 20 years, our understanding of the pathophysiology and management options among patients with gastric varices (GV) has changed significantly. GV are the most common cause of upper gastrointestinal bleeding in patients with portal hypertension after esophageal varices (EV) and generally have more severe bleeding than EV. In the United States, the majority of GV patients have underlying portal hypertension rather than splenic vein thrombosis. The widely used classifications are the Sarin Endoscopic Classification and the Japanese Vascular Classifications. The former is based on the endoscopic appearance and location of the varices, while the Japanese classification is based on the underlying vascular anatomy. In this article, the authors address the current concepts of classification, epidemiology, pathophysiology, and emerging management options of gastric varices. They describe the stepwise approach to patients with gastric varices, including the different available modalities, and the pearls, pitfalls, and stop-gap measures useful in managing patients with gastric variceal bleed. PMID:22942544
Single-accelerometer-based daily physical activity classification.
Long, Xi; Yin, Bin; Aarts, Ronald M
2009-01-01
In this study, a single tri-axial accelerometer placed on the waist was used to record the acceleration data for human physical activity classification. The data collection involved 24 subjects performing daily real-life activities in a naturalistic environment without researchers' intervention. For the purpose of assessing customers' daily energy expenditure, walking, running, cycling, driving, and sports were chosen as target activities for classification. This study compared a Bayesian classification with that of a Decision Tree based approach. A Bayes classifier has the advantage to be more extensible, requiring little effort in classifier retraining and software update upon further expansion or modification of the target activities. Principal components analysis was applied to remove the correlation among features and to reduce the feature vector dimension. Experiments using leave-one-subject-out and 10-fold cross validation protocols revealed a classification accuracy of approximately 80%, which was comparable with that obtained by a Decision Tree classifier.
NASA Astrophysics Data System (ADS)
Craig, Paul; Kennedy, Jessie
2008-01-01
An increasingly common approach being taken by taxonomists to define the relationships between taxa in alternative hierarchical classifications is to use a set-based notation which states relationship between two taxa from alternative classifications. Textual recording of these relationships is cumbersome and difficult for taxonomists to manage. While text based GUI tools are beginning to appear which ease the process, these have several limitations. Interactive visual tools offer greater potential to allow taxonomists to explore the taxa in these hierarchies and specify such relationships. This paper describes the Concept Relationship Editor, an interactive visualisation tool designed to support the assertion of relationships between taxonomic classifications. The tool operates using an interactive space-filling adjacency layout which allows users to expand multiple lists of taxa with common parents so they can explore and assert relationships between two classifications.
Salari, Nader; Shohaimi, Shamarina; Najafi, Farid; Nallappan, Meenakshii; Karishnarajah, Isthrinayagy
2014-01-01
Among numerous artificial intelligence approaches, k-Nearest Neighbor algorithms, genetic algorithms, and artificial neural networks are considered as the most common and effective methods in classification problems in numerous studies. In the present study, the results of the implementation of a novel hybrid feature selection-classification model using the above mentioned methods are presented. The purpose is benefitting from the synergies obtained from combining these technologies for the development of classification models. Such a combination creates an opportunity to invest in the strength of each algorithm, and is an approach to make up for their deficiencies. To develop proposed model, with the aim of obtaining the best array of features, first, feature ranking techniques such as the Fisher's discriminant ratio and class separability criteria were used to prioritize features. Second, the obtained results that included arrays of the top-ranked features were used as the initial population of a genetic algorithm to produce optimum arrays of features. Third, using a modified k-Nearest Neighbor method as well as an improved method of backpropagation neural networks, the classification process was advanced based on optimum arrays of the features selected by genetic algorithms. The performance of the proposed model was compared with thirteen well-known classification models based on seven datasets. Furthermore, the statistical analysis was performed using the Friedman test followed by post-hoc tests. The experimental findings indicated that the novel proposed hybrid model resulted in significantly better classification performance compared with all 13 classification methods. Finally, the performance results of the proposed model was benchmarked against the best ones reported as the state-of-the-art classifiers in terms of classification accuracy for the same data sets. The substantial findings of the comprehensive comparative study revealed that performance of the proposed model in terms of classification accuracy is desirable, promising, and competitive to the existing state-of-the-art classification models. PMID:25419659
Classification of EEG Signals Based on Pattern Recognition Approach.
Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed
2017-01-01
Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a "pattern recognition" approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90-7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11-89.63% and 91.60-81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy.
Classification of EEG Signals Based on Pattern Recognition Approach
Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed
2017-01-01
Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a “pattern recognition” approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90–7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11–89.63% and 91.60–81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy. PMID:29209190
Recursive heuristic classification
NASA Technical Reports Server (NTRS)
Wilkins, David C.
1994-01-01
The author will describe a new problem-solving approach called recursive heuristic classification, whereby a subproblem of heuristic classification is itself formulated and solved by heuristic classification. This allows the construction of more knowledge-intensive classification programs in a way that yields a clean organization. Further, standard knowledge acquisition and learning techniques for heuristic classification can be used to create, refine, and maintain the knowledge base associated with the recursively called classification expert system. The method of recursive heuristic classification was used in the Minerva blackboard shell for heuristic classification. Minerva recursively calls itself every problem-solving cycle to solve the important blackboard scheduler task, which involves assigning a desirability rating to alternative problem-solving actions. Knowing these ratings is critical to the use of an expert system as a component of a critiquing or apprenticeship tutoring system. One innovation of this research is a method called dynamic heuristic classification, which allows selection among dynamically generated classification categories instead of requiring them to be prenumerated.
Fillingim, Roger B; Bruehl, Stephen; Dworkin, Robert H; Dworkin, Samuel F; Loeser, John D; Turk, Dennis C; Widerstrom-Noga, Eva; Arnold, Lesley; Bennett, Robert; Edwards, Robert R; Freeman, Roy; Gewandter, Jennifer; Hertz, Sharon; Hochberg, Marc; Krane, Elliot; Mantyh, Patrick W; Markman, John; Neogi, Tuhina; Ohrbach, Richard; Paice, Judith A; Porreca, Frank; Rappaport, Bob A; Smith, Shannon M; Smith, Thomas J; Sullivan, Mark D; Verne, G Nicholas; Wasan, Ajay D; Wesselmann, Ursula
2014-03-01
Current approaches to classification of chronic pain conditions suffer from the absence of a systematically implemented and evidence-based taxonomy. Moreover, existing diagnostic approaches typically fail to incorporate available knowledge regarding the biopsychosocial mechanisms contributing to pain conditions. To address these gaps, the Analgesic, Anesthetic, and Addiction Clinical Trial Translations Innovations Opportunities and Networks (ACTTION) public-private partnership with the U.S. Food and Drug Administration and the American Pain Society (APS) have joined together to develop an evidence-based chronic pain classification system called the ACTTION-APS Pain Taxonomy. This paper describes the outcome of an ACTTION-APS consensus meeting, at which experts agreed on a structure for this new taxonomy of chronic pain conditions. Several major issues around which discussion revolved are presented and summarized, and the structure of the taxonomy is presented. ACTTION-APS Pain Taxonomy will include the following dimensions: 1) core diagnostic criteria; 2) common features; 3) common medical comorbidities; 4) neurobiological, psychosocial, and functional consequences; and 5) putative neurobiological and psychosocial mechanisms, risk factors, and protective factors. In coming months, expert working groups will apply this taxonomy to clusters of chronic pain conditions, thereby developing a set of diagnostic criteria that have been consistently and systematically implemented across nearly all common chronic pain conditions. It is anticipated that the availability of this evidence-based and mechanistic approach to pain classification will be of substantial benefit to chronic pain research and treatment. The ACTTION-APS Pain Taxonomy is an evidence-based chronic pain classification system designed to classify chronic pain along the following dimensions: 1) core diagnostic criteria; 2) common features; 3) common medical comorbidities; 4) neurobiological, psychosocial, and functional consequences; and 5) putative neurobiological and psychosocial mechanisms, risk factors, and protective factors. Copyright © 2014 American Pain Society. Published by Elsevier Inc. All rights reserved.
The Molecular Pathology of Myelodysplastic Syndrome.
Haferlach, Torsten
2018-05-23
The diagnosis and classification of myelodysplastic syndromes (MDS) are based on cytomorphology and cytogenetics (WHO classification). Prognosis is best defined by the Revised International Prognostic Scoring System (IPSS-R). In recent years, an increasing number of molecular aberrations have been discovered. They are already included in the classification (e.g., SF3B1) and, more importantly, have emerged as valuable markers for better classification, particularly for defining risk groups. Mutations in genes such as SF3B1 and IDH1/2 have already had an impact on targeted treatment approaches in MDS. © 2018 S. Karger AG, Basel.
Multi-dimensionality and variability in folk classification of stingless bees (Apidae: Meliponini).
Zamudio, Fernando; Hilgert, Norma I
2015-05-23
Not long ago Eugene Hunn suggested using a combination of cognitive, linguistic, ecological and evolutionary theories in order to account for the dynamic character of ethnoecology in the study of folk classification systems. In this way he intended to question certain homogeneity in folk classifications models and deepen in the analysis and interpretation of variability in folk classifications. This paper studies how a rural culturally mixed population of the Atlantic Forest of Misiones (Argentina) classified honey-producing stingless bees according to the linguistic, cognitive and ecological dimensions of folk classification. We also analyze the socio-ecological meaning of binomialization in naming and the meaning of general local variability in the appointment of stingless bees. We used three different approaches: the classical approach developed by Brent Berlin which relies heavily on linguistic criteria, the approach developed by Eleonor Rosch which relies on psychological (cognitive) principles of categorization and finally we have captured the ecological dimension of folk classification in local narratives. For the second approximation, we developed ways of measuring the degree of prototypicality based on a total of 107 comparisons of the type "X is similar to Y" identified in personal narratives. Various logical and grouping strategies coexist and were identified as: graded of lateral linkage, hierarchical and functional. Similarity judgments among folk taxa resulted in an implicit logic of classification graded according to taxa's prototypicality. While there is a high agreement on naming stingless bees with monomial names, a considerable number of underrepresented binomial names and lack of names were observed. Two possible explanations about reported local naming variability are presented. We support the multidimensionality of folk classification systems. This confirms the specificity of local classification systems but also reflects the use of grouping strategies and mechanisms commonly observed in other cultural groups, such as the use of similarity judgments between more or less prototypical organisms. Also we support the idea that alternative naming results from a process of fragmentation of knowledge or incomplete transmission of knowledge. These processes lean on the facts that culturally based knowledge, on the one hand, and biologic knowledge of nature on the other, can be acquired through different learning pathways.
Farhan, Saima; Fahiem, Muhammad Abuzar; Tauseef, Huma
2014-01-01
Structural brain imaging is playing a vital role in identification of changes that occur in brain associated with Alzheimer's disease. This paper proposes an automated image processing based approach for the identification of AD from MRI of the brain. The proposed approach is novel in a sense that it has higher specificity/accuracy values despite the use of smaller feature set as compared to existing approaches. Moreover, the proposed approach is capable of identifying AD patients in early stages. The dataset selected consists of 85 age and gender matched individuals from OASIS database. The features selected are volume of GM, WM, and CSF and size of hippocampus. Three different classification models (SVM, MLP, and J48) are used for identification of patients and controls. In addition, an ensemble of classifiers, based on majority voting, is adopted to overcome the error caused by an independent base classifier. Ten-fold cross validation strategy is applied for the evaluation of our scheme. Moreover, to evaluate the performance of proposed approach, individual features and combination of features are fed to individual classifiers and ensemble based classifier. Using size of left hippocampus as feature, the accuracy achieved with ensemble of classifiers is 93.75%, with 100% specificity and 87.5% sensitivity.
Towards quantitative classification of folded proteins in terms of elementary functions.
Hu, Shuangwei; Krokhotin, Andrei; Niemi, Antti J; Peng, Xubiao
2011-04-01
A comparative classification scheme provides a good basis for several approaches to understand proteins, including prediction of relations between their structure and biological function. But it remains a challenge to combine a classification scheme that describes a protein starting from its well-organized secondary structures and often involves direct human involvement, with an atomary-level physics-based approach where a protein is fundamentally nothing more than an ensemble of mutually interacting carbon, hydrogen, oxygen, and nitrogen atoms. In order to bridge these two complementary approaches to proteins, conceptually novel tools need to be introduced. Here we explain how an approach toward geometric characterization of entire folded proteins can be based on a single explicit elementary function that is familiar from nonlinear physical systems where it is known as the kink soliton. Our approach enables the conversion of hierarchical structural information into a quantitative form that allows for a folded protein to be characterized in terms of a small number of global parameters that are in principle computable from atomary-level considerations. As an example we describe in detail how the native fold of the myoglobin 1M6C emerges from a combination of kink solitons with a very high atomary-level accuracy. We also verify that our approach describes longer loops and loops connecting α helices with β strands, with the same overall accuracy. ©2011 American Physical Society
Identification of sea ice types in spaceborne synthetic aperture radar data
NASA Technical Reports Server (NTRS)
Kwok, Ronald; Rignot, Eric; Holt, Benjamin; Onstott, R.
1992-01-01
This study presents an approach for identification of sea ice types in spaceborne SAR image data. The unsupervised classification approach involves cluster analysis for segmentation of the image data followed by cluster labeling based on previously defined look-up tables containing the expected backscatter signatures of different ice types measured by a land-based scatterometer. Extensive scatterometer observations and experience accumulated in field campaigns during the last 10 yr were used to construct these look-up tables. The classification approach, its expected performance, the dependence of this performance on radar system performance, and expected ice scattering characteristics are discussed. Results using both aircraft and simulated ERS-1 SAR data are presented and compared to limited field ice property measurements and coincident passive microwave imagery. The importance of an integrated postlaunch program for the validation and improvement of this approach is discussed.
Mixing geometric and radiometric features for change classification
NASA Astrophysics Data System (ADS)
Fournier, Alexandre; Descombes, Xavier; Zerubia, Josiane
2008-02-01
Most basic change detection algorithms use a pixel-based approach. Whereas such approach is quite well defined for monitoring important area changes (such as urban growth monitoring) in low resolution images, an object based approach seems more relevant when the change detection is specifically aimed toward targets (such as small buildings and vehicles). In this paper, we present an approach that mixes radiometric and geometric features to qualify the changed zones. The goal is to establish bounds (appearance, disappearance, substitution ...) between the detected changes and the underlying objects. We proceed by first clustering the change map (containing each pixel bitemporal radiosity) in different classes using the entropy-kmeans algorithm. Assuming that most man-made objects have a polygonal shape, a polygonal approximation algorithm is then used in order to characterize the resulting zone shapes. Hence allowing us to refine the primary rough classification, by integrating the polygon orientations in the state space. Tests are currently conducted on Quickbird data.
Kainz, Philipp; Pfeiffer, Michael; Urschler, Martin
2017-01-01
Segmentation of histopathology sections is a necessary preprocessing step for digital pathology. Due to the large variability of biological tissue, machine learning techniques have shown superior performance over conventional image processing methods. Here we present our deep neural network-based approach for segmentation and classification of glands in tissue of benign and malignant colorectal cancer, which was developed to participate in the GlaS@MICCAI2015 colon gland segmentation challenge. We use two distinct deep convolutional neural networks (CNN) for pixel-wise classification of Hematoxylin-Eosin stained images. While the first classifier separates glands from background, the second classifier identifies gland-separating structures. In a subsequent step, a figure-ground segmentation based on weighted total variation produces the final segmentation result by regularizing the CNN predictions. We present both quantitative and qualitative segmentation results on the recently released and publicly available Warwick-QU colon adenocarcinoma dataset associated with the GlaS@MICCAI2015 challenge and compare our approach to the simultaneously developed other approaches that participated in the same challenge. On two test sets, we demonstrate our segmentation performance and show that we achieve a tissue classification accuracy of 98% and 95%, making use of the inherent capability of our system to distinguish between benign and malignant tissue. Our results show that deep learning approaches can yield highly accurate and reproducible results for biomedical image analysis, with the potential to significantly improve the quality and speed of medical diagnoses.
Kainz, Philipp; Pfeiffer, Michael
2017-01-01
Segmentation of histopathology sections is a necessary preprocessing step for digital pathology. Due to the large variability of biological tissue, machine learning techniques have shown superior performance over conventional image processing methods. Here we present our deep neural network-based approach for segmentation and classification of glands in tissue of benign and malignant colorectal cancer, which was developed to participate in the GlaS@MICCAI2015 colon gland segmentation challenge. We use two distinct deep convolutional neural networks (CNN) for pixel-wise classification of Hematoxylin-Eosin stained images. While the first classifier separates glands from background, the second classifier identifies gland-separating structures. In a subsequent step, a figure-ground segmentation based on weighted total variation produces the final segmentation result by regularizing the CNN predictions. We present both quantitative and qualitative segmentation results on the recently released and publicly available Warwick-QU colon adenocarcinoma dataset associated with the GlaS@MICCAI2015 challenge and compare our approach to the simultaneously developed other approaches that participated in the same challenge. On two test sets, we demonstrate our segmentation performance and show that we achieve a tissue classification accuracy of 98% and 95%, making use of the inherent capability of our system to distinguish between benign and malignant tissue. Our results show that deep learning approaches can yield highly accurate and reproducible results for biomedical image analysis, with the potential to significantly improve the quality and speed of medical diagnoses. PMID:29018612
Visual attention based bag-of-words model for image classification
NASA Astrophysics Data System (ADS)
Wang, Qiwei; Wan, Shouhong; Yue, Lihua; Wang, Che
2014-04-01
Bag-of-words is a classical method for image classification. The core problem is how to count the frequency of the visual words and what visual words to select. In this paper, we propose a visual attention based bag-of-words model (VABOW model) for image classification task. The VABOW model utilizes visual attention method to generate a saliency map, and uses the saliency map as a weighted matrix to instruct the statistic process for the frequency of the visual words. On the other hand, the VABOW model combines shape, color and texture cues and uses L1 regularization logistic regression method to select the most relevant and most efficient features. We compare our approach with traditional bag-of-words based method on two datasets, and the result shows that our VABOW model outperforms the state-of-the-art method for image classification.
Symbolic rule-based classification of lung cancer stages from free-text pathology reports.
Nguyen, Anthony N; Lawley, Michael J; Hansen, David P; Bowman, Rayleen V; Clarke, Belinda E; Duhig, Edwina E; Colquist, Shoni
2010-01-01
To classify automatically lung tumor-node-metastases (TNM) cancer stages from free-text pathology reports using symbolic rule-based classification. By exploiting report substructure and the symbolic manipulation of systematized nomenclature of medicine-clinical terms (SNOMED CT) concepts in reports, statements in free text can be evaluated for relevance against factors relating to the staging guidelines. Post-coordinated SNOMED CT expressions based on templates were defined and populated by concepts in reports, and tested for subsumption by staging factors. The subsumption results were used to build logic according to the staging guidelines to calculate the TNM stage. The accuracy measure and confusion matrices were used to evaluate the TNM stages classified by the symbolic rule-based system. The system was evaluated against a database of multidisciplinary team staging decisions and a machine learning-based text classification system using support vector machines. Overall accuracy on a corpus of pathology reports for 718 lung cancer patients against a database of pathological TNM staging decisions were 72%, 78%, and 94% for T, N, and M staging, respectively. The system's performance was also comparable to support vector machine classification approaches. A system to classify lung TNM stages from free-text pathology reports was developed, and it was verified that the symbolic rule-based approach using SNOMED CT can be used for the extraction of key lung cancer characteristics from free-text reports. Future work will investigate the applicability of using the proposed methodology for extracting other cancer characteristics and types.
NASA Astrophysics Data System (ADS)
Hernandez-Contreras, D.; Peregrina-Barreto, H.; Rangel-Magdaleno, J.; Ramirez-Cortes, J.; Renero-Carrillo, F.
2015-11-01
This paper presents a novel approach to characterize and identify patterns of temperature in thermographic images of the human foot plant in support of early diagnosis and follow-up of diabetic patients. Composed feature vectors based on 3D morphological pattern spectrum (pecstrum) and relative position, allow the system to quantitatively characterize and discriminate non-diabetic (control) and diabetic (DM) groups. Non-linear classification using neural networks is used for that purpose. A classification rate of 94.33% in average was obtained with the composed feature extraction process proposed in this paper. Performance evaluation and obtained results are presented.
NASA Astrophysics Data System (ADS)
Malatesta, Luca; Attorre, Fabio; Altobelli, Alfredo; Adeeb, Ahmed; De Sanctis, Michele; Taleb, Nadim M.; Scholte, Paul T.; Vitale, Marcello
2013-01-01
Socotra Island (Yemen), a global biodiversity hotspot, is characterized by high geomorphological and biological diversity. In this study, we present a high-resolution vegetation map of the island based on combining vegetation analysis and classification with remote sensing. Two different image classification approaches were tested to assess the most accurate one in mapping the vegetation mosaic of Socotra. Spectral signatures of the vegetation classes were obtained through a Gaussian mixture distribution model, and a sequential maximum a posteriori (SMAP) classification was applied to account for the heterogeneity and the complex spatial pattern of the arid vegetation. This approach was compared to the traditional maximum likelihood (ML) classification. Satellite data were represented by a RapidEye image with 5 m pixel resolution and five spectral bands. Classified vegetation relevés were used to obtain the training and evaluation sets for the main plant communities. Postclassification sorting was performed to adjust the classification through various rule-based operations. Twenty-eight classes were mapped, and SMAP, with an accuracy of 87%, proved to be more effective than ML (accuracy: 66%). The resulting map will represent an important instrument for the elaboration of conservation strategies and the sustainable use of natural resources in the island.
Juan-Albarracín, Javier; Fuster-Garcia, Elies; Manjón, José V.; Robles, Montserrat; Aparici, F.; Martí-Bonmatí, L.; García-Gómez, Juan M.
2015-01-01
Automatic brain tumour segmentation has become a key component for the future of brain tumour treatment. Currently, most of brain tumour segmentation approaches arise from the supervised learning standpoint, which requires a labelled training dataset from which to infer the models of the classes. The performance of these models is directly determined by the size and quality of the training corpus, whose retrieval becomes a tedious and time-consuming task. On the other hand, unsupervised approaches avoid these limitations but often do not reach comparable results than the supervised methods. In this sense, we propose an automated unsupervised method for brain tumour segmentation based on anatomical Magnetic Resonance (MR) images. Four unsupervised classification algorithms, grouped by their structured or non-structured condition, were evaluated within our pipeline. Considering the non-structured algorithms, we evaluated K-means, Fuzzy K-means and Gaussian Mixture Model (GMM), whereas as structured classification algorithms we evaluated Gaussian Hidden Markov Random Field (GHMRF). An automated postprocess based on a statistical approach supported by tissue probability maps is proposed to automatically identify the tumour classes after the segmentations. We evaluated our brain tumour segmentation method with the public BRAin Tumor Segmentation (BRATS) 2013 Test and Leaderboard datasets. Our approach based on the GMM model improves the results obtained by most of the supervised methods evaluated with the Leaderboard set and reaches the second position in the ranking. Our variant based on the GHMRF achieves the first position in the Test ranking of the unsupervised approaches and the seventh position in the general Test ranking, which confirms the method as a viable alternative for brain tumour segmentation. PMID:25978453
Stygoregions – a promising approach to a bioregional classification of groundwater systems
Stein, Heide; Griebler, Christian; Berkhoff, Sven; Matzke, Dirk; Fuchs, Andreas; Hahn, Hans Jürgen
2012-01-01
Linked to diverse biological processes, groundwater ecosystems deliver essential services to mankind, the most important of which is the provision of drinking water. In contrast to surface waters, ecological aspects of groundwater systems are ignored by the current European Union and national legislation. Groundwater management and protection measures refer exclusively to its good physicochemical and quantitative status. Current initiatives in developing ecologically sound integrative assessment schemes by taking groundwater fauna into account depend on the initial classification of subsurface bioregions. In a large scale survey, the regional and biogeographical distribution patterns of groundwater dwelling invertebrates were examined for many parts of Germany. Following an exploratory approach, our results underline that the distribution patterns of invertebrates in groundwater are not in accordance with any existing bioregional classification system established for surface habitats. In consequence, we propose to develope a new classification scheme for groundwater ecosystems based on stygoregions. PMID:22993698
A taxonomy-based approach to shed light on the babel of mathematical analogies for rice simulation
USDA-ARS?s Scientific Manuscript database
For most biophysical domains, different models are available and the extent to which their structures differ with respect to differences in outputs was never quantified. We use a taxonomy-based approach to address the question with thirteen rice models. Classification keys and binary attributes for ...
Sharma, Harshita; Zerbe, Norman; Klempert, Iris; Hellwich, Olaf; Hufnagl, Peter
2017-11-01
Deep learning using convolutional neural networks is an actively emerging field in histological image analysis. This study explores deep learning methods for computer-aided classification in H&E stained histopathological whole slide images of gastric carcinoma. An introductory convolutional neural network architecture is proposed for two computerized applications, namely, cancer classification based on immunohistochemical response and necrosis detection based on the existence of tumor necrosis in the tissue. Classification performance of the developed deep learning approach is quantitatively compared with traditional image analysis methods in digital histopathology requiring prior computation of handcrafted features, such as statistical measures using gray level co-occurrence matrix, Gabor filter-bank responses, LBP histograms, gray histograms, HSV histograms and RGB histograms, followed by random forest machine learning. Additionally, the widely known AlexNet deep convolutional framework is comparatively analyzed for the corresponding classification problems. The proposed convolutional neural network architecture reports favorable results, with an overall classification accuracy of 0.6990 for cancer classification and 0.8144 for necrosis detection. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Wei, Hongqiang; Zhou, Guiyun; Zhou, Junjie
2018-04-01
The classification of leaf and wood points is an essential preprocessing step for extracting inventory measurements and canopy characterization of trees from the terrestrial laser scanning (TLS) data. The geometry-based approach is one of the widely used classification method. In the geometry-based method, it is common practice to extract salient features at one single scale before the features are used for classification. It remains unclear how different scale(s) used affect the classification accuracy and efficiency. To assess the scale effect on the classification accuracy and efficiency, we extracted the single-scale and multi-scale salient features from the point clouds of two oak trees of different sizes and conducted the classification on leaf and wood. Our experimental results show that the balanced accuracy of the multi-scale method is higher than the average balanced accuracy of the single-scale method by about 10 % for both trees. The average speed-up ratio of single scale classifiers over multi-scale classifier for each tree is higher than 30.
Comparison of Classifier Architectures for Online Neural Spike Sorting.
Saeed, Maryam; Khan, Amir Ali; Kamboh, Awais Mehmood
2017-04-01
High-density, intracranial recordings from micro-electrode arrays need to undergo Spike Sorting in order to associate the recorded neuronal spikes to particular neurons. This involves spike detection, feature extraction, and classification. To reduce the data transmission and power requirements, on-chip real-time processing is becoming very popular. However, high computational resources are required for classifiers in on-chip spike-sorters, making scalability a great challenge. In this review paper, we analyze several popular classifiers to propose five new hardware architectures using the off-chip training with on-chip classification approach. These include support vector classification, fuzzy C-means classification, self-organizing maps classification, moving-centroid K-means classification, and Cosine distance classification. The performance of these architectures is analyzed in terms of accuracy and resource requirement. We establish that the neural networks based Self-Organizing Maps classifier offers the most viable solution. A spike sorter based on the Self-Organizing Maps classifier, requires only 7.83% of computational resources of the best-reported spike sorter, hierarchical adaptive means, while offering a 3% better accuracy at 7 dB SNR.
Alwanni, Hisham; Baslan, Yara; Alnuman, Nasim; Daoud, Mohammad I.
2017-01-01
This paper presents an EEG-based brain-computer interface system for classifying eleven motor imagery (MI) tasks within the same hand. The proposed system utilizes the Choi-Williams time-frequency distribution (CWD) to construct a time-frequency representation (TFR) of the EEG signals. The constructed TFR is used to extract five categories of time-frequency features (TFFs). The TFFs are processed using a hierarchical classification model to identify the MI task encapsulated within the EEG signals. To evaluate the performance of the proposed approach, EEG data were recorded for eighteen intact subjects and four amputated subjects while imagining to perform each of the eleven hand MI tasks. Two performance evaluation analyses, namely channel- and TFF-based analyses, are conducted to identify the best subset of EEG channels and the TFFs category, respectively, that enable the highest classification accuracy between the MI tasks. In each evaluation analysis, the hierarchical classification model is trained using two training procedures, namely subject-dependent and subject-independent procedures. These two training procedures quantify the capability of the proposed approach to capture both intra- and inter-personal variations in the EEG signals for different MI tasks within the same hand. The results demonstrate the efficacy of the approach for classifying the MI tasks within the same hand. In particular, the classification accuracies obtained for the intact and amputated subjects are as high as 88.8% and 90.2%, respectively, for the subject-dependent training procedure, and 80.8% and 87.8%, respectively, for the subject-independent training procedure. These results suggest the feasibility of applying the proposed approach to control dexterous prosthetic hands, which can be of great benefit for individuals suffering from hand amputations. PMID:28832513
Unsupervised classification of operator workload from brain signals
NASA Astrophysics Data System (ADS)
Schultze-Kraft, Matthias; Dähne, Sven; Gugler, Manfred; Curio, Gabriel; Blankertz, Benjamin
2016-06-01
Objective. In this study we aimed for the classification of operator workload as it is expected in many real-life workplace environments. We explored brain-signal based workload predictors that differ with respect to the level of label information required for training, including entirely unsupervised approaches. Approach. Subjects executed a task on a touch screen that required continuous effort of visual and motor processing with alternating difficulty. We first employed classical approaches for workload state classification that operate on the sensor space of EEG and compared those to the performance of three state-of-the-art spatial filtering methods: common spatial patterns (CSPs) analysis, which requires binary label information; source power co-modulation (SPoC) analysis, which uses the subjects’ error rate as a target function; and canonical SPoC (cSPoC) analysis, which solely makes use of cross-frequency power correlations induced by different states of workload and thus represents an unsupervised approach. Finally, we investigated the effects of fusing brain signals and peripheral physiological measures (PPMs) and examined the added value for improving classification performance. Main results. Mean classification accuracies of 94%, 92% and 82% were achieved with CSP, SPoC, cSPoC, respectively. These methods outperformed the approaches that did not use spatial filtering and they extracted physiologically plausible components. The performance of the unsupervised cSPoC is significantly increased by augmenting it with PPM features. Significance. Our analyses ensured that the signal sources used for classification were of cortical origin and not contaminated with artifacts. Our findings show that workload states can be successfully differentiated from brain signals, even when less and less information from the experimental paradigm is used, thus paving the way for real-world applications in which label information may be noisy or entirely unavailable.
Evaluation of air quality zone classification methods based on ambient air concentration exposure.
Freeman, Brian; McBean, Ed; Gharabaghi, Bahram; Thé, Jesse
2017-05-01
Air quality zones are used by regulatory authorities to implement ambient air standards in order to protect human health. Air quality measurements at discrete air monitoring stations are critical tools to determine whether an air quality zone complies with local air quality standards or is noncompliant. This study presents a novel approach for evaluation of air quality zone classification methods by breaking the concentration distribution of a pollutant measured at an air monitoring station into compliance and exceedance probability density functions (PDFs) and then using Monte Carlo analysis with the Central Limit Theorem to estimate long-term exposure. The purpose of this paper is to compare the risk associated with selecting one ambient air classification approach over another by testing the possible exposure an individual living within a zone may face. The chronic daily intake (CDI) is utilized to compare different pollutant exposures over the classification duration of 3 years between two classification methods. Historical data collected from air monitoring stations in Kuwait are used to build representative models of 1-hr NO 2 and 8-hr O 3 within a zone that meets the compliance requirements of each method. The first method, the "3 Strike" method, is a conservative approach based on a winner-take-all approach common with most compliance classification methods, while the second, the 99% Rule method, allows for more robust analyses and incorporates long-term trends. A Monte Carlo analysis is used to model the CDI for each pollutant and each method with the zone at a single station and with multiple stations. The model assumes that the zone is already in compliance with air quality standards over the 3 years under the different classification methodologies. The model shows that while the CDI of the two methods differs by 2.7% over the exposure period for the single station case, the large number of samples taken over the duration period impacts the sensitivity of the statistical tests, causing the null hypothesis to fail. Local air quality managers can use either methodology to classify the compliance of an air zone, but must accept that the 99% Rule method may cause exposures that are statistically more significant than the 3 Strike method. A novel method using the Central Limit Theorem and Monte Carlo analysis is used to directly compare different air standard compliance classification methods by estimating the chronic daily intake of pollutants. This method allows air quality managers to rapidly see how individual classification methods may impact individual population groups, as well as to evaluate different pollutants based on dosage and exposure when complete health impacts are not known.
Ecological type classification for California: the Forest Service approach
Barbara H. Allen
1987-01-01
National legislation has mandated the development and use of an ecological data base to improve resource decision making, while State and Federal agencies have agreed to cooperate in standardizing resource classification and inventory data. In the Pacific Southwest Region, which includes nearly 20 million acres (8.3 million ha) in California, the Forest Service, U.S....
Application of the Covalent Bond Classification Method for the Teaching of Inorganic Chemistry
ERIC Educational Resources Information Center
Green, Malcolm L. H.; Parkin, Gerard
2014-01-01
The Covalent Bond Classification (CBC) method provides a means to classify covalent molecules according to the number and types of bonds that surround an atom of interest. This approach is based on an elementary molecular orbital analysis of the bonding involving the central atom (M), with the various interactions being classified according to the…
ERIC Educational Resources Information Center
Kranzler, John H.; Floyd, Randy G.; Benson, Nicholas; Zaboski, Brian; Thibodaux, Lia
2016-01-01
The Cross-Battery Assessment (XBA) approach to identifying a specific learning disorder (SLD) is based on the postulate that deficits in cognitive abilities in the presence of otherwise average general intelligence are causally related to academic achievement weaknesses. To examine this postulate, we conducted a classification agreement analysis…
A support vector machine approach for classification of welding defects from ultrasonic signals
NASA Astrophysics Data System (ADS)
Chen, Yuan; Ma, Hong-Wei; Zhang, Guang-Ming
2014-07-01
Defect classification is an important issue in ultrasonic non-destructive evaluation. A layered multi-class support vector machine (LMSVM) classification system, which combines multiple SVM classifiers through a layered architecture, is proposed in this paper. The proposed LMSVM classification system is applied to the classification of welding defects from ultrasonic test signals. The measured ultrasonic defect echo signals are first decomposed into wavelet coefficients by the wavelet packet transform. The energy of the wavelet coefficients at different frequency channels are used to construct the feature vectors. The bees algorithm (BA) is then used for feature selection and SVM parameter optimisation for the LMSVM classification system. The BA-based feature selection optimises the energy feature vectors. The optimised feature vectors are input to the LMSVM classification system for training and testing. Experimental results of classifying welding defects demonstrate that the proposed technique is highly robust, precise and reliable for ultrasonic defect classification.
Pixel-based flood mapping from SAR imagery: a comparison of approaches
NASA Astrophysics Data System (ADS)
Landuyt, Lisa; Van Wesemael, Alexandra; Van Coillie, Frieke M. B.; Verhoest, Niko E. C.
2017-04-01
Due to their all-weather, day and night capabilities, SAR sensors have been shown to be particularly suitable for flood mapping applications. Thus, they can provide spatially-distributed flood extent data which are valuable for calibrating, validating and updating flood inundation models. These models are an invaluable tool for water managers, to take appropriate measures in times of high water levels. Image analysis approaches to delineate flood extent on SAR imagery are numerous. They can be classified into two categories, i.e. pixel-based and object-based approaches. Pixel-based approaches, e.g. thresholding, are abundant and in general computationally inexpensive. However, large discrepancies between these techniques exist and often subjective user intervention is needed. Object-based approaches require more processing but allow for the integration of additional object characteristics, like contextual information and object geometry, and thus have significant potential to provide an improved classification result. As means of benchmark, a selection of pixel-based techniques is applied on a ERS-2 SAR image of the 2006 flood event of River Dee, United Kingdom. This selection comprises Otsu thresholding, Kittler & Illingworth thresholding, the Fine To Coarse segmentation algorithm and active contour modelling. The different classification results are evaluated and compared by means of several accuracy measures, including binary performance measures.
NASA Astrophysics Data System (ADS)
Hamedianfar, Alireza; Shafri, Helmi Zulhaidi Mohd
2016-04-01
This paper integrates decision tree-based data mining (DM) and object-based image analysis (OBIA) to provide a transferable model for the detailed characterization of urban land-cover classes using WorldView-2 (WV-2) satellite images. Many articles have been published on OBIA in recent years based on DM for different applications. However, less attention has been paid to the generation of a transferable model for characterizing detailed urban land cover features. Three subsets of WV-2 images were used in this paper to generate transferable OBIA rule-sets. Many features were explored by using a DM algorithm, which created the classification rules as a decision tree (DT) structure from the first study area. The developed DT algorithm was applied to object-based classifications in the first study area. After this process, we validated the capability and transferability of the classification rules into second and third subsets. Detailed ground truth samples were collected to assess the classification results. The first, second, and third study areas achieved 88%, 85%, and 85% overall accuracies, respectively. Results from the investigation indicate that DM was an efficient method to provide the optimal and transferable classification rules for OBIA, which accelerates the rule-sets creation stage in the OBIA classification domain.
Sánchez-González, Alain; García-Zapirain, Begoña; Maestro Saiz, Iratxe; Yurrebaso Santamaría, Izaskun
2015-01-01
Periodic activity in electroencephalography (PA-EEG) is shown as comprising a series of repetitive wave patterns that may appear in different cerebral regions and are due to many different pathologies. The diagnosis based on PA-EEG is an arduous task for experts in Clinical Neurophysiology, being mainly based on other clinical features of patients. Considering this difficulty in the diagnosis it is also very complicated to establish the prognosis of patients who present PA-EEG. The goal of this paper is to propose a method capable of determining patient prognosis based on characteristics of the PA-EEG activity. The approach, based on a parallel classification architecture and a majority vote system has proven successful by obtaining a success rate of 81.94% in the classification of patient prognosis of our database.
NASA Astrophysics Data System (ADS)
Chellasamy, Menaka; Ferré, Ty Paul Andrew; Greve, Mogens Humlekrog
2016-07-01
Beginning in 2015, Danish farmers are obliged to meet specific crop diversification rules based on total land area and number of crops cultivated to be eligible for new greening subsidies. Hence, there is a need for the Danish government to extend their subsidy control system to verify farmers' declarations to warrant greening payments under the new crop diversification rules. Remote Sensing (RS) technology has been used since 1992 to control farmers' subsidies in Denmark. However, a proper RS-based approach is yet to be finalised to validate new crop diversity requirements designed for assessing compliance under the recent subsidy scheme (2014-2020); This study uses an ensemble classification approach (proposed by the authors in previous studies) for validating the crop diversity requirements of the new rules. The approach uses a neural network ensemble classification system with bi-temporal (spring and early summer) WorldView-2 imagery (WV2) and includes the following steps: (1) automatic computation of pixel-based prediction probabilities using multiple neural networks; (2) quantification of the classification uncertainty using Endorsement Theory (ET); (3) discrimination of crop pixels and validation of the crop diversification rules at farm level; and (4) identification of farmers who are violating the requirements for greening subsidies. The prediction probabilities are computed by a neural network ensemble supplied with training samples selected automatically using farmers declared parcels (field vectors containing crop information and the field boundary of each crop). Crop discrimination is performed by considering a set of conclusions derived from individual neural networks based on ET. Verification of the diversification rules is performed by incorporating pixel-based classification uncertainty or confidence intervals with the class labels at the farmer level. The proposed approach was tested with WV2 imagery acquired in 2011 for a study area in Vennebjerg, Denmark, containing 132 farmers, 1258 fields, and 18 crops. The classification results obtained show an overall accuracy of 90.2%. The RS-based results suggest that 36 farmers did not follow the crop diversification rules that would qualify for the greening subsidies. When compared to the farmers' reported crop mixes, irrespective of the rule, the RS results indicate that false crop declarations were made by 8 farmers, covering 15 fields. If the farmers' reports had been submitted for the new greening subsidies, 3 farmers would have made a false claim; while remaining 5 farmers obey the rules of required crop proportion even though they have submitted the false crop code due to their small holding size. The RS results would have supported 96 farmers for greening subsidy claims, with no instances of suggesting a greening subsidy for a holding that the farmer did not report as meeting the required conditions. These results suggest that the proposed RS based method shows great promise for validating the new greening subsidies in Denmark.
High-order distance-based multiview stochastic learning in image classification.
Yu, Jun; Rui, Yong; Tang, Yuan Yan; Tao, Dacheng
2014-12-01
How do we find all images in a larger set of images which have a specific content? Or estimate the position of a specific object relative to the camera? Image classification methods, like support vector machine (supervised) and transductive support vector machine (semi-supervised), are invaluable tools for the applications of content-based image retrieval, pose estimation, and optical character recognition. However, these methods only can handle the images represented by single feature. In many cases, different features (or multiview data) can be obtained, and how to efficiently utilize them is a challenge. It is inappropriate for the traditionally concatenating schema to link features of different views into a long vector. The reason is each view has its specific statistical property and physical interpretation. In this paper, we propose a high-order distance-based multiview stochastic learning (HD-MSL) method for image classification. HD-MSL effectively combines varied features into a unified representation and integrates the labeling information based on a probabilistic framework. In comparison with the existing strategies, our approach adopts the high-order distance obtained from the hypergraph to replace pairwise distance in estimating the probability matrix of data distribution. In addition, the proposed approach can automatically learn a combination coefficient for each view, which plays an important role in utilizing the complementary information of multiview data. An alternative optimization is designed to solve the objective functions of HD-MSL and obtain different views on coefficients and classification scores simultaneously. Experiments on two real world datasets demonstrate the effectiveness of HD-MSL in image classification.
Building confidence and credibility into CAD with belief decision trees
NASA Astrophysics Data System (ADS)
Affenit, Rachael N.; Barns, Erik R.; Furst, Jacob D.; Rasin, Alexander; Raicu, Daniela S.
2017-03-01
Creating classifiers for computer-aided diagnosis in the absence of ground truth is a challenging problem. Using experts' opinions as reference truth is difficult because the variability in the experts' interpretations introduces uncertainty in the labeled diagnostic data. This uncertainty translates into noise, which can significantly affect the performance of any classifier on test data. To address this problem, we propose a new label set weighting approach to combine the experts' interpretations and their variability, as well as a selective iterative classification (SIC) approach that is based on conformal prediction. Using the NIH/NCI Lung Image Database Consortium (LIDC) dataset in which four radiologists interpreted the lung nodule characteristics, including the degree of malignancy, we illustrate the benefits of the proposed approach. Our results show that the proposed 2-label-weighted approach significantly outperforms the accuracy of the original 5- label and 2-label-unweighted classification approaches by 39.9% and 7.6%, respectively. We also found that the weighted 2-label models produce higher skewness values by 1.05 and 0.61 for non-SIC and SIC respectively on root mean square error (RMSE) distributions. When each approach was combined with selective iterative classification, this further improved the accuracy of classification for the 2-weighted-label by 7.5% over the original, and improved the skewness of the 5-label and 2-unweighted-label by 0.22 and 0.44 respectively.
Model selection for anomaly detection
NASA Astrophysics Data System (ADS)
Burnaev, E.; Erofeev, P.; Smolyakov, D.
2015-12-01
Anomaly detection based on one-class classification algorithms is broadly used in many applied domains like image processing (e.g. detection of whether a patient is "cancerous" or "healthy" from mammography image), network intrusion detection, etc. Performance of an anomaly detection algorithm crucially depends on a kernel, used to measure similarity in a feature space. The standard approaches (e.g. cross-validation) for kernel selection, used in two-class classification problems, can not be used directly due to the specific nature of a data (absence of a second, abnormal, class data). In this paper we generalize several kernel selection methods from binary-class case to the case of one-class classification and perform extensive comparison of these approaches using both synthetic and real-world data.
Li, Pengfei; Jiang, Yongying; Xiang, Jiawei
2014-01-01
To deal with the difficulty to obtain a large number of fault samples under the practical condition for mechanical fault diagnosis, a hybrid method that combined wavelet packet decomposition and support vector classification (SVC) is proposed. The wavelet packet is employed to decompose the vibration signal to obtain the energy ratio in each frequency band. Taking energy ratios as feature vectors, the pattern recognition results are obtained by the SVC. The rolling bearing and gear fault diagnostic results of the typical experimental platform show that the present approach is robust to noise and has higher classification accuracy and, thus, provides a better way to diagnose mechanical faults under the condition of small fault samples. PMID:24688361
A classification tree based modeling approach for segment related crashes on multilane highways.
Pande, Anurag; Abdel-Aty, Mohamed; Das, Abhishek
2010-10-01
This study presents a classification tree based alternative to crash frequency analysis for analyzing crashes on mid-block segments of multilane arterials. The traditional approach of modeling counts of crashes that occur over a period of time works well for intersection crashes where each intersection itself provides a well-defined unit over which to aggregate the crash data. However, in the case of mid-block segments the crash frequency based approach requires segmentation of the arterial corridor into segments of arbitrary lengths. In this study we have used random samples of time, day of week, and location (i.e., milepost) combinations and compared them with the sample of crashes from the same arterial corridor. For crash and non-crash cases, geometric design/roadside and traffic characteristics were derived based on their milepost locations. The variables used in the analysis are non-event specific and therefore more relevant for roadway safety feature improvement programs. First classification tree model is a model comparing all crashes with the non-crash data and then four groups of crashes (rear-end, lane-change related, pedestrian, and single-vehicle/off-road crashes) are separately compared to the non-crash cases. The classification tree models provide a list of significant variables as well as a measure to classify crash from non-crash cases. ADT along with time of day/day of week are significantly related to all crash types with different groups of crashes being more likely to occur at different times. From the classification performance of different models it was apparent that using non-event specific information may not be suitable for single vehicle/off-road crashes. The study provides the safety analysis community an additional tool to assess safety without having to aggregate the corridor crash data over arbitrary segment lengths. Copyright © 2010. Published by Elsevier Ltd.
NASA Astrophysics Data System (ADS)
Marwaha, Richa; Kumar, Anil; Kumar, Arumugam Senthil
2015-01-01
Our primary objective was to explore a classification algorithm for thermal hyperspectral data. Minimum noise fraction is applied to thermal hyperspectral data and eight pixel-based classifiers, i.e., constrained energy minimization, matched filter, spectral angle mapper (SAM), adaptive coherence estimator, orthogonal subspace projection, mixture-tuned matched filter, target-constrained interference-minimized filter, and mixture-tuned target-constrained interference minimized filter are tested. The long-wave infrared (LWIR) has not yet been exploited for classification purposes. The LWIR data contain emissivity and temperature information about an object. A highest overall accuracy of 90.99% was obtained using the SAM algorithm for the combination of thermal data with a colored digital photograph. Similarly, an object-oriented approach is applied to thermal data. The image is segmented into meaningful objects based on properties such as geometry, length, etc., which are grouped into pixels using a watershed algorithm and an applied supervised classification algorithm, i.e., support vector machine (SVM). The best algorithm in the pixel-based category is the SAM technique. SVM is useful for thermal data, providing a high accuracy of 80.00% at a scale value of 83 and a merge value of 90, whereas for the combination of thermal data with a colored digital photograph, SVM gives the highest accuracy of 85.71% at a scale value of 82 and a merge value of 90.
Change classification in SAR time series: a functional approach
NASA Astrophysics Data System (ADS)
Boldt, Markus; Thiele, Antje; Schulz, Karsten; Hinz, Stefan
2017-10-01
Change detection represents a broad field of research in SAR remote sensing, consisting of many different approaches. Besides the simple recognition of change areas, the analysis of type, category or class of the change areas is at least as important for creating a comprehensive result. Conventional strategies for change classification are based on supervised or unsupervised landuse / landcover classifications. The main drawback of such approaches is that the quality of the classification result directly depends on the selection of training and reference data. Additionally, supervised processing methods require an experienced operator who capably selects the training samples. This training step is not necessary when using unsupervised strategies, but nevertheless meaningful reference data must be available for identifying the resulting classes. Consequently, an experienced operator is indispensable. In this study, an innovative concept for the classification of changes in SAR time series data is proposed. Regarding the drawbacks of traditional strategies given above, it copes without using any training data. Moreover, the method can be applied by an operator, who does not have detailed knowledge about the available scenery yet. This knowledge is provided by the algorithm. The final step of the procedure, which main aspect is given by the iterative optimization of an initial class scheme with respect to the categorized change objects, is represented by the classification of these objects to the finally resulting classes. This assignment step is subject of this paper.
Wang, Xue; Bi, Dao-wei; Ding, Liang; Wang, Sheng
2007-01-01
The recent availability of low cost and miniaturized hardware has allowed wireless sensor networks (WSNs) to retrieve audio and video data in real world applications, which has fostered the development of wireless multimedia sensor networks (WMSNs). Resource constraints and challenging multimedia data volume make development of efficient algorithms to perform in-network processing of multimedia contents imperative. This paper proposes solving problems in the domain of WMSNs from the perspective of multi-agent systems. The multi-agent framework enables flexible network configuration and efficient collaborative in-network processing. The focus is placed on target classification in WMSNs where audio information is retrieved by microphones. To deal with the uncertainties related to audio information retrieval, the statistical approaches of power spectral density estimates, principal component analysis and Gaussian process classification are employed. A multi-agent negotiation mechanism is specially developed to efficiently utilize limited resources and simultaneously enhance classification accuracy and reliability. The negotiation is composed of two phases, where an auction based approach is first exploited to allocate the classification task among the agents and then individual agent decisions are combined by the committee decision mechanism. Simulation experiments with real world data are conducted and the results show that the proposed statistical approaches and negotiation mechanism not only reduce memory and computation requirements in WMSNs but also significantly enhance classification accuracy and reliability. PMID:28903223
Ensemble of classifiers for confidence-rated classification of NDE signal
NASA Astrophysics Data System (ADS)
Banerjee, Portia; Safdarnejad, Seyed; Udpa, Lalita; Udpa, Satish
2016-02-01
Ensemble of classifiers in general, aims to improve classification accuracy by combining results from multiple weak hypotheses into a single strong classifier through weighted majority voting. Improved versions of ensemble of classifiers generate self-rated confidence scores which estimate the reliability of each of its prediction and boost the classifier using these confidence-rated predictions. However, such a confidence metric is based only on the rate of correct classification. In existing works, although ensemble of classifiers has been widely used in computational intelligence, the effect of all factors of unreliability on the confidence of classification is highly overlooked. With relevance to NDE, classification results are affected by inherent ambiguity of classifica-tion, non-discriminative features, inadequate training samples and noise due to measurement. In this paper, we extend the existing ensemble classification by maximizing confidence of every classification decision in addition to minimizing the classification error. Initial results of the approach on data from eddy current inspection show improvement in classification performance of defect and non-defect indications.
Zhang, Chi; Zhang, Ge; Chen, Ke-ji; Lu, Ai-ping
2016-04-01
The development of an effective classification method for human health conditions is essential for precise diagnosis and delivery of tailored therapy to individuals. Contemporary classification of disease systems has properties that limit its information content and usability. Chinese medicine pattern classification has been incorporated with disease classification, and this integrated classification method became more precise because of the increased understanding of the molecular mechanisms. However, we are still facing the complexity of diseases and patterns in the classification of health conditions. With continuing advances in omics methodologies and instrumentation, we are proposing a new classification approach: molecular module classification, which is applying molecular modules to classifying human health status. The initiative would be precisely defining the health status, providing accurate diagnoses, optimizing the therapeutics and improving new drug discovery strategy. Therefore, there would be no current disease diagnosis, no disease pattern classification, and in the future, a new medicine based on this classification, molecular module medicine, could redefine health statuses and reshape the clinical practice.
Alam, Daniel; Ali, Yaseen; Klem, Christopher; Coventry, Daniel
2016-11-01
Orbito-malar reconstruction after oncological resection represents one of the most challenging facial reconstructive procedures. Until the last few decades, rehabilitation was typically prosthesis based with a limited role for surgery. The advent of microsurgical techniques allowed large-volume tissue reconstitution from a distant donor site, revolutionizing the potential approaches to these defects. The authors report a novel surgery-based algorithm and a classification scheme for complete midface reconstruction with a foundation in the Gillies principles of like-to-like reconstruction and with a significant role of computer-aided virtual planning. With this approach, the authors have been able to achieve significantly better patient outcomes. Copyright © 2016 Elsevier Inc. All rights reserved.
Tensor-based classification of an auditory mobile BCI without a subject-specific calibration phase.
Zink, Rob; Hunyadi, Borbála; Huffel, Sabine Van; Vos, Maarten De
2016-04-01
One of the major drawbacks in EEG brain-computer interfaces (BCI) is the need for subject-specific training of the classifier. By removing the need for a supervised calibration phase, new users could potentially explore a BCI faster. In this work we aim to remove this subject-specific calibration phase and allow direct classification. We explore canonical polyadic decompositions and block term decompositions of the EEG. These methods exploit structure in higher dimensional data arrays called tensors. The BCI tensors are constructed by concatenating ERP templates from other subjects to a target and non-target trial and the inherent structure guides a decomposition that allows accurate classification. We illustrate the new method on data from a three-class auditory oddball paradigm. The presented approach leads to a fast and intuitive classification with accuracies competitive with a supervised and cross-validated LDA approach. The described methods are a promising new way of classifying BCI data with a forthright link to the original P300 ERP signal over the conventional and widely used supervised approaches.
A comparison of rule-based and machine learning approaches for classifying patient portal messages.
Cronin, Robert M; Fabbri, Daniel; Denny, Joshua C; Rosenbloom, S Trent; Jackson, Gretchen Purcell
2017-09-01
Secure messaging through patient portals is an increasingly popular way that consumers interact with healthcare providers. The increasing burden of secure messaging can affect clinic staffing and workflows. Manual management of portal messages is costly and time consuming. Automated classification of portal messages could potentially expedite message triage and delivery of care. We developed automated patient portal message classifiers with rule-based and machine learning techniques using bag of words and natural language processing (NLP) approaches. To evaluate classifier performance, we used a gold standard of 3253 portal messages manually categorized using a taxonomy of communication types (i.e., main categories of informational, medical, logistical, social, and other communications, and subcategories including prescriptions, appointments, problems, tests, follow-up, contact information, and acknowledgement). We evaluated our classifiers' accuracies in identifying individual communication types within portal messages with area under the receiver-operator curve (AUC). Portal messages often contain more than one type of communication. To predict all communication types within single messages, we used the Jaccard Index. We extracted the variables of importance for the random forest classifiers. The best performing approaches to classification for the major communication types were: logistic regression for medical communications (AUC: 0.899); basic (rule-based) for informational communications (AUC: 0.842); and random forests for social communications and logistical communications (AUCs: 0.875 and 0.925, respectively). The best performing classification approach of classifiers for individual communication subtypes was random forests for Logistical-Contact Information (AUC: 0.963). The Jaccard Indices by approach were: basic classifier, Jaccard Index: 0.674; Naïve Bayes, Jaccard Index: 0.799; random forests, Jaccard Index: 0.859; and logistic regression, Jaccard Index: 0.861. For medical communications, the most predictive variables were NLP concepts (e.g., Temporal_Concept, which maps to 'morning', 'evening' and Idea_or_Concept which maps to 'appointment' and 'refill'). For logistical communications, the most predictive variables contained similar numbers of NLP variables and words (e.g., Telephone mapping to 'phone', 'insurance'). For social and informational communications, the most predictive variables were words (e.g., social: 'thanks', 'much', informational: 'question', 'mean'). This study applies automated classification methods to the content of patient portal messages and evaluates the application of NLP techniques on consumer communications in patient portal messages. We demonstrated that random forest and logistic regression approaches accurately classified the content of portal messages, although the best approach to classification varied by communication type. Words were the most predictive variables for classification of most communication types, although NLP variables were most predictive for medical communication types. As adoption of patient portals increases, automated techniques could assist in understanding and managing growing volumes of messages. Further work is needed to improve classification performance to potentially support message triage and answering. Copyright © 2017 Elsevier B.V. All rights reserved.
Guo, Yang; Liu, Shuhui; Li, Zhanhuai; Shang, Xuequn
2018-04-11
The classification of cancer subtypes is of great importance to cancer disease diagnosis and therapy. Many supervised learning approaches have been applied to cancer subtype classification in the past few years, especially of deep learning based approaches. Recently, the deep forest model has been proposed as an alternative of deep neural networks to learn hyper-representations by using cascade ensemble decision trees. It has been proved that the deep forest model has competitive or even better performance than deep neural networks in some extent. However, the standard deep forest model may face overfitting and ensemble diversity challenges when dealing with small sample size and high-dimensional biology data. In this paper, we propose a deep learning model, so-called BCDForest, to address cancer subtype classification on small-scale biology datasets, which can be viewed as a modification of the standard deep forest model. The BCDForest distinguishes from the standard deep forest model with the following two main contributions: First, a named multi-class-grained scanning method is proposed to train multiple binary classifiers to encourage diversity of ensemble. Meanwhile, the fitting quality of each classifier is considered in representation learning. Second, we propose a boosting strategy to emphasize more important features in cascade forests, thus to propagate the benefits of discriminative features among cascade layers to improve the classification performance. Systematic comparison experiments on both microarray and RNA-Seq gene expression datasets demonstrate that our method consistently outperforms the state-of-the-art methods in application of cancer subtype classification. The multi-class-grained scanning and boosting strategy in our model provide an effective solution to ease the overfitting challenge and improve the robustness of deep forest model working on small-scale data. Our model provides a useful approach to the classification of cancer subtypes by using deep learning on high-dimensional and small-scale biology data.
Torrens, Francisco; Castellano, Gloria
2014-06-05
Pesticide residues in wine were analyzed by liquid chromatography-tandem mass spectrometry. Retentions are modelled by structure-property relationships. Bioplastic evolution is an evolutionary perspective conjugating effect of acquired characters and evolutionary indeterminacy-morphological determination-natural selection principles; its application to design co-ordination index barely improves correlations. Fractal dimensions and partition coefficient differentiate pesticides. Classification algorithms are based on information entropy and its production. Pesticides allow a structural classification by nonplanarity, and number of O, S, N and Cl atoms and cycles; different behaviours depend on number of cycles. The novelty of the approach is that the structural parameters are related to retentions. Classification algorithms are based on information entropy. When applying procedures to moderate-sized sets, excessive results appear compatible with data suffering a combinatorial explosion. However, equipartition conjecture selects criterion resulting from classification between hierarchical trees. Information entropy permits classifying compounds agreeing with principal component analyses. Periodic classification shows that pesticides in the same group present similar properties; those also in equal period, maximum resemblance. The advantage of the classification is to predict the retentions for molecules not included in the categorization. Classification extends to phenyl/sulphonylureas and the application will be to predict their retentions.
NASA Astrophysics Data System (ADS)
Miao, Minmin; Zeng, Hong; Wang, Aimin; Zhao, Fengkui; Liu, Feixiang
2017-09-01
Electroencephalogram (EEG)-based motor imagery (MI) brain-computer interface (BCI) has shown its effectiveness for the control of rehabilitation devices designed for large body parts of the patients with neurologic impairments. In order to validate the feasibility of using EEG to decode the MI of a single index finger and constructing a BCI-enhanced finger rehabilitation system, we collected EEG data during right hand index finger MI and rest state for five healthy subjects and proposed a pattern recognition approach for classifying these two mental states. First, Fisher's linear discriminant criteria and power spectral density analysis were used to analyze the event-related desynchronization patterns. Second, both band power and approximate entropy were extracted as features. Third, aiming to eliminate the abnormal samples in the dictionary and improve the classification performance of the conventional sparse representation-based classification (SRC) method, we proposed a novel dictionary cleaned sparse representation-based classification (DCSRC) method for final classification. The experimental results show that the proposed DCSRC method gives better classification accuracies than SRC and an average classification accuracy of 81.32% is obtained for five subjects. Thus, it is demonstrated that single right hand index finger MI can be decoded from the sensorimotor rhythms, and the feature patterns of index finger MI and rest state can be well recognized for robotic exoskeleton initiation.
Classification of EEG abnormalities in partial epilepsy with simultaneous EEG-fMRI recordings.
Pedreira, C; Vaudano, A E; Thornton, R C; Chaudhary, U J; Vulliemoz, S; Laufs, H; Rodionov, R; Carmichael, D W; Lhatoo, S D; Guye, M; Quian Quiroga, R; Lemieux, L
2014-10-01
Scalp EEG recordings and the classification of interictal epileptiform discharges (IED) in patients with epilepsy provide valuable information about the epileptogenic network, particularly by defining the boundaries of the "irritative zone" (IZ), and hence are helpful during pre-surgical evaluation of patients with severe refractory epilepsies. The current detection and classification of epileptiform signals essentially rely on expert observers. This is a very time-consuming procedure, which also leads to inter-observer variability. Here, we propose a novel approach to automatically classify epileptic activity and show how this method provides critical and reliable information related to the IZ localization beyond the one provided by previous approaches. We applied Wave_clus, an automatic spike sorting algorithm, for the classification of IED visually identified from pre-surgical simultaneous Electroencephalogram-functional Magnetic Resonance Imagining (EEG-fMRI) recordings in 8 patients affected by refractory partial epilepsy candidate for surgery. For each patient, two fMRI analyses were performed: one based on the visual classification and one based on the algorithmic sorting. This novel approach successfully identified a total of 29 IED classes (compared to 26 for visual identification). The general concordance between methods was good, providing a full match of EEG patterns in 2 cases, additional EEG information in 2 other cases and, in general, covering EEG patterns of the same areas as expert classification in 7 of the 8 cases. Most notably, evaluation of the method with EEG-fMRI data analysis showed hemodynamic maps related to the majority of IED classes representing improved performance than the visual IED classification-based analysis (72% versus 50%). Furthermore, the IED-related BOLD changes revealed by using the algorithm were localized within the presumed IZ for a larger number of IED classes (9) in a greater number of patients than the expert classification (7 and 5, respectively). In contrast, in only one case presented the new algorithm resulted in fewer classes and activation areas. We propose that the use of automated spike sorting algorithms to classify IED provides an efficient tool for mapping IED-related fMRI changes and increases the EEG-fMRI clinical value for the pre-surgical assessment of patients with severe epilepsy. Copyright © 2014 Elsevier Inc. All rights reserved.
Empirical Analysis and Automated Classification of Security Bug Reports
NASA Technical Reports Server (NTRS)
Tyo, Jacob P.
2016-01-01
With the ever expanding amount of sensitive data being placed into computer systems, the need for effective cybersecurity is of utmost importance. However, there is a shortage of detailed empirical studies of security vulnerabilities from which cybersecurity metrics and best practices could be determined. This thesis has two main research goals: (1) to explore the distribution and characteristics of security vulnerabilities based on the information provided in bug tracking systems and (2) to develop data analytics approaches for automatic classification of bug reports as security or non-security related. This work is based on using three NASA datasets as case studies. The empirical analysis showed that the majority of software vulnerabilities belong only to a small number of types. Addressing these types of vulnerabilities will consequently lead to cost efficient improvement of software security. Since this analysis requires labeling of each bug report in the bug tracking system, we explored using machine learning to automate the classification of each bug report as a security or non-security related (two-class classification), as well as each security related bug report as specific security type (multiclass classification). In addition to using supervised machine learning algorithms, a novel unsupervised machine learning approach is proposed. An ac- curacy of 92%, recall of 96%, precision of 92%, probability of false alarm of 4%, F-Score of 81% and G-Score of 90% were the best results achieved during two-class classification. Furthermore, an accuracy of 80%, recall of 80%, precision of 94%, and F-score of 85% were the best results achieved during multiclass classification.
Cognitive approaches for patterns analysis and security applications
NASA Astrophysics Data System (ADS)
Ogiela, Marek R.; Ogiela, Lidia
2017-08-01
In this paper will be presented new opportunities for developing innovative solutions for semantic pattern classification and visual cryptography, which will base on cognitive and bio-inspired approaches. Such techniques can be used for evaluation of the meaning of analyzed patterns or encrypted information, and allow to involve such meaning into the classification task or encryption process. It also allows using some crypto-biometric solutions to extend personalized cryptography methodologies based on visual pattern analysis. In particular application of cognitive information systems for semantic analysis of different patterns will be presented, and also a novel application of such systems for visual secret sharing will be described. Visual shares for divided information can be created based on threshold procedure, which may be dependent on personal abilities to recognize some image details visible on divided images.
Improving Hospital-Wide Early Resource Allocation through Machine Learning.
Gartner, Daniel; Padman, Rema
2015-01-01
The objective of this paper is to evaluate the extent to which early determination of diagnosis-related groups (DRGs) can be used for better allocation of scarce hospital resources. When elective patients seek admission, the true DRG, currently determined only at discharge, is unknown. We approach the problem of early DRG determination in three stages: (1) test how much a Naïve Bayes classifier can improve classification accuracy as compared to a hospital's current approach; (2) develop a statistical program that makes admission and scheduling decisions based on the patients' clincial pathways and scarce hospital resources; and (3) feed the DRG as classified by the Naïve Bayes classifier and the hospitals' baseline approach into the model (which we evaluate in simulation). Our results reveal that the DRG grouper performs poorly in classifying the DRG correctly before admission while the Naïve Bayes approach substantially improves the classification task. The results from the connection of the classification method with the mathematical program also reveal that resource allocation decisions can be more effective and efficient with the hybrid approach.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Webb-Robertson, Bobbie-Jo M.; Wiberg, Holli K.; Matzke, Melissa M.
In this review, we apply selected imputation strategies to label-free liquid chromatography–mass spectrometry (LC–MS) proteomics datasets to evaluate the accuracy with respect to metrics of variance and classification. We evaluate several commonly used imputation approaches for individual merits and discuss the caveats of each approach with respect to the example LC–MS proteomics data. In general, local similarity-based approaches, such as the regularized expectation maximization and least-squares adaptive algorithms, yield the best overall performances with respect to metrics of accuracy and robustness. However, no single algorithm consistently outperforms the remaining approaches, and in some cases, performing classification without imputation sometimes yieldedmore » the most accurate classification. Thus, because of the complex mechanisms of missing data in proteomics, which also vary from peptide to protein, no individual method is a single solution for imputation. In summary, on the basis of the observations in this review, the goal for imputation in the field of computational proteomics should be to develop new approaches that work generically for this data type and new strategies to guide users in the selection of the best imputation for their dataset and analysis objectives.« less
Attribute-based Decision Graphs: A framework for multiclass data classification.
Bertini, João Roberto; Nicoletti, Maria do Carmo; Zhao, Liang
2017-01-01
Graph-based algorithms have been successfully applied in machine learning and data mining tasks. A simple but, widely used, approach to build graphs from vector-based data is to consider each data instance as a vertex and connecting pairs of it using a similarity measure. Although this abstraction presents some advantages, such as arbitrary shape representation of the original data, it is still tied to some drawbacks, for example, it is dependent on the choice of a pre-defined distance metric and is biased by the local information among data instances. Aiming at exploring alternative ways to build graphs from data, this paper proposes an algorithm for constructing a new type of graph, called Attribute-based Decision Graph-AbDG. Given a vector-based data set, an AbDG is built by partitioning each data attribute range into disjoint intervals and representing each interval as a vertex. The edges are then established between vertices from different attributes according to a pre-defined pattern. Classification is performed through a matching process among the attribute values of the new instance and AbDG. Moreover, AbDG provides an inner mechanism to handle missing attribute values, which contributes for expanding its applicability. Results of classification tasks have shown that AbDG is a competitive approach when compared to well-known multiclass algorithms. The main contribution of the proposed framework is the combination of the advantages of attribute-based and graph-based techniques to perform robust pattern matching data classification, while permitting the analysis the input data considering only a subset of its attributes. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Zhang, C.; Pan, X.; Zhang, S. Q.; Li, H. P.; Atkinson, P. M.
2017-09-01
Recent advances in remote sensing have witnessed a great amount of very high resolution (VHR) images acquired at sub-metre spatial resolution. These VHR remotely sensed data has post enormous challenges in processing, analysing and classifying them effectively due to the high spatial complexity and heterogeneity. Although many computer-aid classification methods that based on machine learning approaches have been developed over the past decades, most of them are developed toward pixel level spectral differentiation, e.g. Multi-Layer Perceptron (MLP), which are unable to exploit abundant spatial details within VHR images. This paper introduced a rough set model as a general framework to objectively characterize the uncertainty in CNN classification results, and further partition them into correctness and incorrectness on the map. The correct classification regions of CNN were trusted and maintained, whereas the misclassification areas were reclassified using a decision tree with both CNN and MLP. The effectiveness of the proposed rough set decision tree based MLP-CNN was tested using an urban area at Bournemouth, United Kingdom. The MLP-CNN, well capturing the complementarity between CNN and MLP through the rough set based decision tree, achieved the best classification performance both visually and numerically. Therefore, this research paves the way to achieve fully automatic and effective VHR image classification.
NASA Astrophysics Data System (ADS)
Baccar, D.; Söffker, D.
2017-11-01
Acoustic Emission (AE) is a suitable method to monitor the health of composite structures in real-time. However, AE-based failure mode identification and classification are still complex to apply due to the fact that AE waves are generally released simultaneously from all AE-emitting damage sources. Hence, the use of advanced signal processing techniques in combination with pattern recognition approaches is required. In this paper, AE signals generated from laminated carbon fiber reinforced polymer (CFRP) subjected to indentation test are examined and analyzed. A new pattern recognition approach involving a number of processing steps able to be implemented in real-time is developed. Unlike common classification approaches, here only CWT coefficients are extracted as relevant features. Firstly, Continuous Wavelet Transform (CWT) is applied to the AE signals. Furthermore, dimensionality reduction process using Principal Component Analysis (PCA) is carried out on the coefficient matrices. The PCA-based feature distribution is analyzed using Kernel Density Estimation (KDE) allowing the determination of a specific pattern for each fault-specific AE signal. Moreover, waveform and frequency content of AE signals are in depth examined and compared with fundamental assumptions reported in this field. A correlation between the identified patterns and failure modes is achieved. The introduced method improves the damage classification and can be used as a non-destructive evaluation tool.
Knauer, Uwe; Matros, Andrea; Petrovic, Tijana; Zanker, Timothy; Scott, Eileen S; Seiffert, Udo
2017-01-01
Hyperspectral imaging is an emerging means of assessing plant vitality, stress parameters, nutrition status, and diseases. Extraction of target values from the high-dimensional datasets either relies on pixel-wise processing of the full spectral information, appropriate selection of individual bands, or calculation of spectral indices. Limitations of such approaches are reduced classification accuracy, reduced robustness due to spatial variation of the spectral information across the surface of the objects measured as well as a loss of information intrinsic to band selection and use of spectral indices. In this paper we present an improved spatial-spectral segmentation approach for the analysis of hyperspectral imaging data and its application for the prediction of powdery mildew infection levels (disease severity) of intact Chardonnay grape bunches shortly before veraison. Instead of calculating texture features (spatial features) for the huge number of spectral bands independently, dimensionality reduction by means of Linear Discriminant Analysis (LDA) was applied first to derive a few descriptive image bands. Subsequent classification was based on modified Random Forest classifiers and selective extraction of texture parameters from the integral image representation of the image bands generated. Dimensionality reduction, integral images, and the selective feature extraction led to improved classification accuracies of up to [Formula: see text] for detached berries used as a reference sample (training dataset). Our approach was validated by predicting infection levels for a sample of 30 intact bunches. Classification accuracy improved with the number of decision trees of the Random Forest classifier. These results corresponded with qPCR results. An accuracy of 0.87 was achieved in classification of healthy, infected, and severely diseased bunches. However, discrimination between visually healthy and infected bunches proved to be challenging for a few samples, perhaps due to colonized berries or sparse mycelia hidden within the bunch or airborne conidia on the berries that were detected by qPCR. An advanced approach to hyperspectral image classification based on combined spatial and spectral image features, potentially applicable to many available hyperspectral sensor technologies, has been developed and validated to improve the detection of powdery mildew infection levels of Chardonnay grape bunches. The spatial-spectral approach improved especially the detection of light infection levels compared with pixel-wise spectral data analysis. This approach is expected to improve the speed and accuracy of disease detection once the thresholds for fungal biomass detected by hyperspectral imaging are established; it can also facilitate monitoring in plant phenotyping of grapevine and additional crops.
Knowledge-based approaches to the maintenance of a large controlled medical terminology.
Cimino, J J; Clayton, P D; Hripcsak, G; Johnson, S B
1994-01-01
OBJECTIVE: Develop a knowledge-based representation for a controlled terminology of clinical information to facilitate creation, maintenance, and use of the terminology. DESIGN: The Medical Entities Dictionary (MED) is a semantic network, based on the Unified Medical Language System (UMLS), with a directed acyclic graph to represent multiple hierarchies. Terms from four hospital systems (laboratory, electrocardiography, medical records coding, and pharmacy) were added as nodes in the network. Additional knowledge about terms, added as semantic links, was used to assist in integration, harmonization, and automated classification of disparate terminologies. RESULTS: The MED contains 32,767 terms and is in active clinical use. Automated classification was successfully applied to terms for laboratory specimens, laboratory tests, and medications. One benefit of the approach has been the automated inclusion of medications into multiple pharmacologic and allergenic classes that were not present in the pharmacy system. Another benefit has been the reduction of maintenance efforts by 90%. CONCLUSION: The MED is a hybrid of terminology and knowledge. It provides domain coverage, synonymy, consistency of views, explicit relationships, and multiple classification while preventing redundancy, ambiguity (homonymy) and misclassification. PMID:7719786
Guo, Hao-Bo; Ma, Yue; Tuskan, Gerald A.; ...
2018-01-01
The existence of complete genome sequences makes it important to develop different approaches for classification of large-scale data sets and to make extraction of biological insights easier. Here, we propose an approach for classification of complete proteomes/protein sets based on protein distributions on some basic attributes. We demonstrate the usefulness of this approach by determining protein distributions in terms of two attributes: protein lengths and protein intrinsic disorder contents (ID). The protein distributions based on L and ID are surveyed for representative proteome organisms and protein sets from the three domains of life. The two-dimensional maps (designated as fingerprints here)more » from the protein distribution densities in the LD space defined by ln( L ) and ID are then constructed. The fingerprints for different organisms and protein sets are found to be distinct with each other, and they can therefore be used for comparative studies. As a test case, phylogenetic trees have been constructed based on the protein distribution densities in the fingerprints of proteomes of organisms without performing any protein sequence comparison and alignments. The phylogenetic trees generated are biologically meaningful, demonstrating that the protein distributions in the LD space may serve as unique phylogenetic signals of the organisms at the proteome level.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Guo, Hao-Bo; Ma, Yue; Tuskan, Gerald A.
The existence of complete genome sequences makes it important to develop different approaches for classification of large-scale data sets and to make extraction of biological insights easier. Here, we propose an approach for classification of complete proteomes/protein sets based on protein distributions on some basic attributes. We demonstrate the usefulness of this approach by determining protein distributions in terms of two attributes: protein lengths and protein intrinsic disorder contents (ID). The protein distributions based on L and ID are surveyed for representative proteome organisms and protein sets from the three domains of life. The two-dimensional maps (designated as fingerprints here)more » from the protein distribution densities in the LD space defined by ln( L ) and ID are then constructed. The fingerprints for different organisms and protein sets are found to be distinct with each other, and they can therefore be used for comparative studies. As a test case, phylogenetic trees have been constructed based on the protein distribution densities in the fingerprints of proteomes of organisms without performing any protein sequence comparison and alignments. The phylogenetic trees generated are biologically meaningful, demonstrating that the protein distributions in the LD space may serve as unique phylogenetic signals of the organisms at the proteome level.« less
The SoRReL papers: Recent publications of the Software Reuse Repository Lab
NASA Technical Reports Server (NTRS)
Eichmann, David A. (Editor)
1992-01-01
The entire publication is presented of some of the papers recently published by the SoRReL. Some typical titles are as follows: Design of a Lattice-Based Faceted Classification System; A Hybrid Approach to Software Reuse Repository Retrieval; Selecting Reusable Components Using Algebraic Specifications; Neural Network-Based Retrieval from Reuse Repositories; and A Neural Net-Based Approach to Software Metrics.
Transfer Kernel Common Spatial Patterns for Motor Imagery Brain-Computer Interface Classification.
Dai, Mengxi; Zheng, Dezhi; Liu, Shucong; Zhang, Pengju
2018-01-01
Motor-imagery-based brain-computer interfaces (BCIs) commonly use the common spatial pattern (CSP) as preprocessing step before classification. The CSP method is a supervised algorithm. Therefore a lot of time-consuming training data is needed to build the model. To address this issue, one promising approach is transfer learning, which generalizes a learning model can extract discriminative information from other subjects for target classification task. To this end, we propose a transfer kernel CSP (TKCSP) approach to learn a domain-invariant kernel by directly matching distributions of source subjects and target subjects. The dataset IVa of BCI Competition III is used to demonstrate the validity by our proposed methods. In the experiment, we compare the classification performance of the TKCSP against CSP, CSP for subject-to-subject transfer (CSP SJ-to-SJ), regularizing CSP (RCSP), stationary subspace CSP (ssCSP), multitask CSP (mtCSP), and the combined mtCSP and ssCSP (ss + mtCSP) method. The results indicate that the superior mean classification performance of TKCSP can achieve 81.14%, especially in case of source subjects with fewer number of training samples. Comprehensive experimental evidence on the dataset verifies the effectiveness and efficiency of the proposed TKCSP approach over several state-of-the-art methods.
Zemp, Roland; Tanadini, Matteo; Plüss, Stefan; Schnüriger, Karin; Singh, Navrag B; Taylor, William R; Lorenzetti, Silvio
2016-01-01
Occupational musculoskeletal disorders, particularly chronic low back pain (LBP), are ubiquitous due to prolonged static sitting or nonergonomic sitting positions. Therefore, the aim of this study was to develop an instrumented chair with force and acceleration sensors to determine the accuracy of automatically identifying the user's sitting position by applying five different machine learning methods (Support Vector Machines, Multinomial Regression, Boosting, Neural Networks, and Random Forest). Forty-one subjects were requested to sit four times in seven different prescribed sitting positions (total 1148 samples). Sixteen force sensor values and the backrest angle were used as the explanatory variables (features) for the classification. The different classification methods were compared by means of a Leave-One-Out cross-validation approach. The best performance was achieved using the Random Forest classification algorithm, producing a mean classification accuracy of 90.9% for subjects with which the algorithm was not familiar. The classification accuracy varied between 81% and 98% for the seven different sitting positions. The present study showed the possibility of accurately classifying different sitting positions by means of the introduced instrumented office chair combined with machine learning analyses. The use of such novel approaches for the accurate assessment of chair usage could offer insights into the relationships between sitting position, sitting behaviour, and the occurrence of musculoskeletal disorders.
Transfer Kernel Common Spatial Patterns for Motor Imagery Brain-Computer Interface Classification
Dai, Mengxi; Liu, Shucong; Zhang, Pengju
2018-01-01
Motor-imagery-based brain-computer interfaces (BCIs) commonly use the common spatial pattern (CSP) as preprocessing step before classification. The CSP method is a supervised algorithm. Therefore a lot of time-consuming training data is needed to build the model. To address this issue, one promising approach is transfer learning, which generalizes a learning model can extract discriminative information from other subjects for target classification task. To this end, we propose a transfer kernel CSP (TKCSP) approach to learn a domain-invariant kernel by directly matching distributions of source subjects and target subjects. The dataset IVa of BCI Competition III is used to demonstrate the validity by our proposed methods. In the experiment, we compare the classification performance of the TKCSP against CSP, CSP for subject-to-subject transfer (CSP SJ-to-SJ), regularizing CSP (RCSP), stationary subspace CSP (ssCSP), multitask CSP (mtCSP), and the combined mtCSP and ssCSP (ss + mtCSP) method. The results indicate that the superior mean classification performance of TKCSP can achieve 81.14%, especially in case of source subjects with fewer number of training samples. Comprehensive experimental evidence on the dataset verifies the effectiveness and efficiency of the proposed TKCSP approach over several state-of-the-art methods. PMID:29743934
Lauber, Chris
2012-01-01
Virus taxonomy has received little attention from the research community despite its broad relevance. In an accompanying paper (C. Lauber and A. E. Gorbalenya, J. Virol. 86:3890–3904, 2012), we have introduced a quantitative approach to hierarchically classify viruses of a family using pairwise evolutionary distances (PEDs) as a measure of genetic divergence. When applied to the six most conserved proteins of the Picornaviridae, it clustered 1,234 genome sequences in groups at three hierarchical levels (to which we refer as the “GENETIC classification”). In this study, we compare the GENETIC classification with the expert-based picornavirus taxonomy and outline differences in the underlying frameworks regarding the relation of virus groups and genetic diversity that represent, respectively, the structure and content of a classification. To facilitate the analysis, we introduce two novel diagrams. The first connects the genetic diversity of taxa to both the PED distribution and the phylogeny of picornaviruses. The second depicts a classification and the accommodated genetic diversity in a standardized manner. Generally, we found striking agreement between the two classifications on species and genus taxa. A few disagreements concern the species Human rhinovirus A and Human rhinovirus C and the genus Aphthovirus, which were split in the GENETIC classification. Furthermore, we propose a new supergenus level and universal, level-specific PED thresholds, not reached yet by many taxa. Since the species threshold is approached mostly by taxa with large sampling sizes and those infecting multiple hosts, it may represent an upper limit on divergence, beyond which homologous recombination in the six most conserved genes between two picornaviruses might not give viable progeny. PMID:22278238
A Two-Layer Method for Sedentary Behaviors Classification Using Smartphone and Bluetooth Beacons.
Cerón, Jesús D; López, Diego M; Hofmann, Christian
2017-01-01
Among the factors that outline the health of populations, person's lifestyle is the more important one. This work focuses on the caracterization and prevention of sedentary lifestyles. A sedentary behavior is defined as "any waking behavior characterized by an energy expenditure of 1.5 METs (Metabolic Equivalent) or less while in a sitting or reclining posture". To propose a method for sedentary behaviors classification using a smartphone and Bluetooth beacons considering different types of classification models: personal, hybrid or impersonal. Following the CRISP-DM methodology, a method based on a two-layer approach for the classification of sedentary behaviors is proposed. Using data collected from a smartphones' accelerometer, gyroscope and barometer; the first layer classifies between performing a sedentary behavior and not. The second layer of the method classifies the specific sedentary activity performed using only the smartphone's accelerometer and barometer data, but adding indoor location data, using Bluetooth Low Energy (BLE) beacons. To improve the precision of the classification, both layers implemented the Random Forest algorithm and the personal model. This study presents the first available method for the automatic classification of specific sedentary behaviors. The layered classification approach has the potential to improve processing, memory and energy consumption of mobile devices and wearables used.
A tri-fold hybrid classification approach for diagnostics with unexampled faulty states
NASA Astrophysics Data System (ADS)
Tamilselvan, Prasanna; Wang, Pingfeng
2015-01-01
System health diagnostics provides diversified benefits such as improved safety, improved reliability and reduced costs for the operation and maintenance of engineered systems. Successful health diagnostics requires the knowledge of system failures. However, with an increasing system complexity, it is extraordinarily difficult to have a well-tested system so that all potential faulty states can be realized and studied at product testing stage. Thus, real time health diagnostics requires automatic detection of unexampled system faulty states based upon sensory data to avoid sudden catastrophic system failures. This paper presents a trifold hybrid classification (THC) approach for structural health diagnosis with unexampled health states (UHS), which comprises of preliminary UHS identification using a new thresholded Mahalanobis distance (TMD) classifier, UHS diagnostics using a two-class support vector machine (SVM) classifier, and exampled health states diagnostics using a multi-class SVM classifier. The proposed THC approach, which takes the advantages of both TMD and SVM-based classification techniques, is able to identify and isolate the unexampled faulty states through interactively detecting the deviation of sensory data from the exampled health states and forming new ones autonomously. The proposed THC approach is further extended to a generic framework for health diagnostics problems with unexampled faulty states and demonstrated with health diagnostics case studies for power transformers and rolling bearings.
NASA Technical Reports Server (NTRS)
Chen, D. W.; Sengupta, S. K.; Welch, R. M.
1989-01-01
This paper compares the results of cloud-field classification derived from two simplified vector approaches, the Sum and Difference Histogram (SADH) and the Gray Level Difference Vector (GLDV), with the results produced by the Gray Level Cooccurrence Matrix (GLCM) approach described by Welch et al. (1988). It is shown that the SADH method produces accuracies equivalent to those obtained using the GLCM method, while the GLDV method fails to resolve error clusters. Compared to the GLCM method, the SADH method leads to a 31 percent saving in run time and a 50 percent saving in storage requirements, while the GLVD approach leads to a 40 percent saving in run time and an 87 percent saving in storage requirements.
Bush Encroachment Mapping for Africa - Multi-Scale Analysis with Remote Sensing and GIS
NASA Astrophysics Data System (ADS)
Graw, V. A. M.; Oldenburg, C.; Dubovyk, O.
2015-12-01
Bush encroachment describes a global problem which is especially facing the savanna ecosystem in Africa. Livestock is directly affected by decreasing grasslands and inedible invasive species which defines the process of bush encroachment. For many small scale farmers in developing countries livestock represents a type of insurance in times of crop failure or drought. Among that bush encroachment is also a problem for crop production. Studies on the mapping of bush encroachment so far focus on small scales using high-resolution data and rarely provide information beyond the national level. Therefore a process chain was developed using a multi-scale approach to detect bush encroachment for whole Africa. The bush encroachment map is calibrated with ground truth data provided by experts in Southern, Eastern and Western Africa. By up-scaling location specific information on different levels of remote sensing imagery - 30m with Landsat images and 250m with MODIS data - a map is created showing potential and actual areas of bush encroachment on the African continent and thereby provides an innovative approach to map bush encroachment on the regional scale. A classification approach links location data based on GPS information from experts to the respective pixel in the remote sensing imagery. Supervised classification is used while actual bush encroachment information represents the training samples for the up-scaling. The classification technique is based on Random Forests and regression trees, a machine learning classification approach. Working on multiple scales and with the help of field data an innovative approach can be presented showing areas affected by bush encroachment on the African continent. This information can help to prevent further grassland decrease and identify those regions where land management strategies are of high importance to sustain livestock keeping and thereby also secure livelihoods in rural areas.
NASA Astrophysics Data System (ADS)
Besic, Nikola; Ventura, Jordi Figueras i.; Grazioli, Jacopo; Gabella, Marco; Germann, Urs; Berne, Alexis
2016-09-01
Polarimetric radar-based hydrometeor classification is the procedure of identifying different types of hydrometeors by exploiting polarimetric radar observations. The main drawback of the existing supervised classification methods, mostly based on fuzzy logic, is a significant dependency on a presumed electromagnetic behaviour of different hydrometeor types. Namely, the results of the classification largely rely upon the quality of scattering simulations. When it comes to the unsupervised approach, it lacks the constraints related to the hydrometeor microphysics. The idea of the proposed method is to compensate for these drawbacks by combining the two approaches in a way that microphysical hypotheses can, to a degree, adjust the content of the classes obtained statistically from the observations. This is done by means of an iterative approach, performed offline, which, in a statistical framework, examines clustered representative polarimetric observations by comparing them to the presumed polarimetric properties of each hydrometeor class. Aside from comparing, a routine alters the content of clusters by encouraging further statistical clustering in case of non-identification. By merging all identified clusters, the multi-dimensional polarimetric signatures of various hydrometeor types are obtained for each of the studied representative datasets, i.e. for each radar system of interest. These are depicted by sets of centroids which are then employed in operational labelling of different hydrometeors. The method has been applied on three C-band datasets, each acquired by different operational radar from the MeteoSwiss Rad4Alp network, as well as on two X-band datasets acquired by two research mobile radars. The results are discussed through a comparative analysis which includes a corresponding supervised and unsupervised approach, emphasising the operational potential of the proposed method.
A Neuro-Fuzzy Approach in the Classification of Students' Academic Performance
2013-01-01
Classifying the student academic performance with high accuracy facilitates admission decisions and enhances educational services at educational institutions. The purpose of this paper is to present a neuro-fuzzy approach for classifying students into different groups. The neuro-fuzzy classifier used previous exam results and other related factors as input variables and labeled students based on their expected academic performance. The results showed that the proposed approach achieved a high accuracy. The results were also compared with those obtained from other well-known classification approaches, including support vector machine, Naive Bayes, neural network, and decision tree approaches. The comparative analysis indicated that the neuro-fuzzy approach performed better than the others. It is expected that this work may be used to support student admission procedures and to strengthen the services of educational institutions. PMID:24302928
A neuro-fuzzy approach in the classification of students' academic performance.
Do, Quang Hung; Chen, Jeng-Fung
2013-01-01
Classifying the student academic performance with high accuracy facilitates admission decisions and enhances educational services at educational institutions. The purpose of this paper is to present a neuro-fuzzy approach for classifying students into different groups. The neuro-fuzzy classifier used previous exam results and other related factors as input variables and labeled students based on their expected academic performance. The results showed that the proposed approach achieved a high accuracy. The results were also compared with those obtained from other well-known classification approaches, including support vector machine, Naive Bayes, neural network, and decision tree approaches. The comparative analysis indicated that the neuro-fuzzy approach performed better than the others. It is expected that this work may be used to support student admission procedures and to strengthen the services of educational institutions.
A topological approach for protein classification
Cang, Zixuan; Mu, Lin; Wu, Kedi; ...
2015-11-04
Here, protein function and dynamics are closely related to its sequence and structure. However, prediction of protein function and dynamics from its sequence and structure is still a fundamental challenge in molecular biology. Protein classification, which is typically done through measuring the similarity between proteins based on protein sequence or physical information, serves as a crucial step toward the understanding of protein function and dynamics.
Supervised classification of continental shelf sediment off western Donegal, Ireland
NASA Astrophysics Data System (ADS)
Monteys, X.; Craven, K.; McCarron, S. G.
2017-12-01
Managing human impacts on marine ecosystems requires natural regions to be identified and mapped over a range of hierarchically nested scales. In recent years (2000-present) the Irish National Seabed Survey (INSS) and Integrated Mapping for the Sustainable Development of Ireland's Marine Resources programme (INFOMAR) (Geological Survey Ireland and Marine Institute collaborations) has provided unprecedented quantities of high quality data on Ireland's offshore territories. The increasing availability of large, detailed digital representations of these environments requires the application of objective and quantitative analyses. This study presents results of a new approach for sea floor sediment mapping based on an integrated analysis of INFOMAR multibeam bathymetric data (including the derivatives of slope and relative position), backscatter data (including derivatives of angular response analysis) and sediment groundtruthing over the continental shelf, west of Donegal. It applies a Geographic-Object-Based Image Analysis software package to provide a supervised classification of the surface sediment. This approach can provide a statistically robust, high resolution classification of the seafloor. Initial results display a differentiation of sediment classes and a reduction in artefacts from previously applied methodologies. These results indicate a methodology that could be used during physical habitat mapping and classification of marine environments.
Advances in the molecular genetics of gliomas - implications for classification and therapy.
Reifenberger, Guido; Wirsching, Hans-Georg; Knobbe-Thomsen, Christiane B; Weller, Michael
2017-07-01
Genome-wide molecular-profiling studies have revealed the characteristic genetic alterations and epigenetic profiles associated with different types of gliomas. These molecular characteristics can be used to refine glioma classification, to improve prediction of patient outcomes, and to guide individualized treatment. Thus, the WHO Classification of Tumours of the Central Nervous System was revised in 2016 to incorporate molecular biomarkers - together with classic histological features - in an integrated diagnosis, in order to define distinct glioma entities as precisely as possible. This paradigm shift is markedly changing how glioma is diagnosed, and has important implications for future clinical trials and patient management in daily practice. Herein, we highlight the developments in our understanding of the molecular genetics of gliomas, and review the current landscape of clinically relevant molecular biomarkers for use in classification of the disease subtypes. Novel approaches to the genetic characterization of gliomas based on large-scale DNA-methylation profiling and next-generation sequencing are also discussed. In addition, we illustrate how advances in the molecular genetics of gliomas can promote the development and clinical translation of novel pathogenesis-based therapeutic approaches, thereby paving the way towards precision medicine in neuro-oncology.
Analyzing Sub-Classifications of Glaucoma via SOM Based Clustering of Optic Nerve Images.
Yan, Sanjun; Abidi, Syed Sibte Raza; Artes, Paul Habib
2005-01-01
We present a data mining framework to cluster optic nerve images obtained by Confocal Scanning Laser Tomography (CSLT) in normal subjects and patients with glaucoma. We use self-organizing maps and expectation maximization methods to partition the data into clusters that provide insights into potential sub-classification of glaucoma based on morphological features. We conclude that our approach provides a first step towards a better understanding of morphological features in optic nerve images obtained from glaucoma patients and healthy controls.
Gross, Douglas P; Zhang, Jing; Steenstra, Ivan; Barnsley, Susan; Haws, Calvin; Amell, Tyler; McIntosh, Greg; Cooper, Juliette; Zaiane, Osmar
2013-12-01
To develop a classification algorithm and accompanying computer-based clinical decision support tool to help categorize injured workers toward optimal rehabilitation interventions based on unique worker characteristics. Population-based historical cohort design. Data were extracted from a Canadian provincial workers' compensation database on all claimants undergoing work assessment between December 2009 and January 2011. Data were available on: (1) numerous personal, clinical, occupational, and social variables; (2) type of rehabilitation undertaken; and (3) outcomes following rehabilitation (receiving time loss benefits or undergoing repeat programs). Machine learning, concerned with the design of algorithms to discriminate between classes based on empirical data, was the foundation of our approach to build a classification system with multiple independent and dependent variables. The population included 8,611 unique claimants. Subjects were predominantly employed (85 %) males (64 %) with diagnoses of sprain/strain (44 %). Baseline clinician classification accuracy was high (ROC = 0.86) for selecting programs that lead to successful return-to-work. Classification performance for machine learning techniques outperformed the clinician baseline classification (ROC = 0.94). The final classifiers were multifactorial and included the variables: injury duration, occupation, job attachment status, work status, modified work availability, pain intensity rating, self-rated occupational disability, and 9 items from the SF-36 Health Survey. The use of machine learning classification techniques appears to have resulted in classification performance better than clinician decision-making. The final algorithm has been integrated into a computer-based clinical decision support tool that requires additional validation in a clinical sample.
Pham, Tuyen Danh; Lee, Dong Eun; Park, Kang Ryoung
2017-07-08
Automatic recognition of banknotes is applied in payment facilities, such as automated teller machines (ATMs) and banknote counters. Besides the popular approaches that focus on studying the methods applied to various individual types of currencies, there have been studies conducted on simultaneous classification of banknotes from multiple countries. However, their methods were conducted with limited numbers of banknote images, national currencies, and denominations. To address this issue, we propose a multi-national banknote classification method based on visible-light banknote images captured by a one-dimensional line sensor and classified by a convolutional neural network (CNN) considering the size information of each denomination. Experiments conducted on the combined banknote image database of six countries with 62 denominations gave a classification accuracy of 100%, and results show that our proposed algorithm outperforms previous methods.
Pham, Tuyen Danh; Lee, Dong Eun; Park, Kang Ryoung
2017-01-01
Automatic recognition of banknotes is applied in payment facilities, such as automated teller machines (ATMs) and banknote counters. Besides the popular approaches that focus on studying the methods applied to various individual types of currencies, there have been studies conducted on simultaneous classification of banknotes from multiple countries. However, their methods were conducted with limited numbers of banknote images, national currencies, and denominations. To address this issue, we propose a multi-national banknote classification method based on visible-light banknote images captured by a one-dimensional line sensor and classified by a convolutional neural network (CNN) considering the size information of each denomination. Experiments conducted on the combined banknote image database of six countries with 62 denominations gave a classification accuracy of 100%, and results show that our proposed algorithm outperforms previous methods. PMID:28698466
Hyperspectral imaging with wavelet transform for classification of colon tissue biopsy samples
NASA Astrophysics Data System (ADS)
Masood, Khalid
2008-08-01
Automatic classification of medical images is a part of our computerised medical imaging programme to support the pathologists in their diagnosis. Hyperspectral data has found its applications in medical imagery. Its usage is increasing significantly in biopsy analysis of medical images. In this paper, we present a histopathological analysis for the classification of colon biopsy samples into benign and malignant classes. The proposed study is based on comparison between 3D spectral/spatial analysis and 2D spatial analysis. Wavelet textural features in the wavelet domain are used in both these approaches for classification of colon biopsy samples. Experimental results indicate that the incorporation of wavelet textural features using a support vector machine, in 2D spatial analysis, achieve best classification accuracy.
Computer assisted optical biopsy for colorectal polyps
NASA Astrophysics Data System (ADS)
Navarro-Avila, Fernando J.; Saint-Hill-Febles, Yadira; Renner, Janis; Klare, Peter; von Delius, Stefan; Navab, Nassir; Mateus, Diana
2017-03-01
We propose a method for computer-assisted optical biopsy for colorectal polyps, with the final goal of assisting the medical expert during the colonoscopy. In particular, we target the problem of automatic classification of polyp images in two classes: adenomatous vs non-adenoma. Our approach is based on recent advancements in convolutional neural networks (CNN) for image representation. In the paper, we describe and compare four different methodologies to address the binary classification task: a baseline with classical features and a Random Forest classifier, two methods based on features obtained from a pre-trained network, and finally, the end-to-end training of a CNN. With the pre-trained network, we show the feasibility of transferring a feature extraction mechanism trained on millions of natural images, to the task of classifying adenomatous polyps. We then demonstrate further performance improvements when training the CNN for our specific classification task. In our study, 776 polyp images were acquired and histologically analyzed after polyp resection. We report a performance increase of the CNN-based approaches with respect to both, the conventional engineered features and to a state-of-the-art method based on videos and 3D shape features.
Feature selection gait-based gender classification under different circumstances
NASA Astrophysics Data System (ADS)
Sabir, Azhin; Al-Jawad, Naseer; Jassim, Sabah
2014-05-01
This paper proposes a gender classification based on human gait features and investigates the problem of two variations: clothing (wearing coats) and carrying bag condition as addition to the normal gait sequence. The feature vectors in the proposed system are constructed after applying wavelet transform. Three different sets of feature are proposed in this method. First, Spatio-temporal distance that is dealing with the distance of different parts of the human body (like feet, knees, hand, Human Height and shoulder) during one gait cycle. The second and third feature sets are constructed from approximation and non-approximation coefficient of human body respectively. To extract these two sets of feature we divided the human body into two parts, upper and lower body part, based on the golden ratio proportion. In this paper, we have adopted a statistical method for constructing the feature vector from the above sets. The dimension of the constructed feature vector is reduced based on the Fisher score as a feature selection method to optimize their discriminating significance. Finally k-Nearest Neighbor is applied as a classification method. Experimental results demonstrate that our approach is providing more realistic scenario and relatively better performance compared with the existing approaches.
Simple-random-sampling-based multiclass text classification algorithm.
Liu, Wuying; Wang, Lin; Yi, Mianzhu
2014-01-01
Multiclass text classification (MTC) is a challenging issue and the corresponding MTC algorithms can be used in many applications. The space-time overhead of the algorithms must be concerned about the era of big data. Through the investigation of the token frequency distribution in a Chinese web document collection, this paper reexamines the power law and proposes a simple-random-sampling-based MTC (SRSMTC) algorithm. Supported by a token level memory to store labeled documents, the SRSMTC algorithm uses a text retrieval approach to solve text classification problems. The experimental results on the TanCorp data set show that SRSMTC algorithm can achieve the state-of-the-art performance at greatly reduced space-time requirements.
Afanasyev, Pavel; Seer-Linnemayr, Charlotte; Ravelli, Raimond B G; Matadeen, Rishi; De Carlo, Sacha; Alewijnse, Bart; Portugal, Rodrigo V; Pannu, Navraj S; Schatz, Michael; van Heel, Marin
2017-09-01
Single-particle cryogenic electron microscopy (cryo-EM) can now yield near-atomic resolution structures of biological complexes. However, the reference-based alignment algorithms commonly used in cryo-EM suffer from reference bias, limiting their applicability (also known as the 'Einstein from random noise' problem). Low-dose cryo-EM therefore requires robust and objective approaches to reveal the structural information contained in the extremely noisy data, especially when dealing with small structures. A reference-free pipeline is presented for obtaining near-atomic resolution three-dimensional reconstructions from heterogeneous ('four-dimensional') cryo-EM data sets. The methodologies integrated in this pipeline include a posteriori camera correction, movie-based full-data-set contrast transfer function determination, movie-alignment algorithms, (Fourier-space) multivariate statistical data compression and unsupervised classification, 'random-startup' three-dimensional reconstructions, four-dimensional structural refinements and Fourier shell correlation criteria for evaluating anisotropic resolution. The procedures exclusively use information emerging from the data set itself, without external 'starting models'. Euler-angle assignments are performed by angular reconstitution rather than by the inherently slower projection-matching approaches. The comprehensive 'ABC-4D' pipeline is based on the two-dimensional reference-free 'alignment by classification' (ABC) approach, where similar images in similar orientations are grouped by unsupervised classification. Some fundamental differences between X-ray crystallography versus single-particle cryo-EM data collection and data processing are discussed. The structure of the giant haemoglobin from Lumbricus terrestris at a global resolution of ∼3.8 Å is presented as an example of the use of the ABC-4D procedure.
Fusion of shallow and deep features for classification of high-resolution remote sensing images
NASA Astrophysics Data System (ADS)
Gao, Lang; Tian, Tian; Sun, Xiao; Li, Hang
2018-02-01
Effective spectral and spatial pixel description plays a significant role for the classification of high resolution remote sensing images. Current approaches of pixel-based feature extraction are of two main kinds: one includes the widelyused principal component analysis (PCA) and gray level co-occurrence matrix (GLCM) as the representative of the shallow spectral and shape features, and the other refers to the deep learning-based methods which employ deep neural networks and have made great promotion on classification accuracy. However, the former traditional features are insufficient to depict complex distribution of high resolution images, while the deep features demand plenty of samples to train the network otherwise over fitting easily occurs if only limited samples are involved in the training. In view of the above, we propose a GLCM-based convolution neural network (CNN) approach to extract features and implement classification for high resolution remote sensing images. The employment of GLCM is able to represent the original images and eliminate redundant information and undesired noises. Meanwhile, taking shallow features as the input of deep network will contribute to a better guidance and interpretability. In consideration of the amount of samples, some strategies such as L2 regularization and dropout methods are used to prevent over-fitting. The fine-tuning strategy is also used in our study to reduce training time and further enhance the generalization performance of the network. Experiments with popular data sets such as PaviaU data validate that our proposed method leads to a performance improvement compared to individual involved approaches.
NASA Astrophysics Data System (ADS)
Albert, L.; Rottensteiner, F.; Heipke, C.
2015-08-01
Land cover and land use exhibit strong contextual dependencies. We propose a novel approach for the simultaneous classification of land cover and land use, where semantic and spatial context is considered. The image sites for land cover and land use classification form a hierarchy consisting of two layers: a land cover layer and a land use layer. We apply Conditional Random Fields (CRF) at both layers. The layers differ with respect to the image entities corresponding to the nodes, the employed features and the classes to be distinguished. In the land cover layer, the nodes represent super-pixels; in the land use layer, the nodes correspond to objects from a geospatial database. Both CRFs model spatial dependencies between neighbouring image sites. The complex semantic relations between land cover and land use are integrated in the classification process by using contextual features. We propose a new iterative inference procedure for the simultaneous classification of land cover and land use, in which the two classification tasks mutually influence each other. This helps to improve the classification accuracy for certain classes. The main idea of this approach is that semantic context helps to refine the class predictions, which, in turn, leads to more expressive context information. Thus, potentially wrong decisions can be reversed at later stages. The approach is designed for input data based on aerial images. Experiments are carried out on a test site to evaluate the performance of the proposed method. We show the effectiveness of the iterative inference procedure and demonstrate that a smaller size of the super-pixels has a positive influence on the classification result.
[Object-oriented aquatic vegetation extracting approach based on visible vegetation indices.
Jing, Ran; Deng, Lei; Zhao, Wen Ji; Gong, Zhao Ning
2016-05-01
Using the estimation of scale parameters (ESP) image segmentation tool to determine the ideal image segmentation scale, the optimal segmented image was created by the multi-scale segmentation method. Based on the visible vegetation indices derived from mini-UAV imaging data, we chose a set of optimal vegetation indices from a series of visible vegetation indices, and built up a decision tree rule. A membership function was used to automatically classify the study area and an aquatic vegetation map was generated. The results showed the overall accuracy of image classification using the supervised classification was 53.7%, and the overall accuracy of object-oriented image analysis (OBIA) was 91.7%. Compared with pixel-based supervised classification method, the OBIA method improved significantly the image classification result and further increased the accuracy of extracting the aquatic vegetation. The Kappa value of supervised classification was 0.4, and the Kappa value based OBIA was 0.9. The experimental results demonstrated that using visible vegetation indices derived from the mini-UAV data and OBIA method extracting the aquatic vegetation developed in this study was feasible and could be applied in other physically similar areas.
Correlation-based pattern recognition for implantable defibrillators.
Wilkins, J.
1996-01-01
An estimated 300,000 Americans die each year from cardiac arrhythmias. Historically, drug therapy or surgery were the only treatment options available for patients suffering from arrhythmias. Recently, implantable arrhythmia management devices have been developed. These devices allow abnormal cardiac rhythms to be sensed and corrected in vivo. Proper arrhythmia classification is critical to selecting the appropriate therapeutic intervention. The classification problem is made more challenging by the power/computation constraints imposed by the short battery life of implantable devices. Current devices utilize heart rate-based classification algorithms. Although easy to implement, rate-based approaches have unacceptably high error rates in distinguishing supraventricular tachycardia (SVT) from ventricular tachycardia (VT). Conventional morphology assessment techniques used in ECG analysis often require too much computation to be practical for implantable devices. In this paper, a computationally-efficient, arrhythmia classification architecture using correlation-based morphology assessment is presented. The architecture classifies individuals heart beats by assessing similarity between an incoming cardiac signal vector and a series of prestored class templates. A series of these beat classifications are used to make an overall rhythm assessment. The system makes use of several new results in the field of pattern recognition. The resulting system achieved excellent accuracy in discriminating SVT and VT. PMID:8947674
Vehicle Classification Using an Imbalanced Dataset Based on a Single Magnetic Sensor.
Xu, Chang; Wang, Yingguan; Bao, Xinghe; Li, Fengrong
2018-05-24
This paper aims to improve the accuracy of automatic vehicle classifiers for imbalanced datasets. Classification is made through utilizing a single anisotropic magnetoresistive sensor, with the models of vehicles involved being classified into hatchbacks, sedans, buses, and multi-purpose vehicles (MPVs). Using time domain and frequency domain features in combination with three common classification algorithms in pattern recognition, we develop a novel feature extraction method for vehicle classification. These three common classification algorithms are the k-nearest neighbor, the support vector machine, and the back-propagation neural network. Nevertheless, a problem remains with the original vehicle magnetic dataset collected being imbalanced, and may lead to inaccurate classification results. With this in mind, we propose an approach called SMOTE, which can further boost the performance of classifiers. Experimental results show that the k-nearest neighbor (KNN) classifier with the SMOTE algorithm can reach a classification accuracy of 95.46%, thus minimizing the effect of the imbalance.
NASA Astrophysics Data System (ADS)
Managò, Stefano; Valente, Carmen; Mirabelli, Peppino; Circolo, Diego; Basile, Filomena; Corda, Daniela; de Luca, Anna Chiara
2016-04-01
Acute lymphoblastic leukemia type B (B-ALL) is a neoplastic disorder that shows high mortality rates due to immature lymphocyte B-cell proliferation. B-ALL diagnosis requires identification and classification of the leukemia cells. Here, we demonstrate the use of Raman spectroscopy to discriminate normal lymphocytic B-cells from three different B-leukemia transformed cell lines (i.e., RS4;11, REH, MN60 cells) based on their biochemical features. In combination with immunofluorescence and Western blotting, we show that these Raman markers reflect the relative changes in the potential biological markers from cell surface antigens, cytoplasmic proteins, and DNA content and correlate with the lymphoblastic B-cell maturation/differentiation stages. Our study demonstrates the potential of this technique for classification of B-leukemia cells into the different differentiation/maturation stages, as well as for the identification of key biochemical changes under chemotherapeutic treatments. Finally, preliminary results from clinical samples indicate high consistency of, and potential applications for, this Raman spectroscopy approach.
Brain-computer interfacing under distraction: an evaluation study
NASA Astrophysics Data System (ADS)
Brandl, Stephanie; Frølich, Laura; Höhne, Johannes; Müller, Klaus-Robert; Samek, Wojciech
2016-10-01
Objective. While motor-imagery based brain-computer interfaces (BCIs) have been studied over many years by now, most of these studies have taken place in controlled lab settings. Bringing BCI technology into everyday life is still one of the main challenges in this field of research. Approach. This paper systematically investigates BCI performance under 6 types of distractions that mimic out-of-lab environments. Main results. We report results of 16 participants and show that the performance of the standard common spatial patterns (CSP) + regularized linear discriminant analysis classification pipeline drops significantly in this ‘simulated’ out-of-lab setting. We then investigate three methods for improving the performance: (1) artifact removal, (2) ensemble classification, and (3) a 2-step classification approach. While artifact removal does not enhance the BCI performance significantly, both ensemble classification and the 2-step classification combined with CSP significantly improve the performance compared to the standard procedure. Significance. Systematically analyzing out-of-lab scenarios is crucial when bringing BCI into everyday life. Algorithms must be adapted to overcome nonstationary environments in order to tackle real-world challenges.
Argumentation Based Joint Learning: A Novel Ensemble Learning Approach
Xu, Junyi; Yao, Li; Li, Le
2015-01-01
Recently, ensemble learning methods have been widely used to improve classification performance in machine learning. In this paper, we present a novel ensemble learning method: argumentation based multi-agent joint learning (AMAJL), which integrates ideas from multi-agent argumentation, ensemble learning, and association rule mining. In AMAJL, argumentation technology is introduced as an ensemble strategy to integrate multiple base classifiers and generate a high performance ensemble classifier. We design an argumentation framework named Arena as a communication platform for knowledge integration. Through argumentation based joint learning, high quality individual knowledge can be extracted, and thus a refined global knowledge base can be generated and used independently for classification. We perform numerous experiments on multiple public datasets using AMAJL and other benchmark methods. The results demonstrate that our method can effectively extract high quality knowledge for ensemble classifier and improve the performance of classification. PMID:25966359
NASA Astrophysics Data System (ADS)
Gonulalan, Cansu
In recent years, there has been an increasing demand for applications to monitor the targets related to land-use, using remote sensing images. Advances in remote sensing satellites give rise to the research in this area. Many applications ranging from urban growth planning to homeland security have already used the algorithms for automated object recognition from remote sensing imagery. However, they have still problems such as low accuracy on detection of targets, specific algorithms for a specific area etc. In this thesis, we focus on an automatic approach to classify and detect building foot-prints, road networks and vegetation areas. The automatic interpretation of visual data is a comprehensive task in computer vision field. The machine learning approaches improve the capability of classification in an intelligent way. We propose a method, which has high accuracy on detection and classification. The multi class classification is developed for detecting multiple objects. We present an AdaBoost-based approach along with the supervised learning algorithm. The combi- nation of AdaBoost with "Attentional Cascade" is adopted from Viola and Jones [1]. This combination decreases the computation time and gives opportunity to real time applications. For the feature extraction step, our contribution is to combine Haar-like features that include corner, rectangle and Gabor. Among all features, AdaBoost selects only critical features and generates in extremely efficient cascade structured classifier. Finally, we present and evaluate our experimental results. The overall system is tested and high performance of detection is achieved. The precision rate of the final multi-class classifier is over 98%.
Elman RNN based classification of proteins sequences on account of their mutual information.
Mishra, Pooja; Nath Pandey, Paras
2012-10-21
In the present work we have employed the method of estimating residue correlation within the protein sequences, by using the mutual information (MI) of adjacent residues, based on structural and solvent accessibility properties of amino acids. The long range correlation between nonadjacent residues is improved by constructing a mutual information vector (MIV) for a single protein sequence, like this each protein sequence is associated with its corresponding MIVs. These MIVs are given to Elman RNN to obtain the classification of protein sequences. The modeling power of MIV was shown to be significantly better, giving a new approach towards alignment free classification of protein sequences. We also conclude that sequence structural and solvent accessible property based MIVs are better predictor. Copyright © 2012 Elsevier Ltd. All rights reserved.
Characterization and classification of South American land cover types using satellite data
NASA Technical Reports Server (NTRS)
Townshend, J. R. G.; Justice, C. O.; Kalb, V.
1987-01-01
Various methods are compared for carrying out land cover classifications of South America using multitemporal Advanced Very High Resolution Radiometer data. Fifty-two images of the normalized difference vegetation index (NDVI) from a 1-year period are used to generate multitemporal data sets. Three main approaches to land cover classification are considered, namely the use of the principal components transformed images, the use of a characteristic curves procedure based on NDVI values plotted against time, and finally application of the maximum likelihood rule to multitemporal data sets. Comparison of results from training sites indicates that the last approach yields the most accurate results. Despite the reliance on training site figures for performance assessment, the results are nevertheless extremely encouraging, with accuracies for several cover types exceeding 90 per cent.
Delavarian, Mona; Towhidkhah, Farzad; Gharibzadeh, Shahriar; Dibajnia, Parvin
2011-07-12
Automatic classification of different behavioral disorders with many similarities (e.g. in symptoms) by using an automated approach will help psychiatrists to concentrate on correct disorder and its treatment as soon as possible, to avoid wasting time on diagnosis, and to increase the accuracy of diagnosis. In this study, we tried to differentiate and classify (diagnose) 306 children with many similar symptoms and different behavioral disorders such as ADHD, depression, anxiety, comorbid depression and anxiety and conduct disorder with high accuracy. Classification was based on the symptoms and their severity. With examining 16 different available classifiers, by using "Prtools", we have proposed nearest mean classifier as the most accurate classifier with 96.92% accuracy in this research. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
SoFoCles: feature filtering for microarray classification based on gene ontology.
Papachristoudis, Georgios; Diplaris, Sotiris; Mitkas, Pericles A
2010-02-01
Marker gene selection has been an important research topic in the classification analysis of gene expression data. Current methods try to reduce the "curse of dimensionality" by using statistical intra-feature set calculations, or classifiers that are based on the given dataset. In this paper, we present SoFoCles, an interactive tool that enables semantic feature filtering in microarray classification problems with the use of external, well-defined knowledge retrieved from the Gene Ontology. The notion of semantic similarity is used to derive genes that are involved in the same biological path during the microarray experiment, by enriching a feature set that has been initially produced with legacy methods. Among its other functionalities, SoFoCles offers a large repository of semantic similarity methods that are used in order to derive feature sets and marker genes. The structure and functionality of the tool are discussed in detail, as well as its ability to improve classification accuracy. Through experimental evaluation, SoFoCles is shown to outperform other classification schemes in terms of classification accuracy in two real datasets using different semantic similarity computation approaches.
Development of Vision Based Multiview Gait Recognition System with MMUGait Database
Ng, Hu; Tan, Wooi-Haw; Tong, Hau-Lee
2014-01-01
This paper describes the acquisition setup and development of a new gait database, MMUGait. This database consists of 82 subjects walking under normal condition and 19 subjects walking with 11 covariate factors, which were captured under two views. This paper also proposes a multiview model-based gait recognition system with joint detection approach that performs well under different walking trajectories and covariate factors, which include self-occluded or external occluded silhouettes. In the proposed system, the process begins by enhancing the human silhouette to remove the artifacts. Next, the width and height of the body are obtained. Subsequently, the joint angular trajectories are determined once the body joints are automatically detected. Lastly, crotch height and step-size of the walking subject are determined. The extracted features are smoothened by Gaussian filter to eliminate the effect of outliers. The extracted features are normalized with linear scaling, which is followed by feature selection prior to the classification process. The classification experiments carried out on MMUGait database were benchmarked against the SOTON Small DB from University of Southampton. Results showed correct classification rate above 90% for all the databases. The proposed approach is found to outperform other approaches on SOTON Small DB in most cases. PMID:25143972
Evaluating the Visualization of What a Deep Neural Network Has Learned.
Samek, Wojciech; Binder, Alexander; Montavon, Gregoire; Lapuschkin, Sebastian; Muller, Klaus-Robert
Deep neural networks (DNNs) have demonstrated impressive performance in complex machine learning tasks such as image classification or speech recognition. However, due to their multilayer nonlinear structure, they are not transparent, i.e., it is hard to grasp what makes them arrive at a particular classification or recognition decision, given a new unseen data sample. Recently, several approaches have been proposed enabling one to understand and interpret the reasoning embodied in a DNN for a single test image. These methods quantify the "importance" of individual pixels with respect to the classification decision and allow a visualization in terms of a heatmap in pixel/input space. While the usefulness of heatmaps can be judged subjectively by a human, an objective quality measure is missing. In this paper, we present a general methodology based on region perturbation for evaluating ordered collections of pixels such as heatmaps. We compare heatmaps computed by three different methods on the SUN397, ILSVRC2012, and MIT Places data sets. Our main result is that the recently proposed layer-wise relevance propagation algorithm qualitatively and quantitatively provides a better explanation of what made a DNN arrive at a particular classification decision than the sensitivity-based approach or the deconvolution method. We provide theoretical arguments to explain this result and discuss its practical implications. Finally, we investigate the use of heatmaps for unsupervised assessment of the neural network performance.Deep neural networks (DNNs) have demonstrated impressive performance in complex machine learning tasks such as image classification or speech recognition. However, due to their multilayer nonlinear structure, they are not transparent, i.e., it is hard to grasp what makes them arrive at a particular classification or recognition decision, given a new unseen data sample. Recently, several approaches have been proposed enabling one to understand and interpret the reasoning embodied in a DNN for a single test image. These methods quantify the "importance" of individual pixels with respect to the classification decision and allow a visualization in terms of a heatmap in pixel/input space. While the usefulness of heatmaps can be judged subjectively by a human, an objective quality measure is missing. In this paper, we present a general methodology based on region perturbation for evaluating ordered collections of pixels such as heatmaps. We compare heatmaps computed by three different methods on the SUN397, ILSVRC2012, and MIT Places data sets. Our main result is that the recently proposed layer-wise relevance propagation algorithm qualitatively and quantitatively provides a better explanation of what made a DNN arrive at a particular classification decision than the sensitivity-based approach or the deconvolution method. We provide theoretical arguments to explain this result and discuss its practical implications. Finally, we investigate the use of heatmaps for unsupervised assessment of the neural network performance.
Epileptic seizure detection in EEG signal using machine learning techniques.
Jaiswal, Abeg Kumar; Banka, Haider
2018-03-01
Epilepsy is a well-known nervous system disorder characterized by seizures. Electroencephalograms (EEGs), which capture brain neural activity, can detect epilepsy. Traditional methods for analyzing an EEG signal for epileptic seizure detection are time-consuming. Recently, several automated seizure detection frameworks using machine learning technique have been proposed to replace these traditional methods. The two basic steps involved in machine learning are feature extraction and classification. Feature extraction reduces the input pattern space by keeping informative features and the classifier assigns the appropriate class label. In this paper, we propose two effective approaches involving subpattern based PCA (SpPCA) and cross-subpattern correlation-based PCA (SubXPCA) with Support Vector Machine (SVM) for automated seizure detection in EEG signals. Feature extraction was performed using SpPCA and SubXPCA. Both techniques explore the subpattern correlation of EEG signals, which helps in decision-making process. SVM is used for classification of seizure and non-seizure EEG signals. The SVM was trained with radial basis kernel. All the experiments have been carried out on the benchmark epilepsy EEG dataset. The entire dataset consists of 500 EEG signals recorded under different scenarios. Seven different experimental cases for classification have been conducted. The classification accuracy was evaluated using tenfold cross validation. The classification results of the proposed approaches have been compared with the results of some of existing techniques proposed in the literature to establish the claim.
Deep learning for EEG-Based preference classification
NASA Astrophysics Data System (ADS)
Teo, Jason; Hou, Chew Lin; Mountstephens, James
2017-10-01
Electroencephalogram (EEG)-based emotion classification is rapidly becoming one of the most intensely studied areas of brain-computer interfacing (BCI). The ability to passively identify yet accurately correlate brainwaves with our immediate emotions opens up truly meaningful and previously unattainable human-computer interactions such as in forensic neuroscience, rehabilitative medicine, affective entertainment and neuro-marketing. One particularly useful yet rarely explored areas of EEG-based emotion classification is preference recognition [1], which is simply the detection of like versus dislike. Within the limited investigations into preference classification, all reported studies were based on musically-induced stimuli except for a single study which used 2D images. The main objective of this study is to apply deep learning, which has been shown to produce state-of-the-art results in diverse hard problems such as in computer vision, natural language processing and audio recognition, to 3D object preference classification over a larger group of test subjects. A cohort of 16 users was shown 60 bracelet-like objects as rotating visual stimuli on a computer display while their preferences and EEGs were recorded. After training a variety of machine learning approaches which included deep neural networks, we then attempted to classify the users' preferences for the 3D visual stimuli based on their EEGs. Here, we show that that deep learning outperforms a variety of other machine learning classifiers for this EEG-based preference classification task particularly in a highly challenging dataset with large inter- and intra-subject variability.
NASA Astrophysics Data System (ADS)
Liu, Sijia; Sa, Ruhan; Maguire, Orla; Minderman, Hans; Chaudhary, Vipin
2015-03-01
Cytogenetic abnormalities are important diagnostic and prognostic criteria for acute myeloid leukemia (AML). A flow cytometry-based imaging approach for FISH in suspension (FISH-IS) was established that enables the automated analysis of several log-magnitude higher number of cells compared to the microscopy-based approaches. The rotational positioning can occur leading to discordance between spot count. As a solution of counting error from overlapping spots, in this study, a Gaussian Mixture Model based classification method is proposed. The Akaike information criterion (AIC) and Bayesian information criterion (BIC) of GMM are used as global image features of this classification method. Via Random Forest classifier, the result shows that the proposed method is able to detect closely overlapping spots which cannot be separated by existing image segmentation based spot detection methods. The experiment results show that by the proposed method we can obtain a significant improvement in spot counting accuracy.
Wang, Yin; Li, Rudong; Zhou, Yuhua; Ling, Zongxin; Guo, Xiaokui; Xie, Lu; Liu, Lei
2016-01-01
Text data of 16S rRNA are informative for classifications of microbiota-associated diseases. However, the raw text data need to be systematically processed so that features for classification can be defined/extracted; moreover, the high-dimension feature spaces generated by the text data also pose an additional difficulty. Here we present a Phylogenetic Tree-Based Motif Finding algorithm (PMF) to analyze 16S rRNA text data. By integrating phylogenetic rules and other statistical indexes for classification, we can effectively reduce the dimension of the large feature spaces generated by the text datasets. Using the retrieved motifs in combination with common classification methods, we can discriminate different samples of both pneumonia and dental caries better than other existing methods. We extend the phylogenetic approaches to perform supervised learning on microbiota text data to discriminate the pathological states for pneumonia and dental caries. The results have shown that PMF may enhance the efficiency and reliability in analyzing high-dimension text data.
van Leth, Frank; den Heijer, Casper; Beerepoot, Mariëlle; Stobberingh, Ellen; Geerlings, Suzanne; Schultsz, Constance
2017-04-01
Increasing antimicrobial resistance (AMR) requires rapid surveillance tools, such as Lot Quality Assurance Sampling (LQAS). LQAS classifies AMR as high or low based on set parameters. We compared classifications with the underlying true AMR prevalence using data on 1335 Escherichia coli isolates from surveys of community-acquired urinary tract infection in women, by assessing operating curves, sensitivity and specificity. Sensitivity and specificity of any set of LQAS parameters was above 99% and between 79 and 90%, respectively. Operating curves showed high concordance of the LQAS classification with true AMR prevalence estimates. LQAS-based AMR surveillance is a feasible approach that provides timely and locally relevant estimates, and the necessary information to formulate and evaluate guidelines for empirical treatment.
A land classification protocol for pollinator ecology research: An urbanization case study.
Samuelson, Ash E; Leadbeater, Ellouise
2018-06-01
Land-use change is one of the most important drivers of widespread declines in pollinator populations. Comprehensive quantitative methods for land classification are critical to understanding these effects, but co-option of existing human-focussed land classifications is often inappropriate for pollinator research. Here, we present a flexible GIS-based land classification protocol for pollinator research using a bottom-up approach driven by reference to pollinator ecology, with urbanization as a case study. Our multistep method involves manually generating land cover maps at multiple biologically relevant radii surrounding study sites using GIS, with a focus on identifying land cover types that have a specific relevance to pollinators. This is followed by a three-step refinement process using statistical tools: (i) definition of land-use categories, (ii) principal components analysis on the categories, and (iii) cluster analysis to generate a categorical land-use variable for use in subsequent analysis. Model selection is then used to determine the appropriate spatial scale for analysis. We demonstrate an application of our protocol using a case study of 38 sites across a gradient of urbanization in South-East England. In our case study, the land classification generated a categorical land-use variable at each of four radii based on the clustering of sites with different degrees of urbanization, open land, and flower-rich habitat. Studies of land-use effects on pollinators have historically employed a wide array of land classification techniques from descriptive and qualitative to complex and quantitative. We suggest that land-use studies in pollinator ecology should broadly adopt GIS-based multistep land classification techniques to enable robust analysis and aid comparative research. Our protocol offers a customizable approach that combines specific relevance to pollinator research with the potential for application to a wide range of ecological questions, including agroecological studies of pest control.
Johnson, Nathan T; Dhroso, Andi; Hughes, Katelyn J; Korkin, Dmitry
2018-06-25
The extent to which the genes are expressed in the cell can be simplistically defined as a function of one or more factors of the environment, lifestyle, and genetics. RNA sequencing (RNA-Seq) is becoming a prevalent approach to quantify gene expression, and is expected to gain better insights to a number of biological and biomedical questions, compared to the DNA microarrays. Most importantly, RNA-Seq allows to quantify expression at the gene and alternative splicing isoform levels. However, leveraging the RNA-Seq data requires development of new data mining and analytics methods. Supervised machine learning methods are commonly used approaches for biological data analysis, and have recently gained attention for their applications to the RNA-Seq data. In this work, we assess the utility of supervised learning methods trained on RNA-Seq data for a diverse range of biological classification tasks. We hypothesize that the isoform-level expression data is more informative for biological classification tasks than the gene-level expression data. Our large-scale assessment is done through utilizing multiple datasets, organisms, lab groups, and RNA-Seq analysis pipelines. Overall, we performed and assessed 61 biological classification problems that leverage three independent RNA-Seq datasets and include over 2,000 samples that come from multiple organisms, lab groups, and RNA-Seq analyses. These 61 problems include predictions of the tissue type, sex, or age of the sample, healthy or cancerous phenotypes and, the pathological tumor stage for the samples from the cancerous tissue. For each classification problem, the performance of three normalization techniques and six machine learning classifiers was explored. We find that for every single classification problem, the isoform-based classifiers outperform or are comparable with gene expression based methods. The top-performing supervised learning techniques reached a near perfect classification accuracy, demonstrating the utility of supervised learning for RNA-Seq based data analysis. Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Particle Swarm Optimization approach to defect detection in armour ceramics.
Kesharaju, Manasa; Nagarajah, Romesh
2017-03-01
In this research, various extracted features were used in the development of an automated ultrasonic sensor based inspection system that enables defect classification in each ceramic component prior to despatch to the field. Classification is an important task and large number of irrelevant, redundant features commonly introduced to a dataset reduces the classifiers performance. Feature selection aims to reduce the dimensionality of the dataset while improving the performance of a classification system. In the context of a multi-criteria optimization problem (i.e. to minimize classification error rate and reduce number of features) such as one discussed in this research, the literature suggests that evolutionary algorithms offer good results. Besides, it is noted that Particle Swarm Optimization (PSO) has not been explored especially in the field of classification of high frequency ultrasonic signals. Hence, a binary coded Particle Swarm Optimization (BPSO) technique is investigated in the implementation of feature subset selection and to optimize the classification error rate. In the proposed method, the population data is used as input to an Artificial Neural Network (ANN) based classification system to obtain the error rate, as ANN serves as an evaluator of PSO fitness function. Copyright © 2016. Published by Elsevier B.V.
NASA Astrophysics Data System (ADS)
Fujita, Yusuke; Mitani, Yoshihiro; Hamamoto, Yoshihiko; Segawa, Makoto; Terai, Shuji; Sakaida, Isao
2017-03-01
Ultrasound imaging is a popular and non-invasive tool used in the diagnoses of liver disease. Cirrhosis is a chronic liver disease and it can advance to liver cancer. Early detection and appropriate treatment are crucial to prevent liver cancer. However, ultrasound image analysis is very challenging, because of the low signal-to-noise ratio of ultrasound images. To achieve the higher classification performance, selection of training regions of interest (ROIs) is very important that effect to classification accuracy. The purpose of our study is cirrhosis detection with high accuracy using liver ultrasound images. In our previous works, training ROI selection by MILBoost and multiple-ROI classification based on the product rule had been proposed, to achieve high classification performance. In this article, we propose self-training method to select training ROIs effectively. Evaluation experiments were performed to evaluate effect of self-training, using manually selected ROIs and also automatically selected ROIs. Experimental results show that self-training for manually selected ROIs achieved higher classification performance than other approaches, including our conventional methods. The manually ROI definition and sample selection are important to improve classification accuracy in cirrhosis detection using ultrasound images.
Multi-class SVM model for fMRI-based classification and grading of liver fibrosis
NASA Astrophysics Data System (ADS)
Freiman, M.; Sela, Y.; Edrei, Y.; Pappo, O.; Joskowicz, L.; Abramovitch, R.
2010-03-01
We present a novel non-invasive automatic method for the classification and grading of liver fibrosis from fMRI maps based on hepatic hemodynamic changes. This method automatically creates a model for liver fibrosis grading based on training datasets. Our supervised learning method evaluates hepatic hemodynamics from an anatomical MRI image and three T2*-W fMRI signal intensity time-course scans acquired during the breathing of air, air-carbon dioxide, and carbogen. It constructs a statistical model of liver fibrosis from these fMRI scans using a binary-based one-against-all multi class Support Vector Machine (SVM) classifier. We evaluated the resulting classification model with the leave-one out technique and compared it to both full multi-class SVM and K-Nearest Neighbor (KNN) classifications. Our experimental study analyzed 57 slice sets from 13 mice, and yielded a 98.2% separation accuracy between healthy and low grade fibrotic subjects, and an overall accuracy of 84.2% for fibrosis grading. These results are better than the existing image-based methods which can only discriminate between healthy and high grade fibrosis subjects. With appropriate extensions, our method may be used for non-invasive classification and progression monitoring of liver fibrosis in human patients instead of more invasive approaches, such as biopsy or contrast-enhanced imaging.
Peng, Xiang; King, Irwin
2008-01-01
The Biased Minimax Probability Machine (BMPM) constructs a classifier which deals with the imbalanced learning tasks. It provides a worst-case bound on the probability of misclassification of future data points based on reliable estimates of means and covariance matrices of the classes from the training data samples, and achieves promising performance. In this paper, we develop a novel yet critical extension training algorithm for BMPM that is based on Second-Order Cone Programming (SOCP). Moreover, we apply the biased classification model to medical diagnosis problems to demonstrate its usefulness. By removing some crucial assumptions in the original solution to this model, we make the new method more accurate and robust. We outline the theoretical derivatives of the biased classification model, and reformulate it into an SOCP problem which could be efficiently solved with global optima guarantee. We evaluate our proposed SOCP-based BMPM (BMPMSOCP) scheme in comparison with traditional solutions on medical diagnosis tasks where the objectives are to focus on improving the sensitivity (the accuracy of the more important class, say "ill" samples) instead of the overall accuracy of the classification. Empirical results have shown that our method is more effective and robust to handle imbalanced classification problems than traditional classification approaches, and the original Fractional Programming-based BMPM (BMPMFP).
Land cover classification of Landsat 8 satellite data based on Fuzzy Logic approach
NASA Astrophysics Data System (ADS)
Taufik, Afirah; Sakinah Syed Ahmad, Sharifah
2016-06-01
The aim of this paper is to propose a method to classify the land covers of a satellite image based on fuzzy rule-based system approach. The study uses bands in Landsat 8 and other indices, such as Normalized Difference Water Index (NDWI), Normalized difference built-up index (NDBI) and Normalized Difference Vegetation Index (NDVI) as input for the fuzzy inference system. The selected three indices represent our main three classes called water, built- up land, and vegetation. The combination of the original multispectral bands and selected indices provide more information about the image. The parameter selection of fuzzy membership is performed by using a supervised method known as ANFIS (Adaptive neuro fuzzy inference system) training. The fuzzy system is tested for the classification on the land cover image that covers Klang Valley area. The results showed that the fuzzy system approach is effective and can be explored and implemented for other areas of Landsat data.
NASA Astrophysics Data System (ADS)
Hoffmeister, Dirk; Kramm, Tanja; Curdt, Constanze; Maleki, Sedigheh; Khormali, Farhad; Kehl, Martin
2016-04-01
The Iranian loess plateau is covered by loess deposits, up to 70 m thick. Tectonic uplift triggered deep erosion and valley incision into the loess and underlying marine deposits. Soil development strongly relates to the aspect of these incised slopes, because on northern slopes vegetation protects the soil surface against erosion and facilitates formation and preservation of a Cambisol, whereas on south-facing slopes soils were probably eroded and weakly developed Entisols formed. While the whole area is intensively stocked with sheep and goat, rain-fed cropping of winter wheat is practiced on the valley floors. Most time of the year, the soil surface is unprotected against rainfall, which is one of the factors promoting soil erosion and serious flooding. However, little information is available on soil distribution, plant cover and the geomorphological evolution of the plateau, as well as on potentials and problems in land use. Thus, digital landform and soil mapping is needed. As a requirement of digital landform and soil mapping, four different landform classification methods were compared and evaluated. These geomorphometric classifications were run on two different scales. On the whole area an ASTER GDEM and SRTM dataset (30 m pixel resolution) was used. Likewise, two high-resolution digital elevation models were derived from Pléiades satellite stereo-imagery (< 1m pixel resolution, 10 by 10 km). The high-resolution information of this dataset was aggregated to datasets of 5 and 10 m scale. The applied classification methods are the Geomorphons approach, an object-based image approach, the topographical position index and a mainly slope based approach. The accuracy of the classification was checked with a location related image dataset obtained in a field survey (n ~ 150) in September 2015. The accuracy of the DEMs was compared to measured DGPS trenches and map-based elevation data. The overall derived accuracy of the landform classification based on the high-resolution DEM with a resolution of 5 m is approximately 70% and on a 10 m resolution >58%. For the 30 m resolution datasets is the achieved accuracy approximately 40%, as several small scale features are not recognizable in this resolution. Thus, for an accurate differentiation between different important landform types, high-resolution datasets are necessary for this strongly shaped area. One major problem of this approach are the different classes derived by each method and the various class annotations. The result of this evaluation will be regarded for the derivation of landform and soil maps.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cavanaugh, J.E.; McQuarrie, A.D.; Shumway, R.H.
Conventional methods for discriminating between earthquakes and explosions at regional distances have concentrated on extracting specific features such as amplitude and spectral ratios from the waveforms of the P and S phases. We consider here an optimum nonparametric classification procedure derived from the classical approach to discriminating between two Gaussian processes with unequal spectra. Two robust variations based on the minimum discrimination information statistic and Renyi's entropy are also considered. We compare the optimum classification procedure with various amplitude and spectral ratio discriminants and show that its performance is superior when applied to a small population of 8 land-based earthquakesmore » and 8 mining explosions recorded in Scandinavia. Several parametric characterizations of the notion of complexity based on modeling earthquakes and explosions as autoregressive or modulated autoregressive processes are also proposed and their performance compared with the nonparametric and feature extraction approaches.« less
NASA Astrophysics Data System (ADS)
Ceballos, G. A.; Hernández, L. F.
2015-04-01
Objective. The classical ERP-based speller, or P300 Speller, is one of the most commonly used paradigms in the field of Brain Computer Interfaces (BCI). Several alterations to the visual stimuli presentation system have been developed to avoid unfavorable effects elicited by adjacent stimuli. However, there has been little, if any, regard to useful information contained in responses to adjacent stimuli about spatial location of target symbols. This paper aims to demonstrate that combining the classification of non-target adjacent stimuli with standard classification (target versus non-target) significantly improves classical ERP-based speller efficiency. Approach. Four SWLDA classifiers were trained and combined with the standard classifier: the lower row, upper row, right column and left column classifiers. This new feature extraction procedure and the classification method were carried out on three open databases: the UAM P300 database (Universidad Autonoma Metropolitana, Mexico), BCI competition II (dataset IIb) and BCI competition III (dataset II). Main results. The inclusion of the classification of non-target adjacent stimuli improves target classification in the classical row/column paradigm. A gain in mean single trial classification of 9.6% and an overall improvement of 25% in simulated spelling speed was achieved. Significance. We have provided further evidence that the ERPs produced by adjacent stimuli present discriminable features, which could provide additional information about the spatial location of intended symbols. This work promotes the searching of information on the peripheral stimulation responses to improve the performance of emerging visual ERP-based spellers.
A Visual mining based framework for classification accuracy estimation
NASA Astrophysics Data System (ADS)
Arun, Pattathal Vijayakumar
2013-12-01
Classification techniques have been widely used in different remote sensing applications and correct classification of mixed pixels is a tedious task. Traditional approaches adopt various statistical parameters, however does not facilitate effective visualisation. Data mining tools are proving very helpful in the classification process. We propose a visual mining based frame work for accuracy assessment of classification techniques using open source tools such as WEKA and PREFUSE. These tools in integration can provide an efficient approach for getting information about improvements in the classification accuracy and helps in refining training data set. We have illustrated framework for investigating the effects of various resampling methods on classification accuracy and found that bilinear (BL) is best suited for preserving radiometric characteristics. We have also investigated the optimal number of folds required for effective analysis of LISS-IV images. Techniki klasyfikacji są szeroko wykorzystywane w różnych aplikacjach teledetekcyjnych, w których poprawna klasyfikacja pikseli stanowi poważne wyzwanie. Podejście tradycyjne wykorzystujące różnego rodzaju parametry statystyczne nie zapewnia efektywnej wizualizacji. Wielce obiecujące wydaje się zastosowanie do klasyfikacji narzędzi do eksploracji danych. W artykule zaproponowano podejście bazujące na wizualnej analizie eksploracyjnej, wykorzystujące takie narzędzia typu open source jak WEKA i PREFUSE. Wymienione narzędzia ułatwiają korektę pół treningowych i efektywnie wspomagają poprawę dokładności klasyfikacji. Działanie metody sprawdzono wykorzystując wpływ różnych metod resampling na zachowanie dokładności radiometrycznej i uzyskując najlepsze wyniki dla metody bilinearnej (BL).
Visual affective classification by combining visual and text features.
Liu, Ningning; Wang, Kai; Jin, Xin; Gao, Boyang; Dellandréa, Emmanuel; Chen, Liming
2017-01-01
Affective analysis of images in social networks has drawn much attention, and the texts surrounding images are proven to provide valuable semantic meanings about image content, which can hardly be represented by low-level visual features. In this paper, we propose a novel approach for visual affective classification (VAC) task. This approach combines visual representations along with novel text features through a fusion scheme based on Dempster-Shafer (D-S) Evidence Theory. Specifically, we not only investigate different types of visual features and fusion methods for VAC, but also propose textual features to effectively capture emotional semantics from the short text associated to images based on word similarity. Experiments are conducted on three public available databases: the International Affective Picture System (IAPS), the Artistic Photos and the MirFlickr Affect set. The results demonstrate that the proposed approach combining visual and textual features provides promising results for VAC task.
Visual affective classification by combining visual and text features
Liu, Ningning; Wang, Kai; Jin, Xin; Gao, Boyang; Dellandréa, Emmanuel; Chen, Liming
2017-01-01
Affective analysis of images in social networks has drawn much attention, and the texts surrounding images are proven to provide valuable semantic meanings about image content, which can hardly be represented by low-level visual features. In this paper, we propose a novel approach for visual affective classification (VAC) task. This approach combines visual representations along with novel text features through a fusion scheme based on Dempster-Shafer (D-S) Evidence Theory. Specifically, we not only investigate different types of visual features and fusion methods for VAC, but also propose textual features to effectively capture emotional semantics from the short text associated to images based on word similarity. Experiments are conducted on three public available databases: the International Affective Picture System (IAPS), the Artistic Photos and the MirFlickr Affect set. The results demonstrate that the proposed approach combining visual and textual features provides promising results for VAC task. PMID:28850566
Unsupervised classification of operator workload from brain signals.
Schultze-Kraft, Matthias; Dähne, Sven; Gugler, Manfred; Curio, Gabriel; Blankertz, Benjamin
2016-06-01
In this study we aimed for the classification of operator workload as it is expected in many real-life workplace environments. We explored brain-signal based workload predictors that differ with respect to the level of label information required for training, including entirely unsupervised approaches. Subjects executed a task on a touch screen that required continuous effort of visual and motor processing with alternating difficulty. We first employed classical approaches for workload state classification that operate on the sensor space of EEG and compared those to the performance of three state-of-the-art spatial filtering methods: common spatial patterns (CSPs) analysis, which requires binary label information; source power co-modulation (SPoC) analysis, which uses the subjects' error rate as a target function; and canonical SPoC (cSPoC) analysis, which solely makes use of cross-frequency power correlations induced by different states of workload and thus represents an unsupervised approach. Finally, we investigated the effects of fusing brain signals and peripheral physiological measures (PPMs) and examined the added value for improving classification performance. Mean classification accuracies of 94%, 92% and 82% were achieved with CSP, SPoC, cSPoC, respectively. These methods outperformed the approaches that did not use spatial filtering and they extracted physiologically plausible components. The performance of the unsupervised cSPoC is significantly increased by augmenting it with PPM features. Our analyses ensured that the signal sources used for classification were of cortical origin and not contaminated with artifacts. Our findings show that workload states can be successfully differentiated from brain signals, even when less and less information from the experimental paradigm is used, thus paving the way for real-world applications in which label information may be noisy or entirely unavailable.
Tree species classification in subtropical forests using small-footprint full-waveform LiDAR data
NASA Astrophysics Data System (ADS)
Cao, Lin; Coops, Nicholas C.; Innes, John L.; Dai, Jinsong; Ruan, Honghua; She, Guanghui
2016-07-01
The accurate classification of tree species is critical for the management of forest ecosystems, particularly subtropical forests, which are highly diverse and complex ecosystems. While airborne Light Detection and Ranging (LiDAR) technology offers significant potential to estimate forest structural attributes, the capacity of this new tool to classify species is less well known. In this research, full-waveform metrics were extracted by a voxel-based composite waveform approach and examined with a Random Forests classifier to discriminate six subtropical tree species (i.e., Masson pine (Pinus massoniana Lamb.)), Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.), Slash pines (Pinus elliottii Engelm.), Sawtooth oak (Quercus acutissima Carruth.) and Chinese holly (Ilex chinensis Sims.) at three levels of discrimination. As part of the analysis, the optimal voxel size for modelling the composite waveforms was investigated, the most important predictor metrics for species classification assessed and the effect of scan angle on species discrimination examined. Results demonstrate that all tree species were classified with relatively high accuracy (68.6% for six classes, 75.8% for four main species and 86.2% for conifers and broadleaved trees). Full-waveform metrics (based on height of median energy, waveform distance and number of waveform peaks) demonstrated high classification importance and were stable among various voxel sizes. The results also suggest that the voxel based approach can alleviate some of the issues associated with large scan angles. In summary, the results indicate that full-waveform LIDAR data have significant potential for tree species classification in the subtropical forests.
Mission-Based Scenario Research: Experimental Design and Analysis
2011-08-10
with Army Reserach Lab; DCS Corporation, Alexandria, Va and Warren, Mi 14. ABSTRACT In this paper , we discuss a neuroimaging experiment that...Development and Engineering Center Warren, MI Kelvin S. Oie, PhD Army Research Laboratory Aberdeen Proving Ground, MD ABSTRACT In this paper , we...experiment, this paper will emphasize analyses that employ a pattern classification analysis approach. These classification examples aim to identify
Combining TerraSAR-X and SPOT-5 data for object-based landslide detection
NASA Astrophysics Data System (ADS)
Friedl, B.; Hölbling, D.; Füreder, P.
2012-04-01
Landslide detection and classification is an essential requirement in pre- and post-disaster hazard analysis. In earlier studies landslide detection often was achieved through time-consuming and cost-intensive field surveys and visual orthophoto interpretation. Recent studies show that Earth Observation (EO) data offer new opportunities for fast, reliable and accurate landslide detection and classification, which may conduce to an effective landslide monitoring and landslide hazard management. To ensure the fast recognition and classification of landslides at a regional scale, a (semi-)automated object-based landslide detection approach is established for a study site situated in the Huaguoshan catchment, Southern Taiwan. The study site exhibits a high vulnerability to landslides and debris flows, which are predominantly typhoon-induced. Through the integration of optical satellite data (SPOT-5 with 2.5 m GSD), SAR (Synthetic Aperture Radar) data (TerraSAR-X Spotlight with 2.95 m GSD) and digital elevation information (DEM with 5 m GSD) including its derived products (e.g. slope, curvature, flow accumulation) landslides may be examined in a more efficient way as if relying on single data sources only. The combination of optical and SAR data in an object-based image analysis (OBIA) domain for landslide detection and classification has not been investigated so far, even if SAR imagery show valuable properties for landslide detection, which differ from optical data (e.g. high sensitivity to surface roughness and soil moisture). The main purpose of this study is to recognize and analyze existing landslides by applying object-based image analysis making use of eCognition software. OBIA provides a framework for examining features defined by spectral, spatial, textural, contextual as well as hierarchical properties. Objects are derived through image segmentation and serve as input for the classification process, which relies on transparent rulesets, representing knowledge. Through class modeling, an iterative process of segmentation and classification, objects can be addressed individually in a region-specific manner. The presented approach is marked by the comprehensive use of available data sets from various sources. This full integration of optical, SAR and DEM data conduces to the development of a robust method, which makes use of the most appropriate characteristics (e.g. spectral, textural, contextual) of each data set. The proposed method contributes to a more rapid and accurate landslide mapping in order to assist disaster and crisis management. Especially SAR data proves to be useful in the aftermath of an event, as radar sensors are mostly independent of illumination and weather conditions and therefore data is more likely to be available. The full data integration allows coming up with a robust approach for the detection and classification of landslides. However, more research is needed to make the best of the integration of SAR data in an object-based environment and for making the approach easier adaptable to different study sites and data.
Automatic Identification of Critical Follow-Up Recommendation Sentences in Radiology Reports
Yetisgen-Yildiz, Meliha; Gunn, Martin L.; Xia, Fei; Payne, Thomas H.
2011-01-01
Communication of follow-up recommendations when abnormalities are identified on imaging studies is prone to error. When recommendations are not systematically identified and promptly communicated to referrers, poor patient outcomes can result. Using information technology can improve communication and improve patient safety. In this paper, we describe a text processing approach that uses natural language processing (NLP) and supervised text classification methods to automatically identify critical recommendation sentences in radiology reports. To increase the classification performance we enhanced the simple unigram token representation approach with lexical, semantic, knowledge-base, and structural features. We tested different combinations of those features with the Maximum Entropy (MaxEnt) classification algorithm. Classifiers were trained and tested with a gold standard corpus annotated by a domain expert. We applied 5-fold cross validation and our best performing classifier achieved 95.60% precision, 79.82% recall, 87.0% F-score, and 99.59% classification accuracy in identifying the critical recommendation sentences in radiology reports. PMID:22195225
Automatic identification of critical follow-up recommendation sentences in radiology reports.
Yetisgen-Yildiz, Meliha; Gunn, Martin L; Xia, Fei; Payne, Thomas H
2011-01-01
Communication of follow-up recommendations when abnormalities are identified on imaging studies is prone to error. When recommendations are not systematically identified and promptly communicated to referrers, poor patient outcomes can result. Using information technology can improve communication and improve patient safety. In this paper, we describe a text processing approach that uses natural language processing (NLP) and supervised text classification methods to automatically identify critical recommendation sentences in radiology reports. To increase the classification performance we enhanced the simple unigram token representation approach with lexical, semantic, knowledge-base, and structural features. We tested different combinations of those features with the Maximum Entropy (MaxEnt) classification algorithm. Classifiers were trained and tested with a gold standard corpus annotated by a domain expert. We applied 5-fold cross validation and our best performing classifier achieved 95.60% precision, 79.82% recall, 87.0% F-score, and 99.59% classification accuracy in identifying the critical recommendation sentences in radiology reports.
Multiclass fMRI data decoding and visualization using supervised self-organizing maps.
Hausfeld, Lars; Valente, Giancarlo; Formisano, Elia
2014-08-01
When multivariate pattern decoding is applied to fMRI studies entailing more than two experimental conditions, a most common approach is to transform the multiclass classification problem into a series of binary problems. Furthermore, for decoding analyses, classification accuracy is often the only outcome reported although the topology of activation patterns in the high-dimensional features space may provide additional insights into underlying brain representations. Here we propose to decode and visualize voxel patterns of fMRI datasets consisting of multiple conditions with a supervised variant of self-organizing maps (SSOMs). Using simulations and real fMRI data, we evaluated the performance of our SSOM-based approach. Specifically, the analysis of simulated fMRI data with varying signal-to-noise and contrast-to-noise ratio suggested that SSOMs perform better than a k-nearest-neighbor classifier for medium and large numbers of features (i.e. 250 to 1000 or more voxels) and similar to support vector machines (SVMs) for small and medium numbers of features (i.e. 100 to 600voxels). However, for a larger number of features (>800voxels), SSOMs performed worse than SVMs. When applied to a challenging 3-class fMRI classification problem with datasets collected to examine the neural representation of three human voices at individual speaker level, the SSOM-based algorithm was able to decode speaker identity from auditory cortical activation patterns. Classification performances were similar between SSOMs and other decoding algorithms; however, the ability to visualize decoding models and underlying data topology of SSOMs promotes a more comprehensive understanding of classification outcomes. We further illustrated this visualization ability of SSOMs with a re-analysis of a dataset examining the representation of visual categories in the ventral visual cortex (Haxby et al., 2001). This analysis showed that SSOMs could retrieve and visualize topography and neighborhood relations of the brain representation of eight visual categories. We conclude that SSOMs are particularly suited for decoding datasets consisting of more than two classes and are optimally combined with approaches that reduce the number of voxels used for classification (e.g. region-of-interest or searchlight approaches). Copyright © 2014. Published by Elsevier Inc.
The classification based on intrahepatic portal system for congenital portosystemic shunts.
Kanazawa, Hiroyuki; Nosaka, Shunsuke; Miyazaki, Osamu; Sakamoto, Seisuke; Fukuda, Akinari; Shigeta, Takanobu; Nakazawa, Atsuko; Kasahara, Mureo
2015-04-01
Liver transplantation was previously indicated as a curative operation for congenital absence of portal vein. Recent advances in radiological interventional techniques can precisely visualize the architecture of the intrahepatic portal system (IHPS). Therefore, the therapeutic approach for congenital portosystemic shunt (CPS) needs to be reevaluated from a viewpoint of radiological appearances. The aim of this study was to propose the IHPS classification which could explain the pathophysiological characteristics and play a complementary role of a therapeutic approach and management for CPS. Nineteen patients with CPS were retrospectively reviewed. The median age at diagnosis was 6.8 years old. Eighteen of these patients underwent angiography with a shunt occlusion test and were classified based of the severity of the hypoplasia of IHPS. The eighteen cases who could undergo the shunt occlusion test were classified into mild (n=7), moderate (n=6) and severe types (n=5) according to the IHPS classification. The IHPS classification correlated with the portal venous pressure under shunt occlusion, the histopathological findings, postoperative portal venous flow and liver regeneration. Shunt closure resulted in dramatic improvement in the laboratory data and subclinical encephalopathy. Two patients with the severe type suffered from sepsis associated with portal hypertension after treatment, and from the portal flow steal phenomenon because of the development of unexpected collateral vessels. The patients with the severe type had a high risk of postoperative complications after shunt closure in one step, even if the PVP was relatively low during the shunt occlusion test. The IHPS could be visualized by the shunt occlusion test. The IHPS classification reflected the clinicopathological features of CPS, and was useful to determine the therapeutic approach and management for CPS. Copyright © 2015 Elsevier Inc. All rights reserved.
Xia, Jiaqi; Peng, Zhenling; Qi, Dawei; Mu, Hongbo; Yang, Jianyi
2017-03-15
Protein fold classification is a critical step in protein structure prediction. There are two possible ways to classify protein folds. One is through template-based fold assignment and the other is ab-initio prediction using machine learning algorithms. Combination of both solutions to improve the prediction accuracy was never explored before. We developed two algorithms, HH-fold and SVM-fold for protein fold classification. HH-fold is a template-based fold assignment algorithm using the HHsearch program. SVM-fold is a support vector machine-based ab-initio classification algorithm, in which a comprehensive set of features are extracted from three complementary sequence profiles. These two algorithms are then combined, resulting to the ensemble approach TA-fold. We performed a comprehensive assessment for the proposed methods by comparing with ab-initio methods and template-based threading methods on six benchmark datasets. An accuracy of 0.799 was achieved by TA-fold on the DD dataset that consists of proteins from 27 folds. This represents improvement of 5.4-11.7% over ab-initio methods. After updating this dataset to include more proteins in the same folds, the accuracy increased to 0.971. In addition, TA-fold achieved >0.9 accuracy on a large dataset consisting of 6451 proteins from 184 folds. Experiments on the LE dataset show that TA-fold consistently outperforms other threading methods at the family, superfamily and fold levels. The success of TA-fold is attributed to the combination of template-based fold assignment and ab-initio classification using features from complementary sequence profiles that contain rich evolution information. http://yanglab.nankai.edu.cn/TA-fold/. yangjy@nankai.edu.cn or mhb-506@163.com. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Detection of lobular structures in normal breast tissue.
Apou, Grégory; Schaadt, Nadine S; Naegel, Benoît; Forestier, Germain; Schönmeyer, Ralf; Feuerhake, Friedrich; Wemmert, Cédric; Grote, Anne
2016-07-01
Ongoing research into inflammatory conditions raises an increasing need to evaluate immune cells in histological sections in biologically relevant regions of interest (ROIs). Herein, we compare different approaches to automatically detect lobular structures in human normal breast tissue in digitized whole slide images (WSIs). This automation is required to perform objective and consistent quantitative studies on large data sets. In normal breast tissue from nine healthy patients immunohistochemically stained for different markers, we evaluated and compared three different image analysis methods to automatically detect lobular structures in WSIs: (1) a bottom-up approach using the cell-based data for subsequent tissue level classification, (2) a top-down method starting with texture classification at tissue level analysis of cell densities in specific ROIs, and (3) a direct texture classification using deep learning technology. All three methods result in comparable overall quality allowing automated detection of lobular structures with minor advantage in sensitivity (approach 3), specificity (approach 2), or processing time (approach 1). Combining the outputs of the approaches further improved the precision. Different approaches of automated ROI detection are feasible and should be selected according to the individual needs of biomarker research. Additionally, detected ROIs could be used as a basis for quantification of immune infiltration in lobular structures. Copyright © 2016 Elsevier Ltd. All rights reserved.
Variable Selection for Road Segmentation in Aerial Images
NASA Astrophysics Data System (ADS)
Warnke, S.; Bulatov, D.
2017-05-01
For extraction of road pixels from combined image and elevation data, Wegner et al. (2015) proposed classification of superpixels into road and non-road, after which a refinement of the classification results using minimum cost paths and non-local optimization methods took place. We believed that the variable set used for classification was to a certain extent suboptimal, because many variables were redundant while several features known as useful in Photogrammetry and Remote Sensing are missed. This motivated us to implement a variable selection approach which builds a model for classification using portions of training data and subsets of features, evaluates this model, updates the feature set, and terminates when a stopping criterion is satisfied. The choice of classifier is flexible; however, we tested the approach with Logistic Regression and Random Forests, and taylored the evaluation module to the chosen classifier. To guarantee a fair comparison, we kept the segment-based approach and most of the variables from the related work, but we extended them by additional, mostly higher-level features. Applying these superior features, removing the redundant ones, as well as using more accurately acquired 3D data allowed to keep stable or even to reduce the misclassification error in a challenging dataset.
Going Deeper With Contextual CNN for Hyperspectral Image Classification.
Lee, Hyungtae; Kwon, Heesung
2017-10-01
In this paper, we describe a novel deep convolutional neural network (CNN) that is deeper and wider than other existing deep networks for hyperspectral image classification. Unlike current state-of-the-art approaches in CNN-based hyperspectral image classification, the proposed network, called contextual deep CNN, can optimally explore local contextual interactions by jointly exploiting local spatio-spectral relationships of neighboring individual pixel vectors. The joint exploitation of the spatio-spectral information is achieved by a multi-scale convolutional filter bank used as an initial component of the proposed CNN pipeline. The initial spatial and spectral feature maps obtained from the multi-scale filter bank are then combined together to form a joint spatio-spectral feature map. The joint feature map representing rich spectral and spatial properties of the hyperspectral image is then fed through a fully convolutional network that eventually predicts the corresponding label of each pixel vector. The proposed approach is tested on three benchmark data sets: the Indian Pines data set, the Salinas data set, and the University of Pavia data set. Performance comparison shows enhanced classification performance of the proposed approach over the current state-of-the-art on the three data sets.
Figueroa, Rosa L; Flores, Christopher A
2016-08-01
Obesity is a chronic disease with an increasing impact on the world's population. In this work, we present a method of identifying obesity automatically using text mining techniques and information related to body weight measures and obesity comorbidities. We used a dataset of 3015 de-identified medical records that contain labels for two classification problems. The first classification problem distinguishes between obesity, overweight, normal weight, and underweight. The second classification problem differentiates between obesity types: super obesity, morbid obesity, severe obesity and moderate obesity. We used a Bag of Words approach to represent the records together with unigram and bigram representations of the features. We implemented two approaches: a hierarchical method and a nonhierarchical one. We used Support Vector Machine and Naïve Bayes together with ten-fold cross validation to evaluate and compare performances. Our results indicate that the hierarchical approach does not work as well as the nonhierarchical one. In general, our results show that Support Vector Machine obtains better performances than Naïve Bayes for both classification problems. We also observed that bigram representation improves performance compared with unigram representation.
Zwemmer, J N P; Berkhof, J; Castelijns, J A; Barkhof, F; Polman, C H; Uitdehaag, B M J
2006-10-01
Disease heterogeneity is a major issue in multiple sclerosis (MS). Classification of MS patients is usually based on clinical characteristics. More recently, a pathological classification has been presented. While clinical subtypes differ by magnetic resonance imaging (MRI) signature on a group level, a classification of individual MS patients based purely on MRI characteristics has not been presented so far. To investigate whether a restricted classification of MS patients can be made based on a combination of quantitative and qualitative MRI characteristics and to test whether the resulting subgroups are associated with clinical and laboratory characteristics. MRI examinations of the brain and spinal cord of 50 patients were scored for 21 quantitative and qualitative characteristics. Using latent class analysis, subgroups were identified, for whom disease characteristics and laboratory measures were compared. Latent class analysis revealed two subgroups that mainly differed in the extent of lesion confluency and MRI correlates of neuronal loss in the brain. Demographics and disease characteristics were comparable except for cognitive deficits. No correlations with laboratory measures were found. Latent class analysis offers a feasible approach for classifying subgroups of MS patients based on the presence of MRI characteristics. The reproducibility, longitudinal evolution and further clinical or prognostic relevance of the observed classification will have to be explored in a larger and independent sample of patients.
NASA Astrophysics Data System (ADS)
Liu, Tao; Abd-Elrahman, Amr
2018-05-01
Deep convolutional neural network (DCNN) requires massive training datasets to trigger its image classification power, while collecting training samples for remote sensing application is usually an expensive process. When DCNN is simply implemented with traditional object-based image analysis (OBIA) for classification of Unmanned Aerial systems (UAS) orthoimage, its power may be undermined if the number training samples is relatively small. This research aims to develop a novel OBIA classification approach that can take advantage of DCNN by enriching the training dataset automatically using multi-view data. Specifically, this study introduces a Multi-View Object-based classification using Deep convolutional neural network (MODe) method to process UAS images for land cover classification. MODe conducts the classification on multi-view UAS images instead of directly on the orthoimage, and gets the final results via a voting procedure. 10-fold cross validation results show the mean overall classification accuracy increasing substantially from 65.32%, when DCNN was applied on the orthoimage to 82.08% achieved when MODe was implemented. This study also compared the performances of the support vector machine (SVM) and random forest (RF) classifiers with DCNN under traditional OBIA and the proposed multi-view OBIA frameworks. The results indicate that the advantage of DCNN over traditional classifiers in terms of accuracy is more obvious when these classifiers were applied with the proposed multi-view OBIA framework than when these classifiers were applied within the traditional OBIA framework.
NASA Astrophysics Data System (ADS)
Khan, Faisal; Enzmann, Frieder; Kersten, Michael
2016-03-01
Image processing of X-ray-computed polychromatic cone-beam micro-tomography (μXCT) data of geological samples mainly involves artefact reduction and phase segmentation. For the former, the main beam-hardening (BH) artefact is removed by applying a best-fit quadratic surface algorithm to a given image data set (reconstructed slice), which minimizes the BH offsets of the attenuation data points from that surface. A Matlab code for this approach is provided in the Appendix. The final BH-corrected image is extracted from the residual data or from the difference between the surface elevation values and the original grey-scale values. For the segmentation, we propose a novel least-squares support vector machine (LS-SVM, an algorithm for pixel-based multi-phase classification) approach. A receiver operating characteristic (ROC) analysis was performed on BH-corrected and uncorrected samples to show that BH correction is in fact an important prerequisite for accurate multi-phase classification. The combination of the two approaches was thus used to classify successfully three different more or less complex multi-phase rock core samples.
Advances in Risk Classification and Treatment Strategies for Neuroblastoma
Pinto, Navin R.; Applebaum, Mark A.; Volchenboum, Samuel L.; Matthay, Katherine K.; London, Wendy B.; Ambros, Peter F.; Nakagawara, Akira; Berthold, Frank; Schleiermacher, Gudrun; Park, Julie R.; Valteau-Couanet, Dominique; Pearson, Andrew D.J.
2015-01-01
Risk-based treatment approaches for neuroblastoma have been ongoing for decades. However, the criteria used to define risk in various institutional and cooperative groups were disparate, limiting the ability to compare clinical trial results. To mitigate this problem and enhance collaborative research, homogenous pretreatment patient cohorts have been defined by the International Neuroblastoma Risk Group classification system. During the past 30 years, increasingly intensive, multimodality approaches have been developed to treat patients who are classified as high risk, whereas patients with low- or intermediate-risk neuroblastoma have received reduced therapy. This treatment approach has resulted in improved outcome, although survival for high-risk patients remains poor, emphasizing the need for more effective treatments. Increased knowledge regarding the biology and genetic basis of neuroblastoma has led to the discovery of druggable targets and promising, new therapeutic approaches. Collaborative efforts of institutions and international cooperative groups have led to advances in our understanding of neuroblastoma biology, refinements in risk classification, and stratified treatment strategies, resulting in improved outcome. International collaboration will be even more critical when evaluating therapies designed to treat small cohorts of patients with rare actionable mutations. PMID:26304901
Myint, S.W.; Yuan, M.; Cerveny, R.S.; Giri, C.P.
2008-01-01
Remote sensing techniques have been shown effective for large-scale damage surveys after a hazardous event in both near real-time or post-event analyses. The paper aims to compare accuracy of common imaging processing techniques to detect tornado damage tracks from Landsat TM data. We employed the direct change detection approach using two sets of images acquired before and after the tornado event to produce a principal component composite images and a set of image difference bands. Techniques in the comparison include supervised classification, unsupervised classification, and objectoriented classification approach with a nearest neighbor classifier. Accuracy assessment is based on Kappa coefficient calculated from error matrices which cross tabulate correctly identified cells on the TM image and commission and omission errors in the result. Overall, the Object-oriented Approach exhibits the highest degree of accuracy in tornado damage detection. PCA and Image Differencing methods show comparable outcomes. While selected PCs can improve detection accuracy 5 to 10%, the Object-oriented Approach performs significantly better with 15-20% higher accuracy than the other two techniques. ?? 2008 by MDPI.
Myint, Soe W.; Yuan, May; Cerveny, Randall S.; Giri, Chandra P.
2008-01-01
Remote sensing techniques have been shown effective for large-scale damage surveys after a hazardous event in both near real-time or post-event analyses. The paper aims to compare accuracy of common imaging processing techniques to detect tornado damage tracks from Landsat TM data. We employed the direct change detection approach using two sets of images acquired before and after the tornado event to produce a principal component composite images and a set of image difference bands. Techniques in the comparison include supervised classification, unsupervised classification, and object-oriented classification approach with a nearest neighbor classifier. Accuracy assessment is based on Kappa coefficient calculated from error matrices which cross tabulate correctly identified cells on the TM image and commission and omission errors in the result. Overall, the Object-oriented Approach exhibits the highest degree of accuracy in tornado damage detection. PCA and Image Differencing methods show comparable outcomes. While selected PCs can improve detection accuracy 5 to 10%, the Object-oriented Approach performs significantly better with 15-20% higher accuracy than the other two techniques. PMID:27879757
An incremental approach to genetic-algorithms-based classification.
Guan, Sheng-Uei; Zhu, Fangming
2005-04-01
Incremental learning has been widely addressed in the machine learning literature to cope with learning tasks where the learning environment is ever changing or training samples become available over time. However, most research work explores incremental learning with statistical algorithms or neural networks, rather than evolutionary algorithms. The work in this paper employs genetic algorithms (GAs) as basic learning algorithms for incremental learning within one or more classifier agents in a multiagent environment. Four new approaches with different initialization schemes are proposed. They keep the old solutions and use an "integration" operation to integrate them with new elements to accommodate new attributes, while biased mutation and crossover operations are adopted to further evolve a reinforced solution. The simulation results on benchmark classification data sets show that the proposed approaches can deal with the arrival of new input attributes and integrate them with the original input space. It is also shown that the proposed approaches can be successfully used for incremental learning and improve classification rates as compared to the retraining GA. Possible applications for continuous incremental training and feature selection are also discussed.
[Research on the methods for multi-class kernel CSP-based feature extraction].
Wang, Jinjia; Zhang, Lingzhi; Hu, Bei
2012-04-01
To relax the presumption of strictly linear patterns in the common spatial patterns (CSP), we studied the kernel CSP (KCSP). A new multi-class KCSP (MKCSP) approach was proposed in this paper, which combines the kernel approach with multi-class CSP technique. In this approach, we used kernel spatial patterns for each class against all others, and extracted signal components specific to one condition from EEG data sets of multiple conditions. Then we performed classification using the Logistic linear classifier. Brain computer interface (BCI) competition III_3a was used in the experiment. Through the experiment, it can be proved that this approach could decompose the raw EEG singles into spatial patterns extracted from multi-class of single trial EEG, and could obtain good classification results.
Obsessive compulsive and related disorders: comparing DSM-5 and ICD-11.
Marras, Anna; Fineberg, Naomi; Pallanti, Stefano
2016-08-01
Obsessive-compulsive disorder (OCD) has been recognized as mainly characterized by compulsivity rather than anxiety and, therefore, was removed from the anxiety disorders chapter and given its own in both the American Psychiatric Association (APA) Diagnostic and Statistical Manual of Mental Disorders (DSM-5) and the Beta Draft Version of the 11th revision of the World Health Organization (WHO) International Classification of Diseases (ICD-11). This revised clustering is based on increasing evidence of common affected neurocircuits between disorders, differently from previous classification systems based on interrater agreement. In this article, we focus on the classification of obsessive-compulsive and related disorders (OCRDs), examining the differences in approach adopted by these 2 nosological systems, with particular attention to the proposed changes in the forthcoming ICD-11. At this stage, notable differences in the ICD classification are emerging from the previous revision, apparently converging toward a reformulation of OCRDs that is closer to the DSM-5.
Group-Based Active Learning of Classification Models.
Luo, Zhipeng; Hauskrecht, Milos
2017-05-01
Learning of classification models from real-world data often requires additional human expert effort to annotate the data. However, this process can be rather costly and finding ways of reducing the human annotation effort is critical for this task. The objective of this paper is to develop and study new ways of providing human feedback for efficient learning of classification models by labeling groups of examples. Briefly, unlike traditional active learning methods that seek feedback on individual examples, we develop a new group-based active learning framework that solicits label information on groups of multiple examples. In order to describe groups in a user-friendly way, conjunctive patterns are used to compactly represent groups. Our empirical study on 12 UCI data sets demonstrates the advantages and superiority of our approach over both classic instance-based active learning work, as well as existing group-based active-learning methods.
Cluster Method Analysis of K. S. C. Image
NASA Technical Reports Server (NTRS)
Rodriguez, Joe, Jr.; Desai, M.
1997-01-01
Information obtained from satellite-based systems has moved to the forefront as a method in the identification of many land cover types. Identification of different land features through remote sensing is an effective tool for regional and global assessment of geometric characteristics. Classification data acquired from remote sensing images have a wide variety of applications. In particular, analysis of remote sensing images have special applications in the classification of various types of vegetation. Results obtained from classification studies of a particular area or region serve towards a greater understanding of what parameters (ecological, temporal, etc.) affect the region being analyzed. In this paper, we make a distinction between both types of classification approaches although, focus is given to the unsupervised classification method using 1987 Thematic Mapped (TM) images of Kennedy Space Center.
Identification of terrain cover using the optimum polarimetric classifier
NASA Technical Reports Server (NTRS)
Kong, J. A.; Swartz, A. A.; Yueh, H. A.; Novak, L. M.; Shin, R. T.
1988-01-01
A systematic approach for the identification of terrain media such as vegetation canopy, forest, and snow-covered fields is developed using the optimum polarimetric classifier. The covariance matrices for various terrain cover are computed from theoretical models of random medium by evaluating the scattering matrix elements. The optimal classification scheme makes use of a quadratic distance measure and is applied to classify a vegetation canopy consisting of both trees and grass. Experimentally measured data are used to validate the classification scheme. Analytical and Monte Carlo simulated classification errors using the fully polarimetric feature vector are compared with classification based on single features which include the phase difference between the VV and HH polarization returns. It is shown that the full polarimetric results are optimal and provide better classification performance than single feature measurements.
Predicting Mycobacterium tuberculosis Complex Clades Using Knowledge-Based Bayesian Networks
Bennett, Kristin P.
2014-01-01
We develop a novel approach for incorporating expert rules into Bayesian networks for classification of Mycobacterium tuberculosis complex (MTBC) clades. The proposed knowledge-based Bayesian network (KBBN) treats sets of expert rules as prior distributions on the classes. Unlike prior knowledge-based support vector machine approaches which require rules expressed as polyhedral sets, KBBN directly incorporates the rules without any modification. KBBN uses data to refine rule-based classifiers when the rule set is incomplete or ambiguous. We develop a predictive KBBN model for 69 MTBC clades found in the SITVIT international collection. We validate the approach using two testbeds that model knowledge of the MTBC obtained from two different experts and large DNA fingerprint databases to predict MTBC genetic clades and sublineages. These models represent strains of MTBC using high-throughput biomarkers called spacer oligonucleotide types (spoligotypes), since these are routinely gathered from MTBC isolates of tuberculosis (TB) patients. Results show that incorporating rules into problems can drastically increase classification accuracy if data alone are insufficient. The SITVIT KBBN is publicly available for use on the World Wide Web. PMID:24864238
In-vivo determination of chewing patterns using FBG and artificial neural networks
NASA Astrophysics Data System (ADS)
Pegorini, Vinicius; Zen Karam, Leandro; Rocha Pitta, Christiano S.; Ribeiro, Richardson; Simioni Assmann, Tangriani; Cardozo da Silva, Jean Carlos; Bertotti, Fábio L.; Kalinowski, Hypolito J.; Cardoso, Rafael
2015-09-01
This paper reports the process of pattern classification of the chewing process of ruminants. We propose a simplified signal processing scheme for optical fiber Bragg grating (FBG) sensors based on machine learning techniques. The FBG sensors measure the biomechanical forces during jaw movements and an artificial neural network is responsible for the classification of the associated chewing pattern. In this study, three patterns associated to dietary supplement, hay and ryegrass were considered. Additionally, two other important events for ingestive behavior studies were monitored, rumination and idle period. Experimental results show that the proposed approach for pattern classification has been capable of differentiating the materials involved in the chewing process with a small classification error.
Rey, Sergio J.; Stephens, Philip A.; Laura, Jason R.
2017-01-01
Large data contexts present a number of challenges to optimal choropleth map classifiers. Application of optimal classifiers to a sample of the attribute space is one proposed solution. The properties of alternative sampling-based classification methods are examined through a series of Monte Carlo simulations. The impacts of spatial autocorrelation, number of desired classes, and form of sampling are shown to have significant impacts on the accuracy of map classifications. Tradeoffs between improved speed of the sampling approaches and loss of accuracy are also considered. The results suggest the possibility of guiding the choice of classification scheme as a function of the properties of large data sets.
Real-time and simultaneous control of artificial limbs based on pattern recognition algorithms.
Ortiz-Catalan, Max; Håkansson, Bo; Brånemark, Rickard
2014-07-01
The prediction of simultaneous limb motions is a highly desirable feature for the control of artificial limbs. In this work, we investigate different classification strategies for individual and simultaneous movements based on pattern recognition of myoelectric signals. Our results suggest that any classifier can be potentially employed in the prediction of simultaneous movements if arranged in a distributed topology. On the other hand, classifiers inherently capable of simultaneous predictions, such as the multi-layer perceptron (MLP), were found to be more cost effective, as they can be successfully employed in their simplest form. In the prediction of individual movements, the one-vs-one (OVO) topology was found to improve classification accuracy across different classifiers and it was therefore used to benchmark the benefits of simultaneous control. As opposed to previous work reporting only offline accuracy, the classification performance and the resulting controllability are evaluated in real time using the motion test and target achievement control (TAC) test, respectively. We propose a simultaneous classification strategy based on MLP that outperformed a top classifier for individual movements (LDA-OVO), thus improving the state-of-the-art classification approach. Furthermore, all the presented classification strategies and data collected in this study are freely available in BioPatRec, an open source platform for the development of advanced prosthetic control strategies.
NASA Astrophysics Data System (ADS)
Ahmed, H. O. A.; Wong, M. L. D.; Nandi, A. K.
2018-01-01
Condition classification of rolling element bearings in rotating machines is important to prevent the breakdown of industrial machinery. A considerable amount of literature has been published on bearing faults classification. These studies aim to determine automatically the current status of a roller element bearing. Of these studies, methods based on compressed sensing (CS) have received some attention recently due to their ability to allow one to sample below the Nyquist sampling rate. This technology has many possible uses in machine condition monitoring and has been investigated as a possible approach for fault detection and classification in the compressed domain, i.e., without reconstructing the original signal. However, previous CS based methods have been found to be too weak for highly compressed data. The present paper explores computationally, for the first time, the effects of sparse autoencoder based over-complete sparse representations on the classification performance of highly compressed measurements of bearing vibration signals. For this study, the CS method was used to produce highly compressed measurements of the original bearing dataset. Then, an effective deep neural network (DNN) with unsupervised feature learning algorithm based on sparse autoencoder is used for learning over-complete sparse representations of these compressed datasets. Finally, the fault classification is achieved using two stages, namely, pre-training classification based on stacked autoencoder and softmax regression layer form the deep net stage (the first stage), and re-training classification based on backpropagation (BP) algorithm forms the fine-tuning stage (the second stage). The experimental results show that the proposed method is able to achieve high levels of accuracy even with extremely compressed measurements compared with the existing techniques.
A three-way approach for protein function classification
2017-01-01
The knowledge of protein functions plays an essential role in understanding biological cells and has a significant impact on human life in areas such as personalized medicine, better crops and improved therapeutic interventions. Due to expense and inherent difficulty of biological experiments, intelligent methods are generally relied upon for automatic assignment of functions to proteins. The technological advancements in the field of biology are improving our understanding of biological processes and are regularly resulting in new features and characteristics that better describe the role of proteins. It is inevitable to neglect and overlook these anticipated features in designing more effective classification techniques. A key issue in this context, that is not being sufficiently addressed, is how to build effective classification models and approaches for protein function prediction by incorporating and taking advantage from the ever evolving biological information. In this article, we propose a three-way decision making approach which provides provisions for seeking and incorporating future information. We considered probabilistic rough sets based models such as Game-Theoretic Rough Sets (GTRS) and Information-Theoretic Rough Sets (ITRS) for inducing three-way decisions. An architecture of protein functions classification with probabilistic rough sets based three-way decisions is proposed and explained. Experiments are carried out on Saccharomyces cerevisiae species dataset obtained from Uniprot database with the corresponding functional classes extracted from the Gene Ontology (GO) database. The results indicate that as the level of biological information increases, the number of deferred cases are reduced while maintaining similar level of accuracy. PMID:28234929
A three-way approach for protein function classification.
Ur Rehman, Hafeez; Azam, Nouman; Yao, JingTao; Benso, Alfredo
2017-01-01
The knowledge of protein functions plays an essential role in understanding biological cells and has a significant impact on human life in areas such as personalized medicine, better crops and improved therapeutic interventions. Due to expense and inherent difficulty of biological experiments, intelligent methods are generally relied upon for automatic assignment of functions to proteins. The technological advancements in the field of biology are improving our understanding of biological processes and are regularly resulting in new features and characteristics that better describe the role of proteins. It is inevitable to neglect and overlook these anticipated features in designing more effective classification techniques. A key issue in this context, that is not being sufficiently addressed, is how to build effective classification models and approaches for protein function prediction by incorporating and taking advantage from the ever evolving biological information. In this article, we propose a three-way decision making approach which provides provisions for seeking and incorporating future information. We considered probabilistic rough sets based models such as Game-Theoretic Rough Sets (GTRS) and Information-Theoretic Rough Sets (ITRS) for inducing three-way decisions. An architecture of protein functions classification with probabilistic rough sets based three-way decisions is proposed and explained. Experiments are carried out on Saccharomyces cerevisiae species dataset obtained from Uniprot database with the corresponding functional classes extracted from the Gene Ontology (GO) database. The results indicate that as the level of biological information increases, the number of deferred cases are reduced while maintaining similar level of accuracy.
NASA Astrophysics Data System (ADS)
Luo, Juhua; Duan, Hongtao; Ma, Ronghua; Jin, Xiuliang; Li, Fei; Hu, Weiping; Shi, Kun; Huang, Wenjiang
2017-05-01
Spatial information of the dominant species of submerged aquatic vegetation (SAV) is essential for restoration projects in eutrophic lakes, especially eutrophic Taihu Lake, China. Mapping the distribution of SAV species is very challenging and difficult using only multispectral satellite remote sensing. In this study, we proposed an approach to map the distribution of seven dominant species of SAV in Taihu Lake. Our approach involved information on the life histories of the seven SAV species and eight distribution maps of SAV from February to October. The life history information of the dominant SAV species was summarized from the literature and field surveys. Eight distribution maps of the SAV were extracted from eight 30 m HJ-CCD images from February to October in 2013 based on the classification tree models, and the overall classification accuracies for the SAV were greater than 80%. Finally, the spatial distribution of the SAV species in Taihu in 2013 was mapped using multilayer erasing approach. Based on validation, the overall classification accuracy for the seven species was 68.4%, and kappa was 0.6306, which suggests that larger differences in life histories between species can produce higher identification accuracies. The classification results show that Potamogeton malaianus was the most widely distributed species in Taihu Lake, followed by Myriophyllum spicatum, Potamogeton maackianus, Potamogeton crispus, Elodea nuttallii, Ceratophyllum demersum and Vallisneria spiralis. The information is useful for planning shallow-water habitat restoration projects.
Rosellini, Anthony J; Brown, Timothy A
2014-12-01
Limitations in anxiety and mood disorder diagnostic reliability and validity due to the categorical approach to classification used by the Diagnostic and Statistical Manual of Mental Disorders (DSM) have been long recognized. Although these limitations have led researchers to forward alternative classification schemes, few have been empirically evaluated. In a sample of 1,218 outpatients with anxiety and mood disorders, the present study examined the validity of Brown and Barlow's (2009) proposal to classify the anxiety and mood disorders using an integrated dimensional-categorical approach based on transdiagnostic emotional disorder vulnerabilities and phenotypes. Latent class analyses of 7 transdiagnostic dimensional indicators suggested that a 6-class (i.e., profile) solution provided the best model fit and was the most conceptually interpretable. Interpretation of the classes was further supported when compared with DSM diagnoses (i.e., within-class prevalence of diagnoses, using diagnoses to predict class membership). In addition, hierarchical multiple regression models were used to demonstrate the incremental validity of the profiles; class probabilities consistently accounted for unique variance in anxiety and mood disorder outcomes above and beyond DSM diagnoses. These results provide support for the potential development and utility of a hybrid dimensional-categorical profile approach to anxiety and mood disorder classification. In particular, the availability of dimensional indicators and corresponding profiles may serve as a useful complement to DSM diagnoses for both researchers and clinicians. (c) 2014 APA, all rights reserved.
Rosellini, Anthony J.; Brown, Timothy A.
2014-01-01
Limitations in anxiety and mood disorder diagnostic reliability and validity due to the categorical approach to classification used by the Diagnostic and Statistical Manual of Mental Disorders (DSM) have been long recognized. Although these limitations have led researchers to forward alternative classification schemes, few have been empirically evaluated. In a sample of 1,218 outpatients with anxiety and mood disorders, the present study examined the validity of Brown and Barlow's (2009) proposal to classify the anxiety and mood disorders using an integrated dimensional-categorical approach based on transdiagnostic emotional disorder vulnerabilities and phenotypes. Latent class analyses of seven transdiagnostic dimensional indicators suggested that a six-class (i.e., profile) solution provided the best model fit and was the most conceptually interpretable. Interpretation of the classes was further supported when compared with DSM-IV diagnoses (i.e., within-class prevalence of diagnoses, using diagnoses to predict class membership). In addition, hierarchical multiple regression models were used to demonstrate the incremental validity of the profiles; class probabilities consistently accounted for unique variance in anxiety and mood disorder outcomes above and beyond DSM diagnoses. These results provide support for the potential development and utility of a hybrid dimensional-categorical profile approach to anxiety and mood disorder classification. In particular, the availability of dimensional indicators and corresponding profiles may serve as a useful complement to DSM diagnoses for both researchers and clinicians. PMID:25265416
Quantifying tolerance indicator values for common stream fish species of the United States
Meador, M.R.; Carlisle, D.M.
2007-01-01
The classification of fish species tolerance to environmental disturbance is often used as a means to assess ecosystem conditions. Its use, however, may be problematic because the approach to tolerance classification is based on subjective judgment. We analyzed fish and physicochemical data from 773 stream sites collected as part of the U.S. Geological Survey's National Water-Quality Assessment Program to calculate tolerance indicator values for 10 physicochemical variables using weighted averaging. Tolerance indicator values (TIVs) for ammonia, chloride, dissolved oxygen, nitrite plus nitrate, pH, phosphorus, specific conductance, sulfate, suspended sediment, and water temperature were calculated for 105 common fish species of the United States. Tolerance indicator values for specific conductance and sulfate were correlated (rho = 0.87), and thus, fish species may be co-tolerant to these water-quality variables. We integrated TIVs for each species into an overall tolerance classification for comparisons with judgment-based tolerance classifications. Principal components analysis indicated that the distinction between tolerant and intolerant classifications was determined largely by tolerance to suspended sediment, specific conductance, chloride, and total phosphorus. Factors such as water temperature, dissolved oxygen, and pH may not be as important in distinguishing between tolerant and intolerant classifications, but may help to segregate species classified as moderate. Empirically derived tolerance classifications were 58.8% in agreement with judgment-derived tolerance classifications. Canonical discriminant analysis revealed that few TIVs, primarily chloride, could discriminate among judgment-derived tolerance classifications of tolerant, moderate, and intolerant. To our knowledge, this is the first empirically based understanding of fish species tolerance for stream fishes in the United States.
NASA Technical Reports Server (NTRS)
Jung, Jinha; Pasolli, Edoardo; Prasad, Saurabh; Tilton, James C.; Crawford, Melba M.
2014-01-01
Acquiring current, accurate land-use information is critical for monitoring and understanding the impact of anthropogenic activities on natural environments.Remote sensing technologies are of increasing importance because of their capability to acquire information for large areas in a timely manner, enabling decision makers to be more effective in complex environments. Although optical imagery has demonstrated to be successful for land cover classification, active sensors, such as light detection and ranging (LiDAR), have distinct capabilities that can be exploited to improve classification results. However, utilization of LiDAR data for land cover classification has not been fully exploited. Moreover, spatial-spectral classification has recently gained significant attention since classification accuracy can be improved by extracting additional information from the neighboring pixels. Although spatial information has been widely used for spectral data, less attention has been given to LiDARdata. In this work, a new framework for land cover classification using discrete return LiDAR data is proposed. Pseudo-waveforms are generated from the LiDAR data and processed by hierarchical segmentation. Spatial featuresare extracted in a region-based way using a new unsupervised strategy for multiple pruning of the segmentation hierarchy. The proposed framework is validated experimentally on a real dataset acquired in an urban area. Better classification results are exhibited by the proposed framework compared to the cases in which basic LiDAR products such as digital surface model and intensity image are used. Moreover, the proposed region-based feature extraction strategy results in improved classification accuracies in comparison with a more traditional window-based approach.
ERIC Educational Resources Information Center
Lassnigg, Lorenz; Vogtenhuber, Stefan
2011-01-01
The empirical approach referred to in this article describes the relationship between education and training (ET) supply and employment in Austria; the use of the new ISCED (International Standard Classification of Education) fields of study variable makes this approach applicable abroad. The purpose is to explore a system that produces timely…
Khellal, Atmane; Ma, Hongbin; Fei, Qing
2018-05-09
The success of Deep Learning models, notably convolutional neural networks (CNNs), makes them the favorable solution for object recognition systems in both visible and infrared domains. However, the lack of training data in the case of maritime ships research leads to poor performance due to the problem of overfitting. In addition, the back-propagation algorithm used to train CNN is very slow and requires tuning many hyperparameters. To overcome these weaknesses, we introduce a new approach fully based on Extreme Learning Machine (ELM) to learn useful CNN features and perform a fast and accurate classification, which is suitable for infrared-based recognition systems. The proposed approach combines an ELM based learning algorithm to train CNN for discriminative features extraction and an ELM based ensemble for classification. The experimental results on VAIS dataset, which is the largest dataset of maritime ships, confirm that the proposed approach outperforms the state-of-the-art models in term of generalization performance and training speed. For instance, the proposed model is up to 950 times faster than the traditional back-propagation based training of convolutional neural networks, primarily for low-level features extraction.