Landcover Classification Using Deep Fully Convolutional Neural Networks
NASA Astrophysics Data System (ADS)
Wang, J.; Li, X.; Zhou, S.; Tang, J.
2017-12-01
Land cover classification has always been an essential application in remote sensing. Certain image features are needed for land cover classification whether it is based on pixel or object-based methods. Different from other machine learning methods, deep learning model not only extracts useful information from multiple bands/attributes, but also learns spatial characteristics. In recent years, deep learning methods have been developed rapidly and widely applied in image recognition, semantic understanding, and other application domains. However, there are limited studies applying deep learning methods in land cover classification. In this research, we used fully convolutional networks (FCN) as the deep learning model to classify land covers. The National Land Cover Database (NLCD) within the state of Kansas was used as training dataset and Landsat images were classified using the trained FCN model. We also applied an image segmentation method to improve the original results from the FCN model. In addition, the pros and cons between deep learning and several machine learning methods were compared and explored. Our research indicates: (1) FCN is an effective classification model with an overall accuracy of 75%; (2) image segmentation improves the classification results with better match of spatial patterns; (3) FCN has an excellent ability of learning which can attains higher accuracy and better spatial patterns compared with several machine learning methods.
Estimation of Lithological Classification in Taipei Basin: A Bayesian Maximum Entropy Method
NASA Astrophysics Data System (ADS)
Wu, Meng-Ting; Lin, Yuan-Chien; Yu, Hwa-Lung
2015-04-01
In environmental or other scientific applications, we must have a certain understanding of geological lithological composition. Because of restrictions of real conditions, only limited amount of data can be acquired. To find out the lithological distribution in the study area, many spatial statistical methods used to estimate the lithological composition on unsampled points or grids. This study applied the Bayesian Maximum Entropy (BME method), which is an emerging method of the geological spatiotemporal statistics field. The BME method can identify the spatiotemporal correlation of the data, and combine not only the hard data but the soft data to improve estimation. The data of lithological classification is discrete categorical data. Therefore, this research applied Categorical BME to establish a complete three-dimensional Lithological estimation model. Apply the limited hard data from the cores and the soft data generated from the geological dating data and the virtual wells to estimate the three-dimensional lithological classification in Taipei Basin. Keywords: Categorical Bayesian Maximum Entropy method, Lithological Classification, Hydrogeological Setting
NASA Technical Reports Server (NTRS)
Kim, H.; Swain, P. H.
1991-01-01
A method of classifying multisource data in remote sensing is presented. The proposed method considers each data source as an information source providing a body of evidence, represents statistical evidence by interval-valued probabilities, and uses Dempster's rule to integrate information based on multiple data source. The method is applied to the problems of ground-cover classification of multispectral data combined with digital terrain data such as elevation, slope, and aspect. Then this method is applied to simulated 201-band High Resolution Imaging Spectrometer (HIRIS) data by dividing the dimensionally huge data source into smaller and more manageable pieces based on the global statistical correlation information. It produces higher classification accuracy than the Maximum Likelihood (ML) classification method when the Hughes phenomenon is apparent.
Zhang, He-Hua; Yang, Liuyang; Liu, Yuchuan; Wang, Pin; Yin, Jun; Li, Yongming; Qiu, Mingguo; Zhu, Xueru; Yan, Fang
2016-11-16
The use of speech based data in the classification of Parkinson disease (PD) has been shown to provide an effect, non-invasive mode of classification in recent years. Thus, there has been an increased interest in speech pattern analysis methods applicable to Parkinsonism for building predictive tele-diagnosis and tele-monitoring models. One of the obstacles in optimizing classifications is to reduce noise within the collected speech samples, thus ensuring better classification accuracy and stability. While the currently used methods are effect, the ability to invoke instance selection has been seldomly examined. In this study, a PD classification algorithm was proposed and examined that combines a multi-edit-nearest-neighbor (MENN) algorithm and an ensemble learning algorithm. First, the MENN algorithm is applied for selecting optimal training speech samples iteratively, thereby obtaining samples with high separability. Next, an ensemble learning algorithm, random forest (RF) or decorrelated neural network ensembles (DNNE), is used to generate trained samples from the collected training samples. Lastly, the trained ensemble learning algorithms are applied to the test samples for PD classification. This proposed method was examined using a more recently deposited public datasets and compared against other currently used algorithms for validation. Experimental results showed that the proposed algorithm obtained the highest degree of improved classification accuracy (29.44%) compared with the other algorithm that was examined. Furthermore, the MENN algorithm alone was found to improve classification accuracy by as much as 45.72%. Moreover, the proposed algorithm was found to exhibit a higher stability, particularly when combining the MENN and RF algorithms. This study showed that the proposed method could improve PD classification when using speech data and can be applied to future studies seeking to improve PD classification methods.
Wavelet-based multicomponent denoising on GPU to improve the classification of hyperspectral images
NASA Astrophysics Data System (ADS)
Quesada-Barriuso, Pablo; Heras, Dora B.; Argüello, Francisco; Mouriño, J. C.
2017-10-01
Supervised classification allows handling a wide range of remote sensing hyperspectral applications. Enhancing the spatial organization of the pixels over the image has proven to be beneficial for the interpretation of the image content, thus increasing the classification accuracy. Denoising in the spatial domain of the image has been shown as a technique that enhances the structures in the image. This paper proposes a multi-component denoising approach in order to increase the classification accuracy when a classification method is applied. It is computed on multicore CPUs and NVIDIA GPUs. The method combines feature extraction based on a 1Ddiscrete wavelet transform (DWT) applied in the spectral dimension followed by an Extended Morphological Profile (EMP) and a classifier (SVM or ELM). The multi-component noise reduction is applied to the EMP just before the classification. The denoising recursively applies a separable 2D DWT after which the number of wavelet coefficients is reduced by using a threshold. Finally, inverse 2D-DWT filters are applied to reconstruct the noise free original component. The computational cost of the classifiers as well as the cost of the whole classification chain is high but it is reduced achieving real-time behavior for some applications through their computation on NVIDIA multi-GPU platforms.
A new tool for post-AGB SED classification
NASA Astrophysics Data System (ADS)
Bendjoya, P.; Suarez, O.; Galluccio, L.; Michel, O.
We present the results of an unsupervised classification method applied on a set of 344 spectral energy distributions (SED) of post-AGB stars extracted from the Torun catalogue of Galactic post-AGB stars. This method aims to find a new unbiased method for post-AGB star classification based on the information contained in the IR region of the SED (fluxes, IR excess, colours). We used the data from IRAS and MSX satellites, and from the 2MASS survey. We applied a classification method based on the construction of the dataset of a minimal spanning tree (MST) with the Prim's algorithm. In order to build this tree, different metrics have been tested on both flux and color indices. Our method is able to classify the set of 344 post-AGB stars in 9 distinct groups according to their SEDs.
A contour-based shape descriptor for biomedical image classification and retrieval
NASA Astrophysics Data System (ADS)
You, Daekeun; Antani, Sameer; Demner-Fushman, Dina; Thoma, George R.
2013-12-01
Contours, object blobs, and specific feature points are utilized to represent object shapes and extract shape descriptors that can then be used for object detection or image classification. In this research we develop a shape descriptor for biomedical image type (or, modality) classification. We adapt a feature extraction method used in optical character recognition (OCR) for character shape representation, and apply various image preprocessing methods to successfully adapt the method to our application. The proposed shape descriptor is applied to radiology images (e.g., MRI, CT, ultrasound, X-ray, etc.) to assess its usefulness for modality classification. In our experiment we compare our method with other visual descriptors such as CEDD, CLD, Tamura, and PHOG that extract color, texture, or shape information from images. The proposed method achieved the highest classification accuracy of 74.1% among all other individual descriptors in the test, and when combined with CSD (color structure descriptor) showed better performance (78.9%) than using the shape descriptor alone.
Pham, Tuyen Danh; Lee, Dong Eun; Park, Kang Ryoung
2017-07-08
Automatic recognition of banknotes is applied in payment facilities, such as automated teller machines (ATMs) and banknote counters. Besides the popular approaches that focus on studying the methods applied to various individual types of currencies, there have been studies conducted on simultaneous classification of banknotes from multiple countries. However, their methods were conducted with limited numbers of banknote images, national currencies, and denominations. To address this issue, we propose a multi-national banknote classification method based on visible-light banknote images captured by a one-dimensional line sensor and classified by a convolutional neural network (CNN) considering the size information of each denomination. Experiments conducted on the combined banknote image database of six countries with 62 denominations gave a classification accuracy of 100%, and results show that our proposed algorithm outperforms previous methods.
Pham, Tuyen Danh; Lee, Dong Eun; Park, Kang Ryoung
2017-01-01
Automatic recognition of banknotes is applied in payment facilities, such as automated teller machines (ATMs) and banknote counters. Besides the popular approaches that focus on studying the methods applied to various individual types of currencies, there have been studies conducted on simultaneous classification of banknotes from multiple countries. However, their methods were conducted with limited numbers of banknote images, national currencies, and denominations. To address this issue, we propose a multi-national banknote classification method based on visible-light banknote images captured by a one-dimensional line sensor and classified by a convolutional neural network (CNN) considering the size information of each denomination. Experiments conducted on the combined banknote image database of six countries with 62 denominations gave a classification accuracy of 100%, and results show that our proposed algorithm outperforms previous methods. PMID:28698466
Best Merge Region Growing with Integrated Probabilistic Classification for Hyperspectral Imagery
NASA Technical Reports Server (NTRS)
Tarabalka, Yuliya; Tilton, James C.
2011-01-01
A new method for spectral-spatial classification of hyperspectral images is proposed. The method is based on the integration of probabilistic classification within the hierarchical best merge region growing algorithm. For this purpose, preliminary probabilistic support vector machines classification is performed. Then, hierarchical step-wise optimization algorithm is applied, by iteratively merging regions with the smallest Dissimilarity Criterion (DC). The main novelty of this method consists in defining a DC between regions as a function of region statistical and geometrical features along with classification probabilities. Experimental results are presented on a 200-band AVIRIS image of the Northwestern Indiana s vegetation area and compared with those obtained by recently proposed spectral-spatial classification techniques. The proposed method improves classification accuracies when compared to other classification approaches.
NASA Astrophysics Data System (ADS)
Yu, Xin; Wen, Zongyong; Zhu, Zhaorong; Xia, Qiang; Shun, Lan
2016-06-01
Image classification will still be a long way in the future, although it has gone almost half a century. In fact, researchers have gained many fruits in the image classification domain, but there is still a long distance between theory and practice. However, some new methods in the artificial intelligence domain will be absorbed into the image classification domain and draw on the strength of each to offset the weakness of the other, which will open up a new prospect. Usually, networks play the role of a high-level language, as is seen in Artificial Intelligence and statistics, because networks are used to build complex model from simple components. These years, Bayesian Networks, one of probabilistic networks, are a powerful data mining technique for handling uncertainty in complex domains. In this paper, we apply Tree Augmented Naive Bayesian Networks (TAN) to texture classification of High-resolution remote sensing images and put up a new method to construct the network topology structure in terms of training accuracy based on the training samples. Since 2013, China government has started the first national geographical information census project, which mainly interprets geographical information based on high-resolution remote sensing images. Therefore, this paper tries to apply Bayesian network to remote sensing image classification, in order to improve image interpretation in the first national geographical information census project. In the experiment, we choose some remote sensing images in Beijing. Experimental results demonstrate TAN outperform than Naive Bayesian Classifier (NBC) and Maximum Likelihood Classification Method (MLC) in the overall classification accuracy. In addition, the proposed method can reduce the workload of field workers and improve the work efficiency. Although it is time consuming, it will be an attractive and effective method for assisting office operation of image interpretation.
Jiménez-Carvelo, Ana M; Pérez-Castaño, Estefanía; González-Casado, Antonio; Cuadros-Rodríguez, Luis
2017-04-15
A new method for differentiation of olive oil (independently of the quality category) from other vegetable oils (canola, safflower, corn, peanut, seeds, grapeseed, palm, linseed, sesame and soybean) has been developed. The analytical procedure for chromatographic fingerprinting of the methyl-transesterified fraction of each vegetable oil, using normal-phase liquid chromatography, is described and the chemometric strategies applied and discussed. Some chemometric methods, such as k-nearest neighbours (kNN), partial least squared-discriminant analysis (PLS-DA), support vector machine classification analysis (SVM-C), and soft independent modelling of class analogies (SIMCA), were applied to build classification models. Performance of the classification was evaluated and ranked using several classification quality metrics. The discriminant analysis, based on the use of one input-class, (plus a dummy class) was applied for the first time in this study. Copyright © 2016 Elsevier Ltd. All rights reserved.
An information-based network approach for protein classification
Wan, Xiaogeng; Zhao, Xin; Yau, Stephen S. T.
2017-01-01
Protein classification is one of the critical problems in bioinformatics. Early studies used geometric distances and polygenetic-tree to classify proteins. These methods use binary trees to present protein classification. In this paper, we propose a new protein classification method, whereby theories of information and networks are used to classify the multivariate relationships of proteins. In this study, protein universe is modeled as an undirected network, where proteins are classified according to their connections. Our method is unsupervised, multivariate, and alignment-free. It can be applied to the classification of both protein sequences and structures. Nine examples are used to demonstrate the efficiency of our new method. PMID:28350835
Pairwise Classifier Ensemble with Adaptive Sub-Classifiers for fMRI Pattern Analysis.
Kim, Eunwoo; Park, HyunWook
2017-02-01
The multi-voxel pattern analysis technique is applied to fMRI data for classification of high-level brain functions using pattern information distributed over multiple voxels. In this paper, we propose a classifier ensemble for multiclass classification in fMRI analysis, exploiting the fact that specific neighboring voxels can contain spatial pattern information. The proposed method converts the multiclass classification to a pairwise classifier ensemble, and each pairwise classifier consists of multiple sub-classifiers using an adaptive feature set for each class-pair. Simulated and real fMRI data were used to verify the proposed method. Intra- and inter-subject analyses were performed to compare the proposed method with several well-known classifiers, including single and ensemble classifiers. The comparison results showed that the proposed method can be generally applied to multiclass classification in both simulations and real fMRI analyses.
NASA Astrophysics Data System (ADS)
Haaf, Ezra; Barthel, Roland
2018-04-01
Classification and similarity based methods, which have recently received major attention in the field of surface water hydrology, namely through the PUB (prediction in ungauged basins) initiative, have not yet been applied to groundwater systems. However, it can be hypothesised, that the principle of "similar systems responding similarly to similar forcing" applies in subsurface hydrology as well. One fundamental prerequisite to test this hypothesis and eventually to apply the principle to make "predictions for ungauged groundwater systems" is efficient methods to quantify the similarity of groundwater system responses, i.e. groundwater hydrographs. In this study, a large, spatially extensive, as well as geologically and geomorphologically diverse dataset from Southern Germany and Western Austria was used, to test and compare a set of 32 grouping methods, which have previously only been used individually in local-scale studies. The resulting groupings are compared to a heuristic visual classification, which serves as a baseline. A performance ranking of these classification methods is carried out and differences in homogeneity of grouping results were shown, whereby selected groups were related to hydrogeological indices and geological descriptors. This exploratory empirical study shows that the choice of grouping method has a large impact on the object distribution within groups, as well as on the homogeneity of patterns captured in groups. The study provides a comprehensive overview of a large number of grouping methods, which can guide researchers when attempting similarity-based groundwater hydrograph classification.
NASA Astrophysics Data System (ADS)
Pahlavani, Parham; Bigdeli, Behnaz
2017-12-01
Hyperspectral images contain extremely rich spectral information that offer great potential to discriminate between various land cover classes. However, these images are usually composed of tens or hundreds of spectrally close bands, which result in high redundancy and great amount of computation time in hyperspectral classification. Furthermore, in the presence of mixed coverage pixels, crisp classifiers produced errors, omission and commission. This paper presents a mutual information-Dempster-Shafer system through an ensemble classification approach for classification of hyperspectral data. First, mutual information is applied to split data into a few independent partitions to overcome high dimensionality. Then, a fuzzy maximum likelihood classifies each band subset. Finally, Dempster-Shafer is applied to fuse the results of the fuzzy classifiers. In order to assess the proposed method, a crisp ensemble system based on a support vector machine as the crisp classifier and weighted majority voting as the crisp fusion method are applied on hyperspectral data. Furthermore, a dimension reduction system is utilized to assess the effectiveness of mutual information band splitting of the proposed method. The proposed methodology provides interesting conclusions on the effectiveness and potentiality of mutual information-Dempster-Shafer based classification of hyperspectral data.
Improved Hierarchical Optimization-Based Classification of Hyperspectral Images Using Shape Analysis
NASA Technical Reports Server (NTRS)
Tarabalka, Yuliya; Tilton, James C.
2012-01-01
A new spectral-spatial method for classification of hyperspectral images is proposed. The HSegClas method is based on the integration of probabilistic classification and shape analysis within the hierarchical step-wise optimization algorithm. First, probabilistic support vector machines classification is applied. Then, at each iteration two neighboring regions with the smallest Dissimilarity Criterion (DC) are merged, and classification probabilities are recomputed. The important contribution of this work consists in estimating a DC between regions as a function of statistical, classification and geometrical (area and rectangularity) features. Experimental results are presented on a 102-band ROSIS image of the Center of Pavia, Italy. The developed approach yields more accurate classification results when compared to previously proposed methods.
NASA Technical Reports Server (NTRS)
Benediktsson, Jon A.; Swain, Philip H.; Ersoy, Okan K.
1990-01-01
Neural network learning procedures and statistical classificaiton methods are applied and compared empirically in classification of multisource remote sensing and geographic data. Statistical multisource classification by means of a method based on Bayesian classification theory is also investigated and modified. The modifications permit control of the influence of the data sources involved in the classification process. Reliability measures are introduced to rank the quality of the data sources. The data sources are then weighted according to these rankings in the statistical multisource classification. Four data sources are used in experiments: Landsat MSS data and three forms of topographic data (elevation, slope, and aspect). Experimental results show that two different approaches have unique advantages and disadvantages in this classification application.
Semi-supervised vibration-based classification and condition monitoring of compressors
NASA Astrophysics Data System (ADS)
Potočnik, Primož; Govekar, Edvard
2017-09-01
Semi-supervised vibration-based classification and condition monitoring of the reciprocating compressors installed in refrigeration appliances is proposed in this paper. The method addresses the problem of industrial condition monitoring where prior class definitions are often not available or difficult to obtain from local experts. The proposed method combines feature extraction, principal component analysis, and statistical analysis for the extraction of initial class representatives, and compares the capability of various classification methods, including discriminant analysis (DA), neural networks (NN), support vector machines (SVM), and extreme learning machines (ELM). The use of the method is demonstrated on a case study which was based on industrially acquired vibration measurements of reciprocating compressors during the production of refrigeration appliances. The paper presents a comparative qualitative analysis of the applied classifiers, confirming the good performance of several nonlinear classifiers. If the model parameters are properly selected, then very good classification performance can be obtained from NN trained by Bayesian regularization, SVM and ELM classifiers. The method can be effectively applied for the industrial condition monitoring of compressors.
NASA Astrophysics Data System (ADS)
Mandal, Shyamapada; Santhi, B.; Sridhar, S.; Vinolia, K.; Swaminathan, P.
2017-06-01
In this paper, an online fault detection and classification method is proposed for thermocouples used in nuclear power plants. In the proposed method, the fault data are detected by the classification method, which classifies the fault data from the normal data. Deep belief network (DBN), a technique for deep learning, is applied to classify the fault data. The DBN has a multilayer feature extraction scheme, which is highly sensitive to a small variation of data. Since the classification method is unable to detect the faulty sensor; therefore, a technique is proposed to identify the faulty sensor from the fault data. Finally, the composite statistical hypothesis test, namely generalized likelihood ratio test, is applied to compute the fault pattern of the faulty sensor signal based on the magnitude of the fault. The performance of the proposed method is validated by field data obtained from thermocouple sensors of the fast breeder test reactor.
Multivariate classification of the infrared spectra of cell and tissue samples
DOE Office of Scientific and Technical Information (OSTI.GOV)
Haaland, D.M.; Jones, H.D.; Thomas, E.V.
1997-03-01
Infrared microspectroscopy of biopsied canine lymph cells and tissue was performed to investigate the possibility of using IR spectra coupled with multivariate classification methods to classify the samples as normal, hyperplastic, or neoplastic (malignant). IR spectra were obtained in transmission mode through BaF{sub 2} windows and in reflection mode from samples prepared on gold-coated microscope slides. Cytology and histopathology samples were prepared by a variety of methods to identify the optimal methods of sample preparation. Cytospinning procedures that yielded a monolayer of cells on the BaF{sub 2} windows produced a limited set of IR transmission spectra. These transmission spectra weremore » converted to absorbance and formed the basis for a classification rule that yielded 100{percent} correct classification in a cross-validated context. Classifications of normal, hyperplastic, and neoplastic cell sample spectra were achieved by using both partial least-squares (PLS) and principal component regression (PCR) classification methods. Linear discriminant analysis applied to principal components obtained from the spectral data yielded a small number of misclassifications. PLS weight loading vectors yield valuable qualitative insight into the molecular changes that are responsible for the success of the infrared classification. These successful classification results show promise for assisting pathologists in the diagnosis of cell types and offer future potential for {ital in vivo} IR detection of some types of cancer. {copyright} {ital 1997} {ital Society for Applied Spectroscopy}« less
NASA Astrophysics Data System (ADS)
Sridhar, J.
2015-12-01
The focus of this work is to examine polarimetric decomposition techniques primarily focussed on Pauli decomposition and Sphere Di-Plane Helix (SDH) decomposition for forest resource assessment. The data processing methods adopted are Pre-processing (Geometric correction and Radiometric calibration), Speckle Reduction, Image Decomposition and Image Classification. Initially to classify forest regions, unsupervised classification was applied to determine different unknown classes. It was observed K-means clustering method gave better results in comparison with ISO Data method.Using the algorithm developed for Radar Tools, the code for decomposition and classification techniques were applied in Interactive Data Language (IDL) and was applied to RISAT-1 image of Mysore-Mandya region of Karnataka, India. This region is chosen for studying forest vegetation and consists of agricultural lands, water and hilly regions. Polarimetric SAR data possess a high potential for classification of earth surface.After applying the decomposition techniques, classification was done by selecting region of interests andpost-classification the over-all accuracy was observed to be higher in the SDH decomposed image, as it operates on individual pixels on a coherent basis and utilises the complete intrinsic coherent nature of polarimetric SAR data. Thereby, making SDH decomposition particularly suited for analysis of high-resolution SAR data. The Pauli Decomposition represents all the polarimetric information in a single SAR image however interpretation of the resulting image is difficult. The SDH decomposition technique seems to produce better results and interpretation as compared to Pauli Decomposition however more quantification and further analysis are being done in this area of research. The comparison of Polarimetric decomposition techniques and evolutionary classification techniques will be the scope of this work.
Fuzzy support vector machine for microarray imbalanced data classification
NASA Astrophysics Data System (ADS)
Ladayya, Faroh; Purnami, Santi Wulan; Irhamah
2017-11-01
DNA microarrays are data containing gene expression with small sample sizes and high number of features. Furthermore, imbalanced classes is a common problem in microarray data. This occurs when a dataset is dominated by a class which have significantly more instances than the other minority classes. Therefore, it is needed a classification method that solve the problem of high dimensional and imbalanced data. Support Vector Machine (SVM) is one of the classification methods that is capable of handling large or small samples, nonlinear, high dimensional, over learning and local minimum issues. SVM has been widely applied to DNA microarray data classification and it has been shown that SVM provides the best performance among other machine learning methods. However, imbalanced data will be a problem because SVM treats all samples in the same importance thus the results is bias for minority class. To overcome the imbalanced data, Fuzzy SVM (FSVM) is proposed. This method apply a fuzzy membership to each input point and reformulate the SVM such that different input points provide different contributions to the classifier. The minority classes have large fuzzy membership so FSVM can pay more attention to the samples with larger fuzzy membership. Given DNA microarray data is a high dimensional data with a very large number of features, it is necessary to do feature selection first using Fast Correlation based Filter (FCBF). In this study will be analyzed by SVM, FSVM and both methods by applying FCBF and get the classification performance of them. Based on the overall results, FSVM on selected features has the best classification performance compared to SVM.
Hierarchical classification method and its application in shape representation
NASA Astrophysics Data System (ADS)
Ireton, M. A.; Oakley, John P.; Xydeas, Costas S.
1992-04-01
In this paper we describe a technique for performing shaped-based content retrieval of images from a large database. In order to be able to formulate such user-generated queries about visual objects, we have developed an hierarchical classification technique. This hierarchical classification technique enables similarity matching between objects, with the position in the hierarchy signifying the level of generality to be used in the query. The classification technique is unsupervised, robust, and general; it can be applied to any suitable parameter set. To establish the potential of this classifier for aiding visual querying, we have applied it to the classification of the 2-D outlines of leaves.
NASA Astrophysics Data System (ADS)
Juniati, E.; Arrofiqoh, E. N.
2017-09-01
Information extraction from remote sensing data especially land cover can be obtained by digital classification. In practical some people are more comfortable using visual interpretation to retrieve land cover information. However, it is highly influenced by subjectivity and knowledge of interpreter, also takes time in the process. Digital classification can be done in several ways, depend on the defined mapping approach and assumptions on data distribution. The study compared several classifiers method for some data type at the same location. The data used Landsat 8 satellite imagery, SPOT 6 and Orthophotos. In practical, the data used to produce land cover map in 1:50,000 map scale for Landsat, 1:25,000 map scale for SPOT and 1:5,000 map scale for Orthophotos, but using visual interpretation to retrieve information. Maximum likelihood Classifiers (MLC) which use pixel-based and parameters approach applied to such data, and also Artificial Neural Network classifiers which use pixel-based and non-parameters approach applied too. Moreover, this study applied object-based classifiers to the data. The classification system implemented is land cover classification on Indonesia topographic map. The classification applied to data source, which is expected to recognize the pattern and to assess consistency of the land cover map produced by each data. Furthermore, the study analyse benefits and limitations the use of methods.
Rasch, Elizabeth K; Hirsch, Rosemarie; Paulose-Ram, Ryne; Hochberg, Marc C
2003-04-01
To determine prevalence estimates for rheumatoid arthritis (RA) in noninstitutionalized older adults in the US. Prevalence estimates were compared using 3 different classification methods based on current classification criteria for RA. Data from the Third National Health and Nutrition Examination Survey (NHANES-III) were used to generate prevalence estimates by 3 classification methods in persons 60 years of age and older (n = 5,302). Method 1 applied the "n of k" rule, such that subjects who met 3 of 6 of the American College of Rheumatology (ACR) 1987 criteria were classified as having RA (data from hand radiographs were not available). In method 2, the ACR classification tree algorithm was applied. For method 3, medication data were used to augment case identification via method 2. Population prevalence estimates and 95% confidence intervals (95% CIs) were determined using the 3 methods on data stratified by sex, race/ethnicity, age, and education. Overall prevalence estimates using the 3 classification methods were 2.03% (95% CI 1.30-2.76), 2.15% (95% CI 1.43-2.87), and 2.34% (95% CI 1.66-3.02), respectively. The prevalence of RA was generally greater in the following groups: women, Mexican Americans, respondents with less education, and respondents who were 70 years of age and older. The prevalence of RA in persons 60 years of age and older is approximately 2%, representing the proportion of the US elderly population who will most likely require medical intervention because of disease activity. Different classification methods yielded similar prevalence estimates, although detection of RA was enhanced by incorporation of data on use of prescription medications, an important consideration in large population surveys.
Behavior Based Social Dimensions Extraction for Multi-Label Classification
Li, Le; Xu, Junyi; Xiao, Weidong; Ge, Bin
2016-01-01
Classification based on social dimensions is commonly used to handle the multi-label classification task in heterogeneous networks. However, traditional methods, which mostly rely on the community detection algorithms to extract the latent social dimensions, produce unsatisfactory performance when community detection algorithms fail. In this paper, we propose a novel behavior based social dimensions extraction method to improve the classification performance in multi-label heterogeneous networks. In our method, nodes’ behavior features, instead of community memberships, are used to extract social dimensions. By introducing Latent Dirichlet Allocation (LDA) to model the network generation process, nodes’ connection behaviors with different communities can be extracted accurately, which are applied as latent social dimensions for classification. Experiments on various public datasets reveal that the proposed method can obtain satisfactory classification results in comparison to other state-of-the-art methods on smaller social dimensions. PMID:27049849
NASA Astrophysics Data System (ADS)
Ranaie, Mehrdad; Soffianian, Alireza; Pourmanafi, Saeid; Mirghaffari, Noorollah; Tarkesh, Mostafa
2018-03-01
In recent decade, analyzing the remotely sensed imagery is considered as one of the most common and widely used procedures in the environmental studies. In this case, supervised image classification techniques play a central role. Hence, taking a high resolution Worldview-3 over a mixed urbanized landscape in Iran, three less applied image classification methods including Bagged CART, Stochastic gradient boosting model and Neural network with feature extraction were tested and compared with two prevalent methods: random forest and support vector machine with linear kernel. To do so, each method was run ten time and three validation techniques was used to estimate the accuracy statistics consist of cross validation, independent validation and validation with total of train data. Moreover, using ANOVA and Tukey test, statistical difference significance between the classification methods was significantly surveyed. In general, the results showed that random forest with marginal difference compared to Bagged CART and stochastic gradient boosting model is the best performing method whilst based on independent validation there was no significant difference between the performances of classification methods. It should be finally noted that neural network with feature extraction and linear support vector machine had better processing speed than other.
NASA Astrophysics Data System (ADS)
Mohamed, Abdul Aziz; Hasan, Abu Bakar; Ghazali, Abu Bakar Mhd.
2017-01-01
Classification of large data into respected classes or groups could be carried out with the help of artificial intelligence (AI) tools readily available in the market. To get the optimum or best results, optimization tool could be applied on those data. Classification and optimization have been used by researchers throughout their works, and the outcomes were very encouraging indeed. Here, the authors are trying to share what they have experienced in three different areas of applied research.
Atzori, Manfredo; Cognolato, Matteo; Müller, Henning
2016-01-01
Natural control methods based on surface electromyography (sEMG) and pattern recognition are promising for hand prosthetics. However, the control robustness offered by scientific research is still not sufficient for many real life applications, and commercial prostheses are capable of offering natural control for only a few movements. In recent years deep learning revolutionized several fields of machine learning, including computer vision and speech recognition. Our objective is to test its methods for natural control of robotic hands via sEMG using a large number of intact subjects and amputees. We tested convolutional networks for the classification of an average of 50 hand movements in 67 intact subjects and 11 transradial amputees. The simple architecture of the neural network allowed to make several tests in order to evaluate the effect of pre-processing, layer architecture, data augmentation and optimization. The classification results are compared with a set of classical classification methods applied on the same datasets. The classification accuracy obtained with convolutional neural networks using the proposed architecture is higher than the average results obtained with the classical classification methods, but lower than the results obtained with the best reference methods in our tests. The results show that convolutional neural networks with a very simple architecture can produce accurate results comparable to the average classical classification methods. They show that several factors (including pre-processing, the architecture of the net and the optimization parameters) can be fundamental for the analysis of sEMG data. Larger networks can achieve higher accuracy on computer vision and object recognition tasks. This fact suggests that it may be interesting to evaluate if larger networks can increase sEMG classification accuracy too. PMID:27656140
Atzori, Manfredo; Cognolato, Matteo; Müller, Henning
2016-01-01
Natural control methods based on surface electromyography (sEMG) and pattern recognition are promising for hand prosthetics. However, the control robustness offered by scientific research is still not sufficient for many real life applications, and commercial prostheses are capable of offering natural control for only a few movements. In recent years deep learning revolutionized several fields of machine learning, including computer vision and speech recognition. Our objective is to test its methods for natural control of robotic hands via sEMG using a large number of intact subjects and amputees. We tested convolutional networks for the classification of an average of 50 hand movements in 67 intact subjects and 11 transradial amputees. The simple architecture of the neural network allowed to make several tests in order to evaluate the effect of pre-processing, layer architecture, data augmentation and optimization. The classification results are compared with a set of classical classification methods applied on the same datasets. The classification accuracy obtained with convolutional neural networks using the proposed architecture is higher than the average results obtained with the classical classification methods, but lower than the results obtained with the best reference methods in our tests. The results show that convolutional neural networks with a very simple architecture can produce accurate results comparable to the average classical classification methods. They show that several factors (including pre-processing, the architecture of the net and the optimization parameters) can be fundamental for the analysis of sEMG data. Larger networks can achieve higher accuracy on computer vision and object recognition tasks. This fact suggests that it may be interesting to evaluate if larger networks can increase sEMG classification accuracy too.
Arrhythmia Classification Based on Multi-Domain Feature Extraction for an ECG Recognition System.
Li, Hongqiang; Yuan, Danyang; Wang, Youxi; Cui, Dianyin; Cao, Lu
2016-10-20
Automatic recognition of arrhythmias is particularly important in the diagnosis of heart diseases. This study presents an electrocardiogram (ECG) recognition system based on multi-domain feature extraction to classify ECG beats. An improved wavelet threshold method for ECG signal pre-processing is applied to remove noise interference. A novel multi-domain feature extraction method is proposed; this method employs kernel-independent component analysis in nonlinear feature extraction and uses discrete wavelet transform to extract frequency domain features. The proposed system utilises a support vector machine classifier optimized with a genetic algorithm to recognize different types of heartbeats. An ECG acquisition experimental platform, in which ECG beats are collected as ECG data for classification, is constructed to demonstrate the effectiveness of the system in ECG beat classification. The presented system, when applied to the MIT-BIH arrhythmia database, achieves a high classification accuracy of 98.8%. Experimental results based on the ECG acquisition experimental platform show that the system obtains a satisfactory classification accuracy of 97.3% and is able to classify ECG beats efficiently for the automatic identification of cardiac arrhythmias.
Arrhythmia Classification Based on Multi-Domain Feature Extraction for an ECG Recognition System
Li, Hongqiang; Yuan, Danyang; Wang, Youxi; Cui, Dianyin; Cao, Lu
2016-01-01
Automatic recognition of arrhythmias is particularly important in the diagnosis of heart diseases. This study presents an electrocardiogram (ECG) recognition system based on multi-domain feature extraction to classify ECG beats. An improved wavelet threshold method for ECG signal pre-processing is applied to remove noise interference. A novel multi-domain feature extraction method is proposed; this method employs kernel-independent component analysis in nonlinear feature extraction and uses discrete wavelet transform to extract frequency domain features. The proposed system utilises a support vector machine classifier optimized with a genetic algorithm to recognize different types of heartbeats. An ECG acquisition experimental platform, in which ECG beats are collected as ECG data for classification, is constructed to demonstrate the effectiveness of the system in ECG beat classification. The presented system, when applied to the MIT-BIH arrhythmia database, achieves a high classification accuracy of 98.8%. Experimental results based on the ECG acquisition experimental platform show that the system obtains a satisfactory classification accuracy of 97.3% and is able to classify ECG beats efficiently for the automatic identification of cardiac arrhythmias. PMID:27775596
Using RNA Sequencing to Classify Organisms into Three Primary Kingdoms.
ERIC Educational Resources Information Center
Evans, Robert H.
1983-01-01
Using the biochemical record to class archaebacteria, eukaryotes, and eubacteria involves abstractions difficult for the concrete learner. Therefore, a method is provided in which students discover some basic tenets of biochemical classification and apply them in a "hands-on" classification problem. The method involves use of RNA…
Developing collaborative classifiers using an expert-based model
Mountrakis, G.; Watts, R.; Luo, L.; Wang, Jingyuan
2009-01-01
This paper presents a hierarchical, multi-stage adaptive strategy for image classification. We iteratively apply various classification methods (e.g., decision trees, neural networks), identify regions of parametric and geographic space where accuracy is low, and in these regions, test and apply alternate methods repeating the process until the entire image is classified. Currently, classifiers are evaluated through human input using an expert-based system; therefore, this paper acts as the proof of concept for collaborative classifiers. Because we decompose the problem into smaller, more manageable sub-tasks, our classification exhibits increased flexibility compared to existing methods since classification methods are tailored to the idiosyncrasies of specific regions. A major benefit of our approach is its scalability and collaborative support since selected low-accuracy classifiers can be easily replaced with others without affecting classification accuracy in high accuracy areas. At each stage, we develop spatially explicit accuracy metrics that provide straightforward assessment of results by non-experts and point to areas that need algorithmic improvement or ancillary data. Our approach is demonstrated in the task of detecting impervious surface areas, an important indicator for human-induced alterations to the environment, using a 2001 Landsat scene from Las Vegas, Nevada. ?? 2009 American Society for Photogrammetry and Remote Sensing.
Feature selection for the classification of traced neurons.
López-Cabrera, José D; Lorenzo-Ginori, Juan V
2018-06-01
The great availability of computational tools to calculate the properties of traced neurons leads to the existence of many descriptors which allow the automated classification of neurons from these reconstructions. This situation determines the necessity to eliminate irrelevant features as well as making a selection of the most appropriate among them, in order to improve the quality of the classification obtained. The dataset used contains a total of 318 traced neurons, classified by human experts in 192 GABAergic interneurons and 126 pyramidal cells. The features were extracted by means of the L-measure software, which is one of the most used computational tools in neuroinformatics to quantify traced neurons. We review some current feature selection techniques as filter, wrapper, embedded and ensemble methods. The stability of the feature selection methods was measured. For the ensemble methods, several aggregation methods based on different metrics were applied to combine the subsets obtained during the feature selection process. The subsets obtained applying feature selection methods were evaluated using supervised classifiers, among which Random Forest, C4.5, SVM, Naïve Bayes, Knn, Decision Table and the Logistic classifier were used as classification algorithms. Feature selection methods of types filter, embedded, wrappers and ensembles were compared and the subsets returned were tested in classification tasks for different classification algorithms. L-measure features EucDistanceSD, PathDistanceSD, Branch_pathlengthAve, Branch_pathlengthSD and EucDistanceAve were present in more than 60% of the selected subsets which provides evidence about their importance in the classification of this neurons. Copyright © 2018 Elsevier B.V. All rights reserved.
A comprehensive simulation study on classification of RNA-Seq data.
Zararsız, Gökmen; Goksuluk, Dincer; Korkmaz, Selcuk; Eldem, Vahap; Zararsiz, Gozde Erturk; Duru, Izzet Parug; Ozturk, Ahmet
2017-01-01
RNA sequencing (RNA-Seq) is a powerful technique for the gene-expression profiling of organisms that uses the capabilities of next-generation sequencing technologies. Developing gene-expression-based classification algorithms is an emerging powerful method for diagnosis, disease classification and monitoring at molecular level, as well as providing potential markers of diseases. Most of the statistical methods proposed for the classification of gene-expression data are either based on a continuous scale (eg. microarray data) or require a normal distribution assumption. Hence, these methods cannot be directly applied to RNA-Seq data since they violate both data structure and distributional assumptions. However, it is possible to apply these algorithms with appropriate modifications to RNA-Seq data. One way is to develop count-based classifiers, such as Poisson linear discriminant analysis and negative binomial linear discriminant analysis. Another way is to bring the data closer to microarrays and apply microarray-based classifiers. In this study, we compared several classifiers including PLDA with and without power transformation, NBLDA, single SVM, bagging SVM (bagSVM), classification and regression trees (CART), and random forests (RF). We also examined the effect of several parameters such as overdispersion, sample size, number of genes, number of classes, differential-expression rate, and the transformation method on model performances. A comprehensive simulation study is conducted and the results are compared with the results of two miRNA and two mRNA experimental datasets. The results revealed that increasing the sample size, differential-expression rate and decreasing the dispersion parameter and number of groups lead to an increase in classification accuracy. Similar with differential-expression studies, the classification of RNA-Seq data requires careful attention when handling data overdispersion. We conclude that, as a count-based classifier, the power transformed PLDA and, as a microarray-based classifier, vst or rlog transformed RF and SVM classifiers may be a good choice for classification. An R/BIOCONDUCTOR package, MLSeq, is freely available at https://www.bioconductor.org/packages/release/bioc/html/MLSeq.html.
Quirós, Elia; Felicísimo, Angel M; Cuartero, Aurora
2009-01-01
This work proposes a new method to classify multi-spectral satellite images based on multivariate adaptive regression splines (MARS) and compares this classification system with the more common parallelepiped and maximum likelihood (ML) methods. We apply the classification methods to the land cover classification of a test zone located in southwestern Spain. The basis of the MARS method and its associated procedures are explained in detail, and the area under the ROC curve (AUC) is compared for the three methods. The results show that the MARS method provides better results than the parallelepiped method in all cases, and it provides better results than the maximum likelihood method in 13 cases out of 17. These results demonstrate that the MARS method can be used in isolation or in combination with other methods to improve the accuracy of soil cover classification. The improvement is statistically significant according to the Wilcoxon signed rank test.
NASA Astrophysics Data System (ADS)
Liu, Yansong; Monteiro, Sildomar T.; Saber, Eli
2015-10-01
Changes in vegetation cover, building construction, road network and traffic conditions caused by urban expansion affect the human habitat as well as the natural environment in rapidly developing cities. It is crucial to assess these changes and respond accordingly by identifying man-made and natural structures with accurate classification algorithms. With the increase in use of multi-sensor remote sensing systems, researchers are able to obtain a more complete description of the scene of interest. By utilizing multi-sensor data, the accuracy of classification algorithms can be improved. In this paper, we propose a method for combining 3D LiDAR point clouds and high-resolution color images to classify urban areas using Gaussian processes (GP). GP classification is a powerful non-parametric classification method that yields probabilistic classification results. It makes predictions in a way that addresses the uncertainty of real world. In this paper, we attempt to identify man-made and natural objects in urban areas including buildings, roads, trees, grass, water and vehicles. LiDAR features are derived from the 3D point clouds and the spatial and color features are extracted from RGB images. For classification, we use the Laplacian approximation for GP binary classification on the new combined feature space. The multiclass classification has been implemented by using one-vs-all binary classification strategy. The result of applying support vector machines (SVMs) and logistic regression (LR) classifier is also provided for comparison. Our experiments show a clear improvement of classification results by using the two sensors combined instead of each sensor separately. Also we found the advantage of applying GP approach to handle the uncertainty in classification result without compromising accuracy compared to SVM, which is considered as the state-of-the-art classification method.
Rana, Mohit; Prasad, Vinod A.; Guan, Cuntai; Birbaumer, Niels; Sitaram, Ranganatha
2016-01-01
Recently, studies have reported the use of Near Infrared Spectroscopy (NIRS) for developing Brain–Computer Interface (BCI) by applying online pattern classification of brain states from subject-specific fNIRS signals. The purpose of the present study was to develop and test a real-time method for subject-specific and subject-independent classification of multi-channel fNIRS signals using support-vector machines (SVM), so as to determine its feasibility as an online neurofeedback system. Towards this goal, we used left versus right hand movement execution and movement imagery as study paradigms in a series of experiments. In the first two experiments, activations in the motor cortex during movement execution and movement imagery were used to develop subject-dependent models that obtained high classification accuracies thereby indicating the robustness of our classification method. In the third experiment, a generalized classifier-model was developed from the first two experimental data, which was then applied for subject-independent neurofeedback training. Application of this method in new participants showed mean classification accuracy of 63% for movement imagery tasks and 80% for movement execution tasks. These results, and their corresponding offline analysis reported in this study demonstrate that SVM based real-time subject-independent classification of fNIRS signals is feasible. This method has important applications in the field of hemodynamic BCIs, and neuro-rehabilitation where patients can be trained to learn spatio-temporal patterns of healthy brain activity. PMID:27467528
Classification of burn wounds using support vector machines
NASA Astrophysics Data System (ADS)
Acha, Begona; Serrano, Carmen; Palencia, Sergio; Murillo, Juan Jose
2004-05-01
The purpose of this work is to improve a previous method developed by the authors for the classification of burn wounds into their depths. The inputs of the system are color and texture information, as these are the characteristics observed by physicians in order to give a diagnosis. Our previous work consisted in segmenting the burn wound from the rest of the image and classifying the burn into its depth. In this paper we focus on the classification problem only. We already proposed to use a Fuzzy-ARTMAP neural network (NN). However, we may take advantage of new powerful classification tools such as Support Vector Machines (SVM). We apply the five-folded cross validation scheme to divide the database into training and validating sets. Then, we apply a feature selection method for each classifier, which will give us the set of features that yields the smallest classification error for each classifier. Features used to classify are first-order statistical parameters extracted from the L*, u* and v* color components of the image. The feature selection algorithms used are the Sequential Forward Selection (SFS) and the Sequential Backward Selection (SBS) methods. As data of the problem faced here are not linearly separable, the SVM was trained using some different kernels. The validating process shows that the SVM method, when using a Gaussian kernel of variance 1, outperforms classification results obtained with the rest of the classifiers, yielding an error classification rate of 0.7% whereas the Fuzzy-ARTMAP NN attained 1.6 %.
Cascaded deep decision networks for classification of endoscopic images
NASA Astrophysics Data System (ADS)
Murthy, Venkatesh N.; Singh, Vivek; Sun, Shanhui; Bhattacharya, Subhabrata; Chen, Terrence; Comaniciu, Dorin
2017-02-01
Both traditional and wireless capsule endoscopes can generate tens of thousands of images for each patient. It is desirable to have the majority of irrelevant images filtered out by automatic algorithms during an offline review process or to have automatic indication for highly suspicious areas during an online guidance. This also applies to the newly invented endomicroscopy, where online indication of tumor classification plays a significant role. Image classification is a standard pattern recognition problem and is well studied in the literature. However, performance on the challenging endoscopic images still has room for improvement. In this paper, we present a novel Cascaded Deep Decision Network (CDDN) to improve image classification performance over standard Deep neural network based methods. During the learning phase, CDDN automatically builds a network which discards samples that are classified with high confidence scores by a previously trained network and concentrates only on the challenging samples which would be handled by the subsequent expert shallow networks. We validate CDDN using two different types of endoscopic imaging, which includes a polyp classification dataset and a tumor classification dataset. From both datasets we show that CDDN can outperform other methods by about 10%. In addition, CDDN can also be applied to other image classification problems.
Gadermayr, M.; Liedlgruber, M.; Uhl, A.; Vécsei, A.
2013-01-01
Due to the optics used in endoscopes, a typical degradation observed in endoscopic images are barrel-type distortions. In this work we investigate the impact of methods used to correct such distortions in images on the classification accuracy in the context of automated celiac disease classification. For this purpose we compare various different distortion correction methods and apply them to endoscopic images, which are subsequently classified. Since the interpolation used in such methods is also assumed to have an influence on the resulting classification accuracies, we also investigate different interpolation methods and their impact on the classification performance. In order to be able to make solid statements about the benefit of distortion correction we use various different feature extraction methods used to obtain features for the classification. Our experiments show that it is not possible to make a clear statement about the usefulness of distortion correction methods in the context of an automated diagnosis of celiac disease. This is mainly due to the fact that an eventual benefit of distortion correction highly depends on the feature extraction method used for the classification. PMID:23981585
Jiang, Hao; Zhao, Dehua; Cai, Ying; An, Shuqing
2012-01-01
In previous attempts to identify aquatic vegetation from remotely-sensed images using classification trees (CT), the images used to apply CT models to different times or locations necessarily originated from the same satellite sensor as that from which the original images used in model development came, greatly limiting the application of CT. We have developed an effective normalization method to improve the robustness of CT models when applied to images originating from different sensors and dates. A total of 965 ground-truth samples of aquatic vegetation types were obtained in 2009 and 2010 in Taihu Lake, China. Using relevant spectral indices (SI) as classifiers, we manually developed a stable CT model structure and then applied a standard CT algorithm to obtain quantitative (optimal) thresholds from 2009 ground-truth data and images from Landsat7-ETM+, HJ-1B-CCD, Landsat5-TM and ALOS-AVNIR-2 sensors. Optimal CT thresholds produced average classification accuracies of 78.1%, 84.7% and 74.0% for emergent vegetation, floating-leaf vegetation and submerged vegetation, respectively. However, the optimal CT thresholds for different sensor images differed from each other, with an average relative variation (RV) of 6.40%. We developed and evaluated three new approaches to normalizing the images. The best-performing method (Method of 0.1% index scaling) normalized the SI images using tailored percentages of extreme pixel values. Using the images normalized by Method of 0.1% index scaling, CT models for a particular sensor in which thresholds were replaced by those from the models developed for images originating from other sensors provided average classification accuracies of 76.0%, 82.8% and 68.9% for emergent vegetation, floating-leaf vegetation and submerged vegetation, respectively. Applying the CT models developed for normalized 2009 images to 2010 images resulted in high classification (78.0%–93.3%) and overall (92.0%–93.1%) accuracies. Our results suggest that Method of 0.1% index scaling provides a feasible way to apply CT models directly to images from sensors or time periods that differ from those of the images used to develop the original models.
Semi-Supervised Marginal Fisher Analysis for Hyperspectral Image Classification
NASA Astrophysics Data System (ADS)
Huang, H.; Liu, J.; Pan, Y.
2012-07-01
The problem of learning with both labeled and unlabeled examples arises frequently in Hyperspectral image (HSI) classification. While marginal Fisher analysis is a supervised method, which cannot be directly applied for Semi-supervised classification. In this paper, we proposed a novel method, called semi-supervised marginal Fisher analysis (SSMFA), to process HSI of natural scenes, which uses a combination of semi-supervised learning and manifold learning. In SSMFA, a new difference-based optimization objective function with unlabeled samples has been designed. SSMFA preserves the manifold structure of labeled and unlabeled samples in addition to separating labeled samples in different classes from each other. The semi-supervised method has an analytic form of the globally optimal solution, and it can be computed based on eigen decomposition. Classification experiments with a challenging HSI task demonstrate that this method outperforms current state-of-the-art HSI-classification methods.
NASA Astrophysics Data System (ADS)
Chen, Fulong; Wang, Chao; Yang, Chengyun; Zhang, Hong; Wu, Fan; Lin, Wenjuan; Zhang, Bo
2008-11-01
This paper proposed a method that uses a case-based classification of remote sensing images and applied this method to abstract the information of suspected illegal land use in urban areas. Because of the discrete cases for imagery classification, the proposed method dealt with the oscillation of spectrum or backscatter within the same land use category, and it not only overcame the deficiency of maximum likelihood classification (the prior probability of land use could not be obtained) but also inherited the advantages of the knowledge-based classification system, such as artificial intelligence and automatic characteristics. Consequently, the proposed method could do the classifying better. Then the researchers used the object-oriented technique for shadow removal in highly dense city zones. With multi-temporal SPOT 5 images whose resolution was 2.5×2.5 meters, the researchers found that the method can abstract suspected illegal land use information in urban areas using post-classification comparison technique.
Classification Techniques for Digital Map Compression
1989-03-01
classification improved the performance of the K-means classification algorithm resulting in a compression of 8.06:1 with Lempel - Ziv coding. Run-length coding... compression performance are run-length coding [2], [8] and Lempel - Ziv coding 110], [11]. These techniques are chosen because they are most efficient when...investigated. After the classification, some standard file compression methods, such as Lempel - Ziv and run-length encoding were applied to the
Mallants, Dirk; Batelaan, Okke; Gedeon, Matej; Huysmans, Marijke; Dassargues, Alain
2017-01-01
Cone penetration testing (CPT) is one of the most efficient and versatile methods currently available for geotechnical, lithostratigraphic and hydrogeological site characterization. Currently available methods for soil behaviour type classification (SBT) of CPT data however have severe limitations, often restricting their application to a local scale. For parameterization of regional groundwater flow or geotechnical models, and delineation of regional hydro- or lithostratigraphy, regional SBT classification would be very useful. This paper investigates the use of model-based clustering for SBT classification, and the influence of different clustering approaches on the properties and spatial distribution of the obtained soil classes. We additionally propose a methodology for automated lithostratigraphic mapping of regionally occurring sedimentary units using SBT classification. The methodology is applied to a large CPT dataset, covering a groundwater basin of ~60 km2 with predominantly unconsolidated sandy sediments in northern Belgium. Results show that the model-based approach is superior in detecting the true lithological classes when compared to more frequently applied unsupervised classification approaches or literature classification diagrams. We demonstrate that automated mapping of lithostratigraphic units using advanced SBT classification techniques can provide a large gain in efficiency, compared to more time-consuming manual approaches and yields at least equally accurate results. PMID:28467468
Rogiers, Bart; Mallants, Dirk; Batelaan, Okke; Gedeon, Matej; Huysmans, Marijke; Dassargues, Alain
2017-01-01
Cone penetration testing (CPT) is one of the most efficient and versatile methods currently available for geotechnical, lithostratigraphic and hydrogeological site characterization. Currently available methods for soil behaviour type classification (SBT) of CPT data however have severe limitations, often restricting their application to a local scale. For parameterization of regional groundwater flow or geotechnical models, and delineation of regional hydro- or lithostratigraphy, regional SBT classification would be very useful. This paper investigates the use of model-based clustering for SBT classification, and the influence of different clustering approaches on the properties and spatial distribution of the obtained soil classes. We additionally propose a methodology for automated lithostratigraphic mapping of regionally occurring sedimentary units using SBT classification. The methodology is applied to a large CPT dataset, covering a groundwater basin of ~60 km2 with predominantly unconsolidated sandy sediments in northern Belgium. Results show that the model-based approach is superior in detecting the true lithological classes when compared to more frequently applied unsupervised classification approaches or literature classification diagrams. We demonstrate that automated mapping of lithostratigraphic units using advanced SBT classification techniques can provide a large gain in efficiency, compared to more time-consuming manual approaches and yields at least equally accurate results.
NASA Astrophysics Data System (ADS)
Esteban, Pere; Beck, Christoph; Philipp, Andreas
2010-05-01
Using data associated with accidents or damages caused by snow avalanches over the eastern Pyrenees (Andorra and Catalonia) several atmospheric circulation type catalogues have been obtained. For this purpose, different circulation type classification methods based on Principal Component Analysis (T-mode and S-mode using the extreme scores) and on optimization procedures (Improved K-means and SANDRA) were applied . Considering the characteristics of the phenomena studied, not only single day circulation patterns were taken into account but also sequences of circulation types of varying length. Thus different classifications with different numbers of types and for different sequence lengths were obtained using the different classification methods. Simple between type variability, within type variability, and outlier detection procedures have been applied for selecting the best result concerning snow avalanches type classifications. Furthermore, days without occurrence of the hazards were also related to the avalanche centroids using pattern-correlations, facilitating the calculation of the anomalies between hazardous and no hazardous days, and also frequencies of occurrence of hazardous events for each circulation type. Finally, the catalogues statistically considered the best results are evaluated using the avalanche forecaster expert knowledge. Consistent explanation of snow avalanches occurrence by means of circulation sequences is obtained, but always considering results from classifications with different sequence length. This work has been developed in the framework of the COST Action 733 (Harmonisation and Applications of Weather Type Classifications for European regions).
Support vector machine based classification of fast Fourier transform spectroscopy of proteins
NASA Astrophysics Data System (ADS)
Lazarevic, Aleksandar; Pokrajac, Dragoljub; Marcano, Aristides; Melikechi, Noureddine
2009-02-01
Fast Fourier transform spectroscopy has proved to be a powerful method for study of the secondary structure of proteins since peak positions and their relative amplitude are affected by the number of hydrogen bridges that sustain this secondary structure. However, to our best knowledge, the method has not been used yet for identification of proteins within a complex matrix like a blood sample. The principal reason is the apparent similarity of protein infrared spectra with actual differences usually masked by the solvent contribution and other interactions. In this paper, we propose a novel machine learning based method that uses protein spectra for classification and identification of such proteins within a given sample. The proposed method uses principal component analysis (PCA) to identify most important linear combinations of original spectral components and then employs support vector machine (SVM) classification model applied on such identified combinations to categorize proteins into one of given groups. Our experiments have been performed on the set of four different proteins, namely: Bovine Serum Albumin, Leptin, Insulin-like Growth Factor 2 and Osteopontin. Our proposed method of applying principal component analysis along with support vector machines exhibits excellent classification accuracy when identifying proteins using their infrared spectra.
Classification of skin cancer images using local binary pattern and SVM classifier
NASA Astrophysics Data System (ADS)
Adjed, Faouzi; Faye, Ibrahima; Ababsa, Fakhreddine; Gardezi, Syed Jamal; Dass, Sarat Chandra
2016-11-01
In this paper, a classification method for melanoma and non-melanoma skin cancer images has been presented using the local binary patterns (LBP). The LBP computes the local texture information from the skin cancer images, which is later used to compute some statistical features that have capability to discriminate the melanoma and non-melanoma skin tissues. Support vector machine (SVM) is applied on the feature matrix for classification into two skin image classes (malignant and benign). The method achieves good classification accuracy of 76.1% with sensitivity of 75.6% and specificity of 76.7%.
Jiménez-Carvelo, Ana M; González-Casado, Antonio; Pérez-Castaño, Estefanía; Cuadros-Rodríguez, Luis
2017-03-01
A new analytical method for the differentiation of olive oil from other vegetable oils using reversed-phase LC and applying chemometric techniques was developed. A 3 cm short column was used to obtain the chromatographic fingerprint of the methyl-transesterified fraction of each vegetable oil. The chromatographic analysis took only 4 min. The multivariate classification methods used were k-nearest neighbors, partial least-squares (PLS) discriminant analysis, one-class PLS, support vector machine classification, and soft independent modeling of class analogies. The discrimination of olive oil from other vegetable edible oils was evaluated by several classification quality metrics. Several strategies for the classification of the olive oil were used: one input-class, two input-class, and pseudo two input-class.
Vidaurre, D.; Rodríguez, E. E.; Bielza, C.; Larrañaga, P.; Rudomin, P.
2012-01-01
In the spinal cord of the anesthetized cat, spontaneous cord dorsum potentials (CDPs) appear synchronously along the lumbo-sacral segments. These CDPs have different shapes and magnitudes. Previous work has indicated that some CDPs appear to be specially associated with the activation of spinal pathways that lead to primary afferent depolarization and presynaptic inhibition. Visual detection and classification of these CDPs provides relevant information on the functional organization of the neural networks involved in the control of sensory information and allows the characterization of the changes produced by acute nerve and spinal lesions. We now present a novel feature extraction approach for signal classification, applied to CDP detection. The method is based on an intuitive procedure. We first remove by convolution the noise from the CDPs recorded in each given spinal segment. Then, we assign a coefficient for each main local maximum of the signal using its amplitude and distance to the most important maximum of the signal. These coefficients will be the input for the subsequent classification algorithm. In particular, we employ gradient boosting classification trees. This combination of approaches allows a faster and more accurate discrimination of CDPs than is obtained by other methods. PMID:22929924
Vidaurre, D; Rodríguez, E E; Bielza, C; Larrañaga, P; Rudomin, P
2012-10-01
In the spinal cord of the anesthetized cat, spontaneous cord dorsum potentials (CDPs) appear synchronously along the lumbo-sacral segments. These CDPs have different shapes and magnitudes. Previous work has indicated that some CDPs appear to be specially associated with the activation of spinal pathways that lead to primary afferent depolarization and presynaptic inhibition. Visual detection and classification of these CDPs provides relevant information on the functional organization of the neural networks involved in the control of sensory information and allows the characterization of the changes produced by acute nerve and spinal lesions. We now present a novel feature extraction approach for signal classification, applied to CDP detection. The method is based on an intuitive procedure. We first remove by convolution the noise from the CDPs recorded in each given spinal segment. Then, we assign a coefficient for each main local maximum of the signal using its amplitude and distance to the most important maximum of the signal. These coefficients will be the input for the subsequent classification algorithm. In particular, we employ gradient boosting classification trees. This combination of approaches allows a faster and more accurate discrimination of CDPs than is obtained by other methods.
Protein classification based on text document classification techniques.
Cheng, Betty Yee Man; Carbonell, Jaime G; Klein-Seetharaman, Judith
2005-03-01
The need for accurate, automated protein classification methods continues to increase as advances in biotechnology uncover new proteins. G-protein coupled receptors (GPCRs) are a particularly difficult superfamily of proteins to classify due to extreme diversity among its members. Previous comparisons of BLAST, k-nearest neighbor (k-NN), hidden markov model (HMM) and support vector machine (SVM) using alignment-based features have suggested that classifiers at the complexity of SVM are needed to attain high accuracy. Here, analogous to document classification, we applied Decision Tree and Naive Bayes classifiers with chi-square feature selection on counts of n-grams (i.e. short peptide sequences of length n) to this classification task. Using the GPCR dataset and evaluation protocol from the previous study, the Naive Bayes classifier attained an accuracy of 93.0 and 92.4% in level I and level II subfamily classification respectively, while SVM has a reported accuracy of 88.4 and 86.3%. This is a 39.7 and 44.5% reduction in residual error for level I and level II subfamily classification, respectively. The Decision Tree, while inferior to SVM, outperforms HMM in both level I and level II subfamily classification. For those GPCR families whose profiles are stored in the Protein FAMilies database of alignments and HMMs (PFAM), our method performs comparably to a search against those profiles. Finally, our method can be generalized to other protein families by applying it to the superfamily of nuclear receptors with 94.5, 97.8 and 93.6% accuracy in family, level I and level II subfamily classification respectively. Copyright 2005 Wiley-Liss, Inc.
NASA Astrophysics Data System (ADS)
Zagouras, Athanassios; Argiriou, Athanassios A.; Flocas, Helena A.; Economou, George; Fotopoulos, Spiros
2012-11-01
Classification of weather maps at various isobaric levels as a methodological tool is used in several problems related to meteorology, climatology, atmospheric pollution and to other fields for many years. Initially the classification was performed manually. The criteria used by the person performing the classification are features of isobars or isopleths of geopotential height, depending on the type of maps to be classified. Although manual classifications integrate the perceptual experience and other unquantifiable qualities of the meteorology specialists involved, these are typically subjective and time consuming. Furthermore, during the last years different approaches of automated methods for atmospheric circulation classification have been proposed, which present automated and so-called objective classifications. In this paper a new method of atmospheric circulation classification of isobaric maps is presented. The method is based on graph theory. It starts with an intelligent prototype selection using an over-partitioning mode of fuzzy c-means (FCM) algorithm, proceeds to a graph formulation for the entire dataset and produces the clusters based on the contemporary dominant sets clustering method. Graph theory is a novel mathematical approach, allowing a more efficient representation of spatially correlated data, compared to the classical Euclidian space representation approaches, used in conventional classification methods. The method has been applied to the classification of 850 hPa atmospheric circulation over the Eastern Mediterranean. The evaluation of the automated methods is performed by statistical indexes; results indicate that the classification is adequately comparable with other state-of-the-art automated map classification methods, for a variable number of clusters.
Griffiths, Jason I.; Fronhofer, Emanuel A.; Garnier, Aurélie; Seymour, Mathew; Altermatt, Florian; Petchey, Owen L.
2017-01-01
The development of video-based monitoring methods allows for rapid, dynamic and accurate monitoring of individuals or communities, compared to slower traditional methods, with far reaching ecological and evolutionary applications. Large amounts of data are generated using video-based methods, which can be effectively processed using machine learning (ML) algorithms into meaningful ecological information. ML uses user defined classes (e.g. species), derived from a subset (i.e. training data) of video-observed quantitative features (e.g. phenotypic variation), to infer classes in subsequent observations. However, phenotypic variation often changes due to environmental conditions, which may lead to poor classification, if environmentally induced variation in phenotypes is not accounted for. Here we describe a framework for classifying species under changing environmental conditions based on the random forest classification. A sliding window approach was developed that restricts temporal and environmentally conditions to improve the classification. We tested our approach by applying the classification framework to experimental data. The experiment used a set of six ciliate species to monitor changes in community structure and behavior over hundreds of generations, in dozens of species combinations and across a temperature gradient. Differences in biotic and abiotic conditions caused simplistic classification approaches to be unsuccessful. In contrast, the sliding window approach allowed classification to be highly successful, as phenotypic differences driven by environmental change, could be captured by the classifier. Importantly, classification using the random forest algorithm showed comparable success when validated against traditional, slower, manual identification. Our framework allows for reliable classification in dynamic environments, and may help to improve strategies for long-term monitoring of species in changing environments. Our classification pipeline can be applied in fields assessing species community dynamics, such as eco-toxicology, ecology and evolutionary ecology. PMID:28472193
Traffic sign classification with dataset augmentation and convolutional neural network
NASA Astrophysics Data System (ADS)
Tang, Qing; Kurnianggoro, Laksono; Jo, Kang-Hyun
2018-04-01
This paper presents a method for traffic sign classification using a convolutional neural network (CNN). In this method, firstly we transfer a color image into grayscale, and then normalize it in the range (-1,1) as the preprocessing step. To increase robustness of classification model, we apply a dataset augmentation algorithm and create new images to train the model. To avoid overfitting, we utilize a dropout module before the last fully connection layer. To assess the performance of the proposed method, the German traffic sign recognition benchmark (GTSRB) dataset is utilized. Experimental results show that the method is effective in classifying traffic signs.
Gu, Yao; Ni, Yongnian; Kokot, Serge
2012-09-13
A novel, simple and direct fluorescence method for analysis of complex substances and their potential substitutes has been researched and developed. Measurements involved excitation and emission (EEM) fluorescence spectra of powdered, complex, medicinal herbs, Cortex Phellodendri Chinensis (CPC) and the similar Cortex Phellodendri Amurensis (CPA); these substances were compared and discriminated from each other and the potentially adulterated samples (Caulis mahoniae (CM) and David poplar bark (DPB)). Different chemometrics methods were applied for resolution of the complex spectra, and the excitation spectra were found to be the most informative; only the rank-ordering PROMETHEE method was able to classify the samples with single ingredients (CPA, CPC, CM) or those with binary mixtures (CPA/CPC, CPA/CM, CPC/CM). Interestingly, it was essential to use the geometrical analysis for interactive aid (GAIA) display for a full understanding of the classification results. However, these two methods, like the other chemometrics models, were unable to classify composite spectral matrices consisting of data from samples of single ingredients and binary mixtures; this suggested that the excitation spectra of the different samples were very similar. However, the method is useful for classification of single-ingredient samples and, separately, their binary mixtures; it may also be applied for similar classification work with other complex substances.
NASA Astrophysics Data System (ADS)
Bukreeva, Ekaterina B.; Bulanova, Anna A.; Kistenev, Yury V.; Kuzmin, Dmitry A.; Tuzikov, Sergei A.; Yumov, Evgeny L.
2014-11-01
The results of the joint use of laser photoacoustic spectroscopy and chemometrics methods in gas analysis of exhaled air of patients with respiratory diseases (chronic obstructive pulmonary disease, pneumonia and lung cancer) are presented. The absorption spectra of exhaled breath of all volunteers were measured, the classification methods of the scans of the absorption spectra were applied, the sensitivity/specificity of the classification results were determined. It were obtained a result of nosological in pairs classification for all investigated volunteers, indices of sensitivity and specificity.
Meta-learning framework applied in bioinformatics inference system design.
Arredondo, Tomás; Ormazábal, Wladimir
2015-01-01
This paper describes a meta-learner inference system development framework which is applied and tested in the implementation of bioinformatic inference systems. These inference systems are used for the systematic classification of the best candidates for inclusion in bacterial metabolic pathway maps. This meta-learner-based approach utilises a workflow where the user provides feedback with final classification decisions which are stored in conjunction with analysed genetic sequences for periodic inference system training. The inference systems were trained and tested with three different data sets related to the bacterial degradation of aromatic compounds. The analysis of the meta-learner-based framework involved contrasting several different optimisation methods with various different parameters. The obtained inference systems were also contrasted with other standard classification methods with accurate prediction capabilities observed.
Zhang, Chi; Zhang, Ge; Chen, Ke-ji; Lu, Ai-ping
2016-04-01
The development of an effective classification method for human health conditions is essential for precise diagnosis and delivery of tailored therapy to individuals. Contemporary classification of disease systems has properties that limit its information content and usability. Chinese medicine pattern classification has been incorporated with disease classification, and this integrated classification method became more precise because of the increased understanding of the molecular mechanisms. However, we are still facing the complexity of diseases and patterns in the classification of health conditions. With continuing advances in omics methodologies and instrumentation, we are proposing a new classification approach: molecular module classification, which is applying molecular modules to classifying human health status. The initiative would be precisely defining the health status, providing accurate diagnoses, optimizing the therapeutics and improving new drug discovery strategy. Therefore, there would be no current disease diagnosis, no disease pattern classification, and in the future, a new medicine based on this classification, molecular module medicine, could redefine health statuses and reshape the clinical practice.
NASA Astrophysics Data System (ADS)
Fusco, Terence; Bi, Yaxin; Nugent, Chris; Wu, Shengli
2016-08-01
We can see that the data imputation approach using the Regression CTA has performed more favourably when compared with the alternative methods on this dataset. We now have the evidence to show that this method is viable moving forward with further research in this area. The weighted distribution experiments have provided us with a more balanced and appropriate ratio for snail density classification purposes when using either the 3 or 5 category combination. The most desirable results are found when using 3 categories of SD with the weighted distribution of classes being 20-60-20. This information reflects the optimum classification accuracy across the data range and can be applied to any novel environment feature dataset pertaining to Schistosomiasis vector classification. ITSVM has provided us with a method of labelling SD data which we can use for classification with epidemic disease prediction research. The confidence level selection enables consistent labelling accuracy for bespoke requirements when classifying the data from each year. The SMOTE Equilibrium proposed method has yielded a slight increase with each multiple of synthetic instances that are compounded to the training dataset. The reduction of overfitting and increase of data instances has shown a gradual classification accuracy increase across the data for each year. We will now test to see what the optimum synthetic instance incremental increase is across our data and apply this to our experiments with this research.
ERIC Educational Resources Information Center
Liu, Boquan; Polce, Evan; Sprott, Julien C.; Jiang, Jack J.
2018-01-01
Purpose: The purpose of this study is to introduce a chaos level test to evaluate linear and nonlinear voice type classification method performances under varying signal chaos conditions without subjective impression. Study Design: Voice signals were constructed with differing degrees of noise to model signal chaos. Within each noise power, 100…
Quantitative computed tomography applied to interstitial lung diseases.
Obert, Martin; Kampschulte, Marian; Limburg, Rebekka; Barańczuk, Stefan; Krombach, Gabriele A
2018-03-01
To evaluate a new image marker that retrieves information from computed tomography (CT) density histograms, with respect to classification properties between different lung parenchyma groups. Furthermore, to conduct a comparison of the new image marker with conventional markers. Density histograms from 220 different subjects (normal = 71; emphysema = 73; fibrotic = 76) were used to compare the conventionally applied emphysema index (EI), 15 th percentile value (PV), mean value (MV), variance (V), skewness (S), kurtosis (K), with a new histogram's functional shape (HFS) method. Multinomial logistic regression (MLR) analyses was performed to calculate predictions of different lung parenchyma group membership using the individual methods, as well as combinations thereof, as covariates. Overall correct assigned subjects (OCA), sensitivity (sens), specificity (spec), and Nagelkerke's pseudo R 2 (NR 2 ) effect size were estimated. NR 2 was used to set up a ranking list of the different methods. MLR indicates the highest classification power (OCA of 92%; sens 0.95; spec 0.89; NR 2 0.95) when all histogram analyses methods were applied together in the MLR. Highest classification power among individually applied methods was found using the HFS concept (OCA 86%; sens 0.93; spec 0.79; NR 2 0.80). Conventional methods achieved lower classification potential on their own: EI (OCA 69%; sens 0.95; spec 0.26; NR 2 0.52); PV (OCA 69%; sens 0.90; spec 0.37; NR 2 0.57); MV (OCA 65%; sens 0.71; spec 0.58; NR 2 0.61); V (OCA 66%; sens 0.72; spec 0.53; NR 2 0.66); S (OCA 65%; sens 0.88; spec 0.26; NR 2 0.55); and K (OCA 63%; sens 0.90; spec 0.16; NR 2 0.48). The HFS method, which was so far applied to a CT bone density curve analysis, is also a remarkable information extraction tool for lung density histograms. Presumably, being a principle mathematical approach, the HFS method can extract valuable health related information also from histograms from complete different areas. Copyright © 2018 Elsevier B.V. All rights reserved.
Identification of an Efficient Gene Expression Panel for Glioblastoma Classification
Zelaya, Ivette; Laks, Dan R.; Zhao, Yining; Kawaguchi, Riki; Gao, Fuying; Kornblum, Harley I.; Coppola, Giovanni
2016-01-01
We present here a novel genetic algorithm-based random forest (GARF) modeling technique that enables a reduction in the complexity of large gene disease signatures to highly accurate, greatly simplified gene panels. When applied to 803 glioblastoma multiforme samples, this method allowed the 840-gene Verhaak et al. gene panel (the standard in the field) to be reduced to a 48-gene classifier, while retaining 90.91% classification accuracy, and outperforming the best available alternative methods. Additionally, using this approach we produced a 32-gene panel which allows for better consistency between RNA-seq and microarray-based classifications, improving cross-platform classification retention from 69.67% to 86.07%. A webpage producing these classifications is available at http://simplegbm.semel.ucla.edu. PMID:27855170
Classification of weld defect based on information fusion technology for radiographic testing system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jiang, Hongquan; Liang, Zeming, E-mail: heavenlzm@126.com; Gao, Jianmin
Improving the efficiency and accuracy of weld defect classification is an important technical problem in developing the radiographic testing system. This paper proposes a novel weld defect classification method based on information fusion technology, Dempster–Shafer evidence theory. First, to characterize weld defects and improve the accuracy of their classification, 11 weld defect features were defined based on the sub-pixel level edges of radiographic images, four of which are presented for the first time in this paper. Second, we applied information fusion technology to combine different features for weld defect classification, including a mass function defined based on the weld defectmore » feature information and the quartile-method-based calculation of standard weld defect class which is to solve a sample problem involving a limited number of training samples. A steam turbine weld defect classification case study is also presented herein to illustrate our technique. The results show that the proposed method can increase the correct classification rate with limited training samples and address the uncertainties associated with weld defect classification.« less
Jiang, Hongquan; Liang, Zeming; Gao, Jianmin; Dang, Changying
2016-03-01
Improving the efficiency and accuracy of weld defect classification is an important technical problem in developing the radiographic testing system. This paper proposes a novel weld defect classification method based on information fusion technology, Dempster-Shafer evidence theory. First, to characterize weld defects and improve the accuracy of their classification, 11 weld defect features were defined based on the sub-pixel level edges of radiographic images, four of which are presented for the first time in this paper. Second, we applied information fusion technology to combine different features for weld defect classification, including a mass function defined based on the weld defect feature information and the quartile-method-based calculation of standard weld defect class which is to solve a sample problem involving a limited number of training samples. A steam turbine weld defect classification case study is also presented herein to illustrate our technique. The results show that the proposed method can increase the correct classification rate with limited training samples and address the uncertainties associated with weld defect classification.
Wang, Yi; Xiang, Ma; Wen, Ya-Dong; Yu, Chun-Xia; Wang, Luo-Ping; Zhao, Long-Lian; Li, Jun-Hui
2012-11-01
In this study, tobacco quality analysis of main Industrial classification of different years was carried out applying spectrum projection and correlation methods. The group of data was near-infrared (NIR) spectrum from Hongta Tobacco (Group) Co., Ltd. 5730 tobacco leaf Industrial classification samples from Yuxi in Yunnan province from 2007 to 2010 year were collected using near infrared spectroscopy, which from different parts and colors and all belong to tobacco varieties of HONGDA. The conclusion showed that, when the samples were divided to two part by the ratio of 2:1 randomly as analysis and verification sets in the same year, the verification set corresponded with the analysis set applying spectrum projection because their correlation coefficients were above 0.98. The correlation coefficients between two different years applying spectrum projection were above 0.97. The highest correlation coefficient was the one between 2008 and 2009 year and the lowest correlation coefficient was the one between 2007 and 2010 year. At the same time, The study discussed a method to get the quantitative similarity values of different industrial classification samples. The similarity and consistency values were instructive in combination and replacement of tobacco leaf blending.
Genetic algorithm for the optimization of features and neural networks in ECG signals classification
NASA Astrophysics Data System (ADS)
Li, Hongqiang; Yuan, Danyang; Ma, Xiangdong; Cui, Dianyin; Cao, Lu
2017-01-01
Feature extraction and classification of electrocardiogram (ECG) signals are necessary for the automatic diagnosis of cardiac diseases. In this study, a novel method based on genetic algorithm-back propagation neural network (GA-BPNN) for classifying ECG signals with feature extraction using wavelet packet decomposition (WPD) is proposed. WPD combined with the statistical method is utilized to extract the effective features of ECG signals. The statistical features of the wavelet packet coefficients are calculated as the feature sets. GA is employed to decrease the dimensions of the feature sets and to optimize the weights and biases of the back propagation neural network (BPNN). Thereafter, the optimized BPNN classifier is applied to classify six types of ECG signals. In addition, an experimental platform is constructed for ECG signal acquisition to supply the ECG data for verifying the effectiveness of the proposed method. The GA-BPNN method with the MIT-BIH arrhythmia database achieved a dimension reduction of nearly 50% and produced good classification results with an accuracy of 97.78%. The experimental results based on the established acquisition platform indicated that the GA-BPNN method achieved a high classification accuracy of 99.33% and could be efficiently applied in the automatic identification of cardiac arrhythmias.
Controlling protected designation of origin of wine by Raman spectroscopy.
Mandrile, Luisa; Zeppa, Giuseppe; Giovannozzi, Andrea Mario; Rossi, Andrea Mario
2016-11-15
In this paper, a Fourier Transform Raman spectroscopy method, to authenticate the provenience of wine, for food traceability applications was developed. In particular, due to the specific chemical fingerprint of the Raman spectrum, it was possible to discriminate different wines produced in the Piedmont area (North West Italy) in accordance with i) grape varieties, ii) production area and iii) ageing time. In order to create a consistent training set, more than 300 samples from tens of different producers were analyzed, and a chemometric treatment of raw spectra was applied. A discriminant analysis method was employed in the classification procedures, providing a classification capability (percentage of correct answers) of 90% for validation of grape analysis and geographical area provenance, and a classification capability of 84% for ageing time classification. The present methodology was applied successfully to raw materials without any preliminary treatment of the sample, providing a response in a very short time. Copyright © 2016 Elsevier Ltd. All rights reserved.
Comparative analysis of image classification methods for automatic diagnosis of ophthalmic images
NASA Astrophysics Data System (ADS)
Wang, Liming; Zhang, Kai; Liu, Xiyang; Long, Erping; Jiang, Jiewei; An, Yingying; Zhang, Jia; Liu, Zhenzhen; Lin, Zhuoling; Li, Xiaoyan; Chen, Jingjing; Cao, Qianzhong; Li, Jing; Wu, Xiaohang; Wang, Dongni; Li, Wangting; Lin, Haotian
2017-01-01
There are many image classification methods, but it remains unclear which methods are most helpful for analyzing and intelligently identifying ophthalmic images. We select representative slit-lamp images which show the complexity of ocular images as research material to compare image classification algorithms for diagnosing ophthalmic diseases. To facilitate this study, some feature extraction algorithms and classifiers are combined to automatic diagnose pediatric cataract with same dataset and then their performance are compared using multiple criteria. This comparative study reveals the general characteristics of the existing methods for automatic identification of ophthalmic images and provides new insights into the strengths and shortcomings of these methods. The relevant methods (local binary pattern +SVMs, wavelet transformation +SVMs) which achieve an average accuracy of 87% and can be adopted in specific situations to aid doctors in preliminarily disease screening. Furthermore, some methods requiring fewer computational resources and less time could be applied in remote places or mobile devices to assist individuals in understanding the condition of their body. In addition, it would be helpful to accelerate the development of innovative approaches and to apply these methods to assist doctors in diagnosing ophthalmic disease.
A modified method for MRF segmentation and bias correction of MR image with intensity inhomogeneity.
Xie, Mei; Gao, Jingjing; Zhu, Chongjin; Zhou, Yan
2015-01-01
Markov random field (MRF) model is an effective method for brain tissue classification, which has been applied in MR image segmentation for decades. However, it falls short of the expected classification in MR images with intensity inhomogeneity for the bias field is not considered in the formulation. In this paper, we propose an interleaved method joining a modified MRF classification and bias field estimation in an energy minimization framework, whose initial estimation is based on k-means algorithm in view of prior information on MRI. The proposed method has a salient advantage of overcoming the misclassifications from the non-interleaved MRF classification for the MR image with intensity inhomogeneity. In contrast to other baseline methods, experimental results also have demonstrated the effectiveness and advantages of our algorithm via its applications in the real and the synthetic MR images.
Some sequential, distribution-free pattern classification procedures with applications
NASA Technical Reports Server (NTRS)
Poage, J. L.
1971-01-01
Some sequential, distribution-free pattern classification techniques are presented. The decision problem to which the proposed classification methods are applied is that of discriminating between two kinds of electroencephalogram responses recorded from a human subject: spontaneous EEG and EEG driven by a stroboscopic light stimulus at the alpha frequency. The classification procedures proposed make use of the theory of order statistics. Estimates of the probabilities of misclassification are given. The procedures were tested on Gaussian samples and the EEG responses.
Höller, Yvonne; Bergmann, Jürgen; Thomschewski, Aljoscha; Kronbichler, Martin; Höller, Peter; Crone, Julia S.; Schmid, Elisabeth V.; Butz, Kevin; Nardone, Raffaele; Trinka, Eugen
2013-01-01
Current research aims at identifying voluntary brain activation in patients who are behaviorally diagnosed as being unconscious, but are able to perform commands by modulating their brain activity patterns. This involves machine learning techniques and feature extraction methods such as applied in brain computer interfaces. In this study, we try to answer the question if features/classification methods which show advantages in healthy participants are also accurate when applied to data of patients with disorders of consciousness. A sample of healthy participants (N = 22), patients in a minimally conscious state (MCS; N = 5), and with unresponsive wakefulness syndrome (UWS; N = 9) was examined with a motor imagery task which involved imagery of moving both hands and an instruction to hold both hands firm. We extracted a set of 20 features from the electroencephalogram and used linear discriminant analysis, k-nearest neighbor classification, and support vector machines (SVM) as classification methods. In healthy participants, the best classification accuracies were seen with coherences (mean = .79; range = .53−.94) and power spectra (mean = .69; range = .40−.85). The coherence patterns in healthy participants did not match the expectation of central modulated -rhythm. Instead, coherence involved mainly frontal regions. In healthy participants, the best classification tool was SVM. Five patients had at least one feature-classifier outcome with p0.05 (none of which were coherence or power spectra), though none remained significant after false-discovery rate correction for multiple comparisons. The present work suggests the use of coherences in patients with disorders of consciousness because they show high reliability among healthy subjects and patient groups. However, feature extraction and classification is a challenging task in unresponsive patients because there is no ground truth to validate the results. PMID:24282545
A Cluster-then-label Semi-supervised Learning Approach for Pathology Image Classification.
Peikari, Mohammad; Salama, Sherine; Nofech-Mozes, Sharon; Martel, Anne L
2018-05-08
Completely labeled pathology datasets are often challenging and time-consuming to obtain. Semi-supervised learning (SSL) methods are able to learn from fewer labeled data points with the help of a large number of unlabeled data points. In this paper, we investigated the possibility of using clustering analysis to identify the underlying structure of the data space for SSL. A cluster-then-label method was proposed to identify high-density regions in the data space which were then used to help a supervised SVM in finding the decision boundary. We have compared our method with other supervised and semi-supervised state-of-the-art techniques using two different classification tasks applied to breast pathology datasets. We found that compared with other state-of-the-art supervised and semi-supervised methods, our SSL method is able to improve classification performance when a limited number of labeled data instances are made available. We also showed that it is important to examine the underlying distribution of the data space before applying SSL techniques to ensure semi-supervised learning assumptions are not violated by the data.
Dasgupta, Nilanjan; Carin, Lawrence
2005-04-01
Time-reversal imaging (TRI) is analogous to matched-field processing, although TRI is typically very wideband and is appropriate for subsequent target classification (in addition to localization). Time-reversal techniques, as applied to acoustic target classification, are highly sensitive to channel mismatch. Hence, it is crucial to estimate the channel parameters before time-reversal imaging is performed. The channel-parameter statistics are estimated here by applying a geoacoustic inversion technique based on Gibbs sampling. The maximum a posteriori (MAP) estimate of the channel parameters are then used to perform time-reversal imaging. Time-reversal implementation requires a fast forward model, implemented here by a normal-mode framework. In addition to imaging, extraction of features from the time-reversed images is explored, with these applied to subsequent target classification. The classification of time-reversed signatures is performed by the relevance vector machine (RVM). The efficacy of the technique is analyzed on simulated in-channel data generated by a free-field finite element method (FEM) code, in conjunction with a channel propagation model, wherein the final classification performance is demonstrated to be relatively insensitive to the associated channel parameters. The underlying theory of Gibbs sampling and TRI are presented along with the feature extraction and target classification via the RVM.
Natural fracture systems on planetary surfaces: Genetic classification and pattern randomness
NASA Technical Reports Server (NTRS)
Rossbacher, Lisa A.
1987-01-01
One method for classifying natural fracture systems is by fracture genesis. This approach involves the physics of the formation process, and it has been used most frequently in attempts to predict subsurface fractures and petroleum reservoir productivity. This classification system can also be applied to larger fracture systems on any planetary surface. One problem in applying this classification system to planetary surfaces is that it was developed for ralatively small-scale fractures that would influence porosity, particularly as observed in a core sample. Planetary studies also require consideration of large-scale fractures. Nevertheless, this system offers some valuable perspectives on fracture systems of any size.
Molecular cancer classification using a meta-sample-based regularized robust coding method.
Wang, Shu-Lin; Sun, Liuchao; Fang, Jianwen
2014-01-01
Previous studies have demonstrated that machine learning based molecular cancer classification using gene expression profiling (GEP) data is promising for the clinic diagnosis and treatment of cancer. Novel classification methods with high efficiency and prediction accuracy are still needed to deal with high dimensionality and small sample size of typical GEP data. Recently the sparse representation (SR) method has been successfully applied to the cancer classification. Nevertheless, its efficiency needs to be improved when analyzing large-scale GEP data. In this paper we present the meta-sample-based regularized robust coding classification (MRRCC), a novel effective cancer classification technique that combines the idea of meta-sample-based cluster method with regularized robust coding (RRC) method. It assumes that the coding residual and the coding coefficient are respectively independent and identically distributed. Similar to meta-sample-based SR classification (MSRC), MRRCC extracts a set of meta-samples from the training samples, and then encodes a testing sample as the sparse linear combination of these meta-samples. The representation fidelity is measured by the l2-norm or l1-norm of the coding residual. Extensive experiments on publicly available GEP datasets demonstrate that the proposed method is more efficient while its prediction accuracy is equivalent to existing MSRC-based methods and better than other state-of-the-art dimension reduction based methods.
NASA Astrophysics Data System (ADS)
Ma, L.; Zhou, M.; Li, C.
2017-09-01
In this study, a Random Forest (RF) based land covers classification method is presented to predict the types of land covers in Miyun area. The returned full-waveforms which were acquired by a LiteMapper 5600 airborne LiDAR system were processed, including waveform filtering, waveform decomposition and features extraction. The commonly used features that were distance, intensity, Full Width at Half Maximum (FWHM), skewness and kurtosis were extracted. These waveform features were used as attributes of training data for generating the RF prediction model. The RF prediction model was applied to predict the types of land covers in Miyun area as trees, buildings, farmland and ground. The classification results of these four types of land covers were obtained according to the ground truth information acquired from CCD image data of the same region. The RF classification results were compared with that of SVM method and show better results. The RF classification accuracy reached 89.73% and the classification Kappa was 0.8631.
Wu, Zi Yi; Xie, Ping; Sang, Yan Fang; Gu, Hai Ting
2018-04-01
The phenomenon of jump is one of the importantly external forms of hydrological variabi-lity under environmental changes, representing the adaption of hydrological nonlinear systems to the influence of external disturbances. Presently, the related studies mainly focus on the methods for identifying the jump positions and jump times in hydrological time series. In contrast, few studies have focused on the quantitative description and classification of jump degree in hydrological time series, which make it difficult to understand the environmental changes and evaluate its potential impacts. Here, we proposed a theatrically reliable and easy-to-apply method for the classification of jump degree in hydrological time series, using the correlation coefficient as a basic index. The statistical tests verified the accuracy, reasonability, and applicability of this method. The relationship between the correlation coefficient and the jump degree of series were described using mathematical equation by derivation. After that, several thresholds of correlation coefficients under different statistical significance levels were chosen, based on which the jump degree could be classified into five levels: no, weak, moderate, strong and very strong. Finally, our method was applied to five diffe-rent observed hydrological time series, with diverse geographic and hydrological conditions in China. The results of the classification of jump degrees in those series were closely accorded with their physically hydrological mechanisms, indicating the practicability of our method.
Xu, Xiayu; Ding, Wenxiang; Abràmoff, Michael D; Cao, Ruofan
2017-04-01
Retinal artery and vein classification is an important task for the automatic computer-aided diagnosis of various eye diseases and systemic diseases. This paper presents an improved supervised artery and vein classification method in retinal image. Intra-image regularization and inter-subject normalization is applied to reduce the differences in feature space. Novel features, including first-order and second-order texture features, are utilized to capture the discriminating characteristics of arteries and veins. The proposed method was tested on the DRIVE dataset and achieved an overall accuracy of 0.923. This retinal artery and vein classification algorithm serves as a potentially important tool for the early diagnosis of various diseases, including diabetic retinopathy and cardiovascular diseases. Copyright © 2017 Elsevier B.V. All rights reserved.
Systematization method for distinguishing plastic groups by using NIR spectroscopy.
Kaihara, Mikio; Satoh, Minami; Satoh, Minoru
2007-07-01
A systematic classification method for polymers is not yet available in case of using near infrared spectra (NIR). That is why we have been searching for a systematic method. Because raw NIR spectra usually have few obvious peaks, NIR spectra have been pretreated by 2nd derivation for taking well modulated spectra. After the pretreatment, we applied classification and regression trees (CART) to the discrimination between the spectra and the species of polymers. As a result, we obtained a relatively simple classification tree. Judging from the obtained splitting conditions and the classified polymers, we concluded that obtained knowledge on the chemical function groups estimated by the important wavelength regions is not always applicable to this classification tree. However, we clarified the splitting rules for polymer species from the NIR spectral point of view.
Wu, Shang-Lin; Liao, Lun-De; Lu, Shao-Wei; Jiang, Wei-Ling; Chen, Shi-An; Lin, Chin-Teng
2013-08-01
Electrooculography (EOG) signals can be used to control human-computer interface (HCI) systems, if properly classified. The ability to measure and process these signals may help HCI users to overcome many of the physical limitations and inconveniences in daily life. However, there are currently no effective multidirectional classification methods for monitoring eye movements. Here, we describe a classification method used in a wireless EOG-based HCI device for detecting eye movements in eight directions. This device includes wireless EOG signal acquisition components, wet electrodes and an EOG signal classification algorithm. The EOG classification algorithm is based on extracting features from the electrical signals corresponding to eight directions of eye movement (up, down, left, right, up-left, down-left, up-right, and down-right) and blinking. The recognition and processing of these eight different features were achieved in real-life conditions, demonstrating that this device can reliably measure the features of EOG signals. This system and its classification procedure provide an effective method for identifying eye movements. Additionally, it may be applied to study eye functions in real-life conditions in the near future.
NASA Astrophysics Data System (ADS)
Zhang, Min; Zhou, Xiangrong; Goshima, Satoshi; Chen, Huayue; Muramatsu, Chisako; Hara, Takeshi; Yokoyama, Ryojiro; Kanematsu, Masayuki; Fujita, Hiroshi
2012-03-01
We aim at using a new texton based texture classification method in the classification of pulmonary emphysema in computed tomography (CT) images of the lungs. Different from conventional computer-aided diagnosis (CAD) pulmonary emphysema classification methods, in this paper, firstly, the dictionary of texton is learned via applying sparse representation(SR) to image patches in the training dataset. Then the SR coefficients of the test images over the dictionary are used to construct the histograms for texture presentations. Finally, classification is performed by using a nearest neighbor classifier with a histogram dissimilarity measure as distance. The proposed approach is tested on 3840 annotated regions of interest consisting of normal tissue and mild, moderate and severe pulmonary emphysema of three subtypes. The performance of the proposed system, with an accuracy of about 88%, is comparably higher than state of the art method based on the basic rotation invariant local binary pattern histograms and the texture classification method based on texton learning by k-means, which performs almost the best among other approaches in the literature.
Adaptive phase k-means algorithm for waveform classification
NASA Astrophysics Data System (ADS)
Song, Chengyun; Liu, Zhining; Wang, Yaojun; Xu, Feng; Li, Xingming; Hu, Guangmin
2018-01-01
Waveform classification is a powerful technique for seismic facies analysis that describes the heterogeneity and compartments within a reservoir. Horizon interpretation is a critical step in waveform classification. However, the horizon often produces inconsistent waveform phase, and thus results in an unsatisfied classification. To alleviate this problem, an adaptive phase waveform classification method called the adaptive phase k-means is introduced in this paper. Our method improves the traditional k-means algorithm using an adaptive phase distance for waveform similarity measure. The proposed distance is a measure with variable phases as it moves from sample to sample along the traces. Model traces are also updated with the best phase interference in the iterative process. Therefore, our method is robust to phase variations caused by the interpretation horizon. We tested the effectiveness of our algorithm by applying it to synthetic and real data. The satisfactory results reveal that the proposed method tolerates certain waveform phase variation and is a good tool for seismic facies analysis.
Isentropic Bulk Modulus: Development of a Federal Test Method
2016-01-01
ranging from 30-80 °C and applied pressures of 1,000-18,000 psi. This method has been applied successfully to aviation turbine fuels and diesel fuels...FFP), aviation fuel, diesel fuel, 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT 18. NUMBER OF PAGES 19a. NAME OF RESPONSIBLE...pressures of 1,000-18,000 psi. This method has been applied successfully to aviation turbine fuels and diesel fuels composed of petroleum, synthetic
Rantalainen, Timo; Chivers, Paola; Beck, Belinda R; Robertson, Sam; Hart, Nicolas H; Nimphius, Sophia; Weeks, Benjamin K; McIntyre, Fleur; Hands, Beth; Siafarikas, Aris
Most imaging methods, including peripheral quantitative computed tomography (pQCT), are susceptible to motion artifacts particularly in fidgety pediatric populations. Methods currently used to address motion artifact include manual screening (visual inspection) and objective assessments of the scans. However, previously reported objective methods either cannot be applied on the reconstructed image or have not been tested for distal bone sites. Therefore, the purpose of the present study was to develop and validate motion artifact classifiers to quantify motion artifact in pQCT scans. Whether textural features could provide adequate motion artifact classification performance in 2 adolescent datasets with pQCT scans from tibial and radial diaphyses and epiphyses was tested. The first dataset was split into training (66% of sample) and validation (33% of sample) datasets. Visual classification was used as the ground truth. Moderate to substantial classification performance (J48 classifier, kappa coefficients from 0.57 to 0.80) was observed in the validation dataset with the novel texture-based classifier. In applying the same classifier to the second cross-sectional dataset, a slight-to-fair (κ = 0.01-0.39) classification performance was observed. Overall, this novel textural analysis-based classifier provided a moderate-to-substantial classification of motion artifact when the classifier was specifically trained for the measurement device and population. Classification based on textural features may be used to prescreen obviously acceptable and unacceptable scans, with a subsequent human-operated visual classification of any remaining scans. Copyright © 2017 The International Society for Clinical Densitometry. Published by Elsevier Inc. All rights reserved.
Mode of Action (MOA) Assignment Classifications for Ecotoxicology: Evaluation of Available Methods
There are various structure-based classification schemes to categorize chemicals based on mode of action (MOA) which have been applied for both eco and human toxicology. With increasing calls to assess 1000s of chemicals, some of which have little available information other tha...
Remote sensing imagery classification using multi-objective gravitational search algorithm
NASA Astrophysics Data System (ADS)
Zhang, Aizhu; Sun, Genyun; Wang, Zhenjie
2016-10-01
Simultaneous optimization of different validity measures can capture different data characteristics of remote sensing imagery (RSI) and thereby achieving high quality classification results. In this paper, two conflicting cluster validity indices, the Xie-Beni (XB) index and the fuzzy C-means (FCM) (Jm) measure, are integrated with a diversity-enhanced and memory-based multi-objective gravitational search algorithm (DMMOGSA) to present a novel multi-objective optimization based RSI classification method. In this method, the Gabor filter method is firstly implemented to extract texture features of RSI. Then, the texture features are syncretized with the spectral features to construct the spatial-spectral feature space/set of the RSI. Afterwards, cluster of the spectral-spatial feature set is carried out on the basis of the proposed method. To be specific, cluster centers are randomly generated initially. After that, the cluster centers are updated and optimized adaptively by employing the DMMOGSA. Accordingly, a set of non-dominated cluster centers are obtained. Therefore, numbers of image classification results of RSI are produced and users can pick up the most promising one according to their problem requirements. To quantitatively and qualitatively validate the effectiveness of the proposed method, the proposed classification method was applied to classifier two aerial high-resolution remote sensing imageries. The obtained classification results are compared with that produced by two single cluster validity index based and two state-of-the-art multi-objective optimization algorithms based classification results. Comparison results show that the proposed method can achieve more accurate RSI classification.
Caprihan, A; Pearlson, G D; Calhoun, V D
2008-08-15
Principal component analysis (PCA) is often used to reduce the dimension of data before applying more sophisticated data analysis methods such as non-linear classification algorithms or independent component analysis. This practice is based on selecting components corresponding to the largest eigenvalues. If the ultimate goal is separation of data in two groups, then these set of components need not have the most discriminatory power. We measured the distance between two such populations using Mahalanobis distance and chose the eigenvectors to maximize it, a modified PCA method, which we call the discriminant PCA (DPCA). DPCA was applied to diffusion tensor-based fractional anisotropy images to distinguish age-matched schizophrenia subjects from healthy controls. The performance of the proposed method was evaluated by the one-leave-out method. We show that for this fractional anisotropy data set, the classification error with 60 components was close to the minimum error and that the Mahalanobis distance was twice as large with DPCA, than with PCA. Finally, by masking the discriminant function with the white matter tracts of the Johns Hopkins University atlas, we identified left superior longitudinal fasciculus as the tract which gave the least classification error. In addition, with six optimally chosen tracts the classification error was zero.
Classification of neocortical interneurons using affinity propagation.
Santana, Roberto; McGarry, Laura M; Bielza, Concha; Larrañaga, Pedro; Yuste, Rafael
2013-01-01
In spite of over a century of research on cortical circuits, it is still unknown how many classes of cortical neurons exist. In fact, neuronal classification is a difficult problem because it is unclear how to designate a neuronal cell class and what are the best characteristics to define them. Recently, unsupervised classifications using cluster analysis based on morphological, physiological, or molecular characteristics, have provided quantitative and unbiased identification of distinct neuronal subtypes, when applied to selected datasets. However, better and more robust classification methods are needed for increasingly complex and larger datasets. Here, we explored the use of affinity propagation, a recently developed unsupervised classification algorithm imported from machine learning, which gives a representative example or exemplar for each cluster. As a case study, we applied affinity propagation to a test dataset of 337 interneurons belonging to four subtypes, previously identified based on morphological and physiological characteristics. We found that affinity propagation correctly classified most of the neurons in a blind, non-supervised manner. Affinity propagation outperformed Ward's method, a current standard clustering approach, in classifying the neurons into 4 subtypes. Affinity propagation could therefore be used in future studies to validly classify neurons, as a first step to help reverse engineer neural circuits.
NASA Astrophysics Data System (ADS)
Äijälä, Mikko; Heikkinen, Liine; Fröhlich, Roman; Canonaco, Francesco; Prévôt, André S. H.; Junninen, Heikki; Petäjä, Tuukka; Kulmala, Markku; Worsnop, Douglas; Ehn, Mikael
2017-03-01
Mass spectrometric measurements commonly yield data on hundreds of variables over thousands of points in time. Refining and synthesizing this raw data into chemical information necessitates the use of advanced, statistics-based data analytical techniques. In the field of analytical aerosol chemistry, statistical, dimensionality reductive methods have become widespread in the last decade, yet comparable advanced chemometric techniques for data classification and identification remain marginal. Here we present an example of combining data dimensionality reduction (factorization) with exploratory classification (clustering), and show that the results cannot only reproduce and corroborate earlier findings, but also complement and broaden our current perspectives on aerosol chemical classification. We find that applying positive matrix factorization to extract spectral characteristics of the organic component of air pollution plumes, together with an unsupervised clustering algorithm, k-means+ + , for classification, reproduces classical organic aerosol speciation schemes. Applying appropriately chosen metrics for spectral dissimilarity along with optimized data weighting, the source-specific pollution characteristics can be statistically resolved even for spectrally very similar aerosol types, such as different combustion-related anthropogenic aerosol species and atmospheric aerosols with similar degree of oxidation. In addition to the typical oxidation level and source-driven aerosol classification, we were also able to classify and characterize outlier groups that would likely be disregarded in a more conventional analysis. Evaluating solution quality for the classification also provides means to assess the performance of mass spectral similarity metrics and optimize weighting for mass spectral variables. This facilitates algorithm-based evaluation of aerosol spectra, which may prove invaluable for future development of automatic methods for spectra identification and classification. Robust, statistics-based results and data visualizations also provide important clues to a human analyst on the existence and chemical interpretation of data structures. Applying these methods to a test set of data, aerosol mass spectrometric data of organic aerosol from a boreal forest site, yielded five to seven different recurring pollution types from various sources, including traffic, cooking, biomass burning and nearby sawmills. Additionally, three distinct, minor pollution types were discovered and identified as amine-dominated aerosols.
Dimensional Representation and Gradient Boosting for Seismic Event Classification
NASA Astrophysics Data System (ADS)
Semmelmayer, F. C.; Kappedal, R. D.; Magana-Zook, S. A.
2017-12-01
In this research, we conducted experiments of representational structures on 5009 seismic signals with the intent of finding a method to classify signals as either an explosion or an earthquake in an automated fashion. We also applied a gradient boosted classifier. While perfect classification was not attained (approximately 88% was our best model), some cases demonstrate that many events can be filtered out as very high probability being explosions or earthquakes, diminishing subject-matter experts'(SME) workload for first stage analysis. It is our hope that these methods can be refined, further increasing the classification probability.
Pashaei, Elnaz; Ozen, Mustafa; Aydin, Nizamettin
2015-08-01
Improving accuracy of supervised classification algorithms in biomedical applications is one of active area of research. In this study, we improve the performance of Particle Swarm Optimization (PSO) combined with C4.5 decision tree (PSO+C4.5) classifier by applying Boosted C5.0 decision tree as the fitness function. To evaluate the effectiveness of our proposed method, it is implemented on 1 microarray dataset and 5 different medical data sets obtained from UCI machine learning databases. Moreover, the results of PSO + Boosted C5.0 implementation are compared to eight well-known benchmark classification methods (PSO+C4.5, support vector machine under the kernel of Radial Basis Function, Classification And Regression Tree (CART), C4.5 decision tree, C5.0 decision tree, Boosted C5.0 decision tree, Naive Bayes and Weighted K-Nearest neighbor). Repeated five-fold cross-validation method was used to justify the performance of classifiers. Experimental results show that our proposed method not only improve the performance of PSO+C4.5 but also obtains higher classification accuracy compared to the other classification methods.
Cloud cover determination in polar regions from satellite imagery
NASA Technical Reports Server (NTRS)
Barry, R. G.; Maslanik, J. A.; Key, J. R.
1987-01-01
A definition is undertaken of the spectral and spatial characteristics of clouds and surface conditions in the polar regions, and to the creation of calibrated, geometrically correct data sets suitable for quantitative analysis. Ways are explored in which this information can be applied to cloud classifications as new methods or as extensions to existing classification schemes. A methodology is developed that uses automated techniques to merge Advanced Very High Resolution Radiometer (AVHRR) and Scanning Multichannel Microwave Radiometer (SMMR) data, and to apply first-order calibration and zenith angle corrections to the AVHRR imagery. Cloud cover and surface types are manually interpreted, and manual methods are used to define relatively pure training areas to describe the textural and multispectral characteristics of clouds over several surface conditions. The effects of viewing angle and bidirectional reflectance differences are studied for several classes, and the effectiveness of some key components of existing classification schemes is tested.
Matthews, M E; Waldvogel, C F; Mahaffey, M J; Zemel, P C
1978-06-01
Preparation procedures of standardized quantity formulas were analyzed for similarities and differences in production activities, and three entrée classifications were developed, based on these activities. Two formulas from each classification were selected, preparation procedures were divided into elements of production, and the MSD Quantity Food Production Code was applied. Macro elements not included in the existing Code were simulated, coded, assigned associated Time Measurement Units, and added to the MSD Quantity Food Production Code. Repeated occurrence of similar elements within production methods indicated that macro elements could be synthesized for use within one or more entrée classifications. Basic elements were grouped, simulated, and macro elements were derived. Macro elements were applied in the simulated production of 100 portions of each entrée formula. Total production time for each formula and average production time for each entrée classification were calculated. Application of macro elements indicated that this method of predetermining production time was feasible and could be adapted by quantity foodservice managers as a decision technique used to evaluate menu mix, production personnel schedules, and allocation of equipment usage. These macro elements could serve as a basis for further development and refinement of other macro elements which could be applied to a variety of menu item formulas.
Abnormality detection of mammograms by discriminative dictionary learning on DSIFT descriptors.
Tavakoli, Nasrin; Karimi, Maryam; Nejati, Mansour; Karimi, Nader; Reza Soroushmehr, S M; Samavi, Shadrokh; Najarian, Kayvan
2017-07-01
Detection and classification of breast lesions using mammographic images are one of the most difficult studies in medical image processing. A number of learning and non-learning methods have been proposed for detecting and classifying these lesions. However, the accuracy of the detection/classification still needs improvement. In this paper we propose a powerful classification method based on sparse learning to diagnose breast cancer in mammograms. For this purpose, a supervised discriminative dictionary learning approach is applied on dense scale invariant feature transform (DSIFT) features. A linear classifier is also simultaneously learned with the dictionary which can effectively classify the sparse representations. Our experimental results show the superior performance of our method compared to existing approaches.
Yang, Xiaofeng; Wu, Shengyong; Sechopoulos, Ioannis; Fei, Baowei
2012-01-01
Purpose: To develop and test an automated algorithm to classify the different tissues present in dedicated breast CT images. Methods: The original CT images are first corrected to overcome cupping artifacts, and then a multiscale bilateral filter is used to reduce noise while keeping edge information on the images. As skin and glandular tissues have similar CT values on breast CT images, morphologic processing is used to identify the skin mask based on its position information. A modified fuzzy C-means (FCM) classification method is then used to classify breast tissue as fat and glandular tissue. By combining the results of the skin mask with the FCM, the breast tissue is classified as skin, fat, and glandular tissue. To evaluate the authors’ classification method, the authors use Dice overlap ratios to compare the results of the automated classification to those obtained by manual segmentation on eight patient images. Results: The correction method was able to correct the cupping artifacts and improve the quality of the breast CT images. For glandular tissue, the overlap ratios between the authors’ automatic classification and manual segmentation were 91.6% ± 2.0%. Conclusions: A cupping artifact correction method and an automatic classification method were applied and evaluated for high-resolution dedicated breast CT images. Breast tissue classification can provide quantitative measurements regarding breast composition, density, and tissue distribution. PMID:23039675
NASA Technical Reports Server (NTRS)
Sabol, Donald E., Jr.; Roberts, Dar A.; Adams, John B.; Smith, Milton O.
1993-01-01
An important application of remote sensing is to map and monitor changes over large areas of the land surface. This is particularly significant with the current interest in monitoring vegetation communities. Most of traditional methods for mapping different types of plant communities are based upon statistical classification techniques (i.e., parallel piped, nearest-neighbor, etc.) applied to uncalibrated multispectral data. Classes from these techniques are typically difficult to interpret (particularly to a field ecologist/botanist). Also, classes derived for one image can be very different from those derived from another image of the same area, making interpretation of observed temporal changes nearly impossible. More recently, neural networks have been applied to classification. Neural network classification, based upon spectral matching, is weak in dealing with spectral mixtures (a condition prevalent in images of natural surfaces). Another approach to mapping vegetation communities is based on spectral mixture analysis, which can provide a consistent framework for image interpretation. Roberts et al. (1990) mapped vegetation using the band residuals from a simple mixing model (the same spectral endmembers applied to all image pixels). Sabol et al. (1992b) and Roberts et al. (1992) used different methods to apply the most appropriate spectral endmembers to each image pixel, thereby allowing mapping of vegetation based upon the the different endmember spectra. In this paper, we describe a new approach to classification of vegetation communities based upon the spectra fractions derived from spectral mixture analysis. This approach was applied to three 1992 AVIRIS images of Jasper Ridge, California to observe seasonal changes in surface composition.
NASA Astrophysics Data System (ADS)
Setiyoko, A.; Dharma, I. G. W. S.; Haryanto, T.
2017-01-01
Multispectral data and hyperspectral data acquired from satellite sensor have the ability in detecting various objects on the earth ranging from low scale to high scale modeling. These data are increasingly being used to produce geospatial information for rapid analysis by running feature extraction or classification process. Applying the most suited model for this data mining is still challenging because there are issues regarding accuracy and computational cost. This research aim is to develop a better understanding regarding object feature extraction and classification applied for satellite image by systematically reviewing related recent research projects. A method used in this research is based on PRISMA statement. After deriving important points from trusted sources, pixel based and texture-based feature extraction techniques are promising technique to be analyzed more in recent development of feature extraction and classification.
Classifier dependent feature preprocessing methods
NASA Astrophysics Data System (ADS)
Rodriguez, Benjamin M., II; Peterson, Gilbert L.
2008-04-01
In mobile applications, computational complexity is an issue that limits sophisticated algorithms from being implemented on these devices. This paper provides an initial solution to applying pattern recognition systems on mobile devices by combining existing preprocessing algorithms for recognition. In pattern recognition systems, it is essential to properly apply feature preprocessing tools prior to training classification models in an attempt to reduce computational complexity and improve the overall classification accuracy. The feature preprocessing tools extended for the mobile environment are feature ranking, feature extraction, data preparation and outlier removal. Most desktop systems today are capable of processing a majority of the available classification algorithms without concern of processing while the same is not true on mobile platforms. As an application of pattern recognition for mobile devices, the recognition system targets the problem of steganalysis, determining if an image contains hidden information. The measure of performance shows that feature preprocessing increases the overall steganalysis classification accuracy by an average of 22%. The methods in this paper are tested on a workstation and a Nokia 6620 (Symbian operating system) camera phone with similar results.
NASA Astrophysics Data System (ADS)
Wan, Xiaoqing; Zhao, Chunhui; Gao, Bing
2017-11-01
The integration of an edge-preserving filtering technique in the classification of a hyperspectral image (HSI) has been proven effective in enhancing classification performance. This paper proposes an ensemble strategy for HSI classification using an edge-preserving filter along with a deep learning model and edge detection. First, an adaptive guided filter is applied to the original HSI to reduce the noise in degraded images and to extract powerful spectral-spatial features. Second, the extracted features are fed as input to a stacked sparse autoencoder to adaptively exploit more invariant and deep feature representations; then, a random forest classifier is applied to fine-tune the entire pretrained network and determine the classification output. Third, a Prewitt compass operator is further performed on the HSI to extract the edges of the first principal component after dimension reduction. Moreover, the regional growth rule is applied to the resulting edge logical image to determine the local region for each unlabeled pixel. Finally, the categories of the corresponding neighborhood samples are determined in the original classification map; then, the major voting mechanism is implemented to generate the final output. Extensive experiments proved that the proposed method achieves competitive performance compared with several traditional approaches.
Link prediction boosted psychiatry disorder classification for functional connectivity network
NASA Astrophysics Data System (ADS)
Li, Weiwei; Mei, Xue; Wang, Hao; Zhou, Yu; Huang, Jiashuang
2017-02-01
Functional connectivity network (FCN) is an effective tool in psychiatry disorders classification, and represents cross-correlation of the regional blood oxygenation level dependent signal. However, FCN is often incomplete for suffering from missing and spurious edges. To accurate classify psychiatry disorders and health control with the incomplete FCN, we first `repair' the FCN with link prediction, and then exact the clustering coefficients as features to build a weak classifier for every FCN. Finally, we apply a boosting algorithm to combine these weak classifiers for improving classification accuracy. Our method tested by three datasets of psychiatry disorder, including Alzheimer's Disease, Schizophrenia and Attention Deficit Hyperactivity Disorder. The experimental results show our method not only significantly improves the classification accuracy, but also efficiently reconstructs the incomplete FCN.
Neural net applied to anthropological material: a methodical study on the human nasal skeleton.
Prescher, Andreas; Meyers, Anne; Gerf von Keyserlingk, Diedrich
2005-07-01
A new information processing method, an artificial neural net, was applied to characterise the variability of anthropological features of the human nasal skeleton. The aim was to find different types of nasal skeletons. A neural net with 15*15 nodes was trained by 17 standard anthropological parameters taken from 184 skulls of the Aachen collection. The trained neural net delivers its classification in a two-dimensional map. Different types of noses were locally separated within the map. Rare and frequent types may be distinguished after one passage of the complete collection through the net. Statistical descriptive analysis, hierarchical cluster analysis, and discriminant analysis were applied to the same data set. These parallel applications allowed comparison of the new approach to the more traditional ones. In general the classification by the neural net is in correspondence with cluster analysis and discriminant analysis. However, it goes beyond these classifications because of the possibility of differentiating the types in multi-dimensional dependencies. Furthermore, places in the map are kept blank for intermediate forms, which may be theoretically expected, but were not included in the training set. In conclusion, the application of a neural network is a suitable method for investigating large collections of biological material. The gained classification may be helpful in anatomy and anthropology as well as in forensic medicine. It may be used to characterise the peculiarity of a whole set as well as to find particular cases within the set.
NASA Astrophysics Data System (ADS)
Liu, Tao; Abd-Elrahman, Amr
2018-05-01
Deep convolutional neural network (DCNN) requires massive training datasets to trigger its image classification power, while collecting training samples for remote sensing application is usually an expensive process. When DCNN is simply implemented with traditional object-based image analysis (OBIA) for classification of Unmanned Aerial systems (UAS) orthoimage, its power may be undermined if the number training samples is relatively small. This research aims to develop a novel OBIA classification approach that can take advantage of DCNN by enriching the training dataset automatically using multi-view data. Specifically, this study introduces a Multi-View Object-based classification using Deep convolutional neural network (MODe) method to process UAS images for land cover classification. MODe conducts the classification on multi-view UAS images instead of directly on the orthoimage, and gets the final results via a voting procedure. 10-fold cross validation results show the mean overall classification accuracy increasing substantially from 65.32%, when DCNN was applied on the orthoimage to 82.08% achieved when MODe was implemented. This study also compared the performances of the support vector machine (SVM) and random forest (RF) classifiers with DCNN under traditional OBIA and the proposed multi-view OBIA frameworks. The results indicate that the advantage of DCNN over traditional classifiers in terms of accuracy is more obvious when these classifiers were applied with the proposed multi-view OBIA framework than when these classifiers were applied within the traditional OBIA framework.
SAR-based change detection using hypothesis testing and Markov random field modelling
NASA Astrophysics Data System (ADS)
Cao, W.; Martinis, S.
2015-04-01
The objective of this study is to automatically detect changed areas caused by natural disasters from bi-temporal co-registered and calibrated TerraSAR-X data. The technique in this paper consists of two steps: Firstly, an automatic coarse detection step is applied based on a statistical hypothesis test for initializing the classification. The original analytical formula as proposed in the constant false alarm rate (CFAR) edge detector is reviewed and rewritten in a compact form of the incomplete beta function, which is a builtin routine in commercial scientific software such as MATLAB and IDL. Secondly, a post-classification step is introduced to optimize the noisy classification result in the previous step. Generally, an optimization problem can be formulated as a Markov random field (MRF) on which the quality of a classification is measured by an energy function. The optimal classification based on the MRF is related to the lowest energy value. Previous studies provide methods for the optimization problem using MRFs, such as the iterated conditional modes (ICM) algorithm. Recently, a novel algorithm was presented based on graph-cut theory. This method transforms a MRF to an equivalent graph and solves the optimization problem by a max-flow/min-cut algorithm on the graph. In this study this graph-cut algorithm is applied iteratively to improve the coarse classification. At each iteration the parameters of the energy function for the current classification are set by the logarithmic probability density function (PDF). The relevant parameters are estimated by the method of logarithmic cumulants (MoLC). Experiments are performed using two flood events in Germany and Australia in 2011 and a forest fire on La Palma in 2009 using pre- and post-event TerraSAR-X data. The results show convincing coarse classifications and considerable improvement by the graph-cut post-classification step.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Möller, A.; Ruhlmann-Kleider, V.; Leloup, C.
In the era of large astronomical surveys, photometric classification of supernovae (SNe) has become an important research field due to limited spectroscopic resources for candidate follow-up and classification. In this work, we present a method to photometrically classify type Ia supernovae based on machine learning with redshifts that are derived from the SN light-curves. This method is implemented on real data from the SNLS deferred pipeline, a purely photometric pipeline that identifies SNe Ia at high-redshifts (0.2 < z < 1.1). Our method consists of two stages: feature extraction (obtaining the SN redshift from photometry and estimating light-curve shape parameters)more » and machine learning classification. We study the performance of different algorithms such as Random Forest and Boosted Decision Trees. We evaluate the performance using SN simulations and real data from the first 3 years of the Supernova Legacy Survey (SNLS), which contains large spectroscopically and photometrically classified type Ia samples. Using the Area Under the Curve (AUC) metric, where perfect classification is given by 1, we find that our best-performing classifier (Extreme Gradient Boosting Decision Tree) has an AUC of 0.98.We show that it is possible to obtain a large photometrically selected type Ia SN sample with an estimated contamination of less than 5%. When applied to data from the first three years of SNLS, we obtain 529 events. We investigate the differences between classifying simulated SNe, and real SN survey data. In particular, we find that applying a thorough set of selection cuts to the SN sample is essential for good classification. This work demonstrates for the first time the feasibility of machine learning classification in a high- z SN survey with application to real SN data.« less
Martínez-Domingo, Miguel Ángel; Valero, Eva M; Hernández-Andrés, Javier; Tominaga, Shoji; Horiuchi, Takahiko; Hirai, Keita
2017-11-27
We propose a method for the capture of high dynamic range (HDR), multispectral (MS), polarimetric (Pol) images of indoor scenes using a liquid crystal tunable filter (LCTF). We have included the adaptive exposure estimation (AEE) method to fully automatize the capturing process. We also propose a pre-processing method which can be applied for the registration of HDR images after they are already built as the result of combining different low dynamic range (LDR) images. This method is applied to ensure a correct alignment of the different polarization HDR images for each spectral band. We have focused our efforts in two main applications: object segmentation and classification into metal and dielectric classes. We have simplified the segmentation using mean shift combined with cluster averaging and region merging techniques. We compare the performance of our segmentation with that of Ncut and Watershed methods. For the classification task, we propose to use information not only in the highlight regions but also in their surrounding area, extracted from the degree of linear polarization (DoLP) maps. We present experimental results which proof that the proposed image processing pipeline outperforms previous techniques developed specifically for MSHDRPol image cubes.
Gao, Xiang; Lin, Huaiying; Revanna, Kashi; Dong, Qunfeng
2017-05-10
Species-level classification for 16S rRNA gene sequences remains a serious challenge for microbiome researchers, because existing taxonomic classification tools for 16S rRNA gene sequences either do not provide species-level classification, or their classification results are unreliable. The unreliable results are due to the limitations in the existing methods which either lack solid probabilistic-based criteria to evaluate the confidence of their taxonomic assignments, or use nucleotide k-mer frequency as the proxy for sequence similarity measurement. We have developed a method that shows significantly improved species-level classification results over existing methods. Our method calculates true sequence similarity between query sequences and database hits using pairwise sequence alignment. Taxonomic classifications are assigned from the species to the phylum levels based on the lowest common ancestors of multiple database hits for each query sequence, and further classification reliabilities are evaluated by bootstrap confidence scores. The novelty of our method is that the contribution of each database hit to the taxonomic assignment of the query sequence is weighted by a Bayesian posterior probability based upon the degree of sequence similarity of the database hit to the query sequence. Our method does not need any training datasets specific for different taxonomic groups. Instead only a reference database is required for aligning to the query sequences, making our method easily applicable for different regions of the 16S rRNA gene or other phylogenetic marker genes. Reliable species-level classification for 16S rRNA or other phylogenetic marker genes is critical for microbiome research. Our software shows significantly higher classification accuracy than the existing tools and we provide probabilistic-based confidence scores to evaluate the reliability of our taxonomic classification assignments based on multiple database matches to query sequences. Despite its higher computational costs, our method is still suitable for analyzing large-scale microbiome datasets for practical purposes. Furthermore, our method can be applied for taxonomic classification of any phylogenetic marker gene sequences. Our software, called BLCA, is freely available at https://github.com/qunfengdong/BLCA .
Non-linear molecular pattern classification using molecular beacons with multiple targets.
Lee, In-Hee; Lee, Seung Hwan; Park, Tai Hyun; Zhang, Byoung-Tak
2013-12-01
In vitro pattern classification has been highlighted as an important future application of DNA computing. Previous work has demonstrated the feasibility of linear classifiers using DNA-based molecular computing. However, complex tasks require non-linear classification capability. Here we design a molecular beacon that can interact with multiple targets and experimentally shows that its fluorescent signals form a complex radial-basis function, enabling it to be used as a building block for non-linear molecular classification in vitro. The proposed method was successfully applied to solving artificial and real-world classification problems: XOR and microRNA expression patterns. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Tissue classification for laparoscopic image understanding based on multispectral texture analysis
NASA Astrophysics Data System (ADS)
Zhang, Yan; Wirkert, Sebastian J.; Iszatt, Justin; Kenngott, Hannes; Wagner, Martin; Mayer, Benjamin; Stock, Christian; Clancy, Neil T.; Elson, Daniel S.; Maier-Hein, Lena
2016-03-01
Intra-operative tissue classification is one of the prerequisites for providing context-aware visualization in computer-assisted minimally invasive surgeries. As many anatomical structures are difficult to differentiate in conventional RGB medical images, we propose a classification method based on multispectral image patches. In a comprehensive ex vivo study we show (1) that multispectral imaging data is superior to RGB data for organ tissue classification when used in conjunction with widely applied feature descriptors and (2) that combining the tissue texture with the reflectance spectrum improves the classification performance. Multispectral tissue analysis could thus evolve as a key enabling technique in computer-assisted laparoscopy.
Joint Feature Selection and Classification for Multilabel Learning.
Huang, Jun; Li, Guorong; Huang, Qingming; Wu, Xindong
2018-03-01
Multilabel learning deals with examples having multiple class labels simultaneously. It has been applied to a variety of applications, such as text categorization and image annotation. A large number of algorithms have been proposed for multilabel learning, most of which concentrate on multilabel classification problems and only a few of them are feature selection algorithms. Current multilabel classification models are mainly built on a single data representation composed of all the features which are shared by all the class labels. Since each class label might be decided by some specific features of its own, and the problems of classification and feature selection are often addressed independently, in this paper, we propose a novel method which can perform joint feature selection and classification for multilabel learning, named JFSC. Different from many existing methods, JFSC learns both shared features and label-specific features by considering pairwise label correlations, and builds the multilabel classifier on the learned low-dimensional data representations simultaneously. A comparative study with state-of-the-art approaches manifests a competitive performance of our proposed method both in classification and feature selection for multilabel learning.
Classification of radiolarian images with hand-crafted and deep features
NASA Astrophysics Data System (ADS)
Keçeli, Ali Seydi; Kaya, Aydın; Keçeli, Seda Uzunçimen
2017-12-01
Radiolarians are planktonic protozoa and are important biostratigraphic and paleoenvironmental indicators for paleogeographic reconstructions. Radiolarian paleontology still remains as a low cost and the one of the most convenient way to obtain dating of deep ocean sediments. Traditional methods for identifying radiolarians are time-consuming and cannot scale to the granularity or scope necessary for large-scale studies. Automated image classification will allow making these analyses promptly. In this study, a method for automatic radiolarian image classification is proposed on Scanning Electron Microscope (SEM) images of radiolarians to ease species identification of fossilized radiolarians. The proposed method uses both hand-crafted features like invariant moments, wavelet moments, Gabor features, basic morphological features and deep features obtained from a pre-trained Convolutional Neural Network (CNN). Feature selection is applied over deep features to reduce high dimensionality. Classification outcomes are analyzed to compare hand-crafted features, deep features, and their combinations. Results show that the deep features obtained from a pre-trained CNN are more discriminative comparing to hand-crafted ones. Additionally, feature selection utilizes to the computational cost of classification algorithms and have no negative effect on classification accuracy.
A hybrid method for classifying cognitive states from fMRI data.
Parida, S; Dehuri, S; Cho, S-B; Cacha, L A; Poznanski, R R
2015-09-01
Functional magnetic resonance imaging (fMRI) makes it possible to detect brain activities in order to elucidate cognitive-states. The complex nature of fMRI data requires under-standing of the analyses applied to produce possible avenues for developing models of cognitive state classification and improving brain activity prediction. While many models of classification task of fMRI data analysis have been developed, in this paper, we present a novel hybrid technique through combining the best attributes of genetic algorithms (GAs) and ensemble decision tree technique that consistently outperforms all other methods which are being used for cognitive-state classification. Specifically, this paper illustrates the combined effort of decision-trees ensemble and GAs for feature selection through an extensive simulation study and discusses the classification performance with respect to fMRI data. We have shown that our proposed method exhibits significant reduction of the number of features with clear edge classification accuracy over ensemble of decision-trees.
The use of the modified Cholesky decomposition in divergence and classification calculations
NASA Technical Reports Server (NTRS)
Vanroony, D. L.; Lynn, M. S.; Snyder, C. H.
1973-01-01
The use of the Cholesky decomposition technique is analyzed as applied to the feature selection and classification algorithms used in the analysis of remote sensing data (e.g. as in LARSYS). This technique is approximately 30% faster in classification and a factor of 2-3 faster in divergence, as compared with LARSYS. Also numerical stability and accuracy are slightly improved. Other methods necessary to deal with numerical stablity problems are briefly discussed.
The use of the modified Cholesky decomposition in divergence and classification calculations
NASA Technical Reports Server (NTRS)
Van Rooy, D. L.; Lynn, M. S.; Snyder, C. H.
1973-01-01
This report analyzes the use of the modified Cholesky decomposition technique as applied to the feature selection and classification algorithms used in the analysis of remote sensing data (e.g., as in LARSYS). This technique is approximately 30% faster in classification and a factor of 2-3 faster in divergence, as compared with LARSYS. Also numerical stability and accuracy are slightly improved. Other methods necessary to deal with numerical stability problems are briefly discussed.
Chung, Sukhoon; Rhee, Hyunsill; Suh, Yongmoo
2010-01-01
Objectives This study sought to find answers to the following questions: 1) Can we predict whether a patient will revisit a healthcare center? 2) Can we anticipate diseases of patients who revisit the center? Methods For the first question, we applied 5 classification algorithms (decision tree, artificial neural network, logistic regression, Bayesian networks, and Naïve Bayes) and the stacking-bagging method for building classification models. To solve the second question, we performed sequential pattern analysis. Results We determined: 1) In general, the most influential variables which impact whether a patient of a public healthcare center will revisit it or not are personal burden, insurance bill, period of prescription, age, systolic pressure, name of disease, and postal code. 2) The best plain classification model is dependent on the dataset. 3) Based on average of classification accuracy, the proposed stacking-bagging method outperformed all traditional classification models and our sequential pattern analysis revealed 16 sequential patterns. Conclusions Classification models and sequential patterns can help public healthcare centers plan and implement healthcare service programs and businesses that are more appropriate to local residents, encouraging them to revisit public health centers. PMID:21818426
Yang, Xiaofeng; Wu, Shengyong; Sechopoulos, Ioannis; Fei, Baowei
2012-10-01
To develop and test an automated algorithm to classify the different tissues present in dedicated breast CT images. The original CT images are first corrected to overcome cupping artifacts, and then a multiscale bilateral filter is used to reduce noise while keeping edge information on the images. As skin and glandular tissues have similar CT values on breast CT images, morphologic processing is used to identify the skin mask based on its position information. A modified fuzzy C-means (FCM) classification method is then used to classify breast tissue as fat and glandular tissue. By combining the results of the skin mask with the FCM, the breast tissue is classified as skin, fat, and glandular tissue. To evaluate the authors' classification method, the authors use Dice overlap ratios to compare the results of the automated classification to those obtained by manual segmentation on eight patient images. The correction method was able to correct the cupping artifacts and improve the quality of the breast CT images. For glandular tissue, the overlap ratios between the authors' automatic classification and manual segmentation were 91.6% ± 2.0%. A cupping artifact correction method and an automatic classification method were applied and evaluated for high-resolution dedicated breast CT images. Breast tissue classification can provide quantitative measurements regarding breast composition, density, and tissue distribution.
NASA Astrophysics Data System (ADS)
Madokoro, H.; Yamanashi, A.; Sato, K.
2013-08-01
This paper presents an unsupervised scene classification method for actualizing semantic recognition of indoor scenes. Background and foreground features are respectively extracted using Gist and color scale-invariant feature transform (SIFT) as feature representations based on context. We used hue, saturation, and value SIFT (HSV-SIFT) because of its simple algorithm with low calculation costs. Our method creates bags of features for voting visual words created from both feature descriptors to a two-dimensional histogram. Moreover, our method generates labels as candidates of categories for time-series images while maintaining stability and plasticity together. Automatic labeling of category maps can be realized using labels created using adaptive resonance theory (ART) as teaching signals for counter propagation networks (CPNs). We evaluated our method for semantic scene classification using KTH's image database for robot localization (KTH-IDOL), which is popularly used for robot localization and navigation. The mean classification accuracies of Gist, gray SIFT, one class support vector machines (OC-SVM), position-invariant robust features (PIRF), and our method are, respectively, 39.7, 58.0, 56.0, 63.6, and 79.4%. The result of our method is 15.8% higher than that of PIRF. Moreover, we applied our method for fine classification using our original mobile robot. We obtained mean classification accuracy of 83.2% for six zones.
Classification of Dynamical Diffusion States in Single Molecule Tracking Microscopy
Bosch, Peter J.; Kanger, Johannes S.; Subramaniam, Vinod
2014-01-01
Single molecule tracking of membrane proteins by fluorescence microscopy is a promising method to investigate dynamic processes in live cells. Translating the trajectories of proteins to biological implications, such as protein interactions, requires the classification of protein motion within the trajectories. Spatial information of protein motion may reveal where the protein interacts with cellular structures, because binding of proteins to such structures often alters their diffusion speed. For dynamic diffusion systems, we provide an analytical framework to determine in which diffusion state a molecule is residing during the course of its trajectory. We compare different methods for the quantification of motion to utilize this framework for the classification of two diffusion states (two populations with different diffusion speed). We found that a gyration quantification method and a Bayesian statistics-based method are the most accurate in diffusion-state classification for realistic experimentally obtained datasets, of which the gyration method is much less computationally demanding. After classification of the diffusion, the lifetime of the states can be determined, and images of the diffusion states can be reconstructed at high resolution. Simulations validate these applications. We apply the classification and its applications to experimental data to demonstrate the potential of this approach to obtain further insights into the dynamics of cell membrane proteins. PMID:25099798
NASA Astrophysics Data System (ADS)
Hafizt, M.; Manessa, M. D. M.; Adi, N. S.; Prayudha, B.
2017-12-01
Benthic habitat mapping using satellite data is one challenging task for practitioners and academician as benthic objects are covered by light-attenuating water column obscuring object discrimination. One common method to reduce this water-column effect is by using depth-invariant index (DII) image. However, the application of the correction in shallow coastal areas is challenging as a dark object such as seagrass could have a very low pixel value, preventing its reliable identification and classification. This limitation can be solved by specifically applying a classification process to areas with different water depth levels. The water depth level can be extracted from satellite imagery using Relative Water Depth Index (RWDI). This study proposed a new approach to improve the mapping accuracy, particularly for benthic dark objects by combining the DII of Lyzenga’s water column correction method and the RWDI of Stumpt’s method. This research was conducted in Lintea Island which has a high variation of benthic cover using Sentinel-2A imagery. To assess the effectiveness of the proposed new approach for benthic habitat mapping two different classification procedures are implemented. The first procedure is the commonly applied method in benthic habitat mapping where DII image is used as input data to all coastal area for image classification process regardless of depth variation. The second procedure is the proposed new approach where its initial step begins with the separation of the study area into shallow and deep waters using the RWDI image. Shallow area was then classified using the sunglint-corrected image as input data and the deep area was classified using DII image as input data. The final classification maps of those two areas were merged as a single benthic habitat map. A confusion matrix was then applied to evaluate the mapping accuracy of the final map. The result shows that the new proposed mapping approach can be used to map all benthic objects in all depth ranges and shows a better accuracy compared to that of classification map produced using only with DII.
NASA Astrophysics Data System (ADS)
Klomp, Sander; van der Sommen, Fons; Swager, Anne-Fré; Zinger, Svitlana; Schoon, Erik J.; Curvers, Wouter L.; Bergman, Jacques J.; de With, Peter H. N.
2017-03-01
Volumetric Laser Endomicroscopy (VLE) is a promising technique for the detection of early neoplasia in Barrett's Esophagus (BE). VLE generates hundreds of high resolution, grayscale, cross-sectional images of the esophagus. However, at present, classifying these images is a time consuming and cumbersome effort performed by an expert using a clinical prediction model. This paper explores the feasibility of using computer vision techniques to accurately predict the presence of dysplastic tissue in VLE BE images. Our contribution is threefold. First, a benchmarking is performed for widely applied machine learning techniques and feature extraction methods. Second, three new features based on the clinical detection model are proposed, having superior classification accuracy and speed, compared to earlier work. Third, we evaluate automated parameter tuning by applying simple grid search and feature selection methods. The results are evaluated on a clinically validated dataset of 30 dysplastic and 30 non-dysplastic VLE images. Optimal classification accuracy is obtained by applying a support vector machine and using our modified Haralick features and optimal image cropping, obtaining an area under the receiver operating characteristic of 0.95 compared to the clinical prediction model at 0.81. Optimal execution time is achieved using a proposed mean and median feature, which is extracted at least factor 2.5 faster than alternative features with comparable performance.
NASA Astrophysics Data System (ADS)
Tamimi, E.; Ebadi, H.; Kiani, A.
2017-09-01
Automatic building detection from High Spatial Resolution (HSR) images is one of the most important issues in Remote Sensing (RS). Due to the limited number of spectral bands in HSR images, using other features will lead to improve accuracy. By adding these features, the presence probability of dependent features will be increased, which leads to accuracy reduction. In addition, some parameters should be determined in Support Vector Machine (SVM) classification. Therefore, it is necessary to simultaneously determine classification parameters and select independent features according to image type. Optimization algorithm is an efficient method to solve this problem. On the other hand, pixel-based classification faces several challenges such as producing salt-paper results and high computational time in high dimensional data. Hence, in this paper, a novel method is proposed to optimize object-based SVM classification by applying continuous Ant Colony Optimization (ACO) algorithm. The advantages of the proposed method are relatively high automation level, independency of image scene and type, post processing reduction for building edge reconstruction and accuracy improvement. The proposed method was evaluated by pixel-based SVM and Random Forest (RF) classification in terms of accuracy. In comparison with optimized pixel-based SVM classification, the results showed that the proposed method improved quality factor and overall accuracy by 17% and 10%, respectively. Also, in the proposed method, Kappa coefficient was improved by 6% rather than RF classification. Time processing of the proposed method was relatively low because of unit of image analysis (image object). These showed the superiority of the proposed method in terms of time and accuracy.
Federal Register 2010, 2011, 2012, 2013, 2014
2010-02-01
... Classification as Refugee; OMB Control No. 1615- 0068. The Department of Homeland Security, U.S. Citizenship and... information collection. (2) Title of the Form/Collection: Registration for Classification as Refugee. (3... Households. Form I- 590 provides a uniform method for applicants to apply for refugee status and contains the...
ERIC Educational Resources Information Center
Strobl, Carolin; Malley, James; Tutz, Gerhard
2009-01-01
Recursive partitioning methods have become popular and widely used tools for nonparametric regression and classification in many scientific fields. Especially random forests, which can deal with large numbers of predictor variables even in the presence of complex interactions, have been applied successfully in genetics, clinical medicine, and…
A new machine classification method applied to human peripheral blood leukocytes
NASA Technical Reports Server (NTRS)
Rorvig, Mark E.; Fitzpatrick, Steven J.; Vitthal, Sanjay; Ladoulis, Charles T.
1994-01-01
Human beings judge images by complex mental processes, whereas computing machines extract features. By reducing scaled human judgments and machine extracted features to a common metric space and fitting them by regression, the judgments of human experts rendered on a sample of images may be imposed on an image population to provide automatic classification.
Quantitation of flavonoid constituents in citrus fruits.
Kawaii, S; Tomono, Y; Katase, E; Ogawa, K; Yano, M
1999-09-01
Twenty-four flavonoids have been determined in 66 Citrus species and near-citrus relatives, grown in the same field and year, by means of reversed phase high-performance liquid chromatography analysis. Statistical methods have been applied to find relations among the species. The F ratios of 21 flavonoids obtained by applying ANOVA analysis are significant, indicating that a classification of the species using these variables is reasonable to pursue. Principal component analysis revealed that the distributions of Citrus species belonging to different classes were largely in accordance with Tanaka's classification system.
New feature extraction method for classification of agricultural products from x-ray images
NASA Astrophysics Data System (ADS)
Talukder, Ashit; Casasent, David P.; Lee, Ha-Woon; Keagy, Pamela M.; Schatzki, Thomas F.
1999-01-01
Classification of real-time x-ray images of randomly oriented touching pistachio nuts is discussed. The ultimate objective is the development of a system for automated non- invasive detection of defective product items on a conveyor belt. We discuss the extraction of new features that allow better discrimination between damaged and clean items. This feature extraction and classification stage is the new aspect of this paper; our new maximum representation and discrimination between damaged and clean items. This feature extraction and classification stage is the new aspect of this paper; our new maximum representation and discriminating feature (MRDF) extraction method computes nonlinear features that are used as inputs to a new modified k nearest neighbor classifier. In this work the MRDF is applied to standard features. The MRDF is robust to various probability distributions of the input class and is shown to provide good classification and new ROC data.
Liu, Yanqiu; Lu, Huijuan; Yan, Ke; Xia, Haixia; An, Chunlin
2016-01-01
Embedding cost-sensitive factors into the classifiers increases the classification stability and reduces the classification costs for classifying high-scale, redundant, and imbalanced datasets, such as the gene expression data. In this study, we extend our previous work, that is, Dissimilar ELM (D-ELM), by introducing misclassification costs into the classifier. We name the proposed algorithm as the cost-sensitive D-ELM (CS-D-ELM). Furthermore, we embed rejection cost into the CS-D-ELM to increase the classification stability of the proposed algorithm. Experimental results show that the rejection cost embedded CS-D-ELM algorithm effectively reduces the average and overall cost of the classification process, while the classification accuracy still remains competitive. The proposed method can be extended to classification problems of other redundant and imbalanced data.
14 CFR 1203.501 - Applying derivative classification markings.
Code of Federal Regulations, 2011 CFR
2011-01-01
... INFORMATION SECURITY PROGRAM Derivative Classification § 1203.501 Applying derivative classification markings... classification decisions: (b) Verify the information's current level of classification so far as practicable...
14 CFR 1203.501 - Applying derivative classification markings.
Code of Federal Regulations, 2010 CFR
2010-01-01
... INFORMATION SECURITY PROGRAM Derivative Classification § 1203.501 Applying derivative classification markings... classification decisions: (b) Verify the information's current level of classification so far as practicable...
Zhang, Li; Qian, Liqiang; Ding, Chuntao; Zhou, Weida; Li, Fanzhang
2015-09-01
The family of discriminant neighborhood embedding (DNE) methods is typical graph-based methods for dimension reduction, and has been successfully applied to face recognition. This paper proposes a new variant of DNE, called similarity-balanced discriminant neighborhood embedding (SBDNE) and applies it to cancer classification using gene expression data. By introducing a novel similarity function, SBDNE deals with two data points in the same class and the different classes with different ways. The homogeneous and heterogeneous neighbors are selected according to the new similarity function instead of the Euclidean distance. SBDNE constructs two adjacent graphs, or between-class adjacent graph and within-class adjacent graph, using the new similarity function. According to these two adjacent graphs, we can generate the local between-class scatter and the local within-class scatter, respectively. Thus, SBDNE can maximize the between-class scatter and simultaneously minimize the within-class scatter to find the optimal projection matrix. Experimental results on six microarray datasets show that SBDNE is a promising method for cancer classification. Copyright © 2015 Elsevier Ltd. All rights reserved.
Zhang, Xin; Cui, Jintian; Wang, Weisheng; Lin, Chao
2017-01-01
To address the problem of image texture feature extraction, a direction measure statistic that is based on the directionality of image texture is constructed, and a new method of texture feature extraction, which is based on the direction measure and a gray level co-occurrence matrix (GLCM) fusion algorithm, is proposed in this paper. This method applies the GLCM to extract the texture feature value of an image and integrates the weight factor that is introduced by the direction measure to obtain the final texture feature of an image. A set of classification experiments for the high-resolution remote sensing images were performed by using support vector machine (SVM) classifier with the direction measure and gray level co-occurrence matrix fusion algorithm. Both qualitative and quantitative approaches were applied to assess the classification results. The experimental results demonstrated that texture feature extraction based on the fusion algorithm achieved a better image recognition, and the accuracy of classification based on this method has been significantly improved. PMID:28640181
A Wavelet Polarization Decomposition Net Model for Polarimetric SAR Image Classification
NASA Astrophysics Data System (ADS)
He, Chu; Ou, Dan; Yang, Teng; Wu, Kun; Liao, Mingsheng; Chen, Erxue
2014-11-01
In this paper, a deep model based on wavelet texture has been proposed for Polarimetric Synthetic Aperture Radar (PolSAR) image classification inspired by recent successful deep learning method. Our model is supposed to learn powerful and informative representations to improve the generalization ability for the complex scene classification tasks. Given the influence of speckle noise in Polarimetric SAR image, wavelet polarization decomposition is applied first to obtain basic and discriminative texture features which are then embedded into a Deep Neural Network (DNN) in order to compose multi-layer higher representations. We demonstrate that the model can produce a powerful representation which can capture some untraceable information from Polarimetric SAR images and show a promising achievement in comparison with other traditional SAR image classification methods for the SAR image dataset.
Tabu search and binary particle swarm optimization for feature selection using microarray data.
Chuang, Li-Yeh; Yang, Cheng-Huei; Yang, Cheng-Hong
2009-12-01
Gene expression profiles have great potential as a medical diagnosis tool because they represent the state of a cell at the molecular level. In the classification of cancer type research, available training datasets generally have a fairly small sample size compared to the number of genes involved. This fact poses an unprecedented challenge to some classification methodologies due to training data limitations. Therefore, a good selection method for genes relevant for sample classification is needed to improve the predictive accuracy, and to avoid incomprehensibility due to the large number of genes investigated. In this article, we propose to combine tabu search (TS) and binary particle swarm optimization (BPSO) for feature selection. BPSO acts as a local optimizer each time the TS has been run for a single generation. The K-nearest neighbor method with leave-one-out cross-validation and support vector machine with one-versus-rest serve as evaluators of the TS and BPSO. The proposed method is applied and compared to the 11 classification problems taken from the literature. Experimental results show that our method simplifies features effectively and either obtains higher classification accuracy or uses fewer features compared to other feature selection methods.
High Accuracy Human Activity Recognition Based on Sparse Locality Preserving Projections.
Zhu, Xiangbin; Qiu, Huiling
2016-01-01
Human activity recognition(HAR) from the temporal streams of sensory data has been applied to many fields, such as healthcare services, intelligent environments and cyber security. However, the classification accuracy of most existed methods is not enough in some applications, especially for healthcare services. In order to improving accuracy, it is necessary to develop a novel method which will take full account of the intrinsic sequential characteristics for time-series sensory data. Moreover, each human activity may has correlated feature relationship at different levels. Therefore, in this paper, we propose a three-stage continuous hidden Markov model (TSCHMM) approach to recognize human activities. The proposed method contains coarse, fine and accurate classification. The feature reduction is an important step in classification processing. In this paper, sparse locality preserving projections (SpLPP) is exploited to determine the optimal feature subsets for accurate classification of the stationary-activity data. It can extract more discriminative activities features from the sensor data compared with locality preserving projections. Furthermore, all of the gyro-based features are used for accurate classification of the moving data. Compared with other methods, our method uses significantly less number of features, and the over-all accuracy has been obviously improved.
High Accuracy Human Activity Recognition Based on Sparse Locality Preserving Projections
2016-01-01
Human activity recognition(HAR) from the temporal streams of sensory data has been applied to many fields, such as healthcare services, intelligent environments and cyber security. However, the classification accuracy of most existed methods is not enough in some applications, especially for healthcare services. In order to improving accuracy, it is necessary to develop a novel method which will take full account of the intrinsic sequential characteristics for time-series sensory data. Moreover, each human activity may has correlated feature relationship at different levels. Therefore, in this paper, we propose a three-stage continuous hidden Markov model (TSCHMM) approach to recognize human activities. The proposed method contains coarse, fine and accurate classification. The feature reduction is an important step in classification processing. In this paper, sparse locality preserving projections (SpLPP) is exploited to determine the optimal feature subsets for accurate classification of the stationary-activity data. It can extract more discriminative activities features from the sensor data compared with locality preserving projections. Furthermore, all of the gyro-based features are used for accurate classification of the moving data. Compared with other methods, our method uses significantly less number of features, and the over-all accuracy has been obviously improved. PMID:27893761
Computer-aided diagnosis system: a Bayesian hybrid classification method.
Calle-Alonso, F; Pérez, C J; Arias-Nicolás, J P; Martín, J
2013-10-01
A novel method to classify multi-class biomedical objects is presented. The method is based on a hybrid approach which combines pairwise comparison, Bayesian regression and the k-nearest neighbor technique. It can be applied in a fully automatic way or in a relevance feedback framework. In the latter case, the information obtained from both an expert and the automatic classification is iteratively used to improve the results until a certain accuracy level is achieved, then, the learning process is finished and new classifications can be automatically performed. The method has been applied in two biomedical contexts by following the same cross-validation schemes as in the original studies. The first one refers to cancer diagnosis, leading to an accuracy of 77.35% versus 66.37%, originally obtained. The second one considers the diagnosis of pathologies of the vertebral column. The original method achieves accuracies ranging from 76.5% to 96.7%, and from 82.3% to 97.1% in two different cross-validation schemes. Even with no supervision, the proposed method reaches 96.71% and 97.32% in these two cases. By using a supervised framework the achieved accuracy is 97.74%. Furthermore, all abnormal cases were correctly classified. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Li, Zhao-Liang
2018-01-01
Few studies have examined hyperspectral remote-sensing image classification with type-II fuzzy sets. This paper addresses image classification based on a hyperspectral remote-sensing technique using an improved interval type-II fuzzy c-means (IT2FCM*) approach. In this study, in contrast to other traditional fuzzy c-means-based approaches, the IT2FCM* algorithm considers the ranking of interval numbers and the spectral uncertainty. The classification results based on a hyperspectral dataset using the FCM, IT2FCM, and the proposed improved IT2FCM* algorithms show that the IT2FCM* method plays the best performance according to the clustering accuracy. In this paper, in order to validate and demonstrate the separability of the IT2FCM*, four type-I fuzzy validity indexes are employed, and a comparative analysis of these fuzzy validity indexes also applied in FCM and IT2FCM methods are made. These four indexes are also applied into different spatial and spectral resolution datasets to analyze the effects of spectral and spatial scaling factors on the separability of FCM, IT2FCM, and IT2FCM* methods. The results of these validity indexes from the hyperspectral datasets show that the improved IT2FCM* algorithm have the best values among these three algorithms in general. The results demonstrate that the IT2FCM* exhibits good performance in hyperspectral remote-sensing image classification because of its ability to handle hyperspectral uncertainty. PMID:29373548
NASA Astrophysics Data System (ADS)
Sukawattanavijit, Chanika; Srestasathiern, Panu
2017-10-01
Land Use and Land Cover (LULC) information are significant to observe and evaluate environmental change. LULC classification applying remotely sensed data is a technique popularly employed on a global and local dimension particularly, in urban areas which have diverse land cover types. These are essential components of the urban terrain and ecosystem. In the present, object-based image analysis (OBIA) is becoming widely popular for land cover classification using the high-resolution image. COSMO-SkyMed SAR data was fused with THAICHOTE (namely, THEOS: Thailand Earth Observation Satellite) optical data for land cover classification using object-based. This paper indicates a comparison between object-based and pixel-based approaches in image fusion. The per-pixel method, support vector machines (SVM) was implemented to the fused image based on Principal Component Analysis (PCA). For the objectbased classification was applied to the fused images to separate land cover classes by using nearest neighbor (NN) classifier. Finally, the accuracy assessment was employed by comparing with the classification of land cover mapping generated from fused image dataset and THAICHOTE image. The object-based data fused COSMO-SkyMed with THAICHOTE images demonstrated the best classification accuracies, well over 85%. As the results, an object-based data fusion provides higher land cover classification accuracy than per-pixel data fusion.
Differentiation of tea varieties using UV-Vis spectra and pattern recognition techniques
NASA Astrophysics Data System (ADS)
Palacios-Morillo, Ana; Alcázar, Ángela.; de Pablos, Fernando; Jurado, José Marcos
2013-02-01
Tea, one of the most consumed beverages all over the world, is of great importance in the economies of a number of countries. Several methods have been developed to classify tea varieties or origins based in pattern recognition techniques applied to chemical data, such as metal profile, amino acids, catechins and volatile compounds. Some of these analytical methods become tedious and expensive to be applied in routine works. The use of UV-Vis spectral data as discriminant variables, highly influenced by the chemical composition, can be an alternative to these methods. UV-Vis spectra of methanol-water extracts of tea have been obtained in the interval 250-800 nm. Absorbances have been used as input variables. Principal component analysis was used to reduce the number of variables and several pattern recognition methods, such as linear discriminant analysis, support vector machines and artificial neural networks, have been applied in order to differentiate the most common tea varieties. A successful classification model was built by combining principal component analysis and multilayer perceptron artificial neural networks, allowing the differentiation between tea varieties. This rapid and simple methodology can be applied to solve classification problems in food industry saving economic resources.
A model-based test for treatment effects with probabilistic classifications.
Cavagnaro, Daniel R; Davis-Stober, Clintin P
2018-05-21
Within modern psychology, computational and statistical models play an important role in describing a wide variety of human behavior. Model selection analyses are typically used to classify individuals according to the model(s) that best describe their behavior. These classifications are inherently probabilistic, which presents challenges for performing group-level analyses, such as quantifying the effect of an experimental manipulation. We answer this challenge by presenting a method for quantifying treatment effects in terms of distributional changes in model-based (i.e., probabilistic) classifications across treatment conditions. The method uses hierarchical Bayesian mixture modeling to incorporate classification uncertainty at the individual level into the test for a treatment effect at the group level. We illustrate the method with several worked examples, including a reanalysis of the data from Kellen, Mata, and Davis-Stober (2017), and analyze its performance more generally through simulation studies. Our simulations show that the method is both more powerful and less prone to type-1 errors than Fisher's exact test when classifications are uncertain. In the special case where classifications are deterministic, we find a near-perfect power-law relationship between the Bayes factor, derived from our method, and the p value obtained from Fisher's exact test. We provide code in an online supplement that allows researchers to apply the method to their own data. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Feature selection gait-based gender classification under different circumstances
NASA Astrophysics Data System (ADS)
Sabir, Azhin; Al-Jawad, Naseer; Jassim, Sabah
2014-05-01
This paper proposes a gender classification based on human gait features and investigates the problem of two variations: clothing (wearing coats) and carrying bag condition as addition to the normal gait sequence. The feature vectors in the proposed system are constructed after applying wavelet transform. Three different sets of feature are proposed in this method. First, Spatio-temporal distance that is dealing with the distance of different parts of the human body (like feet, knees, hand, Human Height and shoulder) during one gait cycle. The second and third feature sets are constructed from approximation and non-approximation coefficient of human body respectively. To extract these two sets of feature we divided the human body into two parts, upper and lower body part, based on the golden ratio proportion. In this paper, we have adopted a statistical method for constructing the feature vector from the above sets. The dimension of the constructed feature vector is reduced based on the Fisher score as a feature selection method to optimize their discriminating significance. Finally k-Nearest Neighbor is applied as a classification method. Experimental results demonstrate that our approach is providing more realistic scenario and relatively better performance compared with the existing approaches.
General methodology for simultaneous representation and discrimination of multiple object classes
NASA Astrophysics Data System (ADS)
Talukder, Ashit; Casasent, David P.
1998-03-01
We address a new general method for linear and nonlinear feature extraction for simultaneous representation and classification. We call this approach the maximum representation and discrimination feature (MRDF) method. We develop a novel nonlinear eigenfeature extraction technique to represent data with closed-form solutions and use it to derive a nonlinear MRDF algorithm. Results of the MRDF method on synthetic databases are shown and compared with results from standard Fukunaga-Koontz transform and Fisher discriminant function methods. The method is also applied to an automated product inspection problem and for classification and pose estimation of two similar objects under 3D aspect angle variations.
NASA Astrophysics Data System (ADS)
Luna, Aderval S.; da Silva, Arnaldo P.; Pinho, Jéssica S. A.; Ferré, Joan; Boqué, Ricard
Near infrared (NIR) spectroscopy and multivariate classification were applied to discriminate soybean oil samples into non-transgenic and transgenic. Principal Component Analysis (PCA) was applied to extract relevant features from the spectral data and to remove the anomalous samples. The best results were obtained when with Support Vectors Machine-Discriminant Analysis (SVM-DA) and Partial Least Squares-Discriminant Analysis (PLS-DA) after mean centering plus multiplicative scatter correction. For SVM-DA the percentage of successful classification was 100% for the training group and 100% and 90% in validation group for non transgenic and transgenic soybean oil samples respectively. For PLS-DA the percentage of successful classification was 95% and 100% in training group for non transgenic and transgenic soybean oil samples respectively and 100% and 80% in validation group for non transgenic and transgenic respectively. The results demonstrate that NIR spectroscopy can provide a rapid, nondestructive and reliable method to distinguish non-transgenic and transgenic soybean oils.
Age and gender classification in the wild with unsupervised feature learning
NASA Astrophysics Data System (ADS)
Wan, Lihong; Huo, Hong; Fang, Tao
2017-03-01
Inspired by unsupervised feature learning (UFL) within the self-taught learning framework, we propose a method based on UFL, convolution representation, and part-based dimensionality reduction to handle facial age and gender classification, which are two challenging problems under unconstrained circumstances. First, UFL is introduced to learn selective receptive fields (filters) automatically by applying whitening transformation and spherical k-means on random patches collected from unlabeled data. The learning process is fast and has no hyperparameters to tune. Then, the input image is convolved with these filters to obtain filtering responses on which local contrast normalization is applied. Average pooling and feature concatenation are then used to form global face representation. Finally, linear discriminant analysis with part-based strategy is presented to reduce the dimensions of the global representation and to improve classification performances further. Experiments on three challenging databases, namely, Labeled faces in the wild, Gallagher group photos, and Adience, demonstrate the effectiveness of the proposed method relative to that of state-of-the-art approaches.
UNDERSTANDING AND APPLYING ENVIRONMENTAL RELATIVE MOLDINESS INDEX - ERMI
This study compared two binary classification methods to evaluate the mold condition in 271 homes of infants, 144 of which later developed symptoms of respiratory illness. A method using on-site visual mold inspection was compared to another method using a quantitative index of ...
Vehicle detection in aerial surveillance using dynamic Bayesian networks.
Cheng, Hsu-Yung; Weng, Chih-Chia; Chen, Yi-Ying
2012-04-01
We present an automatic vehicle detection system for aerial surveillance in this paper. In this system, we escape from the stereotype and existing frameworks of vehicle detection in aerial surveillance, which are either region based or sliding window based. We design a pixelwise classification method for vehicle detection. The novelty lies in the fact that, in spite of performing pixelwise classification, relations among neighboring pixels in a region are preserved in the feature extraction process. We consider features including vehicle colors and local features. For vehicle color extraction, we utilize a color transform to separate vehicle colors and nonvehicle colors effectively. For edge detection, we apply moment preserving to adjust the thresholds of the Canny edge detector automatically, which increases the adaptability and the accuracy for detection in various aerial images. Afterward, a dynamic Bayesian network (DBN) is constructed for the classification purpose. We convert regional local features into quantitative observations that can be referenced when applying pixelwise classification via DBN. Experiments were conducted on a wide variety of aerial videos. The results demonstrate flexibility and good generalization abilities of the proposed method on a challenging data set with aerial surveillance images taken at different heights and under different camera angles.
Applying the Multiple Signal Classification Method to Silent Object Detection Using Ambient Noise
NASA Astrophysics Data System (ADS)
Mori, Kazuyoshi; Yokoyama, Tomoki; Hasegawa, Akio; Matsuda, Minoru
2004-05-01
The revolutionary concept of using ocean ambient noise positively to detect objects, called acoustic daylight imaging, has attracted much attention. The authors attempted the detection of a silent target object using ambient noise and a wide-band beam former consisting of an array of receivers. In experimental results obtained in air, using the wide-band beam former, we successfully applied the delay-sum array (DSA) method to detect a silent target object in an acoustic noise field generated by a large number of transducers. This paper reports some experimental results obtained by applying the multiple signal classification (MUSIC) method to a wide-band beam former to detect silent targets. The ocean ambient noise was simulated by transducers decentralized to many points in air. Both MUSIC and DSA detected a spherical target object in the noise field. The relative power levels near the target obtained with MUSIC were compared with those obtained by DSA. Then the effectiveness of the MUSIC method was evaluated according to the rate of increase in the maximum and minimum relative power levels.
Eitrich, T; Kless, A; Druska, C; Meyer, W; Grotendorst, J
2007-01-01
In this paper, we study the classifications of unbalanced data sets of drugs. As an example we chose a data set of 2D6 inhibitors of cytochrome P450. The human cytochrome P450 2D6 isoform plays a key role in the metabolism of many drugs in the preclinical drug discovery process. We have collected a data set from annotated public data and calculated physicochemical properties with chemoinformatics methods. On top of this data, we have built classifiers based on machine learning methods. Data sets with different class distributions lead to the effect that conventional machine learning methods are biased toward the larger class. To overcome this problem and to obtain sensitive but also accurate classifiers we combine machine learning and feature selection methods with techniques addressing the problem of unbalanced classification, such as oversampling and threshold moving. We have used our own implementation of a support vector machine algorithm as well as the maximum entropy method. Our feature selection is based on the unsupervised McCabe method. The classification results from our test set are compared structurally with compounds from the training set. We show that the applied algorithms enable the effective high throughput in silico classification of potential drug candidates.
Mejia Tobar, Alejandra; Hyoudou, Rikiya; Kita, Kahori; Nakamura, Tatsuhiro; Kambara, Hiroyuki; Ogata, Yousuke; Hanakawa, Takashi; Koike, Yasuharu; Yoshimura, Natsue
2017-01-01
The classification of ankle movements from non-invasive brain recordings can be applied to a brain-computer interface (BCI) to control exoskeletons, prosthesis, and functional electrical stimulators for the benefit of patients with walking impairments. In this research, ankle flexion and extension tasks at two force levels in both legs, were classified from cortical current sources estimated by a hierarchical variational Bayesian method, using electroencephalography (EEG) and functional magnetic resonance imaging (fMRI) recordings. The hierarchical prior for the current source estimation from EEG was obtained from activated brain areas and their intensities from an fMRI group (second-level) analysis. The fMRI group analysis was performed on regions of interest defined over the primary motor cortex, the supplementary motor area, and the somatosensory area, which are well-known to contribute to movement control. A sparse logistic regression method was applied for a nine-class classification (eight active tasks and a resting control task) obtaining a mean accuracy of 65.64% for time series of current sources, estimated from the EEG and the fMRI signals using a variational Bayesian method, and a mean accuracy of 22.19% for the classification of the pre-processed of EEG sensor signals, with a chance level of 11.11%. The higher classification accuracy of current sources, when compared to EEG classification accuracy, was attributed to the high number of sources and the different signal patterns obtained in the same vertex for different motor tasks. Since the inverse filter estimation for current sources can be done offline with the present method, the present method is applicable to real-time BCIs. Finally, due to the highly enhanced spatial distribution of current sources over the brain cortex, this method has the potential to identify activation patterns to design BCIs for the control of an affected limb in patients with stroke, or BCIs from motor imagery in patients with spinal cord injury.
Multi-level discriminative dictionary learning with application to large scale image classification.
Shen, Li; Sun, Gang; Huang, Qingming; Wang, Shuhui; Lin, Zhouchen; Wu, Enhua
2015-10-01
The sparse coding technique has shown flexibility and capability in image representation and analysis. It is a powerful tool in many visual applications. Some recent work has shown that incorporating the properties of task (such as discrimination for classification task) into dictionary learning is effective for improving the accuracy. However, the traditional supervised dictionary learning methods suffer from high computation complexity when dealing with large number of categories, making them less satisfactory in large scale applications. In this paper, we propose a novel multi-level discriminative dictionary learning method and apply it to large scale image classification. Our method takes advantage of hierarchical category correlation to encode multi-level discriminative information. Each internal node of the category hierarchy is associated with a discriminative dictionary and a classification model. The dictionaries at different layers are learnt to capture the information of different scales. Moreover, each node at lower layers also inherits the dictionary of its parent, so that the categories at lower layers can be described with multi-scale information. The learning of dictionaries and associated classification models is jointly conducted by minimizing an overall tree loss. The experimental results on challenging data sets demonstrate that our approach achieves excellent accuracy and competitive computation cost compared with other sparse coding methods for large scale image classification.
Comparison of Random Forest and Support Vector Machine classifiers using UAV remote sensing imagery
NASA Astrophysics Data System (ADS)
Piragnolo, Marco; Masiero, Andrea; Pirotti, Francesco
2017-04-01
Since recent years surveying with unmanned aerial vehicles (UAV) is getting a great amount of attention due to decreasing costs, higher precision and flexibility of usage. UAVs have been applied for geomorphological investigations, forestry, precision agriculture, cultural heritage assessment and for archaeological purposes. It can be used for land use and land cover classification (LULC). In literature, there are two main types of approaches for classification of remote sensing imagery: pixel-based and object-based. On one hand, pixel-based approach mostly uses training areas to define classes and respective spectral signatures. On the other hand, object-based classification considers pixels, scale, spatial information and texture information for creating homogeneous objects. Machine learning methods have been applied successfully for classification, and their use is increasing due to the availability of faster computing capabilities. The methods learn and train the model from previous computation. Two machine learning methods which have given good results in previous investigations are Random Forest (RF) and Support Vector Machine (SVM). The goal of this work is to compare RF and SVM methods for classifying LULC using images collected with a fixed wing UAV. The processing chain regarding classification uses packages in R, an open source scripting language for data analysis, which provides all necessary algorithms. The imagery was acquired and processed in November 2015 with cameras providing information over the red, blue, green and near infrared wavelength reflectivity over a testing area in the campus of Agripolis, in Italy. Images were elaborated and ortho-rectified through Agisoft Photoscan. The ortho-rectified image is the full data set, and the test set is derived from partial sub-setting of the full data set. Different tests have been carried out, using a percentage from 2 % to 20 % of the total. Ten training sets and ten validation sets are obtained from each test set. The control dataset consist of an independent visual classification done by an expert over the whole area. The classes are (i) broadleaf, (ii) building, (iii) grass, (iv) headland access path, (v) road, (vi) sowed land, (vii) vegetable. The RF and SVM are applied to the test set. The performances of the methods are evaluated using the three following accuracy metrics: Kappa index, Classification accuracy and Classification Error. All three are calculated in three different ways: with K-fold cross validation, using the validation test set and using the full test set. The analysis indicates that SVM gets better results in terms of good scores using K-fold cross or validation test set. Using the full test set, RF achieves a better result in comparison to SVM. It also seems that SVM performs better with smaller training sets, whereas RF performs better as training sets get larger.
NASA Astrophysics Data System (ADS)
Anker, Y.; Hershkovitz, Y.; Gasith, A.; Ben-Dor, E.
2011-12-01
Although remote sensing of fluvial ecosystems is well developed, the tradeoff between spectral and spatial resolutions prevents its application in small streams (<3m width). In the current study, a remote sensing approach for monitoring and research of small ecosystem was developed. The method is based on differentiation between two indicative vegetation species out of the ecosystem flora. Since when studied, the channel was covered mostly by a filamentous green alga (Cladophora glomerata) and watercress (Nasturtium officinale), these species were chosen as indicative; nonetheless, common reed (Phragmites australis) was also classified in order to exclude it from the stream ROI. The procedure included: A. For both section and habitat scales classifications, acquisition of aerial digital RGB datasets. B. For section scale classification, hyperspectral (HSR) dataset acquisition. C. For calibration, HSR reflectance measurements of specific ground targets, in close proximity to each dataset acquisition swath. D. For habitat scale classification, manual, in-stream flora grid transects classification. The digital RGB datasets were converted to reflectance units by spectral calibration against colored reference plates. These red, green, blue, white, and black EVA foam reference plates were measured by an ASD field spectrometer and each was given a spectral value. Each spectral value was later applied to the spectral calibration and radiometric correction of spectral RGB (SRGB) cube. Spectral calibration of the HSR dataset was done using the empirical line method, based on reference values of progressive grey scale targets. Differentiation between the vegetation species was done by supervised classification both for the HSR and for the SRGB datasets. This procedure was done using the Spectral Angle Mapper function with the spectral pattern of each vegetation species as a spectral end member. Comparison between the two remote sensing techniques and between the SRGB classification and the in-situ transects indicates that: A. Stream vegetation classification resolution is about 4 cm by the SRGB method compared to about 1 m by HSR. Moreover, this resolution is also higher than of the manual grid transect classification. B. The SRGB method is by far the most cost-efficient. The combination of spectral information (rather than the cognitive color) and high spatial resolution of aerial photography provides noise filtration and better sub-water detection capabilities than the HSR technique. C. Only the SRGB method applies for habitat and section scales; hence, its application together with in-situ grid transects for validation, may be optimal for use in similar scenarios.
The HSR dataset was first degraded to 17 bands with the same spectral range as the RGB dataset and also to a dataset with 3 equivalent bands
Multi-label spacecraft electrical signal classification method based on DBN and random forest
Li, Ke; Yu, Nan; Li, Pengfei; Song, Shimin; Wu, Yalei; Li, Yang; Liu, Meng
2017-01-01
In spacecraft electrical signal characteristic data, there exists a large amount of data with high-dimensional features, a high computational complexity degree, and a low rate of identification problems, which causes great difficulty in fault diagnosis of spacecraft electronic load systems. This paper proposes a feature extraction method that is based on deep belief networks (DBN) and a classification method that is based on the random forest (RF) algorithm; The proposed algorithm mainly employs a multi-layer neural network to reduce the dimension of the original data, and then, classification is applied. Firstly, we use the method of wavelet denoising, which was used to pre-process the data. Secondly, the deep belief network is used to reduce the feature dimension and improve the rate of classification for the electrical characteristics data. Finally, we used the random forest algorithm to classify the data and comparing it with other algorithms. The experimental results show that compared with other algorithms, the proposed method shows excellent performance in terms of accuracy, computational efficiency, and stability in addressing spacecraft electrical signal data. PMID:28486479
Multi-label spacecraft electrical signal classification method based on DBN and random forest.
Li, Ke; Yu, Nan; Li, Pengfei; Song, Shimin; Wu, Yalei; Li, Yang; Liu, Meng
2017-01-01
In spacecraft electrical signal characteristic data, there exists a large amount of data with high-dimensional features, a high computational complexity degree, and a low rate of identification problems, which causes great difficulty in fault diagnosis of spacecraft electronic load systems. This paper proposes a feature extraction method that is based on deep belief networks (DBN) and a classification method that is based on the random forest (RF) algorithm; The proposed algorithm mainly employs a multi-layer neural network to reduce the dimension of the original data, and then, classification is applied. Firstly, we use the method of wavelet denoising, which was used to pre-process the data. Secondly, the deep belief network is used to reduce the feature dimension and improve the rate of classification for the electrical characteristics data. Finally, we used the random forest algorithm to classify the data and comparing it with other algorithms. The experimental results show that compared with other algorithms, the proposed method shows excellent performance in terms of accuracy, computational efficiency, and stability in addressing spacecraft electrical signal data.
NASA Technical Reports Server (NTRS)
Wardroper, A. M. K.; Brooks, P. W.; Humberston, M. J.; Maxwell, J. R.
1977-01-01
A computer method is described for the automatic classification of triterpanes and steranes into gross structural type from their mass spectral characteristics. The method has been applied to the spectra obtained by gas-chromatographic/mass-spectroscopic analysis of two mixtures of standards and of hydrocarbon fractions isolated from Green River and Messel oil shales. Almost all of the steranes and triterpanes identified previously in both shales were classified, in addition to a number of new components. The results indicate that classification of such alkanes is possible with a laboratory computer system. The method has application to diagenesis and maturation studies as well as to oil/oil and oil/source rock correlations in which rapid screening of large numbers of samples is required.
Musmeci, Nicoló; Aste, Tomaso; Di Matteo, T
2015-01-01
We quantify the amount of information filtered by different hierarchical clustering methods on correlations between stock returns comparing the clustering structure with the underlying industrial activity classification. We apply, for the first time to financial data, a novel hierarchical clustering approach, the Directed Bubble Hierarchical Tree and we compare it with other methods including the Linkage and k-medoids. By taking the industrial sector classification of stocks as a benchmark partition, we evaluate how the different methods retrieve this classification. The results show that the Directed Bubble Hierarchical Tree can outperform other methods, being able to retrieve more information with fewer clusters. Moreover,we show that the economic information is hidden at different levels of the hierarchical structures depending on the clustering method. The dynamical analysis on a rolling window also reveals that the different methods show different degrees of sensitivity to events affecting financial markets, like crises. These results can be of interest for all the applications of clustering methods to portfolio optimization and risk hedging [corrected].
Musmeci, Nicoló; Aste, Tomaso; Di Matteo, T.
2015-01-01
We quantify the amount of information filtered by different hierarchical clustering methods on correlations between stock returns comparing the clustering structure with the underlying industrial activity classification. We apply, for the first time to financial data, a novel hierarchical clustering approach, the Directed Bubble Hierarchical Tree and we compare it with other methods including the Linkage and k-medoids. By taking the industrial sector classification of stocks as a benchmark partition, we evaluate how the different methods retrieve this classification. The results show that the Directed Bubble Hierarchical Tree can outperform other methods, being able to retrieve more information with fewer clusters. Moreover, we show that the economic information is hidden at different levels of the hierarchical structures depending on the clustering method. The dynamical analysis on a rolling window also reveals that the different methods show different degrees of sensitivity to events affecting financial markets, like crises. These results can be of interest for all the applications of clustering methods to portfolio optimization and risk hedging. PMID:25786703
The construction of support vector machine classifier using the firefly algorithm.
Chao, Chih-Feng; Horng, Ming-Huwi
2015-01-01
The setting of parameters in the support vector machines (SVMs) is very important with regard to its accuracy and efficiency. In this paper, we employ the firefly algorithm to train all parameters of the SVM simultaneously, including the penalty parameter, smoothness parameter, and Lagrangian multiplier. The proposed method is called the firefly-based SVM (firefly-SVM). This tool is not considered the feature selection, because the SVM, together with feature selection, is not suitable for the application in a multiclass classification, especially for the one-against-all multiclass SVM. In experiments, binary and multiclass classifications are explored. In the experiments on binary classification, ten of the benchmark data sets of the University of California, Irvine (UCI), machine learning repository are used; additionally the firefly-SVM is applied to the multiclass diagnosis of ultrasonic supraspinatus images. The classification performance of firefly-SVM is also compared to the original LIBSVM method associated with the grid search method and the particle swarm optimization based SVM (PSO-SVM). The experimental results advocate the use of firefly-SVM to classify pattern classifications for maximum accuracy.
The Construction of Support Vector Machine Classifier Using the Firefly Algorithm
Chao, Chih-Feng; Horng, Ming-Huwi
2015-01-01
The setting of parameters in the support vector machines (SVMs) is very important with regard to its accuracy and efficiency. In this paper, we employ the firefly algorithm to train all parameters of the SVM simultaneously, including the penalty parameter, smoothness parameter, and Lagrangian multiplier. The proposed method is called the firefly-based SVM (firefly-SVM). This tool is not considered the feature selection, because the SVM, together with feature selection, is not suitable for the application in a multiclass classification, especially for the one-against-all multiclass SVM. In experiments, binary and multiclass classifications are explored. In the experiments on binary classification, ten of the benchmark data sets of the University of California, Irvine (UCI), machine learning repository are used; additionally the firefly-SVM is applied to the multiclass diagnosis of ultrasonic supraspinatus images. The classification performance of firefly-SVM is also compared to the original LIBSVM method associated with the grid search method and the particle swarm optimization based SVM (PSO-SVM). The experimental results advocate the use of firefly-SVM to classify pattern classifications for maximum accuracy. PMID:25802511
Song, Wen; Ade, Carl; Broxterman, Ryan; Barstow, Thomas; Nelson, Thomas; Warren, Steve
2012-01-01
Accelerometer data provide useful information about subject activity in many different application scenarios. For this study, single-accelerometer data were acquired from subjects participating in field tests that mimic tasks that astronauts might encounter in reduced gravity environments. The primary goal of this effort was to apply classification algorithms that could identify these tasks based on features present in their corresponding accelerometer data, where the end goal is to establish methods to unobtrusively gauge subject well-being based on sensors that reside in their local environment. In this initial analysis, six different activities that involve leg movement are classified. The k-Nearest Neighbors (kNN) algorithm was found to be the most effective, with an overall classification success rate of 90.8%.
[Object-oriented aquatic vegetation extracting approach based on visible vegetation indices.
Jing, Ran; Deng, Lei; Zhao, Wen Ji; Gong, Zhao Ning
2016-05-01
Using the estimation of scale parameters (ESP) image segmentation tool to determine the ideal image segmentation scale, the optimal segmented image was created by the multi-scale segmentation method. Based on the visible vegetation indices derived from mini-UAV imaging data, we chose a set of optimal vegetation indices from a series of visible vegetation indices, and built up a decision tree rule. A membership function was used to automatically classify the study area and an aquatic vegetation map was generated. The results showed the overall accuracy of image classification using the supervised classification was 53.7%, and the overall accuracy of object-oriented image analysis (OBIA) was 91.7%. Compared with pixel-based supervised classification method, the OBIA method improved significantly the image classification result and further increased the accuracy of extracting the aquatic vegetation. The Kappa value of supervised classification was 0.4, and the Kappa value based OBIA was 0.9. The experimental results demonstrated that using visible vegetation indices derived from the mini-UAV data and OBIA method extracting the aquatic vegetation developed in this study was feasible and could be applied in other physically similar areas.
Multi-Pixel Simultaneous Classification of PolSAR Image Using Convolutional Neural Networks
Xu, Xin; Gui, Rong; Pu, Fangling
2018-01-01
Convolutional neural networks (CNN) have achieved great success in the optical image processing field. Because of the excellent performance of CNN, more and more methods based on CNN are applied to polarimetric synthetic aperture radar (PolSAR) image classification. Most CNN-based PolSAR image classification methods can only classify one pixel each time. Because all the pixels of a PolSAR image are classified independently, the inherent interrelation of different land covers is ignored. We use a fixed-feature-size CNN (FFS-CNN) to classify all pixels in a patch simultaneously. The proposed method has several advantages. First, FFS-CNN can classify all the pixels in a small patch simultaneously. When classifying a whole PolSAR image, it is faster than common CNNs. Second, FFS-CNN is trained to learn the interrelation of different land covers in a patch, so it can use the interrelation of land covers to improve the classification results. The experiments of FFS-CNN are evaluated on a Chinese Gaofen-3 PolSAR image and other two real PolSAR images. Experiment results show that FFS-CNN is comparable with the state-of-the-art PolSAR image classification methods. PMID:29510499
Multi-Pixel Simultaneous Classification of PolSAR Image Using Convolutional Neural Networks.
Wang, Lei; Xu, Xin; Dong, Hao; Gui, Rong; Pu, Fangling
2018-03-03
Convolutional neural networks (CNN) have achieved great success in the optical image processing field. Because of the excellent performance of CNN, more and more methods based on CNN are applied to polarimetric synthetic aperture radar (PolSAR) image classification. Most CNN-based PolSAR image classification methods can only classify one pixel each time. Because all the pixels of a PolSAR image are classified independently, the inherent interrelation of different land covers is ignored. We use a fixed-feature-size CNN (FFS-CNN) to classify all pixels in a patch simultaneously. The proposed method has several advantages. First, FFS-CNN can classify all the pixels in a small patch simultaneously. When classifying a whole PolSAR image, it is faster than common CNNs. Second, FFS-CNN is trained to learn the interrelation of different land covers in a patch, so it can use the interrelation of land covers to improve the classification results. The experiments of FFS-CNN are evaluated on a Chinese Gaofen-3 PolSAR image and other two real PolSAR images. Experiment results show that FFS-CNN is comparable with the state-of-the-art PolSAR image classification methods.
2013-01-01
Background Gene expression data could likely be a momentous help in the progress of proficient cancer diagnoses and classification platforms. Lately, many researchers analyze gene expression data using diverse computational intelligence methods, for selecting a small subset of informative genes from the data for cancer classification. Many computational methods face difficulties in selecting small subsets due to the small number of samples compared to the huge number of genes (high-dimension), irrelevant genes, and noisy genes. Methods We propose an enhanced binary particle swarm optimization to perform the selection of small subsets of informative genes which is significant for cancer classification. Particle speed, rule, and modified sigmoid function are introduced in this proposed method to increase the probability of the bits in a particle’s position to be zero. The method was empirically applied to a suite of ten well-known benchmark gene expression data sets. Results The performance of the proposed method proved to be superior to other previous related works, including the conventional version of binary particle swarm optimization (BPSO) in terms of classification accuracy and the number of selected genes. The proposed method also requires lower computational time compared to BPSO. PMID:23617960
Riba Ruiz, Jordi-Roger; Canals, Trini; Cantero, Rosa
2017-01-01
Ethylene propylene diene monomer (EPDM) rubber is widely used in a diverse type of applications, such as the automotive, industrial and construction sectors among others. Due to its appealing features, the consumption of vulcanized EPDM rubber is growing significantly. However, environmental issues are forcing the application of devulcanization processes to facilitate recovery, which has led rubber manufacturers to implement strict quality controls. Consequently, it is important to develop methods for supervising the vulcanizing and recovery processes of such products. This paper deals with the supervision process of EPDM compounds by means of Fourier transform mid-infrared (FT-IR) spectroscopy and suitable multivariate statistical methods. An expedited and nondestructive classification approach was applied to a sufficient number of EPDM samples with different applied processes, that is, with and without application of vulcanizing agents, vulcanized samples, and microwave treated samples. First the FT-IR spectra of the samples is acquired and next it is processed by applying suitable feature extraction methods, i.e., principal component analysis and canonical variate analysis to obtain the latent variables to be used for classifying test EPDM samples. Finally, the k nearest neighbor algorithm was used in the classification stage. Experimental results prove the accuracy of the proposed method and the potential of FT-IR spectroscopy in this area, since the classification accuracy can be as high as 100%.
Doukas, Charalampos; Goudas, Theodosis; Fischer, Simon; Mierswa, Ingo; Chatziioannou, Aristotle; Maglogiannis, Ilias
2010-01-01
This paper presents an open image-mining framework that provides access to tools and methods for the characterization of medical images. Several image processing and feature extraction operators have been implemented and exposed through Web Services. Rapid-Miner, an open source data mining system has been utilized for applying classification operators and creating the essential processing workflows. The proposed framework has been applied for the detection of salient objects in Obstructive Nephropathy microscopy images. Initial classification results are quite promising demonstrating the feasibility of automated characterization of kidney biopsy images.
Infinitely many symmetries and conservation laws for quad-graph equations via the Gardner method
NASA Astrophysics Data System (ADS)
Rasin, Alexander G.
2010-06-01
The application of the Gardner method for the generation of conservation laws to all the ABS equations is considered. It is shown that all the necessary information for the application of the Gardner method, namely Bäcklund transformations and initial conservation laws, follows from the multidimensional consistency of ABS equations. We also apply the Gardner method to an asymmetric equation which is not included in the ABS classification. An analog of the Gardner method for the generation of symmetries is developed and applied to the discrete Korteweg-de Vries equation. It can also be applied to all the other ABS equations.
NASA Astrophysics Data System (ADS)
Phinyomark, A.; Hu, H.; Phukpattaranont, P.; Limsakul, C.
2012-01-01
The classification of upper-limb movements based on surface electromyography (EMG) signals is an important issue in the control of assistive devices and rehabilitation systems. Increasing the number of EMG channels and features in order to increase the number of control commands can yield a high dimensional feature vector. To cope with the accuracy and computation problems associated with high dimensionality, it is commonplace to apply a processing step that transforms the data to a space of significantly lower dimensions with only a limited loss of useful information. Linear discriminant analysis (LDA) has been successfully applied as an EMG feature projection method. Recently, a number of extended LDA-based algorithms have been proposed, which are more competitive in terms of both classification accuracy and computational costs/times with classical LDA. This paper presents the findings of a comparative study of classical LDA and five extended LDA methods. From a quantitative comparison based on seven multi-feature sets, three extended LDA-based algorithms, consisting of uncorrelated LDA, orthogonal LDA and orthogonal fuzzy neighborhood discriminant analysis, produce better class separability when compared with a baseline system (without feature projection), principle component analysis (PCA), and classical LDA. Based on a 7-dimension time domain and time-scale feature vectors, these methods achieved respectively 95.2% and 93.2% classification accuracy by using a linear discriminant classifier.
Discrimination of different sub-basins on Tajo River based on water influence factor
NASA Astrophysics Data System (ADS)
Bermudez, R.; Gascó, J. M.; Tarquis, A. M.; Saa-Requejo, A.
2009-04-01
Numeric taxonomy has been applied to classify Tajo basin water (Spain) till Portugal border. Several stations, a total of 52, that estimate 15 water variables have been used in this study. The different groups have been obtained applying a Euclidean distance among stations (distance classification) and a Euclidean distance between each station and the centroid estimated among them (centroid classification), varying the number of parameters and with or without variable typification. In order to compare the classification a log-log relation has been established, between number of groups created and distances, to select the best one. It has been observed that centroid classification is more appropriate following in a more logic way the natural constrictions than the minimum distance among stations. Variable typification doesn't improve the classification except when the centroid method is applied. Taking in consideration the ions and the sum of them as variables, the classification improved. Stations are grouped based on electric conductivity (CE), total anions (TA), total cations (TC) and ions ratio (Na/Ca and Mg/Ca). For a given classification and comparing the different groups created a certain variation in ions concentration and ions ratio are observed. However, the variation in each ion among groups is different depending on the case. For the last group, regardless the classification, the increase in all ions is general. Comparing the dendrograms, and groups that originated, Tajo river basin can be sub dived in five sub-basins differentiated by the main influence on water: 1. With a higher ombrogenic influence (rain fed). 2. With ombrogenic and pedogenic influence (rain and groundwater fed). 3. With pedogenic influence. 4. With lithogenic influence (geological bedrock). 5. With a higher ombrogenic and lithogenic influence added.
Spectral-Spatial Classification of Hyperspectral Images Using Hierarchical Optimization
NASA Technical Reports Server (NTRS)
Tarabalka, Yuliya; Tilton, James C.
2011-01-01
A new spectral-spatial method for hyperspectral data classification is proposed. For a given hyperspectral image, probabilistic pixelwise classification is first applied. Then, hierarchical step-wise optimization algorithm is performed, by iteratively merging neighboring regions with the smallest Dissimilarity Criterion (DC) and recomputing class labels for new regions. The DC is computed by comparing region mean vectors, class labels and a number of pixels in the two regions under consideration. The algorithm is converged when all the pixels get involved in the region merging procedure. Experimental results are presented on two remote sensing hyperspectral images acquired by the AVIRIS and ROSIS sensors. The proposed approach improves classification accuracies and provides maps with more homogeneous regions, when compared to previously proposed classification techniques.
Fernandez-Lozano, C.; Canto, C.; Gestal, M.; Andrade-Garda, J. M.; Rabuñal, J. R.; Dorado, J.; Pazos, A.
2013-01-01
Given the background of the use of Neural Networks in problems of apple juice classification, this paper aim at implementing a newly developed method in the field of machine learning: the Support Vector Machines (SVM). Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using SVM as a fitness function of the Genetic Algorithm (GA), the most representative variables for a specific classification problem can be selected. PMID:24453933
a Hyperspectral Image Classification Method Using Isomap and Rvm
NASA Astrophysics Data System (ADS)
Chang, H.; Wang, T.; Fang, H.; Su, Y.
2018-04-01
Classification is one of the most significant applications of hyperspectral image processing and even remote sensing. Though various algorithms have been proposed to implement and improve this application, there are still drawbacks in traditional classification methods. Thus further investigations on some aspects, such as dimension reduction, data mining, and rational use of spatial information, should be developed. In this paper, we used a widely utilized global manifold learning approach, isometric feature mapping (ISOMAP), to address the intrinsic nonlinearities of hyperspectral image for dimension reduction. Considering the impropriety of Euclidean distance in spectral measurement, we applied spectral angle (SA) for substitute when constructed the neighbourhood graph. Then, relevance vector machines (RVM) was introduced to implement classification instead of support vector machines (SVM) for simplicity, generalization and sparsity. Therefore, a probability result could be obtained rather than a less convincing binary result. Moreover, taking into account the spatial information of the hyperspectral image, we employ a spatial vector formed by different classes' ratios around the pixel. At last, we combined the probability results and spatial factors with a criterion to decide the final classification result. To verify the proposed method, we have implemented multiple experiments with standard hyperspectral images compared with some other methods. The results and different evaluation indexes illustrated the effectiveness of our method.
On the difficulty to delimit disease risk hot spots
NASA Astrophysics Data System (ADS)
Charras-Garrido, M.; Azizi, L.; Forbes, F.; Doyle, S.; Peyrard, N.; Abrial, D.
2013-06-01
Representing the health state of a region is a helpful tool to highlight spatial heterogeneity and localize high risk areas. For ease of interpretation and to determine where to apply control procedures, we need to clearly identify and delineate homogeneous regions in terms of disease risk, and in particular disease risk hot spots. However, even if practical purposes require the delineation of different risk classes, such a classification does not correspond to a reality and is thus difficult to estimate. Working with grouped data, a first natural choice is to apply disease mapping models. We apply a usual disease mapping model, producing continuous estimations of the risks that requires a post-processing classification step to obtain clearly delimited risk zones. We also apply a risk partition model that build a classification of the risk levels in a one step procedure. Working with point data, we will focus on the scan statistic clustering method. We illustrate our article with a real example concerning the bovin spongiform encephalopathy (BSE) an animal disease whose zones at risk are well known by the epidemiologists. We show that in this difficult case of a rare disease and a very heterogeneous population, the different methods provide risk zones that are globally coherent. But, related to the dichotomy between the need and the reality, the exact delimitation of the risk zones, as well as the corresponding estimated risks are quite different.
Lagrangian methods of cosmic web classification
NASA Astrophysics Data System (ADS)
Fisher, J. D.; Faltenbacher, A.; Johnson, M. S. T.
2016-05-01
The cosmic web defines the large-scale distribution of matter we see in the Universe today. Classifying the cosmic web into voids, sheets, filaments and nodes allows one to explore structure formation and the role environmental factors have on halo and galaxy properties. While existing studies of cosmic web classification concentrate on grid-based methods, this work explores a Lagrangian approach where the V-web algorithm proposed by Hoffman et al. is implemented with techniques borrowed from smoothed particle hydrodynamics. The Lagrangian approach allows one to classify individual objects (e.g. particles or haloes) based on properties of their nearest neighbours in an adaptive manner. It can be applied directly to a halo sample which dramatically reduces computational cost and potentially allows an application of this classification scheme to observed galaxy samples. Finally, the Lagrangian nature admits a straightforward inclusion of the Hubble flow negating the necessity of a visually defined threshold value which is commonly employed by grid-based classification methods.
Pathological brain detection based on wavelet entropy and Hu moment invariants.
Zhang, Yudong; Wang, Shuihua; Sun, Ping; Phillips, Preetha
2015-01-01
With the aim of developing an accurate pathological brain detection system, we proposed a novel automatic computer-aided diagnosis (CAD) to detect pathological brains from normal brains obtained by magnetic resonance imaging (MRI) scanning. The problem still remained a challenge for technicians and clinicians, since MR imaging generated an exceptionally large information dataset. A new two-step approach was proposed in this study. We used wavelet entropy (WE) and Hu moment invariants (HMI) for feature extraction, and the generalized eigenvalue proximal support vector machine (GEPSVM) for classification. To further enhance classification accuracy, the popular radial basis function (RBF) kernel was employed. The 10 runs of k-fold stratified cross validation result showed that the proposed "WE + HMI + GEPSVM + RBF" method was superior to existing methods w.r.t. classification accuracy. It obtained the average classification accuracies of 100%, 100%, and 99.45% over Dataset-66, Dataset-160, and Dataset-255, respectively. The proposed method is effective and can be applied to realistic use.
Li, Ke; Liu, Yi; Wang, Quanxin; Wu, Yalei; Song, Shimin; Sun, Yi; Liu, Tengchong; Wang, Jun; Li, Yang; Du, Shaoyi
2015-01-01
This paper proposes a novel multi-label classification method for resolving the spacecraft electrical characteristics problems which involve many unlabeled test data processing, high-dimensional features, long computing time and identification of slow rate. Firstly, both the fuzzy c-means (FCM) offline clustering and the principal component feature extraction algorithms are applied for the feature selection process. Secondly, the approximate weighted proximal support vector machine (WPSVM) online classification algorithms is used to reduce the feature dimension and further improve the rate of recognition for electrical characteristics spacecraft. Finally, the data capture contribution method by using thresholds is proposed to guarantee the validity and consistency of the data selection. The experimental results indicate that the method proposed can obtain better data features of the spacecraft electrical characteristics, improve the accuracy of identification and shorten the computing time effectively. PMID:26544549
Brain Tumor Segmentation Using Deep Belief Networks and Pathological Knowledge.
Zhan, Tianming; Chen, Yi; Hong, Xunning; Lu, Zhenyu; Chen, Yunjie
2017-01-01
In this paper, we propose an automatic brain tumor segmentation method based on Deep Belief Networks (DBNs) and pathological knowledge. The proposed method is targeted against gliomas (both low and high grade) obtained in multi-sequence magnetic resonance images (MRIs). Firstly, a novel deep architecture is proposed to combine the multi-sequences intensities feature extraction with classification to get the classification probabilities of each voxel. Then, graph cut based optimization is executed on the classification probabilities to strengthen the spatial relationships of voxels. At last, pathological knowledge of gliomas is applied to remove some false positives. Our method was validated in the Brain Tumor Segmentation Challenge 2012 and 2013 databases (BRATS 2012, 2013). The performance of segmentation results demonstrates our proposal providing a competitive solution with stateof- the-art methods. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Incremental Transductive Learning Approaches to Schistosomiasis Vector Classification
NASA Astrophysics Data System (ADS)
Fusco, Terence; Bi, Yaxin; Wang, Haiying; Browne, Fiona
2016-08-01
The key issues pertaining to collection of epidemic disease data for our analysis purposes are that it is a labour intensive, time consuming and expensive process resulting in availability of sparse sample data which we use to develop prediction models. To address this sparse data issue, we present the novel Incremental Transductive methods to circumvent the data collection process by applying previously acquired data to provide consistent, confidence-based labelling alternatives to field survey research. We investigated various reasoning approaches for semi-supervised machine learning including Bayesian models for labelling data. The results show that using the proposed methods, we can label instances of data with a class of vector density at a high level of confidence. By applying the Liberal and Strict Training Approaches, we provide a labelling and classification alternative to standalone algorithms. The methods in this paper are components in the process of reducing the proliferation of the Schistosomiasis disease and its effects.
NASA Technical Reports Server (NTRS)
Tarabalka, Y.; Tilton, J. C.; Benediktsson, J. A.; Chanussot, J.
2012-01-01
The Hierarchical SEGmentation (HSEG) algorithm, which combines region object finding with region object clustering, has given good performances for multi- and hyperspectral image analysis. This technique produces at its output a hierarchical set of image segmentations. The automated selection of a single segmentation level is often necessary. We propose and investigate the use of automatically selected markers for this purpose. In this paper, a novel Marker-based HSEG (M-HSEG) method for spectral-spatial classification of hyperspectral images is proposed. Two classification-based approaches for automatic marker selection are adapted and compared for this purpose. Then, a novel constrained marker-based HSEG algorithm is applied, resulting in a spectral-spatial classification map. Three different implementations of the M-HSEG method are proposed and their performances in terms of classification accuracies are compared. The experimental results, presented for three hyperspectral airborne images, demonstrate that the proposed approach yields accurate segmentation and classification maps, and thus is attractive for remote sensing image analysis.
Wang, Yi; Ma, Xiang; Wen, Ya-Dong; Yu, Chun-Xia; Wang, Luo-Ping; Zhao, Long-Lian; Li, Jun-Hui
2012-10-01
In this study, tobacco quality analysis of industrial classification of different producing area was carried out applying spectrum projection and correlation methods. The group of industrial classification data was near-infrared (NIR) spectrum in 2010 year from different tobacco plant parts and colors of Hongta Tobacco (Group) Co., Ltd. 6 064 tobacco leaf samples of 17 classes from Yuxi, Chuxiong and Zhaotong, in Yunnan province and 6 industrial classifications were collected using near infrared spectroscopy, which from different parts and colors and all belong to tobacco varieties of K326. The conclusion showed that, the probability of the grading belonging by the first dimension was 84%, the probability of the producing area belonging by the second dimension was 71%. The study can explain the difference of tobacco quality of industrial classification and producing area by a projection method to get the quantitative similarity values. The quantitative similarity values were instructive in combination of tobacco leaf blending.
Automotive System for Remote Surface Classification.
Bystrov, Aleksandr; Hoare, Edward; Tran, Thuy-Yung; Clarke, Nigel; Gashinova, Marina; Cherniakov, Mikhail
2017-04-01
In this paper we shall discuss a novel approach to road surface recognition, based on the analysis of backscattered microwave and ultrasonic signals. The novelty of our method is sonar and polarimetric radar data fusion, extraction of features for separate swathes of illuminated surface (segmentation), and using of multi-stage artificial neural network for surface classification. The developed system consists of 24 GHz radar and 40 kHz ultrasonic sensor. The features are extracted from backscattered signals and then the procedures of principal component analysis and supervised classification are applied to feature data. The special attention is paid to multi-stage artificial neural network which allows an overall increase in classification accuracy. The proposed technique was tested for recognition of a large number of real surfaces in different weather conditions with the average accuracy of correct classification of 95%. The obtained results thereby demonstrate that the use of proposed system architecture and statistical methods allow for reliable discrimination of various road surfaces in real conditions.
Automotive System for Remote Surface Classification
Bystrov, Aleksandr; Hoare, Edward; Tran, Thuy-Yung; Clarke, Nigel; Gashinova, Marina; Cherniakov, Mikhail
2017-01-01
In this paper we shall discuss a novel approach to road surface recognition, based on the analysis of backscattered microwave and ultrasonic signals. The novelty of our method is sonar and polarimetric radar data fusion, extraction of features for separate swathes of illuminated surface (segmentation), and using of multi-stage artificial neural network for surface classification. The developed system consists of 24 GHz radar and 40 kHz ultrasonic sensor. The features are extracted from backscattered signals and then the procedures of principal component analysis and supervised classification are applied to feature data. The special attention is paid to multi-stage artificial neural network which allows an overall increase in classification accuracy. The proposed technique was tested for recognition of a large number of real surfaces in different weather conditions with the average accuracy of correct classification of 95%. The obtained results thereby demonstrate that the use of proposed system architecture and statistical methods allow for reliable discrimination of various road surfaces in real conditions. PMID:28368297
NASA Astrophysics Data System (ADS)
Bates, Matthew E.; Keisler, Jeffrey M.; Zussblatt, Niels P.; Plourde, Kenton J.; Wender, Ben A.; Linkov, Igor
2016-02-01
Risk research for nanomaterials is currently prioritized by means of expert workshops and other deliberative processes. However, analytical techniques that quantify and compare alternative research investments are increasingly recommended. Here, we apply value of information and portfolio decision analysis—methods commonly applied in financial and operations management—to prioritize risk research for multiwalled carbon nanotubes and nanoparticulate silver and titanium dioxide. We modify the widely accepted CB Nanotool hazard evaluation framework, which combines nano- and bulk-material properties into a hazard score, to operate probabilistically with uncertain inputs. Literature is reviewed to develop uncertain estimates for each input parameter, and a Monte Carlo simulation is applied to assess how different research strategies can improve hazard classification. The relative cost of each research experiment is elicited from experts, which enables identification of efficient research portfolios—combinations of experiments that lead to the greatest improvement in hazard classification at the lowest cost. Nanoparticle shape, diameter, solubility and surface reactivity were most frequently identified within efficient portfolios in our results.
Bates, Matthew E; Keisler, Jeffrey M; Zussblatt, Niels P; Plourde, Kenton J; Wender, Ben A; Linkov, Igor
2016-02-01
Risk research for nanomaterials is currently prioritized by means of expert workshops and other deliberative processes. However, analytical techniques that quantify and compare alternative research investments are increasingly recommended. Here, we apply value of information and portfolio decision analysis-methods commonly applied in financial and operations management-to prioritize risk research for multiwalled carbon nanotubes and nanoparticulate silver and titanium dioxide. We modify the widely accepted CB Nanotool hazard evaluation framework, which combines nano- and bulk-material properties into a hazard score, to operate probabilistically with uncertain inputs. Literature is reviewed to develop uncertain estimates for each input parameter, and a Monte Carlo simulation is applied to assess how different research strategies can improve hazard classification. The relative cost of each research experiment is elicited from experts, which enables identification of efficient research portfolios-combinations of experiments that lead to the greatest improvement in hazard classification at the lowest cost. Nanoparticle shape, diameter, solubility and surface reactivity were most frequently identified within efficient portfolios in our results.
Lu, Shen; Xia, Yong; Cai, Tom Weidong; Feng, David Dagan
2015-01-01
Dementia, Alzheimer's disease (AD) in particular is a global problem and big threat to the aging population. An image based computer-aided dementia diagnosis method is needed to providing doctors help during medical image examination. Many machine learning based dementia classification methods using medical imaging have been proposed and most of them achieve accurate results. However, most of these methods make use of supervised learning requiring fully labeled image dataset, which usually is not practical in real clinical environment. Using large amount of unlabeled images can improve the dementia classification performance. In this study we propose a new semi-supervised dementia classification method based on random manifold learning with affinity regularization. Three groups of spatial features are extracted from positron emission tomography (PET) images to construct an unsupervised random forest which is then used to regularize the manifold learning objective function. The proposed method, stat-of-the-art Laplacian support vector machine (LapSVM) and supervised SVM are applied to classify AD and normal controls (NC). The experiment results show that learning with unlabeled images indeed improves the classification performance. And our method outperforms LapSVM on the same dataset.
Genetic Algorithms and Classification Trees in Feature Discovery: Diabetes and the NHANES database
DOE Office of Scientific and Technical Information (OSTI.GOV)
Heredia-Langner, Alejandro; Jarman, Kristin H.; Amidan, Brett G.
2013-09-01
This paper presents a feature selection methodology that can be applied to datasets containing a mixture of continuous and categorical variables. Using a Genetic Algorithm (GA), this method explores a dataset and selects a small set of features relevant for the prediction of a binary (1/0) response. Binary classification trees and an objective function based on conditional probabilities are used to measure the fitness of a given subset of features. The method is applied to health data in order to find factors useful for the prediction of diabetes. Results show that our algorithm is capable of narrowing down the setmore » of predictors to around 8 factors that can be validated using reputable medical and public health resources.« less
Park, Jong-Uk; Lee, Hyo-Ki; Lee, Junghun; Urtnasan, Erdenebayar; Kim, Hojoong; Lee, Kyoung-Joung
2015-09-01
This study proposes a method of automatically classifying sleep apnea/hypopnea events based on sleep states and the severity of sleep-disordered breathing (SDB) using photoplethysmogram (PPG) and oxygen saturation (SpO2) signals acquired from a pulse oximeter. The PPG was used to classify sleep state, while the severity of SDB was estimated by detecting events of SpO2 oxygen desaturation. Furthermore, we classified sleep apnea/hypopnea events by applying different categorisations according to the severity of SDB based on a support vector machine. The classification results showed sensitivity performances and positivity predictive values of 74.2% and 87.5% for apnea, 87.5% and 63.4% for hypopnea, and 92.4% and 92.8% for apnea + hypopnea, respectively. These results represent better or comparable outcomes compared to those of previous studies. In addition, our classification method reliably detected sleep apnea/hypopnea events in all patient groups without bias in particular patient groups when our algorithm was applied to a variety of patient groups. Therefore, this method has the potential to diagnose SDB more reliably and conveniently using a pulse oximeter.
Mudali, D; Teune, L K; Renken, R J; Leenders, K L; Roerdink, J B T M
2015-01-01
Medical imaging techniques like fluorodeoxyglucose positron emission tomography (FDG-PET) have been used to aid in the differential diagnosis of neurodegenerative brain diseases. In this study, the objective is to classify FDG-PET brain scans of subjects with Parkinsonian syndromes (Parkinson's disease, multiple system atrophy, and progressive supranuclear palsy) compared to healthy controls. The scaled subprofile model/principal component analysis (SSM/PCA) method was applied to FDG-PET brain image data to obtain covariance patterns and corresponding subject scores. The latter were used as features for supervised classification by the C4.5 decision tree method. Leave-one-out cross validation was applied to determine classifier performance. We carried out a comparison with other types of classifiers. The big advantage of decision tree classification is that the results are easy to understand by humans. A visual representation of decision trees strongly supports the interpretation process, which is very important in the context of medical diagnosis. Further improvements are suggested based on enlarging the number of the training data, enhancing the decision tree method by bagging, and adding additional features based on (f)MRI data.
A Pruning Neural Network Model in Credit Classification Analysis
Tang, Yajiao; Ji, Junkai; Dai, Hongwei; Yu, Yang; Todo, Yuki
2018-01-01
Nowadays, credit classification models are widely applied because they can help financial decision-makers to handle credit classification issues. Among them, artificial neural networks (ANNs) have been widely accepted as the convincing methods in the credit industry. In this paper, we propose a pruning neural network (PNN) and apply it to solve credit classification problem by adopting the well-known Australian and Japanese credit datasets. The model is inspired by synaptic nonlinearity of a dendritic tree in a biological neural model. And it is trained by an error back-propagation algorithm. The model is capable of realizing a neuronal pruning function by removing the superfluous synapses and useless dendrites and forms a tidy dendritic morphology at the end of learning. Furthermore, we utilize logic circuits (LCs) to simulate the dendritic structures successfully which makes PNN be implemented on the hardware effectively. The statistical results of our experiments have verified that PNN obtains superior performance in comparison with other classical algorithms in terms of accuracy and computational efficiency. PMID:29606961
NASA Astrophysics Data System (ADS)
Sanhouse-García, Antonio J.; Rangel-Peraza, Jesús Gabriel; Bustos-Terrones, Yaneth; García-Ferrer, Alfonso; Mesas-Carrascosa, Francisco J.
2016-02-01
Land cover classification is often based on different characteristics between their classes, but with great homogeneity within each one of them. This cover is obtained through field work or by mean of processing satellite images. Field work involves high costs; therefore, digital image processing techniques have become an important alternative to perform this task. However, in some developing countries and particularly in Casacoima municipality in Venezuela, there is a lack of geographic information systems due to the lack of updated information and high costs in software license acquisition. This research proposes a low cost methodology to develop thematic mapping of local land use and types of coverage in areas with scarce resources. Thematic mapping was developed from CBERS-2 images and spatial information available on the network using open source tools. The supervised classification method per pixel and per region was applied using different classification algorithms and comparing them among themselves. Classification method per pixel was based on Maxver algorithms (maximum likelihood) and Euclidean distance (minimum distance), while per region classification was based on the Bhattacharya algorithm. Satisfactory results were obtained from per region classification, where overall reliability of 83.93% and kappa index of 0.81% were observed. Maxver algorithm showed a reliability value of 73.36% and kappa index 0.69%, while Euclidean distance obtained values of 67.17% and 0.61% for reliability and kappa index, respectively. It was demonstrated that the proposed methodology was very useful in cartographic processing and updating, which in turn serve as a support to develop management plans and land management. Hence, open source tools showed to be an economically viable alternative not only for forestry organizations, but for the general public, allowing them to develop projects in economically depressed and/or environmentally threatened areas.
Nawaz, Tabassam; Mehmood, Zahid; Rashid, Muhammad; Habib, Hafiz Adnan
2018-01-01
Recent research on speech segregation and music fingerprinting has led to improvements in speech segregation and music identification algorithms. Speech and music segregation generally involves the identification of music followed by speech segregation. However, music segregation becomes a challenging task in the presence of noise. This paper proposes a novel method of speech segregation for unlabelled stationary noisy audio signals using the deep belief network (DBN) model. The proposed method successfully segregates a music signal from noisy audio streams. A recurrent neural network (RNN)-based hidden layer segregation model is applied to remove stationary noise. Dictionary-based fisher algorithms are employed for speech classification. The proposed method is tested on three datasets (TIMIT, MIR-1K, and MusicBrainz), and the results indicate the robustness of proposed method for speech segregation. The qualitative and quantitative analysis carried out on three datasets demonstrate the efficiency of the proposed method compared to the state-of-the-art speech segregation and classification-based methods. PMID:29558485
Application of Classification Methods for Forecasting Mid-Term Power Load Patterns
NASA Astrophysics Data System (ADS)
Piao, Minghao; Lee, Heon Gyu; Park, Jin Hyoung; Ryu, Keun Ho
Currently an automated methodology based on data mining techniques is presented for the prediction of customer load patterns in long duration load profiles. The proposed approach in this paper consists of three stages: (i) data preprocessing: noise or outlier is removed and the continuous attribute-valued features are transformed to discrete values, (ii) cluster analysis: k-means clustering is used to create load pattern classes and the representative load profiles for each class and (iii) classification: we evaluated several supervised learning methods in order to select a suitable prediction method. According to the proposed methodology, power load measured from AMR (automatic meter reading) system, as well as customer indexes, were used as inputs for clustering. The output of clustering was the classification of representative load profiles (or classes). In order to evaluate the result of forecasting load patterns, the several classification methods were applied on a set of high voltage customers of the Korea power system and derived class labels from clustering and other features are used as input to produce classifiers. Lastly, the result of our experiments was presented.
McRoy, Susan; Jones, Sean; Kurmally, Adam
2016-09-01
This article examines methods for automated question classification applied to cancer-related questions that people have asked on the web. This work is part of a broader effort to provide automated question answering for health education. We created a new corpus of consumer-health questions related to cancer and a new taxonomy for those questions. We then compared the effectiveness of different statistical methods for developing classifiers, including weighted classification and resampling. Basic methods for building classifiers were limited by the high variability in the natural distribution of questions and typical refinement approaches of feature selection and merging categories achieved only small improvements to classifier accuracy. Best performance was achieved using weighted classification and resampling methods, the latter yielding an accuracy of F1 = 0.963. Thus, it would appear that statistical classifiers can be trained on natural data, but only if natural distributions of classes are smoothed. Such classifiers would be useful for automated question answering, for enriching web-based content, or assisting clinical professionals to answer questions. © The Author(s) 2015.
A Fault Alarm and Diagnosis Method Based on Sensitive Parameters and Support Vector Machine
NASA Astrophysics Data System (ADS)
Zhang, Jinjie; Yao, Ziyun; Lv, Zhiquan; Zhu, Qunxiong; Xu, Fengtian; Jiang, Zhinong
2015-08-01
Study on the extraction of fault feature and the diagnostic technique of reciprocating compressor is one of the hot research topics in the field of reciprocating machinery fault diagnosis at present. A large number of feature extraction and classification methods have been widely applied in the related research, but the practical fault alarm and the accuracy of diagnosis have not been effectively improved. Developing feature extraction and classification methods to meet the requirements of typical fault alarm and automatic diagnosis in practical engineering is urgent task. The typical mechanical faults of reciprocating compressor are presented in the paper, and the existing data of online monitoring system is used to extract fault feature parameters within 15 types in total; the inner sensitive connection between faults and the feature parameters has been made clear by using the distance evaluation technique, also sensitive characteristic parameters of different faults have been obtained. On this basis, a method based on fault feature parameters and support vector machine (SVM) is developed, which will be applied to practical fault diagnosis. A better ability of early fault warning has been proved by the experiment and the practical fault cases. Automatic classification by using the SVM to the data of fault alarm has obtained better diagnostic accuracy.
NASA Astrophysics Data System (ADS)
Cui, Binge; Ma, Xiudan; Xie, Xiaoyun; Ren, Guangbo; Ma, Yi
2017-03-01
The classification of hyperspectral images with a few labeled samples is a major challenge which is difficult to meet unless some spatial characteristics can be exploited. In this study, we proposed a novel spectral-spatial hyperspectral image classification method that exploited spatial autocorrelation of hyperspectral images. First, image segmentation is performed on the hyperspectral image to assign each pixel to a homogeneous region. Second, the visible and infrared bands of hyperspectral image are partitioned into multiple subsets of adjacent bands, and each subset is merged into one band. Recursive edge-preserving filtering is performed on each merged band which utilizes the spectral information of neighborhood pixels. Third, the resulting spectral and spatial feature band set is classified using the SVM classifier. Finally, bilateral filtering is performed to remove "salt-and-pepper" noise in the classification result. To preserve the spatial structure of hyperspectral image, edge-preserving filtering is applied independently before and after the classification process. Experimental results on different hyperspectral images prove that the proposed spectral-spatial classification approach is robust and offers more classification accuracy than state-of-the-art methods when the number of labeled samples is small.
Modeling ready biodegradability of fragrance materials.
Ceriani, Lidia; Papa, Ester; Kovarich, Simona; Boethling, Robert; Gramatica, Paola
2015-06-01
In the present study, quantitative structure activity relationships were developed for predicting ready biodegradability of approximately 200 heterogeneous fragrance materials. Two classification methods, classification and regression tree (CART) and k-nearest neighbors (kNN), were applied to perform the modeling. The models were validated with multiple external prediction sets, and the structural applicability domain was verified by the leverage approach. The best models had good sensitivity (internal ≥80%; external ≥68%), specificity (internal ≥80%; external 73%), and overall accuracy (≥75%). Results from the comparison with BIOWIN global models, based on group contribution method, show that specific models developed in the present study perform better in prediction than BIOWIN6, in particular for the correct classification of not readily biodegradable fragrance materials. © 2015 SETAC.
A thyroid nodule classification method based on TI-RADS
NASA Astrophysics Data System (ADS)
Wang, Hao; Yang, Yang; Peng, Bo; Chen, Qin
2017-07-01
Thyroid Imaging Reporting and Data System(TI-RADS) is a valuable tool for differentiating the benign and the malignant thyroid nodules. In clinic, doctors can determine the extent of being benign or malignant in terms of different classes by using TI-RADS. Classification represents the degree of malignancy of thyroid nodules. TI-RADS as a classification standard can be used to guide the ultrasonic doctor to examine thyroid nodules more accurately and reliably. In this paper, we aim to classify the thyroid nodules with the help of TI-RADS. To this end, four ultrasound signs, i.e., cystic and solid, echo pattern, boundary feature and calcification of thyroid nodules are extracted and converted into feature vectors. Then semi-supervised fuzzy C-means ensemble (SS-FCME) model is applied to obtain the classification results. The experimental results demonstrate that the proposed method can help doctors diagnose the thyroid nodules effectively.
Machine learning in soil classification.
Bhattacharya, B; Solomatine, D P
2006-03-01
In a number of engineering problems, e.g. in geotechnics, petroleum engineering, etc. intervals of measured series data (signals) are to be attributed a class maintaining the constraint of contiguity and standard classification methods could be inadequate. Classification in this case needs involvement of an expert who observes the magnitude and trends of the signals in addition to any a priori information that might be available. In this paper, an approach for automating this classification procedure is presented. Firstly, a segmentation algorithm is developed and applied to segment the measured signals. Secondly, the salient features of these segments are extracted using boundary energy method. Based on the measured data and extracted features to assign classes to the segments classifiers are built; they employ Decision Trees, ANN and Support Vector Machines. The methodology was tested in classifying sub-surface soil using measured data from Cone Penetration Testing and satisfactory results were obtained.
NASA Astrophysics Data System (ADS)
Mohammadimanesh, F.; Salehi, B.; Mahdianpari, M.; Homayouni, S.
2016-06-01
Polarimetric Synthetic Aperture Radar (PolSAR) imagery is a complex multi-dimensional dataset, which is an important source of information for various natural resources and environmental classification and monitoring applications. PolSAR imagery produces valuable information by observing scattering mechanisms from different natural and man-made objects. Land cover mapping using PolSAR data classification is one of the most important applications of SAR remote sensing earth observations, which have gained increasing attention in the recent years. However, one of the most challenging aspects of classification is selecting features with maximum discrimination capability. To address this challenge, a statistical approach based on the Fisher Linear Discriminant Analysis (FLDA) and the incorporation of physical interpretation of PolSAR data into classification is proposed in this paper. After pre-processing of PolSAR data, including the speckle reduction, the H/α classification is used in order to classify the basic scattering mechanisms. Then, a new method for feature weighting, based on the fusion of FLDA and physical interpretation, is implemented. This method proves to increase the classification accuracy as well as increasing between-class discrimination in the final Wishart classification. The proposed method was applied to a full polarimetric C-band RADARSAT-2 data set from Avalon area, Newfoundland and Labrador, Canada. This imagery has been acquired in June 2015, and covers various types of wetlands including bogs, fens, marshes and shallow water. The results were compared with the standard Wishart classification, and an improvement of about 20% was achieved in the overall accuracy. This method provides an opportunity for operational wetland classification in northern latitude with high accuracy using only SAR polarimetric data.
NASA Astrophysics Data System (ADS)
Wu, Shulian; Peng, Yuanyuan; Hu, Liangjun; Zhang, Xiaoman; Li, Hui
2016-01-01
Second harmonic generation microscopy (SHGM) was used to monitor the process of chronological aging skin in vivo. The collagen structures of mice model with different ages were obtained using SHGM. Then, texture feature with contrast, correlation and entropy were extracted and analysed using the grey level co-occurrence matrix. At last, the neural network tool of Matlab was applied to train the texture of collagen in different statues during the aging process. And the simulation of mice collagen texture was carried out. The results indicated that the classification accuracy reach 85%. Results demonstrated that the proposed approach effectively detected the target object in the collagen texture image during the chronological aging process and the analysis tool based on neural network applied the skin of classification and feature extraction method is feasible.
Object Detection and Classification by Decision-Level Fusion for Intelligent Vehicle Systems.
Oh, Sang-Il; Kang, Hang-Bong
2017-01-22
To understand driving environments effectively, it is important to achieve accurate detection and classification of objects detected by sensor-based intelligent vehicle systems, which are significantly important tasks. Object detection is performed for the localization of objects, whereas object classification recognizes object classes from detected object regions. For accurate object detection and classification, fusing multiple sensor information into a key component of the representation and perception processes is necessary. In this paper, we propose a new object-detection and classification method using decision-level fusion. We fuse the classification outputs from independent unary classifiers, such as 3D point clouds and image data using a convolutional neural network (CNN). The unary classifiers for the two sensors are the CNN with five layers, which use more than two pre-trained convolutional layers to consider local to global features as data representation. To represent data using convolutional layers, we apply region of interest (ROI) pooling to the outputs of each layer on the object candidate regions generated using object proposal generation to realize color flattening and semantic grouping for charge-coupled device and Light Detection And Ranging (LiDAR) sensors. We evaluate our proposed method on a KITTI benchmark dataset to detect and classify three object classes: cars, pedestrians and cyclists. The evaluation results show that the proposed method achieves better performance than the previous methods. Our proposed method extracted approximately 500 proposals on a 1226 × 370 image, whereas the original selective search method extracted approximately 10 6 × n proposals. We obtained classification performance with 77.72% mean average precision over the entirety of the classes in the moderate detection level of the KITTI benchmark dataset.
Object Detection and Classification by Decision-Level Fusion for Intelligent Vehicle Systems
Oh, Sang-Il; Kang, Hang-Bong
2017-01-01
To understand driving environments effectively, it is important to achieve accurate detection and classification of objects detected by sensor-based intelligent vehicle systems, which are significantly important tasks. Object detection is performed for the localization of objects, whereas object classification recognizes object classes from detected object regions. For accurate object detection and classification, fusing multiple sensor information into a key component of the representation and perception processes is necessary. In this paper, we propose a new object-detection and classification method using decision-level fusion. We fuse the classification outputs from independent unary classifiers, such as 3D point clouds and image data using a convolutional neural network (CNN). The unary classifiers for the two sensors are the CNN with five layers, which use more than two pre-trained convolutional layers to consider local to global features as data representation. To represent data using convolutional layers, we apply region of interest (ROI) pooling to the outputs of each layer on the object candidate regions generated using object proposal generation to realize color flattening and semantic grouping for charge-coupled device and Light Detection And Ranging (LiDAR) sensors. We evaluate our proposed method on a KITTI benchmark dataset to detect and classify three object classes: cars, pedestrians and cyclists. The evaluation results show that the proposed method achieves better performance than the previous methods. Our proposed method extracted approximately 500 proposals on a 1226×370 image, whereas the original selective search method extracted approximately 106×n proposals. We obtained classification performance with 77.72% mean average precision over the entirety of the classes in the moderate detection level of the KITTI benchmark dataset. PMID:28117742
Radar modulation classification using time-frequency representation and nonlinear regression
NASA Astrophysics Data System (ADS)
De Luigi, Christophe; Arques, Pierre-Yves; Lopez, Jean-Marc; Moreau, Eric
1999-09-01
In naval electronic environment, pulses emitted by radars are collected by ESM receivers. For most of them the intrapulse signal is modulated by a particular law. To help the classical identification process, a classification and estimation of this modulation law is applied on the intrapulse signal measurements. To estimate with a good accuracy the time-varying frequency of a signal corrupted by an additive noise, one method has been chosen. This method consists on the Wigner distribution calculation, the instantaneous frequency is then estimated by the peak location of the distribution. Bias and variance of the estimator are performed by computed simulations. In a estimated sequence of frequencies, we assume the presence of false and good estimated ones, the hypothesis of Gaussian distribution is made on the errors. A robust non linear regression method, based on the Levenberg-Marquardt algorithm, is thus applied on these estimated frequencies using a Maximum Likelihood Estimator. The performances of the method are tested by using varied modulation laws and different signal to noise ratios.
A methodology for space-time classification of groundwater quality.
Passarella, G; Caputo, M C
2006-04-01
Safeguarding groundwater from civil, agricultural and industrial contamination is matter of great interest in water resource management. During recent years, much legislation has been produced stating the importance of groundwater as a source for drinking water supplies, underlining its vulnerability and defining the required quality standards. Thus, schematic tools, able to characterise the quality and quantity of groundwater systems, are of very great interest in any territorial planning and/or water resource management activity. This paper proposes a groundwater quality classification method which has been applied to a real aquifer, starting from several studies published by the Italian National Hydrogeologic Catastrophe Defence Group (GNDCI). The methodology is based on the concentration values of several parameters used as indexes of the natural hydro-chemical water condition and of potential man-induced modifications of groundwater quality. The resulting maps, although representative of the quality, do not include any information on its evolution in time. In this paper, this "stationary" classification method has been improved by crossing the quality classes with three indexes of temporal behaviour during recent years. It was then applied to data from monitoring campaigns, performed in spring and autumn, from 1990 to 1996, in the plain of Modena aquifer (central Italy). The results are reported in the form of space-time classification table and maps.
ASTM clustering for improving coal analysis by near-infrared spectroscopy.
Andrés, J M; Bona, M T
2006-11-15
Multivariate analysis techniques have been applied to near-infrared (NIR) spectra coals to investigate the relationship between nine coal properties (moisture (%), ash (%), volatile matter (%), fixed carbon (%), heating value (kcal/kg), carbon (%), hydrogen (%), nitrogen (%) and sulphur (%)) and the corresponding predictor variables. In this work, a whole set of coal samples was grouped into six more homogeneous clusters following the ASTM reference method for classification prior to the application of calibration methods to each coal set. The results obtained showed a considerable improvement of the error determination compared with the calibration for the whole sample set. For some groups, the established calibrations approached the quality required by the ASTM/ISO norms for laboratory analysis. To predict property values for a new coal sample it is necessary the assignation of that sample to its respective group. Thus, the discrimination and classification ability of coal samples by Diffuse Reflectance Infrared Fourier Transform Spectroscopy (DRIFTS) in the NIR range was also studied by applying Soft Independent Modelling of Class Analogy (SIMCA) and Linear Discriminant Analysis (LDA) techniques. Modelling of the groups by SIMCA led to overlapping models that cannot discriminate for unique classification. On the other hand, the application of Linear Discriminant Analysis improved the classification of the samples but not enough to be satisfactory for every group considered.
Natural Language Processing Based Instrument for Classification of Free Text Medical Records
2016-01-01
According to the Ministry of Labor, Health and Social Affairs of Georgia a new health management system has to be introduced in the nearest future. In this context arises the problem of structuring and classifying documents containing all the history of medical services provided. The present work introduces the instrument for classification of medical records based on the Georgian language. It is the first attempt of such classification of the Georgian language based medical records. On the whole 24.855 examination records have been studied. The documents were classified into three main groups (ultrasonography, endoscopy, and X-ray) and 13 subgroups using two well-known methods: Support Vector Machine (SVM) and K-Nearest Neighbor (KNN). The results obtained demonstrated that both machine learning methods performed successfully, with a little supremacy of SVM. In the process of classification a “shrink” method, based on features selection, was introduced and applied. At the first stage of classification the results of the “shrink” case were better; however, on the second stage of classification into subclasses 23% of all documents could not be linked to only one definite individual subclass (liver or binary system) due to common features characterizing these subclasses. The overall results of the study were successful. PMID:27668260
Cloud classification from satellite data using a fuzzy sets algorithm: A polar example
NASA Technical Reports Server (NTRS)
Key, J. R.; Maslanik, J. A.; Barry, R. G.
1988-01-01
Where spatial boundaries between phenomena are diffuse, classification methods which construct mutually exclusive clusters seem inappropriate. The Fuzzy c-means (FCM) algorithm assigns each observation to all clusters, with membership values as a function of distance to the cluster center. The FCM algorithm is applied to AVHRR data for the purpose of classifying polar clouds and surfaces. Careful analysis of the fuzzy sets can provide information on which spectral channels are best suited to the classification of particular features, and can help determine likely areas of misclassification. General agreement in the resulting classes and cloud fraction was found between the FCM algorithm, a manual classification, and an unsupervised maximum likelihood classifier.
Deep learning for tumor classification in imaging mass spectrometry.
Behrmann, Jens; Etmann, Christian; Boskamp, Tobias; Casadonte, Rita; Kriegsmann, Jörg; Maaß, Peter
2018-04-01
Tumor classification using imaging mass spectrometry (IMS) data has a high potential for future applications in pathology. Due to the complexity and size of the data, automated feature extraction and classification steps are required to fully process the data. Since mass spectra exhibit certain structural similarities to image data, deep learning may offer a promising strategy for classification of IMS data as it has been successfully applied to image classification. Methodologically, we propose an adapted architecture based on deep convolutional networks to handle the characteristics of mass spectrometry data, as well as a strategy to interpret the learned model in the spectral domain based on a sensitivity analysis. The proposed methods are evaluated on two algorithmically challenging tumor classification tasks and compared to a baseline approach. Competitiveness of the proposed methods is shown on both tasks by studying the performance via cross-validation. Moreover, the learned models are analyzed by the proposed sensitivity analysis revealing biologically plausible effects as well as confounding factors of the considered tasks. Thus, this study may serve as a starting point for further development of deep learning approaches in IMS classification tasks. https://gitlab.informatik.uni-bremen.de/digipath/Deep_Learning_for_Tumor_Classification_in_IMS. jbehrmann@uni-bremen.de or christianetmann@uni-bremen.de. Supplementary data are available at Bioinformatics online.
Ivanov, Iliya V; Leitritz, Martin A; Norrenberg, Lars A; Völker, Michael; Dynowski, Marek; Ueffing, Marius; Dietter, Johannes
2016-02-01
Abnormalities of blood vessel anatomy, morphology, and ratio can serve as important diagnostic markers for retinal diseases such as AMD or diabetic retinopathy. Large cohort studies demand automated and quantitative image analysis of vascular abnormalities. Therefore, we developed an analytical software tool to enable automated standardized classification of blood vessels supporting clinical reading. A dataset of 61 images was collected from a total of 33 women and 8 men with a median age of 38 years. The pupils were not dilated, and images were taken after dark adaption. In contrast to current methods in which classification is based on vessel profile intensity averages, and similar to human vision, local color contrast was chosen as a discriminator to allow artery vein discrimination and arterial-venous ratio (AVR) calculation without vessel tracking. With 83% ± 1 standard error of the mean for our dataset, we achieved best classification for weighted lightness information from a combination of the red, green, and blue channels. Tested on an independent dataset, our method reached 89% correct classification, which, when benchmarked against conventional ophthalmologic classification, shows significantly improved classification scores. Our study demonstrates that vessel classification based on local color contrast can cope with inter- or intraimage lightness variability and allows consistent AVR calculation. We offer an open-source implementation of this method upon request, which can be integrated into existing tool sets and applied to general diagnostic exams.
A statistical approach to root system classification
Bodner, Gernot; Leitner, Daniel; Nakhforoosh, Alireza; Sobotik, Monika; Moder, Karl; Kaul, Hans-Peter
2013-01-01
Plant root systems have a key role in ecology and agronomy. In spite of fast increase in root studies, still there is no classification that allows distinguishing among distinctive characteristics within the diversity of rooting strategies. Our hypothesis is that a multivariate approach for “plant functional type” identification in ecology can be applied to the classification of root systems. The classification method presented is based on a data-defined statistical procedure without a priori decision on the classifiers. The study demonstrates that principal component based rooting types provide efficient and meaningful multi-trait classifiers. The classification method is exemplified with simulated root architectures and morphological field data. Simulated root architectures showed that morphological attributes with spatial distribution parameters capture most distinctive features within root system diversity. While developmental type (tap vs. shoot-borne systems) is a strong, but coarse classifier, topological traits provide the most detailed differentiation among distinctive groups. Adequacy of commonly available morphologic traits for classification is supported by field data. Rooting types emerging from measured data, mainly distinguished by diameter/weight and density dominated types. Similarity of root systems within distinctive groups was the joint result of phylogenetic relation and environmental as well as human selection pressure. We concluded that the data-define classification is appropriate for integration of knowledge obtained with different root measurement methods and at various scales. Currently root morphology is the most promising basis for classification due to widely used common measurement protocols. To capture details of root diversity efforts in architectural measurement techniques are essential. PMID:23914200
A statistical approach to root system classification.
Bodner, Gernot; Leitner, Daniel; Nakhforoosh, Alireza; Sobotik, Monika; Moder, Karl; Kaul, Hans-Peter
2013-01-01
Plant root systems have a key role in ecology and agronomy. In spite of fast increase in root studies, still there is no classification that allows distinguishing among distinctive characteristics within the diversity of rooting strategies. Our hypothesis is that a multivariate approach for "plant functional type" identification in ecology can be applied to the classification of root systems. The classification method presented is based on a data-defined statistical procedure without a priori decision on the classifiers. The study demonstrates that principal component based rooting types provide efficient and meaningful multi-trait classifiers. The classification method is exemplified with simulated root architectures and morphological field data. Simulated root architectures showed that morphological attributes with spatial distribution parameters capture most distinctive features within root system diversity. While developmental type (tap vs. shoot-borne systems) is a strong, but coarse classifier, topological traits provide the most detailed differentiation among distinctive groups. Adequacy of commonly available morphologic traits for classification is supported by field data. Rooting types emerging from measured data, mainly distinguished by diameter/weight and density dominated types. Similarity of root systems within distinctive groups was the joint result of phylogenetic relation and environmental as well as human selection pressure. We concluded that the data-define classification is appropriate for integration of knowledge obtained with different root measurement methods and at various scales. Currently root morphology is the most promising basis for classification due to widely used common measurement protocols. To capture details of root diversity efforts in architectural measurement techniques are essential.
3D Texture Analysis in Renal Cell Carcinoma Tissue Image Grading
Cho, Nam-Hoon; Choi, Heung-Kook
2014-01-01
One of the most significant processes in cancer cell and tissue image analysis is the efficient extraction of features for grading purposes. This research applied two types of three-dimensional texture analysis methods to the extraction of feature values from renal cell carcinoma tissue images, and then evaluated the validity of the methods statistically through grade classification. First, we used a confocal laser scanning microscope to obtain image slices of four grades of renal cell carcinoma, which were then reconstructed into 3D volumes. Next, we extracted quantitative values using a 3D gray level cooccurrence matrix (GLCM) and a 3D wavelet based on two types of basis functions. To evaluate their validity, we predefined 6 different statistical classifiers and applied these to the extracted feature sets. In the grade classification results, 3D Haar wavelet texture features combined with principal component analysis showed the best discrimination results. Classification using 3D wavelet texture features was significantly better than 3D GLCM, suggesting that the former has potential for use in a computer-based grading system. PMID:25371701
Bahraminejad, Behzad; Basri, Shahnor; Isa, Maryam; Hambli, Zarida
2010-01-01
In this study, the ability of the Capillary-attached conductive gas sensor (CGS) in real-time gas identification was investigated. The structure of the prototype fabricated CGS is presented. Portions were selected from the beginning of the CGS transient response including the first 11 samples to the first 100 samples. Different feature extraction and classification methods were applied on the selected portions. Validation of methods was evaluated to study the ability of an early portion of the CGS transient response in target gas (TG) identification. Experimental results proved that applying extracted features from an early part of the CGS transient response along with a classifier can distinguish short-chain alcohols from each other perfectly. Decreasing time of exposition in the interaction between target gas and sensing element improved the reliability of the sensor. Classification rate was also improved and time of identification was decreased. Moreover, the results indicated the optimum interval of the early transient response of the CGS for selecting portions to achieve the best classification rates. PMID:22219666
NASA Astrophysics Data System (ADS)
Cheng, Yayun; Qi, Bo; Liu, Siyuan; Hu, Fei; Gui, Liangqi; Peng, Xiaohui
2016-10-01
Polarimetric measurements can provide additional information as compared to unpolarized ones. In this paper, linear polarization ratio (LPR) is created to be a feature discriminator. The LPR properties of several materials are investigated using Fresnel theory. The theoretical results show that LPR is sensitive to the material type (metal or dielectric). Then a linear polarization ratio-based (LPR-based) method is presented to distinguish between metal and dielectric materials. In order to apply this method to practical applications, the optimal range of incident angle have been discussed. The typical outdoor experiments including various objects such as aluminum plate, grass, concrete, soil and wood, have been conducted to validate the presented classification method.
A Three-Dimensional Receiver Operator Characteristic Surface Diagnostic Metric
NASA Technical Reports Server (NTRS)
Simon, Donald L.
2011-01-01
Receiver Operator Characteristic (ROC) curves are commonly applied as metrics for quantifying the performance of binary fault detection systems. An ROC curve provides a visual representation of a detection system s True Positive Rate versus False Positive Rate sensitivity as the detection threshold is varied. The area under the curve provides a measure of fault detection performance independent of the applied detection threshold. While the standard ROC curve is well suited for quantifying binary fault detection performance, it is not suitable for quantifying the classification performance of multi-fault classification problems. Furthermore, it does not provide a measure of diagnostic latency. To address these shortcomings, a novel three-dimensional receiver operator characteristic (3D ROC) surface metric has been developed. This is done by generating and applying two separate curves: the standard ROC curve reflecting fault detection performance, and a second curve reflecting fault classification performance. A third dimension, diagnostic latency, is added giving rise to 3D ROC surfaces. Applying numerical integration techniques, the volumes under and between the surfaces are calculated to produce metrics of the diagnostic system s detection and classification performance. This paper will describe the 3D ROC surface metric in detail, and present an example of its application for quantifying the performance of aircraft engine gas path diagnostic methods. Metric limitations and potential enhancements are also discussed
Spectral-Spatial Shared Linear Regression for Hyperspectral Image Classification.
Haoliang Yuan; Yuan Yan Tang
2017-04-01
Classification of the pixels in hyperspectral image (HSI) is an important task and has been popularly applied in many practical applications. Its major challenge is the high-dimensional small-sized problem. To deal with this problem, lots of subspace learning (SL) methods are developed to reduce the dimension of the pixels while preserving the important discriminant information. Motivated by ridge linear regression (RLR) framework for SL, we propose a spectral-spatial shared linear regression method (SSSLR) for extracting the feature representation. Comparing with RLR, our proposed SSSLR has the following two advantages. First, we utilize a convex set to explore the spatial structure for computing the linear projection matrix. Second, we utilize a shared structure learning model, which is formed by original data space and a hidden feature space, to learn a more discriminant linear projection matrix for classification. To optimize our proposed method, an efficient iterative algorithm is proposed. Experimental results on two popular HSI data sets, i.e., Indian Pines and Salinas demonstrate that our proposed methods outperform many SL methods.
NASA Astrophysics Data System (ADS)
Zheng, Wenli; Wang, Chaojian; Chang, Shufang; Zhang, Shiwu; Xu, Ronald X.
2015-12-01
Hyperspectral reflectance imaging technique has been used for in vivo detection of cervical intraepithelial neoplasia. However, the clinical outcome of this technique is suboptimal owing to multiple limitations such as nonuniform illumination, high-cost and bulky setup, and time-consuming data acquisition and processing. To overcome these limitations, we acquired the hyperspectral data cube in a wavelength ranging from 600 to 800 nm and processed it by a wide gap second derivative analysis method. This method effectively reduced the image artifacts caused by nonuniform illumination and background absorption. Furthermore, with second derivative analysis, only three specific wavelengths (620, 696, and 772 nm) are needed for tissue classification with optimal separability. Clinical feasibility of the proposed image analysis and classification method was tested in a clinical trial where cervical hyperspectral images from three patients were used for classification analysis. Our proposed method successfully classified the cervix tissue into three categories of normal, inflammation and high-grade lesion. These classification results were coincident with those by an experienced gynecology oncologist after applying acetic acid. Our preliminary clinical study has demonstrated the technical feasibility for in vivo and noninvasive detection of cervical neoplasia without acetic acid. Further clinical research is needed in order to establish a large-scale diagnostic database and optimize the tissue classification technique.
Zheng, Wenli; Wang, Chaojian; Chang, Shufang; Zhang, Shiwu; Xu, Ronald X
2015-12-01
Hyperspectral reflectance imaging technique has been used for in vivo detection of cervical intraepithelial neoplasia. However, the clinical outcome of this technique is suboptimal owing to multiple limitations such as nonuniform illumination, high-cost and bulky setup, and time-consuming data acquisition and processing. To overcome these limitations, we acquired the hyperspectral data cube in a wavelength ranging from 600 to 800 nm and processed it by a wide gap second derivative analysis method. This method effectively reduced the image artifacts caused by nonuniform illumination and background absorption. Furthermore, with second derivative analysis, only three specific wavelengths (620, 696, and 772 nm) are needed for tissue classification with optimal separability. Clinical feasibility of the proposed image analysis and classification method was tested in a clinical trial where cervical hyperspectral images from three patients were used for classification analysis. Our proposed method successfully classified the cervix tissue into three categories of normal, inflammation and high-grade lesion. These classification results were coincident with those by an experienced gynecology oncologist after applying acetic acid. Our preliminary clinical study has demonstrated the technical feasibility for in vivo and noninvasive detection of cervical neoplasia without acetic acid. Further clinical research is needed in order to establish a large-scale diagnostic database and optimize the tissue classification technique.
A Novel Feature Selection Technique for Text Classification Using Naïve Bayes.
Dey Sarkar, Subhajit; Goswami, Saptarsi; Agarwal, Aman; Aktar, Javed
2014-01-01
With the proliferation of unstructured data, text classification or text categorization has found many applications in topic classification, sentiment analysis, authorship identification, spam detection, and so on. There are many classification algorithms available. Naïve Bayes remains one of the oldest and most popular classifiers. On one hand, implementation of naïve Bayes is simple and, on the other hand, this also requires fewer amounts of training data. From the literature review, it is found that naïve Bayes performs poorly compared to other classifiers in text classification. As a result, this makes the naïve Bayes classifier unusable in spite of the simplicity and intuitiveness of the model. In this paper, we propose a two-step feature selection method based on firstly a univariate feature selection and then feature clustering, where we use the univariate feature selection method to reduce the search space and then apply clustering to select relatively independent feature sets. We demonstrate the effectiveness of our method by a thorough evaluation and comparison over 13 datasets. The performance improvement thus achieved makes naïve Bayes comparable or superior to other classifiers. The proposed algorithm is shown to outperform other traditional methods like greedy search based wrapper or CFS.
Galaxy Zoo: quantitative visual morphological classifications for 48 000 galaxies from CANDELS
NASA Astrophysics Data System (ADS)
Simmons, B. D.; Lintott, Chris; Willett, Kyle W.; Masters, Karen L.; Kartaltepe, Jeyhan S.; Häußler, Boris; Kaviraj, Sugata; Krawczyk, Coleman; Kruk, S. J.; McIntosh, Daniel H.; Smethurst, R. J.; Nichol, Robert C.; Scarlata, Claudia; Schawinski, Kevin; Conselice, Christopher J.; Almaini, Omar; Ferguson, Henry C.; Fortson, Lucy; Hartley, William; Kocevski, Dale; Koekemoer, Anton M.; Mortlock, Alice; Newman, Jeffrey A.; Bamford, Steven P.; Grogin, N. A.; Lucas, Ray A.; Hathi, Nimish P.; McGrath, Elizabeth; Peth, Michael; Pforr, Janine; Rizer, Zachary; Wuyts, Stijn; Barro, Guillermo; Bell, Eric F.; Castellano, Marco; Dahlen, Tomas; Dekel, Avishai; Ownsworth, Jamie; Faber, Sandra M.; Finkelstein, Steven L.; Fontana, Adriano; Galametz, Audrey; Grützbauch, Ruth; Koo, David; Lotz, Jennifer; Mobasher, Bahram; Mozena, Mark; Salvato, Mara; Wiklind, Tommy
2017-02-01
We present quantified visual morphologies of approximately 48 000 galaxies observed in three Hubble Space Telescope legacy fields by the Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey (CANDELS) and classified by participants in the Galaxy Zoo project. 90 per cent of galaxies have z ≤ 3 and are observed in rest-frame optical wavelengths by CANDELS. Each galaxy received an average of 40 independent classifications, which we combine into detailed morphological information on galaxy features such as clumpiness, bar instabilities, spiral structure, and merger and tidal signatures. We apply a consensus-based classifier weighting method that preserves classifier independence while effectively down-weighting significantly outlying classifications. After analysing the effect of varying image depth on reported classifications, we also provide depth-corrected classifications which both preserve the information in the deepest observations and also enable the use of classifications at comparable depths across the full survey. Comparing the Galaxy Zoo classifications to previous classifications of the same galaxies shows very good agreement; for some applications, the high number of independent classifications provided by Galaxy Zoo provides an advantage in selecting galaxies with a particular morphological profile, while in others the combination of Galaxy Zoo with other classifications is a more promising approach than using any one method alone. We combine the Galaxy Zoo classifications of `smooth' galaxies with parametric morphologies to select a sample of featureless discs at 1 ≤ z ≤ 3, which may represent a dynamically warmer progenitor population to the settled disc galaxies seen at later epochs.
Development of a template for the classification of traditional medical knowledge in Korea.
Kim, Sungha; Kim, Boyoung; Mun, Sujeong; Park, Jeong Hwan; Kim, Min-Kyeoung; Choi, Sunmi; Lee, Sanghun
2016-02-03
Traditional Medical Knowledge (TMK) is a form of Traditional Knowledge associated with medicine that is handed down orally or by written material. There are efforts to document TMK, and make database to conserve Traditional Medicine and facilitate future research to validate traditional use. Despite of these efforts, there is no widely accepted template in data file format that is specific for TMK and, at the same time, helpful for understanding and organizing TMK. We aimed to develop a template to classify TMK. First, we reviewed books, articles, and health-related classification systems, and used focus group discussion to establish the definition, scope, and constituents of TMK. Second, we developed an initial version of the template to classify TMK, and applied it to TMK data. Third, we revised the template, based on the results of the initial template and input from experts, and applied it to the data. We developed the template for classification of TMK. The constituents of the template were summary, properties, tools/ingredients, indication/preparation/application, and international standard classification. We applied International Patent Classification, International Classification of Diseases (Korea version), and Classification of Korean Traditional Knowledge Resources to provide legal protection of TMK and facilitate academic research. The template provides standard terms for ingredients, preparation, administration route, and procedure method to assess safety and efficacy. This is the first template that is specialized for TMK for arranging and classifying TMK. The template would have important roles in preserving TMK, and protecting intellectual property. TMK data classified with the template could be used as the preliminary data to screen potential candidates for new pharmaceuticals. Copyright © 2015 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.
NASA Astrophysics Data System (ADS)
Shvelidze, Teimuraz; Malyuto, Valeri
2015-08-01
Quantitative spectral classification of F, G and K stars with the 70-cm telescope of the Ambastumani Astrophysical Observatory in areas of the main meridional section of the Galaxy, and for which proper motion data are available, has been performed. Fundamental parameters have been obtained for several hundred stars. Space densities of stars of different spectral types, the stellar luminosity function and the relationships between the kinematics and metallicity of stars have been studied. The results have confirmed and completed the conclusions made on the basis of some previous spectroscopic and photometric surveys. Many plates have been obtained for other important directions in the sky: the Kapteyn areas, the Galactic anticentre, the main meridional section of the Galaxy and etc. Very rich collection of photographic objective spectral plates (30,000 were accumulated during last 60 years) is available at Abastumani Observatory-wavelength range 3900-4900 A, about 2A resolution. Availability of new devices for automatic registration of spectra from photographic plates as well as some recently developed classification techniques may allow now to create a modern system of automatic spectral classification and with expension of classification techniques to additional types (B-A, M spectral classes). The data can be treated with the same quantitative method applied here. This method may also be applied to other available and future spectroscopic data of similar resolution, notably that obtained with large format CCD detectors on Schmidt-type telescopes.
Skimming Digits: Neuromorphic Classification of Spike-Encoded Images
Cohen, Gregory K.; Orchard, Garrick; Leng, Sio-Hoi; Tapson, Jonathan; Benosman, Ryad B.; van Schaik, André
2016-01-01
The growing demands placed upon the field of computer vision have renewed the focus on alternative visual scene representations and processing paradigms. Silicon retinea provide an alternative means of imaging the visual environment, and produce frame-free spatio-temporal data. This paper presents an investigation into event-based digit classification using N-MNIST, a neuromorphic dataset created with a silicon retina, and the Synaptic Kernel Inverse Method (SKIM), a learning method based on principles of dendritic computation. As this work represents the first large-scale and multi-class classification task performed using the SKIM network, it explores different training patterns and output determination methods necessary to extend the original SKIM method to support multi-class problems. Making use of SKIM networks applied to real-world datasets, implementing the largest hidden layer sizes and simultaneously training the largest number of output neurons, the classification system achieved a best-case accuracy of 92.87% for a network containing 10,000 hidden layer neurons. These results represent the highest accuracies achieved against the dataset to date and serve to validate the application of the SKIM method to event-based visual classification tasks. Additionally, the study found that using a square pulse as the supervisory training signal produced the highest accuracy for most output determination methods, but the results also demonstrate that an exponential pattern is better suited to hardware implementations as it makes use of the simplest output determination method based on the maximum value. PMID:27199646
Estimation and classification by sigmoids based on mutual information
NASA Technical Reports Server (NTRS)
Baram, Yoram
1994-01-01
An estimate of the probability density function of a random vector is obtained by maximizing the mutual information between the input and the output of a feedforward network of sigmoidal units with respect to the input weights. Classification problems can be solved by selecting the class associated with the maximal estimated density. Newton's s method, applied to an estimated density, yields a recursive maximum likelihood estimator, consisting of a single internal layer of sigmoids, for a random variable or a random sequence. Applications to the diamond classification and to the prediction of a sun-spot process are demonstrated.
NASA Technical Reports Server (NTRS)
Harwood, P. (Principal Investigator); Finley, R.; Mcculloch, S.; Marphy, D.; Hupp, B.
1976-01-01
The author has identified the following significant results. Image interpretation mapping techniques were successfully applied to test site 5, an area with a semi-arid climate. The land cover/land use classification required further modification. A new program, HGROUP, added to the ADP classification schedule provides a convenient method for examining the spectral similarity between classes. This capability greatly simplifies the task of combining 25-30 unsupervised subclasses into about 15 major classes that approximately correspond to the land use/land cover classification scheme.
NASA Astrophysics Data System (ADS)
Ness, P. H.; Jacobson, H.
1984-10-01
The thrust of 'group technology' is toward the exploitation of similarities in component design and manufacturing process plans to achieve assembly line flow cost efficiencies for small batch production. The systematic method devised for the identification of similarities in component geometry and processing steps is a coding and classification scheme implemented by interactive CAD/CAM systems. This coding and classification scheme has led to significant increases in computer processing power, allowing rapid searches and retrievals on the basis of a 30-digit code together with user-friendly computer graphics.
The use of a projection method to simplify portal and hepatic vein segmentation in liver anatomy.
Huang, Shaohui; Wang, Boliang; Cheng, Ming; Huang, Xiaoyang; Ju, Ying
2008-12-01
In living donor liver transplantation, the volume of the potential graft must be measured to ensure sufficient liver function after surgery. Couinaud divided the liver into 8 functionally independent segments. However, this method is not simple to perform in 3D space directly. Thus, we propose a rapid method to segment the liver based on the hepatic vessel tree. The most important step of this method is vascular projection. By carefully selecting a projection plane, a 3D point can be fixed in the projection plane. This greatly helps in rapid classification. This method was validated by applying it to a 3D liver depicted on CT images, and the result was in good agreement with Couinaud's classification.
NASA Astrophysics Data System (ADS)
Kerr, Laura T.; Adams, Aine; O'Dea, Shirley; Domijan, Katarina; Cullen, Ivor; Hennelly, Bryan M.
2014-05-01
Raman microspectroscopy can be applied to the urinary bladder for highly accurate classification and diagnosis of bladder cancer. This technique can be applied in vitro to bladder epithelial cells obtained from urine cytology or in vivo as an optical biopsy" to provide results in real-time with higher sensitivity and specificity than current clinical methods. However, there exists a high degree of variability across experimental parameters which need to be standardised before this technique can be utilized in an everyday clinical environment. In this study, we investigate different laser wavelengths (473 nm and 532 nm), sample substrates (glass, fused silica and calcium fluoride) and multivariate statistical methods in order to gain insight into how these various experimental parameters impact on the sensitivity and specificity of Raman cytology.
Boskamp, Tobias; Lachmund, Delf; Oetjen, Janina; Cordero Hernandez, Yovany; Trede, Dennis; Maass, Peter; Casadonte, Rita; Kriegsmann, Jörg; Warth, Arne; Dienemann, Hendrik; Weichert, Wilko; Kriegsmann, Mark
2017-07-01
Matrix-assisted laser desorption/ionization imaging mass spectrometry (MALDI IMS) shows a high potential for applications in histopathological diagnosis, and in particular for supporting tumor typing and subtyping. The development of such applications requires the extraction of spectral fingerprints that are relevant for the given tissue and the identification of biomarkers associated with these spectral patterns. We propose a novel data analysis method based on the extraction of characteristic spectral patterns (CSPs) that allow automated generation of classification models for spectral data. Formalin-fixed paraffin embedded (FFPE) tissue samples from N=445 patients assembled on 12 tissue microarrays were analyzed. The method was applied to discriminate primary lung and pancreatic cancer, as well as adenocarcinoma and squamous cell carcinoma of the lung. A classification accuracy of 100% and 82.8%, resp., could be achieved on core level, assessed by cross-validation. The method outperformed the more conventional classification method based on the extraction of individual m/z values in the first application, while achieving a comparable accuracy in the second. LC-MS/MS peptide identification demonstrated that the spectral features present in selected CSPs correspond to peptides relevant for the respective classification. This article is part of a Special Issue entitled: MALDI Imaging, edited by Dr. Corinna Henkel and Prof. Peter Hoffmann. Copyright © 2016 Elsevier B.V. All rights reserved.
Object Manifold Alignment for Multi-Temporal High Resolution Remote Sensing Images Classification
NASA Astrophysics Data System (ADS)
Gao, G.; Zhang, M.; Gu, Y.
2017-05-01
Multi-temporal remote sensing images classification is very useful for monitoring the land cover changes. Traditional approaches in this field mainly face to limited labelled samples and spectral drift of image information. With spatial resolution improvement, "pepper and salt" appears and classification results will be effected when the pixelwise classification algorithms are applied to high-resolution satellite images, in which the spatial relationship among the pixels is ignored. For classifying the multi-temporal high resolution images with limited labelled samples, spectral drift and "pepper and salt" problem, an object-based manifold alignment method is proposed. Firstly, multi-temporal multispectral images are cut to superpixels by simple linear iterative clustering (SLIC) respectively. Secondly, some features obtained from superpixels are formed as vector. Thirdly, a majority voting manifold alignment method aiming at solving high resolution problem is proposed and mapping the vector data to alignment space. At last, all the data in the alignment space are classified by using KNN method. Multi-temporal images from different areas or the same area are both considered in this paper. In the experiments, 2 groups of multi-temporal HR images collected by China GF1 and GF2 satellites are used for performance evaluation. Experimental results indicate that the proposed method not only has significantly outperforms than traditional domain adaptation methods in classification accuracy, but also effectively overcome the problem of "pepper and salt".
Post-boosting of classification boundary for imbalanced data using geometric mean.
Du, Jie; Vong, Chi-Man; Pun, Chi-Man; Wong, Pak-Kin; Ip, Weng-Fai
2017-12-01
In this paper, a novel imbalance learning method for binary classes is proposed, named as Post-Boosting of classification boundary for Imbalanced data (PBI), which can significantly improve the performance of any trained neural networks (NN) classification boundary. The procedure of PBI simply consists of two steps: an (imbalanced) NN learning method is first applied to produce a classification boundary, which is then adjusted by PBI under the geometric mean (G-mean). For data imbalance, the geometric mean of the accuracies of both minority and majority classes is considered, that is statistically more suitable than the common metric accuracy. PBI also has the following advantages over traditional imbalance methods: (i) PBI can significantly improve the classification accuracy on minority class while improving or keeping that on majority class as well; (ii) PBI is suitable for large data even with high imbalance ratio (up to 0.001). For evaluation of (i), a new metric called Majority loss/Minority advance ratio (MMR) is proposed that evaluates the loss ratio of majority class to minority class. Experiments have been conducted for PBI and several imbalance learning methods over benchmark datasets of different sizes, different imbalance ratios, and different dimensionalities. By analyzing the experimental results, PBI is shown to outperform other imbalance learning methods on almost all datasets. Copyright © 2017 Elsevier Ltd. All rights reserved.
Desert plains classification based on Geomorphometrical parameters (Case study: Aghda, Yazd)
NASA Astrophysics Data System (ADS)
Tazeh, mahdi; Kalantari, Saeideh
2013-04-01
This research focuses on plains. There are several tremendous methods and classification which presented for plain classification. One of The natural resource based classification which is mostly using in Iran, classified plains into three types, Erosional Pediment, Denudation Pediment Aggradational Piedmont. The qualitative and quantitative factors to differentiate them from each other are also used appropriately. In this study effective Geomorphometrical parameters in differentiate landforms were applied for plain. Geomorphometrical parameters are calculable and can be extracted using mathematical equations and the corresponding relations on digital elevation model. Geomorphometrical parameters used in this study included Percent of Slope, Plan Curvature, Profile Curvature, Minimum Curvature, the Maximum Curvature, Cross sectional Curvature, Longitudinal Curvature and Gaussian Curvature. The results indicated that the most important affecting Geomorphometrical parameters for plain and desert classifications includes: Percent of Slope, Minimum Curvature, Profile Curvature, and Longitudinal Curvature. Key Words: Plain, Geomorphometry, Classification, Biophysical, Yazd Khezarabad.
Handwritten digits recognition based on immune network
NASA Astrophysics Data System (ADS)
Li, Yangyang; Wu, Yunhui; Jiao, Lc; Wu, Jianshe
2011-11-01
With the development of society, handwritten digits recognition technique has been widely applied to production and daily life. It is a very difficult task to solve these problems in the field of pattern recognition. In this paper, a new method is presented for handwritten digit recognition. The digit samples firstly are processed and features extraction. Based on these features, a novel immune network classification algorithm is designed and implemented to the handwritten digits recognition. The proposed algorithm is developed by Jerne's immune network model for feature selection and KNN method for classification. Its characteristic is the novel network with parallel commutating and learning. The performance of the proposed method is experimented to the handwritten number datasets MNIST and compared with some other recognition algorithms-KNN, ANN and SVM algorithm. The result shows that the novel classification algorithm based on immune network gives promising performance and stable behavior for handwritten digits recognition.
Spectral-spatial classification of hyperspectral image using three-dimensional convolution network
NASA Astrophysics Data System (ADS)
Liu, Bing; Yu, Xuchu; Zhang, Pengqiang; Tan, Xiong; Wang, Ruirui; Zhi, Lu
2018-01-01
Recently, hyperspectral image (HSI) classification has become a focus of research. However, the complex structure of an HSI makes feature extraction difficult to achieve. Most current methods build classifiers based on complex handcrafted features computed from the raw inputs. The design of an improved 3-D convolutional neural network (3D-CNN) model for HSI classification is described. This model extracts features from both the spectral and spatial dimensions through the application of 3-D convolutions, thereby capturing the important discrimination information encoded in multiple adjacent bands. The designed model views the HSI cube data altogether without relying on any pre- or postprocessing. In addition, the model is trained in an end-to-end fashion without any handcrafted features. The designed model was applied to three widely used HSI datasets. The experimental results demonstrate that the 3D-CNN-based method outperforms conventional methods even with limited labeled training samples.
Liu, Boquan; Polce, Evan; Sprott, Julien C; Jiang, Jack J
2018-05-17
The purpose of this study is to introduce a chaos level test to evaluate linear and nonlinear voice type classification method performances under varying signal chaos conditions without subjective impression. Voice signals were constructed with differing degrees of noise to model signal chaos. Within each noise power, 100 Monte Carlo experiments were applied to analyze the output of jitter, shimmer, correlation dimension, and spectrum convergence ratio. The computational output of the 4 classifiers was then plotted against signal chaos level to investigate the performance of these acoustic analysis methods under varying degrees of signal chaos. A diffusive behavior detection-based chaos level test was used to investigate the performances of different voice classification methods. Voice signals were constructed by varying the signal-to-noise ratio to establish differing signal chaos conditions. Chaos level increased sigmoidally with increasing noise power. Jitter and shimmer performed optimally when the chaos level was less than or equal to 0.01, whereas correlation dimension was capable of analyzing signals with chaos levels of less than or equal to 0.0179. Spectrum convergence ratio demonstrated proficiency in analyzing voice signals with all chaos levels investigated in this study. The results of this study corroborate the performance relationships observed in previous studies and, therefore, demonstrate the validity of the validation test method. The presented chaos level validation test could be broadly utilized to evaluate acoustic analysis methods and establish the most appropriate methodology for objective voice analysis in clinical practice.
CP-CHARM: segmentation-free image classification made accessible.
Uhlmann, Virginie; Singh, Shantanu; Carpenter, Anne E
2016-01-27
Automated classification using machine learning often relies on features derived from segmenting individual objects, which can be difficult to automate. WND-CHARM is a previously developed classification algorithm in which features are computed on the whole image, thereby avoiding the need for segmentation. The algorithm obtained encouraging results but requires considerable computational expertise to execute. Furthermore, some benchmark sets have been shown to be subject to confounding artifacts that overestimate classification accuracy. We developed CP-CHARM, a user-friendly image-based classification algorithm inspired by WND-CHARM in (i) its ability to capture a wide variety of morphological aspects of the image, and (ii) the absence of requirement for segmentation. In order to make such an image-based classification method easily accessible to the biological research community, CP-CHARM relies on the widely-used open-source image analysis software CellProfiler for feature extraction. To validate our method, we reproduced WND-CHARM's results and ensured that CP-CHARM obtained comparable performance. We then successfully applied our approach on cell-based assay data and on tissue images. We designed these new training and test sets to reduce the effect of batch-related artifacts. The proposed method preserves the strengths of WND-CHARM - it extracts a wide variety of morphological features directly on whole images thereby avoiding the need for cell segmentation, but additionally, it makes the methods easily accessible for researchers without computational expertise by implementing them as a CellProfiler pipeline. It has been demonstrated to perform well on a wide range of bioimage classification problems, including on new datasets that have been carefully selected and annotated to minimize batch effects. This provides for the first time a realistic and reliable assessment of the whole image classification strategy.
Motor Oil Classification using Color Histograms and Pattern Recognition Techniques.
Ahmadi, Shiva; Mani-Varnosfaderani, Ahmad; Habibi, Biuck
2018-04-20
Motor oil classification is important for quality control and the identification of oil adulteration. In thiswork, we propose a simple, rapid, inexpensive and nondestructive approach based on image analysis and pattern recognition techniques for the classification of nine different types of motor oils according to their corresponding color histograms. For this, we applied color histogram in different color spaces such as red green blue (RGB), grayscale, and hue saturation intensity (HSI) in order to extract features that can help with the classification procedure. These color histograms and their combinations were used as input for model development and then were statistically evaluated by using linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and support vector machine (SVM) techniques. Here, two common solutions for solving a multiclass classification problem were applied: (1) transformation to binary classification problem using a one-against-all (OAA) approach and (2) extension from binary classifiers to a single globally optimized multilabel classification model. In the OAA strategy, LDA, QDA, and SVM reached up to 97% in terms of accuracy, sensitivity, and specificity for both the training and test sets. In extension from binary case, despite good performances by the SVM classification model, QDA and LDA provided better results up to 92% for RGB-grayscale-HSI color histograms and up to 93% for the HSI color map, respectively. In order to reduce the numbers of independent variables for modeling, a principle component analysis algorithm was used. Our results suggest that the proposed method is promising for the identification and classification of different types of motor oils.
Spectral-spatial classification for noninvasive cancer detection using hyperspectral imaging
NASA Astrophysics Data System (ADS)
Lu, Guolan; Halig, Luma; Wang, Dongsheng; Qin, Xulei; Chen, Zhuo Georgia; Fei, Baowei
2014-10-01
Early detection of malignant lesions could improve both survival and quality of life of cancer patients. Hyperspectral imaging (HSI) has emerged as a powerful tool for noninvasive cancer detection and diagnosis, with the advantage of avoiding tissue biopsy and providing diagnostic signatures without the need of a contrast agent in real time. We developed a spectral-spatial classification method to distinguish cancer from normal tissue on hyperspectral images. We acquire hyperspectral reflectance images from 450 to 900 nm with a 2-nm increment from tumor-bearing mice. In our animal experiments, the HSI and classification method achieved a sensitivity of 93.7% and a specificity of 91.3%. The preliminary study demonstrated that HSI has the potential to be applied in vivo for noninvasive detection of tumors.
Unsupervised Feature Learning for Heart Sounds Classification Using Autoencoder
NASA Astrophysics Data System (ADS)
Hu, Wei; Lv, Jiancheng; Liu, Dongbo; Chen, Yao
2018-04-01
Cardiovascular disease seriously threatens the health of many people. It is usually diagnosed during cardiac auscultation, which is a fast and efficient method of cardiovascular disease diagnosis. In recent years, deep learning approach using unsupervised learning has made significant breakthroughs in many fields. However, to our knowledge, deep learning has not yet been used for heart sound classification. In this paper, we first use the average Shannon energy to extract the envelope of the heart sounds, then find the highest point of S1 to extract the cardiac cycle. We convert the time-domain signals of the cardiac cycle into spectrograms and apply principal component analysis whitening to reduce the dimensionality of the spectrogram. Finally, we apply a two-layer autoencoder to extract the features of the spectrogram. The experimental results demonstrate that the features from the autoencoder are suitable for heart sound classification.
NASA Astrophysics Data System (ADS)
Yang, Y.; Tenenbaum, D. E.
2009-12-01
The process of urbanization has major effects on both human and natural systems. In order to monitor these changes and better understand how urban ecological systems work, urban spatial structure and the variation needs to be first quantified at a fine scale. Because the land-use and land-cover (LULC) in urbanizing areas is highly heterogeneous, the classification of urbanizing environments is the most challenging field in remote sensing. Although a pixel-based method is a common way to do classification, the results are not good enough for many research objectives which require more accurate classification data in fine scales. Transect sampling and object-oriented classification methods are more appropriate for urbanizing areas. Tenenbaum used a transect sampling method using a computer-based facility within a widely available commercial GIS in the Glyndon Catchment and the Upper Baismans Run Catchment, Baltimore, Maryland. It was a two-tiered classification system, including a primary level (which includes 7 classes) and a secondary level (which includes 37 categories). The statistical information of LULC was collected. W. Zhou applied an object-oriented method at the parcel level in Gwynn’s Falls Watershed which includes the two previously mentioned catchments and six classes were extracted. The two urbanizing catchments are located in greater Baltimore, Maryland and drain into Chesapeake Bay. In this research, the two different methods are compared for 6 classes (woody, herbaceous, water, ground, pavement and structure). The comparison method uses the segments in the transect method to extract LULC information from the results of the object-oriented method. Classification results were compared in order to evaluate the difference between the two methods. The overall proportions of LULC classes from the two studies show that there is overestimation of structures in the object-oriented method. For the other five classes, the results from the two methods are similar, except for a difference in the proportions of the woody class. The segment to segment comparison shows that the resolution of the light detection and ranging (LIDAR) data used in the object-oriented method does affect the accuracy of the classification. Shadows of trees and structures are still a big problem in the object-oriented method. For classes that make up a small proportion of the catchments, such as water, neither method was capable of detecting them.
Signal analysis techniques for incipient failure detection in turbomachinery
NASA Technical Reports Server (NTRS)
Coffin, T.
1985-01-01
Signal analysis techniques for the detection and classification of incipient mechanical failures in turbomachinery were developed, implemented and evaluated. Signal analysis techniques available to describe dynamic measurement characteristics are reviewed. Time domain and spectral methods are described, and statistical classification in terms of moments is discussed. Several of these waveform analysis techniques were implemented on a computer and applied to dynamic signals. A laboratory evaluation of the methods with respect to signal detection capability is described. Plans for further technique evaluation and data base development to characterize turbopump incipient failure modes from Space Shuttle main engine (SSME) hot firing measurements are outlined.
A hybrid clustering and classification approach for predicting crash injury severity on rural roads.
Hasheminejad, Seyed Hessam-Allah; Zahedi, Mohsen; Hasheminejad, Seyed Mohammad Hossein
2018-03-01
As a threat for transportation system, traffic crashes have a wide range of social consequences for governments. Traffic crashes are increasing in developing countries and Iran as a developing country is not immune from this risk. There are several researches in the literature to predict traffic crash severity based on artificial neural networks (ANNs), support vector machines and decision trees. This paper attempts to investigate the crash injury severity of rural roads by using a hybrid clustering and classification approach to compare the performance of classification algorithms before and after applying the clustering. In this paper, a novel rule-based genetic algorithm (GA) is proposed to predict crash injury severity, which is evaluated by performance criteria in comparison with classification algorithms like ANN. The results obtained from analysis of 13,673 crashes (5600 property damage, 778 fatal crashes, 4690 slight injuries and 2605 severe injuries) on rural roads in Tehran Province of Iran during 2011-2013 revealed that the proposed GA method outperforms other classification algorithms based on classification metrics like precision (86%), recall (88%) and accuracy (87%). Moreover, the proposed GA method has the highest level of interpretation, is easy to understand and provides feedback to analysts.
Cell dynamic morphology classification using deep convolutional neural networks.
Li, Heng; Pang, Fengqian; Shi, Yonggang; Liu, Zhiwen
2018-05-15
Cell morphology is often used as a proxy measurement of cell status to understand cell physiology. Hence, interpretation of cell dynamic morphology is a meaningful task in biomedical research. Inspired by the recent success of deep learning, we here explore the application of convolutional neural networks (CNNs) to cell dynamic morphology classification. An innovative strategy for the implementation of CNNs is introduced in this study. Mouse lymphocytes were collected to observe the dynamic morphology, and two datasets were thus set up to investigate the performances of CNNs. Considering the installation of deep learning, the classification problem was simplified from video data to image data, and was then solved by CNNs in a self-taught manner with the generated image data. CNNs were separately performed in three installation scenarios and compared with existing methods. Experimental results demonstrated the potential of CNNs in cell dynamic morphology classification, and validated the effectiveness of the proposed strategy. CNNs were successfully applied to the classification problem, and outperformed the existing methods in the classification accuracy. For the installation of CNNs, transfer learning was proved to be a promising scheme. © 2018 International Society for Advancement of Cytometry. © 2018 International Society for Advancement of Cytometry.
Applying matching pursuit decomposition time-frequency processing to UGS footstep classification
NASA Astrophysics Data System (ADS)
Larsen, Brett W.; Chung, Hugh; Dominguez, Alfonso; Sciacca, Jacob; Kovvali, Narayan; Papandreou-Suppappola, Antonia; Allee, David R.
2013-06-01
The challenge of rapid footstep detection and classification in remote locations has long been an important area of study for defense technology and national security. Also, as the military seeks to create effective and disposable unattended ground sensors (UGS), computational complexity and power consumption have become essential considerations in the development of classification techniques. In response to these issues, a research project at the Flexible Display Center at Arizona State University (ASU) has experimented with footstep classification using the matching pursuit decomposition (MPD) time-frequency analysis method. The MPD provides a parsimonious signal representation by iteratively selecting matched signal components from a pre-determined dictionary. The resulting time-frequency representation of the decomposed signal provides distinctive features for different types of footsteps, including footsteps during walking or running activities. The MPD features were used in a Bayesian classification method to successfully distinguish between the different activities. The computational cost of the iterative MPD algorithm was reduced, without significant loss in performance, using a modified MPD with a dictionary consisting of signals matched to cadence temporal gait patterns obtained from real seismic measurements. The classification results were demonstrated with real data from footsteps under various conditions recorded using a low-cost seismic sensor.
Garland, Ellen C; Castellote, Manuel; Berchok, Catherine L
2015-06-01
Beluga whales, Delphinapterus leucas, have a graded call system; call types exist on a continuum making classification challenging. A description of vocalizations from the eastern Beaufort Sea beluga population during its spring migration are presented here, using both a non-parametric classification tree analysis (CART), and a Random Forest analysis. Twelve frequency and duration measurements were made on 1019 calls recorded over 14 days off Icy Cape, Alaska, resulting in 34 identifiable call types with 83% agreement in classification for both CART and Random Forest analyses. This high level of agreement in classification, with an initial subjective classification of calls into 36 categories, demonstrates that the methods applied here provide a quantitative analysis of a graded call dataset. Further, as calls cannot be attributed to individuals using single sensor passive acoustic monitoring efforts, these methods provide a comprehensive analysis of data where the influence of pseudo-replication of calls from individuals is unknown. This study is the first to describe the vocal repertoire of a beluga population using a robust and repeatable methodology. A baseline eastern Beaufort Sea beluga population repertoire is presented here, against which the call repertoire of other seasonally sympatric Alaskan beluga populations can be compared.
Automated classification of articular cartilage surfaces based on surface texture.
Stachowiak, G P; Stachowiak, G W; Podsiadlo, P
2006-11-01
In this study the automated classification system previously developed by the authors was used to classify articular cartilage surfaces with different degrees of wear. This automated system classifies surfaces based on their texture. Plug samples of sheep cartilage (pins) were run on stainless steel discs under various conditions using a pin-on-disc tribometer. Testing conditions were specifically designed to produce different severities of cartilage damage due to wear. Environmental scanning electron microscope (SEM) (ESEM) images of cartilage surfaces, that formed a database for pattern recognition analysis, were acquired. The ESEM images of cartilage were divided into five groups (classes), each class representing different wear conditions or wear severity. Each class was first examined and assessed visually. Next, the automated classification system (pattern recognition) was applied to all classes. The results of the automated surface texture classification were compared to those based on visual assessment of surface morphology. It was shown that the texture-based automated classification system was an efficient and accurate method of distinguishing between various cartilage surfaces generated under different wear conditions. It appears that the texture-based classification method has potential to become a useful tool in medical diagnostics.
A Partial Least Squares Based Procedure for Upstream Sequence Classification in Prokaryotes.
Mehmood, Tahir; Bohlin, Jon; Snipen, Lars
2015-01-01
The upstream region of coding genes is important for several reasons, for instance locating transcription factor, binding sites, and start site initiation in genomic DNA. Motivated by a recently conducted study, where multivariate approach was successfully applied to coding sequence modeling, we have introduced a partial least squares (PLS) based procedure for the classification of true upstream prokaryotic sequence from background upstream sequence. The upstream sequences of conserved coding genes over genomes were considered in analysis, where conserved coding genes were found by using pan-genomics concept for each considered prokaryotic species. PLS uses position specific scoring matrix (PSSM) to study the characteristics of upstream region. Results obtained by PLS based method were compared with Gini importance of random forest (RF) and support vector machine (SVM), which is much used method for sequence classification. The upstream sequence classification performance was evaluated by using cross validation, and suggested approach identifies prokaryotic upstream region significantly better to RF (p-value < 0.01) and SVM (p-value < 0.01). Further, the proposed method also produced results that concurred with known biological characteristics of the upstream region.
Comprehension and reproducibility of the Judet and Letournel classification
Polesello, Giancarlo Cavalli; Nunes, Marcus Aurelius Araujo; Azuaga, Thiago Leonardi; de Queiroz, Marcelo Cavalheiro; Honda, Emerson Kyoshi; Ono, Nelson Keiske
2012-01-01
Objective To evaluate the effectiveness of the method of radiographic interpretation of acetabular fractures, according to the classification of Judet and Letournel, used by a group of residents of Orthopedics at a university hospital. Methods We selected ten orthopedic residents, who were divided into two groups; one group received training in a methodology for the classification of acetabular fractures, which involves transposing the radiographic images to a graphic two-dimensional representation. We classified fifty cases of acetabular fracture on two separate occasions, and determined the intraobserver and interobserver agreement. Result The success rate was 16.2% (10-26%) for the trained group and 22.8% (10-36%) for the untrained group. The mean kappa coefficients for interobserver and intraobserver agreement in the trained group were 0.08 and 0.12, respectively, and for the untrained group, 0.14 and 0.29. Conclusion Training in the method of radiographic interpretation of acetabular fractures was not effective for assisting in the classification of acetabular fractures. Level of evidence I, Testing of previously developed diagnostic criteria on consecutive patients (with universally applied reference "gold" standard). PMID:24453583
G0-WISHART Distribution Based Classification from Polarimetric SAR Images
NASA Astrophysics Data System (ADS)
Hu, G. C.; Zhao, Q. H.
2017-09-01
Enormous scientific and technical developments have been carried out to further improve the remote sensing for decades, particularly Polarimetric Synthetic Aperture Radar(PolSAR) technique, so classification method based on PolSAR images has getted much more attention from scholars and related department around the world. The multilook polarmetric G0-Wishart model is a more flexible model which describe homogeneous, heterogeneous and extremely heterogeneous regions in the image. Moreover, the polarmetric G0-Wishart distribution dose not include the modified Bessel function of the second kind. It is a kind of simple statistical distribution model with less parameter. To prove its feasibility, a process of classification has been tested with the full-polarized Synthetic Aperture Radar (SAR) image by the method. First, apply multilook polarimetric SAR data process and speckle filter to reduce speckle influence for classification result. Initially classify the image into sixteen classes by H/A/α decomposition. Using the ICM algorithm to classify feature based on the G0-Wshart distance. Qualitative and quantitative results show that the proposed method can classify polaimetric SAR data effectively and efficiently.
NASA Astrophysics Data System (ADS)
Hong, Liang
2013-10-01
The availability of high spatial resolution remote sensing data provides new opportunities for urban land-cover classification. More geometric details can be observed in the high resolution remote sensing image, Also Ground objects in the high resolution remote sensing image have displayed rich texture, structure, shape and hierarchical semantic characters. More landscape elements are represented by a small group of pixels. Recently years, the an object-based remote sensing analysis methodology is widely accepted and applied in high resolution remote sensing image processing. The classification method based on Geo-ontology and conditional random fields is presented in this paper. The proposed method is made up of four blocks: (1) the hierarchical ground objects semantic framework is constructed based on geoontology; (2) segmentation by mean-shift algorithm, which image objects are generated. And the mean-shift method is to get boundary preserved and spectrally homogeneous over-segmentation regions ;(3) the relations between the hierarchical ground objects semantic and over-segmentation regions are defined based on conditional random fields framework ;(4) the hierarchical classification results are obtained based on geo-ontology and conditional random fields. Finally, high-resolution remote sensed image data -GeoEye, is used to testify the performance of the presented method. And the experimental results have shown the superiority of this method to the eCognition method both on the effectively and accuracy, which implies it is suitable for the classification of high resolution remote sensing image.
Methods for assessing the quality of mammalian embryos: How far we are from the gold standard?
Rocha, José C; Passalia, Felipe; Matos, Felipe D; Maserati, Marc P; Alves, Mayra F; Almeida, Tamie G de; Cardoso, Bruna L; Basso, Andrea C; Nogueira, Marcelo F G
2016-08-01
Morphological embryo classification is of great importance for many laboratory techniques, from basic research to the ones applied to assisted reproductive technology. However, the standard classification method for both human and cattle embryos, is based on quality parameters that reflect the overall morphological quality of the embryo in cattle, or the quality of the individual embryonic structures, more relevant in human embryo classification. This assessment method is biased by the subjectivity of the evaluator and even though several guidelines exist to standardize the classification, it is not a method capable of giving reliable and trustworthy results. Latest approaches for the improvement of quality assessment include the use of data from cellular metabolism, a new morphological grading system, development kinetics and cleavage symmetry, embryo cell biopsy followed by pre-implantation genetic diagnosis, zona pellucida birefringence, ion release by the embryo cells and so forth. Nowadays there exists a great need for evaluation methods that are practical and non-invasive while being accurate and objective. A method along these lines would be of great importance to embryo evaluation by embryologists, clinicians and other professionals who work with assisted reproductive technology. Several techniques shows promising results in this sense, one being the use of digital images of the embryo as basis for features extraction and classification by means of artificial intelligence techniques (as genetic algorithms and artificial neural networks). This process has the potential to become an accurate and objective standard for embryo quality assessment.
Methods for assessing the quality of mammalian embryos: How far we are from the gold standard?
Rocha, José C.; Passalia, Felipe; Matos, Felipe D.; Maserati Jr, Marc P.; Alves, Mayra F.; de Almeida, Tamie G.; Cardoso, Bruna L.; Basso, Andrea C.; Nogueira, Marcelo F. G.
2016-01-01
Morphological embryo classification is of great importance for many laboratory techniques, from basic research to the ones applied to assisted reproductive technology. However, the standard classification method for both human and cattle embryos, is based on quality parameters that reflect the overall morphological quality of the embryo in cattle, or the quality of the individual embryonic structures, more relevant in human embryo classification. This assessment method is biased by the subjectivity of the evaluator and even though several guidelines exist to standardize the classification, it is not a method capable of giving reliable and trustworthy results. Latest approaches for the improvement of quality assessment include the use of data from cellular metabolism, a new morphological grading system, development kinetics and cleavage symmetry, embryo cell biopsy followed by pre-implantation genetic diagnosis, zona pellucida birefringence, ion release by the embryo cells and so forth. Nowadays there exists a great need for evaluation methods that are practical and non-invasive while being accurate and objective. A method along these lines would be of great importance to embryo evaluation by embryologists, clinicians and other professionals who work with assisted reproductive technology. Several techniques shows promising results in this sense, one being the use of digital images of the embryo as basis for features extraction and classification by means of artificial intelligence techniques (as genetic algorithms and artificial neural networks). This process has the potential to become an accurate and objective standard for embryo quality assessment. PMID:27584609
A Comprehensive Study of Retinal Vessel Classification Methods in Fundus Images
Miri, Maliheh; Amini, Zahra; Rabbani, Hossein; Kafieh, Raheleh
2017-01-01
Nowadays, it is obvious that there is a relationship between changes in the retinal vessel structure and diseases such as diabetic, hypertension, stroke, and the other cardiovascular diseases in adults as well as retinopathy of prematurity in infants. Retinal fundus images provide non-invasive visualization of the retinal vessel structure. Applying image processing techniques in the study of digital color fundus photographs and analyzing their vasculature is a reliable approach for early diagnosis of the aforementioned diseases. Reduction in the arteriolar–venular ratio of retina is one of the primary signs of hypertension, diabetic, and cardiovascular diseases which can be calculated by analyzing the fundus images. To achieve a precise measuring of this parameter and meaningful diagnostic results, accurate classification of arteries and veins is necessary. Classification of vessels in fundus images faces with some challenges that make it difficult. In this paper, a comprehensive study of the proposed methods for classification of arteries and veins in fundus images is presented. Considering that these methods are evaluated on different datasets and use different evaluation criteria, it is not possible to conduct a fair comparison of their performance. Therefore, we evaluate the classification methods from modeling perspective. This analysis reveals that most of the proposed approaches have focused on statistics, and geometric models in spatial domain and transform domain models have received less attention. This could suggest the possibility of using transform models, especially data adaptive ones, for modeling of the fundus images in future classification approaches. PMID:28553578
Applying Active Learning to Assertion Classification of Concepts in Clinical Text
Chen, Yukun; Mani, Subramani; Xu, Hua
2012-01-01
Supervised machine learning methods for clinical natural language processing (NLP) research require a large number of annotated samples, which are very expensive to build because of the involvement of physicians. Active learning, an approach that actively samples from a large pool, provides an alternative solution. Its major goal in classification is to reduce the annotation effort while maintaining the quality of the predictive model. However, few studies have investigated its uses in clinical NLP. This paper reports an application of active learning to a clinical text classification task: to determine the assertion status of clinical concepts. The annotated corpus for the assertion classification task in the 2010 i2b2/VA Clinical NLP Challenge was used in this study. We implemented several existing and newly developed active learning algorithms and assessed their uses. The outcome is reported in the global ALC score, based on the Area under the average Learning Curve of the AUC (Area Under the Curve) score. Results showed that when the same number of annotated samples was used, active learning strategies could generate better classification models (best ALC – 0.7715) than the passive learning method (random sampling) (ALC – 0.7411). Moreover, to achieve the same classification performance, active learning strategies required fewer samples than the random sampling method. For example, to achieve an AUC of 0.79, the random sampling method used 32 samples, while our best active learning algorithm required only 12 samples, a reduction of 62.5% in manual annotation effort. PMID:22127105
Jordan, Alan; Rees, Tony; Gowlett-Holmes, Karen
2015-01-01
Imagery collected by still and video cameras is an increasingly important tool for minimal impact, repeatable observations in the marine environment. Data generated from imagery includes identification, annotation and quantification of biological subjects and environmental features within an image. To be long-lived and useful beyond their project-specific initial purpose, and to maximize their utility across studies and disciplines, marine imagery data should use a standardised vocabulary of defined terms. This would enable the compilation of regional, national and/or global data sets from multiple sources, contributing to broad-scale management studies and development of automated annotation algorithms. The classification scheme developed under the Collaborative and Automated Tools for Analysis of Marine Imagery (CATAMI) project provides such a vocabulary. The CATAMI classification scheme introduces Australian-wide acknowledged, standardised terminology for annotating benthic substrates and biota in marine imagery. It combines coarse-level taxonomy and morphology, and is a flexible, hierarchical classification that bridges the gap between habitat/biotope characterisation and taxonomy, acknowledging limitations when describing biological taxa through imagery. It is fully described, documented, and maintained through curated online databases, and can be applied across benthic image collection methods, annotation platforms and scoring methods. Following release in 2013, the CATAMI classification scheme was taken up by a wide variety of users, including government, academia and industry. This rapid acceptance highlights the scheme’s utility and the potential to facilitate broad-scale multidisciplinary studies of marine ecosystems when applied globally. Here we present the CATAMI classification scheme, describe its conception and features, and discuss its utility and the opportunities as well as challenges arising from its use. PMID:26509918
Tang, Yunwei; Jing, Linhai; Li, Hui; Liu, Qingjie; Yan, Qi; Li, Xiuxia
2016-11-22
This study explores the ability of WorldView-2 (WV-2) imagery for bamboo mapping in a mountainous region in Sichuan Province, China. A large area of this place is covered by shadows in the image, and only a few sampled points derived were useful. In order to identify bamboos based on sparse training data, the sample size was expanded according to the reflectance of multispectral bands selected using the principal component analysis (PCA). Then, class separability based on the training data was calculated using a feature space optimization method to select the features for classification. Four regular object-based classification methods were applied based on both sets of training data. The results show that the k -nearest neighbor ( k -NN) method produced the greatest accuracy. A geostatistically-weighted k -NN classifier, accounting for the spatial correlation between classes, was then applied to further increase the accuracy. It achieved 82.65% and 93.10% of the producer's and user's accuracies respectively for the bamboo class. The canopy densities were estimated to explain the result. This study demonstrates that the WV-2 image can be used to identify small patches of understory bamboos given limited known samples, and the resulting bamboo distribution facilitates the assessments of the habitats of giant pandas.
Comparison of Classification Methods for Detecting Emotion from Mandarin Speech
NASA Astrophysics Data System (ADS)
Pao, Tsang-Long; Chen, Yu-Te; Yeh, Jun-Heng
It is said that technology comes out from humanity. What is humanity? The very definition of humanity is emotion. Emotion is the basis for all human expression and the underlying theme behind everything that is done, said, thought or imagined. Making computers being able to perceive and respond to human emotion, the human-computer interaction will be more natural. Several classifiers are adopted for automatically assigning an emotion category, such as anger, happiness or sadness, to a speech utterance. These classifiers were designed independently and tested on various emotional speech corpora, making it difficult to compare and evaluate their performance. In this paper, we first compared several popular classification methods and evaluated their performance by applying them to a Mandarin speech corpus consisting of five basic emotions, including anger, happiness, boredom, sadness and neutral. The extracted feature streams contain MFCC, LPCC, and LPC. The experimental results show that the proposed WD-MKNN classifier achieves an accuracy of 81.4% for the 5-class emotion recognition and outperforms other classification techniques, including KNN, MKNN, DW-KNN, LDA, QDA, GMM, HMM, SVM, and BPNN. Then, to verify the advantage of the proposed method, we compared these classifiers by applying them to another Mandarin expressive speech corpus consisting of two emotions. The experimental results still show that the proposed WD-MKNN outperforms others.
Derivation of an artificial gene to improve classification accuracy upon gene selection.
Seo, Minseok; Oh, Sejong
2012-02-01
Classification analysis has been developed continuously since 1936. This research field has advanced as a result of development of classifiers such as KNN, ANN, and SVM, as well as through data preprocessing areas. Feature (gene) selection is required for very high dimensional data such as microarray before classification work. The goal of feature selection is to choose a subset of informative features that reduces processing time and provides higher classification accuracy. In this study, we devised a method of artificial gene making (AGM) for microarray data to improve classification accuracy. Our artificial gene was derived from a whole microarray dataset, and combined with a result of gene selection for classification analysis. We experimentally confirmed a clear improvement of classification accuracy after inserting artificial gene. Our artificial gene worked well for popular feature (gene) selection algorithms and classifiers. The proposed approach can be applied to any type of high dimensional dataset. Copyright © 2011 Elsevier Ltd. All rights reserved.
Marker-Based Hierarchical Segmentation and Classification Approach for Hyperspectral Imagery
NASA Technical Reports Server (NTRS)
Tarabalka, Yuliya; Tilton, James C.; Benediktsson, Jon Atli; Chanussot, Jocelyn
2011-01-01
The Hierarchical SEGmentation (HSEG) algorithm, which is a combination of hierarchical step-wise optimization and spectral clustering, has given good performances for hyperspectral image analysis. This technique produces at its output a hierarchical set of image segmentations. The automated selection of a single segmentation level is often necessary. We propose and investigate the use of automatically selected markers for this purpose. In this paper, a novel Marker-based HSEG (M-HSEG) method for spectral-spatial classification of hyperspectral images is proposed. First, pixelwise classification is performed and the most reliably classified pixels are selected as markers, with the corresponding class labels. Then, a novel constrained marker-based HSEG algorithm is applied, resulting in a spectral-spatial classification map. The experimental results show that the proposed approach yields accurate segmentation and classification maps, and thus is attractive for hyperspectral image analysis.
Zhou, Zhen; Wang, Jian-Bao; Zang, Yu-Feng; Pan, Gang
2018-01-01
Classification approaches have been increasingly applied to differentiate patients and normal controls using resting-state functional magnetic resonance imaging data (RS-fMRI). Although most previous classification studies have reported promising accuracy within individual datasets, achieving high levels of accuracy with multiple datasets remains challenging for two main reasons: high dimensionality, and high variability across subjects. We used two independent RS-fMRI datasets (n = 31, 46, respectively) both with eyes closed (EC) and eyes open (EO) conditions. For each dataset, we first reduced the number of features to a small number of brain regions with paired t-tests, using the amplitude of low frequency fluctuation (ALFF) as a metric. Second, we employed a new method for feature extraction, named the PAIR method, examining EC and EO as paired conditions rather than independent conditions. Specifically, for each dataset, we obtained EC minus EO (EC—EO) maps of ALFF from half of subjects (n = 15 for dataset-1, n = 23 for dataset-2) and obtained EO—EC maps from the other half (n = 16 for dataset-1, n = 23 for dataset-2). A support vector machine (SVM) method was used for classification of EC RS-fMRI mapping and EO mapping. The mean classification accuracy of the PAIR method was 91.40% for dataset-1, and 92.75% for dataset-2 in the conventional frequency band of 0.01–0.08 Hz. For cross-dataset validation, we applied the classifier from dataset-1 directly to dataset-2, and vice versa. The mean accuracy of cross-dataset validation was 94.93% for dataset-1 to dataset-2 and 90.32% for dataset-2 to dataset-1 in the 0.01–0.08 Hz range. For the UNPAIR method, classification accuracy was substantially lower (mean 69.89% for dataset-1 and 82.97% for dataset-2), and was much lower for cross-dataset validation (64.69% for dataset-1 to dataset-2 and 64.98% for dataset-2 to dataset-1) in the 0.01–0.08 Hz range. In conclusion, for within-group design studies (e.g., paired conditions or follow-up studies), we recommend the PAIR method for feature extraction. In addition, dimensionality reduction with strong prior knowledge of specific brain regions should also be considered for feature selection in neuroimaging studies. PMID:29375288
An Evaluation of Feature Learning Methods for High Resolution Image Classification
NASA Astrophysics Data System (ADS)
Tokarczyk, P.; Montoya, J.; Schindler, K.
2012-07-01
Automatic image classification is one of the fundamental problems of remote sensing research. The classification problem is even more challenging in high-resolution images of urban areas, where the objects are small and heterogeneous. Two questions arise, namely which features to extract from the raw sensor data to capture the local radiometry and image structure at each pixel or segment, and which classification method to apply to the feature vectors. While classifiers are nowadays well understood, selecting the right features remains a largely empirical process. Here we concentrate on the features. Several methods are evaluated which allow one to learn suitable features from unlabelled image data by analysing the image statistics. In a comparative study, we evaluate unsupervised feature learning with different linear and non-linear learning methods, including principal component analysis (PCA) and deep belief networks (DBN). We also compare these automatically learned features with popular choices of ad-hoc features including raw intensity values, standard combinations like the NDVI, a few PCA channels, and texture filters. The comparison is done in a unified framework using the same images, the target classes, reference data and a Random Forest classifier.
Deep learning for brain tumor classification
NASA Astrophysics Data System (ADS)
Paul, Justin S.; Plassard, Andrew J.; Landman, Bennett A.; Fabbri, Daniel
2017-03-01
Recent research has shown that deep learning methods have performed well on supervised machine learning, image classification tasks. The purpose of this study is to apply deep learning methods to classify brain images with different tumor types: meningioma, glioma, and pituitary. A dataset was publicly released containing 3,064 T1-weighted contrast enhanced MRI (CE-MRI) brain images from 233 patients with either meningioma, glioma, or pituitary tumors split across axial, coronal, or sagittal planes. This research focuses on the 989 axial images from 191 patients in order to avoid confusing the neural networks with three different planes containing the same diagnosis. Two types of neural networks were used in classification: fully connected and convolutional neural networks. Within these two categories, further tests were computed via the augmentation of the original 512×512 axial images. Training neural networks over the axial data has proven to be accurate in its classifications with an average five-fold cross validation of 91.43% on the best trained neural network. This result demonstrates that a more general method (i.e. deep learning) can outperform specialized methods that require image dilation and ring-forming subregions on tumors.
Reduction in training time of a deep learning model in detection of lesions in CT
NASA Astrophysics Data System (ADS)
Makkinejad, Nazanin; Tajbakhsh, Nima; Zarshenas, Amin; Khokhar, Ashfaq; Suzuki, Kenji
2018-02-01
Deep learning (DL) emerged as a powerful tool for object detection and classification in medical images. Building a well-performing DL model, however, requires a huge number of images for training, and it takes days to train a DL model even on a cutting edge high-performance computing platform. This study is aimed at developing a method for selecting a "small" number of representative samples from a large collection of training samples to train a DL model for the could be used to detect polyps in CT colonography (CTC), without compromising the classification performance. Our proposed method for representative sample selection (RSS) consists of a K-means clustering algorithm. For the performance evaluation, we applied the proposed method to select samples for the training of a massive training artificial neural network based DL model, to be used for the classification of polyps and non-polyps in CTC. Our results show that the proposed method reduce the training time by a factor of 15, while maintaining the classification performance equivalent to the model trained using the full training set. We compare the performance using area under the receiveroperating- characteristic curve (AUC).
HMMBinder: DNA-Binding Protein Prediction Using HMM Profile Based Features.
Zaman, Rianon; Chowdhury, Shahana Yasmin; Rashid, Mahmood A; Sharma, Alok; Dehzangi, Abdollah; Shatabda, Swakkhar
2017-01-01
DNA-binding proteins often play important role in various processes within the cell. Over the last decade, a wide range of classification algorithms and feature extraction techniques have been used to solve this problem. In this paper, we propose a novel DNA-binding protein prediction method called HMMBinder. HMMBinder uses monogram and bigram features extracted from the HMM profiles of the protein sequences. To the best of our knowledge, this is the first application of HMM profile based features for the DNA-binding protein prediction problem. We applied Support Vector Machines (SVM) as a classification technique in HMMBinder. Our method was tested on standard benchmark datasets. We experimentally show that our method outperforms the state-of-the-art methods found in the literature.
Gene selection for tumor classification using neighborhood rough sets and entropy measures.
Chen, Yumin; Zhang, Zunjun; Zheng, Jianzhong; Ma, Ying; Xue, Yu
2017-03-01
With the development of bioinformatics, tumor classification from gene expression data becomes an important useful technology for cancer diagnosis. Since a gene expression data often contains thousands of genes and a small number of samples, gene selection from gene expression data becomes a key step for tumor classification. Attribute reduction of rough sets has been successfully applied to gene selection field, as it has the characters of data driving and requiring no additional information. However, traditional rough set method deals with discrete data only. As for the gene expression data containing real-value or noisy data, they are usually employed by a discrete preprocessing, which may result in poor classification accuracy. In this paper, we propose a novel gene selection method based on the neighborhood rough set model, which has the ability of dealing with real-value data whilst maintaining the original gene classification information. Moreover, this paper addresses an entropy measure under the frame of neighborhood rough sets for tackling the uncertainty and noisy of gene expression data. The utilization of this measure can bring about a discovery of compact gene subsets. Finally, a gene selection algorithm is designed based on neighborhood granules and the entropy measure. Some experiments on two gene expression data show that the proposed gene selection is an effective method for improving the accuracy of tumor classification. Copyright © 2017 Elsevier Inc. All rights reserved.
Strength in Numbers: Using Big Data to Simplify Sentiment Classification.
Filippas, Apostolos; Lappas, Theodoros
2017-09-01
Sentiment classification, the task of assigning a positive or negative label to a text segment, is a key component of mainstream applications such as reputation monitoring, sentiment summarization, and item recommendation. Even though the performance of sentiment classification methods has steadily improved over time, their ever-increasing complexity renders them comprehensible by only a shrinking minority of expert practitioners. For all others, such highly complex methods are black-box predictors that are hard to tune and even harder to justify to decision makers. Motivated by these shortcomings, we introduce BigCounter: a new algorithm for sentiment classification that substitutes algorithmic complexity with Big Data. Our algorithm combines standard data structures with statistical testing to deliver accurate and interpretable predictions. It is also parameter free and suitable for use virtually "out of the box," which makes it appealing for organizations wanting to leverage their troves of unstructured data without incurring the significant expense of creating in-house teams of data scientists. Finally, BigCounter's efficient and parallelizable design makes it applicable to very large data sets. We apply our method on such data sets toward a study on the limits of Big Data for sentiment classification. Our study finds that, after a certain point, predictive performance tends to converge and additional data have little benefit. Our algorithmic design and findings provide the foundations for future research on the data-over-computation paradigm for classification problems.
Systematic Model-in-the-Loop Test of Embedded Control Systems
NASA Astrophysics Data System (ADS)
Krupp, Alexander; Müller, Wolfgang
Current model-based development processes offer new opportunities for verification automation, e.g., in automotive development. The duty of functional verification is the detection of design flaws. Current functional verification approaches exhibit a major gap between requirement definition and formal property definition, especially when analog signals are involved. Besides lack of methodical support for natural language formalization, there does not exist a standardized and accepted means for formal property definition as a target for verification planning. This article addresses several shortcomings of embedded system verification. An Enhanced Classification Tree Method is developed based on the established Classification Tree Method for Embeded Systems CTM/ES which applies a hardware verification language to define a verification environment.
The Ex Vivo Eye Irritation Test as an alternative test method for serious eye damage/eye irritation.
Spöler, Felix; Kray, Oya; Kray, Stefan; Panfil, Claudia; Schrage, Norbert F
2015-07-01
Ocular irritation testing is a common requirement for the classification, labelling and packaging of chemicals (substances and mixtures). The in vivo Draize rabbit eye test (OECD Test Guideline 405) is considered to be the regulatory reference method for the classification of chemicals according to their potential to induce eye injury. In the Draize test, chemicals are applied to rabbit eyes in vivo, and changes are monitored over time. If no damage is observed, the chemical is not categorised. Otherwise, the classification depends on the severity and reversibility of the damage. Alternative test methods have to be designed to match the classifications from the in vivo reference method. However, observation of damage reversibility is usually not possible in vitro. Within the present study, a new organotypic method based on rabbit corneas obtained from food production is demonstrated to close this gap. The Ex Vivo Eye Irritation Test (EVEIT) retains the full biochemical activity of the corneal epithelium, epithelial stem cells and endothelium. This permits the in-depth analysis of ocular chemical trauma beyond that achievable by using established in vitro methods. In particular, the EVEIT is the first test to permit the direct monitoring of recovery of all corneal layers after damage. To develop a prediction model for the EVEIT that is comparable to the GHS system, 37 reference chemicals were analysed. The experimental data were used to derive a three-level potency ranking of eye irritation and corrosion that best fits the GHS categorisation. In vivo data available in the literature were used for comparison. When compared with GHS classification predictions, the overall accuracy of the three-level potency ranking was 78%. The classification of chemicals as irritating versus non-irritating resulted in 96% sensitivity, 91% specificity and 95% accuracy. 2015 FRAME.
Fault classification method for the driving safety of electrified vehicles
NASA Astrophysics Data System (ADS)
Wanner, Daniel; Drugge, Lars; Stensson Trigell, Annika
2014-05-01
A fault classification method is proposed which has been applied to an electric vehicle. Potential faults in the different subsystems that can affect the vehicle directional stability were collected in a failure mode and effect analysis. Similar driveline faults were grouped together if they resembled each other with respect to their influence on the vehicle dynamic behaviour. The faults were physically modelled in a simulation environment before they were induced in a detailed vehicle model under normal driving conditions. A special focus was placed on faults in the driveline of electric vehicles employing in-wheel motors of the permanent magnet type. Several failures caused by mechanical and other faults were analysed as well. The fault classification method consists of a controllability ranking developed according to the functional safety standard ISO 26262. The controllability of a fault was determined with three parameters covering the influence of the longitudinal, lateral and yaw motion of the vehicle. The simulation results were analysed and the faults were classified according to their controllability using the proposed method. It was shown that the controllability decreased specifically with increasing lateral acceleration and increasing speed. The results for the electric driveline faults show that this trend cannot be generalised for all the faults, as the controllability deteriorated for some faults during manoeuvres with low lateral acceleration and low speed. The proposed method is generic and can be applied to various other types of road vehicles and faults.
Applying Machine Learning to Star Cluster Classification
NASA Astrophysics Data System (ADS)
Fedorenko, Kristina; Grasha, Kathryn; Calzetti, Daniela; Mahadevan, Sridhar
2016-01-01
Catalogs describing populations of star clusters are essential in investigating a range of important issues, from star formation to galaxy evolution. Star cluster catalogs are typically created in a two-step process: in the first step, a catalog of sources is automatically produced; in the second step, each of the extracted sources is visually inspected by 3-to-5 human classifiers and assigned a category. Classification by humans is labor-intensive and time consuming, thus it creates a bottleneck, and substantially slows down progress in star cluster research.We seek to automate the process of labeling star clusters (the second step) through applying supervised machine learning techniques. This will provide a fast, objective, and reproducible classification. Our data is HST (WFC3 and ACS) images of galaxies in the distance range of 3.5-12 Mpc, with a few thousand star clusters already classified by humans as a part of the LEGUS (Legacy ExtraGalactic UV Survey) project. The classification is based on 4 labels (Class 1 - symmetric, compact cluster; Class 2 - concentrated object with some degree of asymmetry; Class 3 - multiple peak system, diffuse; and Class 4 - spurious detection). We start by looking at basic machine learning methods such as decision trees. We then proceed to evaluate performance of more advanced techniques, focusing on convolutional neural networks and other Deep Learning methods. We analyze the results, and suggest several directions for further improvement.
Towards the Optimal Pixel Size of dem for Automatic Mapping of Landslide Areas
NASA Astrophysics Data System (ADS)
Pawłuszek, K.; Borkowski, A.; Tarolli, P.
2017-05-01
Determining appropriate spatial resolution of digital elevation model (DEM) is a key step for effective landslide analysis based on remote sensing data. Several studies demonstrated that choosing the finest DEM resolution is not always the best solution. Various DEM resolutions can be applicable for diverse landslide applications. Thus, this study aims to assess the influence of special resolution on automatic landslide mapping. Pixel-based approach using parametric and non-parametric classification methods, namely feed forward neural network (FFNN) and maximum likelihood classification (ML), were applied in this study. Additionally, this allowed to determine the impact of used classification method for selection of DEM resolution. Landslide affected areas were mapped based on four DEMs generated at 1 m, 2 m, 5 m and 10 m spatial resolution from airborne laser scanning (ALS) data. The performance of the landslide mapping was then evaluated by applying landslide inventory map and computation of confusion matrix. The results of this study suggests that the finest scale of DEM is not always the best fit, however working at 1 m DEM resolution on micro-topography scale, can show different results. The best performance was found at 5 m DEM-resolution for FFNN and 1 m DEM resolution for results. The best performance was found to be using 5 m DEM-resolution for FFNN and 1 m DEM resolution for ML classification.
NASA Astrophysics Data System (ADS)
Baccar, D.; Söffker, D.
2017-11-01
Acoustic Emission (AE) is a suitable method to monitor the health of composite structures in real-time. However, AE-based failure mode identification and classification are still complex to apply due to the fact that AE waves are generally released simultaneously from all AE-emitting damage sources. Hence, the use of advanced signal processing techniques in combination with pattern recognition approaches is required. In this paper, AE signals generated from laminated carbon fiber reinforced polymer (CFRP) subjected to indentation test are examined and analyzed. A new pattern recognition approach involving a number of processing steps able to be implemented in real-time is developed. Unlike common classification approaches, here only CWT coefficients are extracted as relevant features. Firstly, Continuous Wavelet Transform (CWT) is applied to the AE signals. Furthermore, dimensionality reduction process using Principal Component Analysis (PCA) is carried out on the coefficient matrices. The PCA-based feature distribution is analyzed using Kernel Density Estimation (KDE) allowing the determination of a specific pattern for each fault-specific AE signal. Moreover, waveform and frequency content of AE signals are in depth examined and compared with fundamental assumptions reported in this field. A correlation between the identified patterns and failure modes is achieved. The introduced method improves the damage classification and can be used as a non-destructive evaluation tool.
Comparing Pattern Recognition Feature Sets for Sorting Triples in the FIRST Database
NASA Astrophysics Data System (ADS)
Proctor, D. D.
2006-07-01
Pattern recognition techniques have been used with increasing success for coping with the tremendous amounts of data being generated by automated surveys. Usually this process involves construction of training sets, the typical examples of data with known classifications. Given a feature set, along with the training set, statistical methods can be employed to generate a classifier. The classifier is then applied to process the remaining data. Feature set selection, however, is still an issue. This paper presents techniques developed for accommodating data for which a substantive portion of the training set cannot be classified unambiguously, a typical case for low-resolution data. Significance tests on the sort-ordered, sample-size-normalized vote distribution of an ensemble of decision trees is introduced as a method of evaluating relative quality of feature sets. The technique is applied to comparing feature sets for sorting a particular radio galaxy morphology, bent-doubles, from the Faint Images of the Radio Sky at Twenty Centimeters (FIRST) database. Also examined are alternative functional forms for feature sets. Associated standard deviations provide the means to evaluate the effect of the number of folds, the number of classifiers per fold, and the sample size on the resulting classifications. The technique also may be applied to situations for which, although accurate classifications are available, the feature set is clearly inadequate, but is desired nonetheless to make the best of available information.
Wang, Qian; Liu, Zhen; Ziegler, Sibylle I; Shi, Kuangyu
2015-07-07
Position-sensitive positron cameras using silicon pixel detectors have been applied for some preclinical and intraoperative clinical applications. However, the spatial resolution of a positron camera is limited by positron multiple scattering in the detector. An incident positron may fire a number of successive pixels on the imaging plane. It is still impossible to capture the primary fired pixel along a particle trajectory by hardware or to perceive the pixel firing sequence by direct observation. Here, we propose a novel data-driven method to improve the spatial resolution by classifying the primary pixels within the detector using support vector machine. A classification model is constructed by learning the features of positron trajectories based on Monte-Carlo simulations using Geant4. Topological and energy features of pixels fired by (18)F positrons were considered for the training and classification. After applying the classification model on measurements, the primary fired pixels of the positron tracks in the silicon detector were estimated. The method was tested and assessed for [(18)F]FDG imaging of an absorbing edge protocol and a leaf sample. The proposed method improved the spatial resolution from 154.6 ± 4.2 µm (energy weighted centroid approximation) to 132.3 ± 3.5 µm in the absorbing edge measurements. For the positron imaging of a leaf sample, the proposed method achieved lower root mean square error relative to phosphor plate imaging, and higher similarity with the reference optical image. The improvements of the preliminary results support further investigation of the proposed algorithm for the enhancement of positron imaging in clinical and preclinical applications.
NASA Astrophysics Data System (ADS)
Wang, Qian; Liu, Zhen; Ziegler, Sibylle I.; Shi, Kuangyu
2015-07-01
Position-sensitive positron cameras using silicon pixel detectors have been applied for some preclinical and intraoperative clinical applications. However, the spatial resolution of a positron camera is limited by positron multiple scattering in the detector. An incident positron may fire a number of successive pixels on the imaging plane. It is still impossible to capture the primary fired pixel along a particle trajectory by hardware or to perceive the pixel firing sequence by direct observation. Here, we propose a novel data-driven method to improve the spatial resolution by classifying the primary pixels within the detector using support vector machine. A classification model is constructed by learning the features of positron trajectories based on Monte-Carlo simulations using Geant4. Topological and energy features of pixels fired by 18F positrons were considered for the training and classification. After applying the classification model on measurements, the primary fired pixels of the positron tracks in the silicon detector were estimated. The method was tested and assessed for [18F]FDG imaging of an absorbing edge protocol and a leaf sample. The proposed method improved the spatial resolution from 154.6 ± 4.2 µm (energy weighted centroid approximation) to 132.3 ± 3.5 µm in the absorbing edge measurements. For the positron imaging of a leaf sample, the proposed method achieved lower root mean square error relative to phosphor plate imaging, and higher similarity with the reference optical image. The improvements of the preliminary results support further investigation of the proposed algorithm for the enhancement of positron imaging in clinical and preclinical applications.
Identification of memory reactivation during sleep by EEG classification.
Belal, Suliman; Cousins, James; El-Deredy, Wael; Parkes, Laura; Schneider, Jules; Tsujimura, Hikaru; Zoumpoulaki, Alexia; Perapoch, Marta; Santamaria, Lorena; Lewis, Penelope
2018-04-17
Memory reactivation during sleep is critical for consolidation, but also extremely difficult to measure as it is subtle, distributed and temporally unpredictable. This article reports a novel method for detecting such reactivation in standard sleep recordings. During learning, participants produced a complex sequence of finger presses, with each finger cued by a distinct audio-visual stimulus. Auditory cues were then re-played during subsequent sleep to trigger neural reactivation through a method known as targeted memory reactivation (TMR). Next, we used electroencephalography data from the learning session to train a machine learning classifier, and then applied this classifier to sleep data to determine how successfully each tone had elicited memory reactivation. Neural reactivation was classified above chance in all participants when TMR was applied in SWS, and in 5 of the 14 participants to whom TMR was applied in N2. Classification success reduced across numerous repetitions of the tone cue, suggesting either a gradually reducing responsiveness to such cues or a plasticity-related change in the neural signature as a result of cueing. We believe this method will be valuable for future investigations of memory consolidation. Copyright © 2018 Elsevier Inc. All rights reserved.
Wisaijohn, Thunthita; Pimkhaokham, Atiphan; Lapying, Phenkhae; Itthichaisri, Chumpot; Pannarunothai, Supasit; Igarashi, Isao; Kawabuchi, Koichi
2010-01-01
This study aimed to develop a new casemix classification system as an alternative method for the budget allocation of oral healthcare service (OHCS). Initially, the International Statistical of Diseases and Related Health Problem, 10th revision, Thai Modification (ICD-10-TM) related to OHCS was used for developing the software “Grouper”. This model was designed to allow the translation of dental procedures into eight-digit codes. Multiple regression analysis was used to analyze the relationship between the factors used for developing the model and the resource consumption. Furthermore, the coefficient of variance, reduction in variance, and relative weight (RW) were applied to test the validity. The results demonstrated that 1,624 OHCS classifications, according to the diagnoses and the procedures performed, showed high homogeneity within groups and heterogeneity between groups. Moreover, the RW of the OHCS could be used to predict and control the production costs. In conclusion, this new OHCS casemix classification has a potential use in a global decision making. PMID:20936134
Wisaijohn, Thunthita; Pimkhaokham, Atiphan; Lapying, Phenkhae; Itthichaisri, Chumpot; Pannarunothai, Supasit; Igarashi, Isao; Kawabuchi, Koichi
2010-01-01
This study aimed to develop a new casemix classification system as an alternative method for the budget allocation of oral healthcare service (OHCS). Initially, the International Statistical of Diseases and Related Health Problem, 10th revision, Thai Modification (ICD-10-TM) related to OHCS was used for developing the software "Grouper". This model was designed to allow the translation of dental procedures into eight-digit codes. Multiple regression analysis was used to analyze the relationship between the factors used for developing the model and the resource consumption. Furthermore, the coefficient of variance, reduction in variance, and relative weight (RW) were applied to test the validity. The results demonstrated that 1,624 OHCS classifications, according to the diagnoses and the procedures performed, showed high homogeneity within groups and heterogeneity between groups. Moreover, the RW of the OHCS could be used to predict and control the production costs. In conclusion, this new OHCS casemix classification has a potential use in a global decision making.
Climate of Hungary in the twentieth century according to Feddema
NASA Astrophysics Data System (ADS)
Ács, Ferenc; Breuer, Hajnalka; Skarbit, Nóra
2015-01-01
Feddema's (Physical Geography 26:442-466, 2005) bioclimatic classification scheme is applied to Hungary for the twentieth century using the Climatic Research Unit (CRU) data series. The method is tested in two modes. In the first, its original form is used which is suitable for global scale analysis. In the second, the criteria used in the method are slightly modified for mesoscale classification purposes. In both versions, potential evapotranspiration (PET) is calculated using McKenney and Rosenberg's (Meteorol 64:81-110, 1993) formula. We showed that McKenney and Rosenberg's formula could be applied to Hungary. According to Feddema's global scale application, local climates of the three main geographical regions, the Great Hungarian Plain, the North Hungarian Mountains, and Transdanubia, can be distinguished. However, the spatial distribution pattern within the regions is poorly reproduced, if at all. According to Feddema's mesoscale application, a picture of climatic subregions could be observed.
Pollettini, Juliana T; Panico, Sylvia R G; Daneluzzi, Julio C; Tinós, Renato; Baranauskas, José A; Macedo, Alessandra A
2012-12-01
Surveillance Levels (SLs) are categories for medical patients (used in Brazil) that represent different types of medical recommendations. SLs are defined according to risk factors and the medical and developmental history of patients. Each SL is associated with specific educational and clinical measures. The objective of the present paper was to verify computer-aided, automatic assignment of SLs. The present paper proposes a computer-aided approach for automatic recommendation of SLs. The approach is based on the classification of information from patient electronic records. For this purpose, a software architecture composed of three layers was developed. The architecture is formed by a classification layer that includes a linguistic module and machine learning classification modules. The classification layer allows for the use of different classification methods, including the use of preprocessed, normalized language data drawn from the linguistic module. We report the verification and validation of the software architecture in a Brazilian pediatric healthcare institution. The results indicate that selection of attributes can have a great effect on the performance of the system. Nonetheless, our automatic recommendation of surveillance level can still benefit from improvements in processing procedures when the linguistic module is applied prior to classification. Results from our efforts can be applied to different types of medical systems. The results of systems supported by the framework presented in this paper may be used by healthcare and governmental institutions to improve healthcare services in terms of establishing preventive measures and alerting authorities about the possibility of an epidemic.
A novel deep learning approach for classification of EEG motor imagery signals.
Tabar, Yousef Rezaei; Halici, Ugur
2017-02-01
Signal classification is an important issue in brain computer interface (BCI) systems. Deep learning approaches have been used successfully in many recent studies to learn features and classify different types of data. However, the number of studies that employ these approaches on BCI applications is very limited. In this study we aim to use deep learning methods to improve classification performance of EEG motor imagery signals. In this study we investigate convolutional neural networks (CNN) and stacked autoencoders (SAE) to classify EEG Motor Imagery signals. A new form of input is introduced to combine time, frequency and location information extracted from EEG signal and it is used in CNN having one 1D convolutional and one max-pooling layers. We also proposed a new deep network by combining CNN and SAE. In this network, the features that are extracted in CNN are classified through the deep network SAE. The classification performance obtained by the proposed method on BCI competition IV dataset 2b in terms of kappa value is 0.547. Our approach yields 9% improvement over the winner algorithm of the competition. Our results show that deep learning methods provide better classification performance compared to other state of art approaches. These methods can be applied successfully to BCI systems where the amount of data is large due to daily recording.
Aided diagnosis methods of breast cancer based on machine learning
NASA Astrophysics Data System (ADS)
Zhao, Yue; Wang, Nian; Cui, Xiaoyu
2017-08-01
In the field of medicine, quickly and accurately determining whether the patient is malignant or benign is the key to treatment. In this paper, K-Nearest Neighbor, Linear Discriminant Analysis, Logistic Regression were applied to predict the classification of thyroid,Her-2,PR,ER,Ki67,metastasis and lymph nodes in breast cancer, in order to recognize the benign and malignant breast tumors and achieve the purpose of aided diagnosis of breast cancer. The results showed that the highest classification accuracy of LDA was 88.56%, while the classification effect of KNN and Logistic Regression were better than that of LDA, the best accuracy reached 96.30%.
NASA Astrophysics Data System (ADS)
Wang, Le
2003-10-01
Modern forest management poses an increasing need for detailed knowledge of forest information at different spatial scales. At the forest level, the information for tree species assemblage is desired whereas at or below the stand level, individual tree related information is preferred. Remote Sensing provides an effective tool to extract the above information at multiple spatial scales in the continuous time domain. To date, the increasing volume and readily availability of high-spatial-resolution data have lead to a much wider application of remotely sensed products. Nevertheless, to make effective use of the improving spatial resolution, conventional pixel-based classification methods are far from satisfactory. Correspondingly, developing object-based methods becomes a central challenge for researchers in the field of Remote Sensing. This thesis focuses on the development of methods for accurate individual tree identification and tree species classification. We develop a method in which individual tree crown boundaries and treetop locations are derived under a unified framework. We apply a two-stage approach with edge detection followed by marker-controlled watershed segmentation. Treetops are modeled from radiometry and geometry aspects. Specifically, treetops are assumed to be represented by local radiation maxima and to be located near the center of the tree-crown. As a result, a marker image was created from the derived treetop to guide a watershed segmentation to further differentiate overlapping trees and to produce a segmented image comprised of individual tree crowns. The image segmentation method developed achieves a promising result for a 256 x 256 CASI image. Then further effort is made to extend our methods to the multiscales which are constructed from a wavelet decomposition. A scale consistency and geometric consistency are designed to examine the gradients along the scale-space for the purpose of separating true crown boundary from unwanted textures occurring due to branches and twigs. As a result from the inverse wavelet transform, the tree crown boundary is enhanced while the unwanted textures are suppressed. Based on the enhanced image, an improvement is achieved when applying the two-stage methods to a high resolution aerial photograph. To improve tree species classification, we develop a new method to choose the optimal scale parameter with the aid of Bhattacharya Distance (BD), a well-known index of class separability in traditional pixel-based classification. The optimal scale parameter is then fed in the process of a region-growing-based segmentation as a break-off value. Our object classification achieves a better accuracy in separating tree species when compared to the conventional Maximum Likelihood Classification (MLC). In summary, we develop two object-based methods for identifying individual trees and classifying tree species from high-spatial resolution imagery. Both methods achieve promising results and will promote integration of Remote Sensing and GIS in forest applications.
Handling Imbalanced Data Sets in Multistage Classification
NASA Astrophysics Data System (ADS)
López, M.
Multistage classification is a logical approach, based on a divide-and-conquer solution, for dealing with problems with a high number of classes. The classification problem is divided into several sequential steps, each one associated to a single classifier that works with subgroups of the original classes. In each level, the current set of classes is split into smaller subgroups of classes until they (the subgroups) are composed of only one class. The resulting chain of classifiers can be represented as a tree, which (1) simplifies the classification process by using fewer categories in each classifier and (2) makes it possible to combine several algorithms or use different attributes in each stage. Most of the classification algorithms can be biased in the sense of selecting the most populated class in overlapping areas of the input space. This can degrade a multistage classifier performance if the training set sample frequencies do not reflect the real prevalence in the population. Several techniques such as applying prior probabilities, assigning weights to the classes, or replicating instances have been developed to overcome this handicap. Most of them are designed for two-class (accept-reject) problems. In this article, we evaluate several of these techniques as applied to multistage classification and analyze how they can be useful for astronomy. We compare the results obtained by classifying a data set based on Hipparcos with and without these methods.
Prostate segmentation by sparse representation based classification
Gao, Yaozong; Liao, Shu; Shen, Dinggang
2012-01-01
Purpose: The segmentation of prostate in CT images is of essential importance to external beam radiotherapy, which is one of the major treatments for prostate cancer nowadays. During the radiotherapy, the prostate is radiated by high-energy x rays from different directions. In order to maximize the dose to the cancer and minimize the dose to the surrounding healthy tissues (e.g., bladder and rectum), the prostate in the new treatment image needs to be accurately localized. Therefore, the effectiveness and efficiency of external beam radiotherapy highly depend on the accurate localization of the prostate. However, due to the low contrast of the prostate with its surrounding tissues (e.g., bladder), the unpredicted prostate motion, and the large appearance variations across different treatment days, it is challenging to segment the prostate in CT images. In this paper, the authors present a novel classification based segmentation method to address these problems. Methods: To segment the prostate, the proposed method first uses sparse representation based classification (SRC) to enhance the prostate in CT images by pixel-wise classification, in order to overcome the limitation of poor contrast of the prostate images. Then, based on the classification results, previous segmented prostates of the same patient are used as patient-specific atlases to align onto the current treatment image and the majority voting strategy is finally adopted to segment the prostate. In order to address the limitations of the traditional SRC in pixel-wise classification, especially for the purpose of segmentation, the authors extend SRC from the following four aspects: (1) A discriminant subdictionary learning method is proposed to learn a discriminant and compact representation of training samples for each class so that the discriminant power of SRC can be increased and also SRC can be applied to the large-scale pixel-wise classification. (2) The L1 regularized sparse coding is replaced by the elastic net in order to obtain a smooth and clear prostate boundary in the classification result. (3) Residue-based linear regression is incorporated to improve the classification performance and to extend SRC from hard classification to soft classification. (4) Iterative SRC is proposed by using context information to iteratively refine the classification results. Results: The proposed method has been comprehensively evaluated on a dataset consisting of 330 CT images from 24 patients. The effectiveness of the extended SRC has been validated by comparing it with the traditional SRC based on the proposed four extensions. The experimental results show that our extended SRC can obtain not only more accurate classification results but also smoother and clearer prostate boundary than the traditional SRC. Besides, the comparison with other five state-of-the-art prostate segmentation methods indicates that our method can achieve better performance than other methods under comparison. Conclusions: The authors have proposed a novel prostate segmentation method based on the sparse representation based classification, which can achieve considerably accurate segmentation results in CT prostate segmentation. PMID:23039673
Analysis on the Utility of Satellite Imagery for Detection of Agricultural Facility
NASA Astrophysics Data System (ADS)
Kang, J.-M.; Baek, S.-H.; Jung, K.-Y.
2012-07-01
Now that the agricultural facilities are being increase owing to development of technology and diversification of agriculture and the ratio of garden crops that are imported a lot and the crops cultivated in facilities are raised in Korea, the number of vinyl greenhouses is tending upward. So, it is important to grasp the distribution of vinyl greenhouses as much as that of rice fields, dry fields and orchards, but it is difficult to collect the information of wide areas economically and correctly. Remote sensing using satellite imagery is able to obtain data of wide area at the same time, quickly and cost-effectively collect, monitor and analyze information from every object on earth. In this study, in order to analyze the utilization of satellite imagery at detection of agricultural facility, image classification was performed about the agricultural facility, vinyl greenhouse using Formosat-2 satellite imagery. The training set of sea, vegetation, building, bare ground and vinyl greenhouse was set to monitor the agricultural facilities of the object area and the training set for the vinyl greenhouses that are main monitoring object was classified and set again into 3 types according the spectral characteristics. The image classification using 4 kinds of supervise classification methods applied by the same training set were carried out to grasp the image classification method which is effective for monitoring agricultural facilities. And, in order to minimize the misclassification appeared in the classification using the spectral information, the accuracy of classification was intended to be raised by adding texture information. The results of classification were analyzed regarding the accuracy comparing with that of naked-eyed detection. As the results of classification, the method of Mahalanobis distance was shown as more efficient than other methods and the accuracy of classification was higher when adding texture information. Hence the more effective monitoring of agricultural facilities is expected to be available if the characteristics such as texture information including satellite images or spatial pattern are studied in detail.
NASA Technical Reports Server (NTRS)
Spruce, J. P.; Smoot, James; Ellis, Jean; Hilbert, Kent; Swann, Roberta
2012-01-01
This paper discusses the development and implementation of a geospatial data processing method and multi-decadal Landsat time series for computing general coastal U.S. land-use and land-cover (LULC) classifications and change products consisting of seven classes (water, barren, upland herbaceous, non-woody wetland, woody upland, woody wetland, and urban). Use of this approach extends the observational period of the NOAA-generated Coastal Change and Analysis Program (C-CAP) products by almost two decades, assuming the availability of one cloud free Landsat scene from any season for each targeted year. The Mobile Bay region in Alabama was used as a study area to develop, demonstrate, and validate the method that was applied to derive LULC products for nine dates at approximate five year intervals across a 34-year time span, using single dates of data for each classification in which forests were either leaf-on, leaf-off, or mixed senescent conditions. Classifications were computed and refined using decision rules in conjunction with unsupervised classification of Landsat data and C-CAP value-added products. Each classification's overall accuracy was assessed by comparing stratified random locations to available reference data, including higher spatial resolution satellite and aerial imagery, field survey data, and raw Landsat RGBs. Overall classification accuracies ranged from 83 to 91% with overall Kappa statistics ranging from 0.78 to 0.89. The accuracies are comparable to those from similar, generalized LULC products derived from C-CAP data. The Landsat MSS-based LULC product accuracies are similar to those from Landsat TM or ETM+ data. Accurate classifications were computed for all nine dates, yielding effective results regardless of season. This classification method yielded products that were used to compute LULC change products via additive GIS overlay techniques.
Composite ultrasound imaging apparatus and method
Morimoto, Alan K.; Bow, Jr., Wallace J.; Strong, David Scott; Dickey, Fred M.
1998-01-01
An imaging apparatus and method for use in presenting composite two dimensional and three dimensional images from individual ultrasonic frames. A cross-sectional reconstruction is applied by using digital ultrasound frames, transducer orientation and a known center. Motion compensation, rank value filtering, noise suppression and tissue classification are utilized to optimize the composite image.
Composite ultrasound imaging apparatus and method
Morimoto, A.K.; Bow, W.J. Jr.; Strong, D.S.; Dickey, F.M.
1998-09-15
An imaging apparatus and method for use in presenting composite two dimensional and three dimensional images from individual ultrasonic frames. A cross-sectional reconstruction is applied by using digital ultrasound frames, transducer orientation and a known center. Motion compensation, rank value filtering, noise suppression and tissue classification are utilized to optimize the composite image. 37 figs.
Pang, Junbiao; Qin, Lei; Zhang, Chunjie; Zhang, Weigang; Huang, Qingming; Yin, Baocai
2015-12-01
Local coordinate coding (LCC) is a framework to approximate a Lipschitz smooth function by combining linear functions into a nonlinear one. For locally linear classification, LCC requires a coding scheme that heavily determines the nonlinear approximation ability, posing two main challenges: 1) the locality making faraway anchors have smaller influences on current data and 2) the flexibility balancing well between the reconstruction of current data and the locality. In this paper, we address the problem from the theoretical analysis of the simplest local coding schemes, i.e., local Gaussian coding and local student coding, and propose local Laplacian coding (LPC) to achieve the locality and the flexibility. We apply LPC into locally linear classifiers to solve diverse classification tasks. The comparable or exceeded performances of state-of-the-art methods demonstrate the effectiveness of the proposed method.
Clustering performance comparison using K-means and expectation maximization algorithms.
Jung, Yong Gyu; Kang, Min Soo; Heo, Jun
2014-11-14
Clustering is an important means of data mining based on separating data categories by similar features. Unlike the classification algorithm, clustering belongs to the unsupervised type of algorithms. Two representatives of the clustering algorithms are the K -means and the expectation maximization (EM) algorithm. Linear regression analysis was extended to the category-type dependent variable, while logistic regression was achieved using a linear combination of independent variables. To predict the possibility of occurrence of an event, a statistical approach is used. However, the classification of all data by means of logistic regression analysis cannot guarantee the accuracy of the results. In this paper, the logistic regression analysis is applied to EM clusters and the K -means clustering method for quality assessment of red wine, and a method is proposed for ensuring the accuracy of the classification results.
Surface Water Detection Using Fused Synthetic Aperture Radar, Airborne LiDAR and Optical Imagery
NASA Astrophysics Data System (ADS)
Braun, A.; Irwin, K.; Beaulne, D.; Fotopoulos, G.; Lougheed, S. C.
2016-12-01
Each remote sensing technique has its unique set of strengths and weaknesses, but by combining techniques the classification accuracy can be increased. The goal of this project is to underline the strengths and weaknesses of Synthetic Aperture Radar (SAR), LiDAR and optical imagery data and highlight the opportunities where integration of the three data types can increase the accuracy of identifying water in a principally natural landscape. The study area is located at the Queen's University Biological Station, Ontario, Canada. TerraSAR-X (TSX) data was acquired between April and July 2016, consisting of four single polarization (HH) staring spotlight mode backscatter intensity images. Grey-level thresholding is used to extract surface water bodies, before identifying and masking zones of radar shadow and layover by using LiDAR elevation models to estimate the canopy height and applying simple geometry algorithms. The airborne LiDAR survey was conducted in June 2014, resulting in a discrete return dataset with a density of 1 point/m2. Radiometric calibration to correct for range and incidence angle is applied, before classifying the points as water or land based on corrected intensity, elevation, roughness, and intensity density. Panchromatic and multispectral (4-band) imagery from Quickbird was collected in September 2005 at spatial resolutions of 0.6m and 2.5m respectively. Pixel-based classification is applied to identify and distinguish water bodies from land. A classification system which inputs SAR-, LiDAR- and optically-derived water presence models in raster formats is developed to exploit the strengths and weaknesses of each technique. The total percentage of water detected in the sample area for SAR backscatter, LiDAR intensity, and optical imagery was 27%, 19% and 18% respectively. The output matrix of the classification system indicates that in over 72% of the study area all three methods agree on the classification. Analysis was specifically targeted towards areas where the methods disagree, highlighting how each technique should be properly weighted over these areas to increase the classification accuracy of water. The conclusions and techniques developed in this study are applicable to other areas where similar environmental conditions and data availability exist.
Wendel, Jochen; Buttenfield, Barbara P.; Stanislawski, Larry V.
2016-01-01
Knowledge of landscape type can inform cartographic generalization of hydrographic features, because landscape characteristics provide an important geographic context that affects variation in channel geometry, flow pattern, and network configuration. Landscape types are characterized by expansive spatial gradients, lacking abrupt changes between adjacent classes; and as having a limited number of outliers that might confound classification. The US Geological Survey (USGS) is exploring methods to automate generalization of features in the National Hydrography Data set (NHD), to associate specific sequences of processing operations and parameters with specific landscape characteristics, thus obviating manual selection of a unique processing strategy for every NHD watershed unit. A chronology of methods to delineate physiographic regions for the United States is described, including a recent maximum likelihood classification based on seven input variables. This research compares unsupervised and supervised algorithms applied to these seven input variables, to evaluate and possibly refine the recent classification. Evaluation metrics for unsupervised methods include the Davies–Bouldin index, the Silhouette index, and the Dunn index as well as quantization and topographic error metrics. Cross validation and misclassification rate analysis are used to evaluate supervised classification methods. The paper reports the comparative analysis and its impact on the selection of landscape regions. The compared solutions show problems in areas of high landscape diversity. There is some indication that additional input variables, additional classes, or more sophisticated methods can refine the existing classification.
Fast and Robust Segmentation and Classification for Change Detection in Urban Point Clouds
NASA Astrophysics Data System (ADS)
Roynard, X.; Deschaud, J.-E.; Goulette, F.
2016-06-01
Change detection is an important issue in city monitoring to analyse street furniture, road works, car parking, etc. For example, parking surveys are needed but are currently a laborious task involving sending operators in the streets to identify the changes in car locations. In this paper, we propose a method that performs a fast and robust segmentation and classification of urban point clouds, that can be used for change detection. We apply this method to detect the cars, as a particular object class, in order to perform parking surveys automatically. A recently proposed method already addresses the need for fast segmentation and classification of urban point clouds, using elevation images. The interest to work on images is that processing is much faster, proven and robust. However there may be a loss of information in complex 3D cases: for example when objects are one above the other, typically a car under a tree or a pedestrian under a balcony. In this paper we propose a method that retain the three-dimensional information while preserving fast computation times and improving segmentation and classification accuracy. It is based on fast region-growing using an octree, for the segmentation, and specific descriptors with Random-Forest for the classification. Experiments have been performed on large urban point clouds acquired by Mobile Laser Scanning. They show that the method is as fast as the state of the art, and that it gives more robust results in the complex 3D cases.
NASA Astrophysics Data System (ADS)
Iabchoon, Sanwit; Wongsai, Sangdao; Chankon, Kanoksuk
2017-10-01
Land use and land cover (LULC) data are important to monitor and assess environmental change. LULC classification using satellite images is a method widely used on a global and local scale. Especially, urban areas that have various LULC types are important components of the urban landscape and ecosystem. This study aims to classify urban LULC using WorldView-3 (WV-3) very high-spatial resolution satellite imagery and the object-based image analysis method. A decision rules set was applied to classify the WV-3 images in Kathu subdistrict, Phuket province, Thailand. The main steps were as follows: (1) the image was ortho-rectified with ground control points and using the digital elevation model, (2) multiscale image segmentation was applied to divide the image pixel level into image object level, (3) development of the decision ruleset for LULC classification using spectral bands, spectral indices, spatial and contextual information, and (4) accuracy assessment was computed using testing data, which sampled by statistical random sampling. The results show that seven LULC classes (water, vegetation, open space, road, residential, building, and bare soil) were successfully classified with overall classification accuracy of 94.14% and a kappa coefficient of 92.91%.
Overlapped Partitioning for Ensemble Classifiers of P300-Based Brain-Computer Interfaces
Onishi, Akinari; Natsume, Kiyohisa
2014-01-01
A P300-based brain-computer interface (BCI) enables a wide range of people to control devices that improve their quality of life. Ensemble classifiers with naive partitioning were recently applied to the P300-based BCI and these classification performances were assessed. However, they were usually trained on a large amount of training data (e.g., 15300). In this study, we evaluated ensemble linear discriminant analysis (LDA) classifiers with a newly proposed overlapped partitioning method using 900 training data. In addition, the classification performances of the ensemble classifier with naive partitioning and a single LDA classifier were compared. One of three conditions for dimension reduction was applied: the stepwise method, principal component analysis (PCA), or none. The results show that an ensemble stepwise LDA (SWLDA) classifier with overlapped partitioning achieved a better performance than the commonly used single SWLDA classifier and an ensemble SWLDA classifier with naive partitioning. This result implies that the performance of the SWLDA is improved by overlapped partitioning and the ensemble classifier with overlapped partitioning requires less training data than that with naive partitioning. This study contributes towards reducing the required amount of training data and achieving better classification performance. PMID:24695550
Overlapped partitioning for ensemble classifiers of P300-based brain-computer interfaces.
Onishi, Akinari; Natsume, Kiyohisa
2014-01-01
A P300-based brain-computer interface (BCI) enables a wide range of people to control devices that improve their quality of life. Ensemble classifiers with naive partitioning were recently applied to the P300-based BCI and these classification performances were assessed. However, they were usually trained on a large amount of training data (e.g., 15300). In this study, we evaluated ensemble linear discriminant analysis (LDA) classifiers with a newly proposed overlapped partitioning method using 900 training data. In addition, the classification performances of the ensemble classifier with naive partitioning and a single LDA classifier were compared. One of three conditions for dimension reduction was applied: the stepwise method, principal component analysis (PCA), or none. The results show that an ensemble stepwise LDA (SWLDA) classifier with overlapped partitioning achieved a better performance than the commonly used single SWLDA classifier and an ensemble SWLDA classifier with naive partitioning. This result implies that the performance of the SWLDA is improved by overlapped partitioning and the ensemble classifier with overlapped partitioning requires less training data than that with naive partitioning. This study contributes towards reducing the required amount of training data and achieving better classification performance.
Improving condition severity classification with an efficient active learning based framework
Nissim, Nir; Boland, Mary Regina; Tatonetti, Nicholas P.; Elovici, Yuval; Hripcsak, George; Shahar, Yuval; Moskovitch, Robert
2017-01-01
Classification of condition severity can be useful for discriminating among sets of conditions or phenotypes, for example when prioritizing patient care or for other healthcare purposes. Electronic Health Records (EHRs) represent a rich source of labeled information that can be harnessed for severity classification. The labeling of EHRs is expensive and in many cases requires employing professionals with high level of expertise. In this study, we demonstrate the use of Active Learning (AL) techniques to decrease expert labeling efforts. We employ three AL methods and demonstrate their ability to reduce labeling efforts while effectively discriminating condition severity. We incorporate three AL methods into a new framework based on the original CAESAR (Classification Approach for Extracting Severity Automatically from Electronic Health Records) framework to create the Active Learning Enhancement framework (CAESAR-ALE). We applied CAESAR-ALE to a dataset containing 516 conditions of varying severity levels that were manually labeled by seven experts. Our dataset, called the “CAESAR dataset,” was created from the medical records of 1.9 million patients treated at Columbia University Medical Center (CUMC). All three AL methods decreased labelers’ efforts compared to the learning methods applied by the original CAESER framework in which the classifier was trained on the entire set of conditions; depending on the AL strategy used in the current study, the reduction ranged from 48% to 64% that can result in significant savings, both in time and money. As for the PPV (precision) measure, CAESAR-ALE achieved more than 13% absolute improvement in the predictive capabilities of the framework when classifying conditions as severe. These results demonstrate the potential of AL methods to decrease the labeling efforts of medical experts, while increasing accuracy given the same (or even a smaller) number of acquired conditions. We also demonstrated that the methods included in the CAESAR-ALE framework (Exploitation and Combination_XA) are more robust to the use of human labelers with different levels of professional expertise. PMID:27016383
Improving condition severity classification with an efficient active learning based framework.
Nissim, Nir; Boland, Mary Regina; Tatonetti, Nicholas P; Elovici, Yuval; Hripcsak, George; Shahar, Yuval; Moskovitch, Robert
2016-06-01
Classification of condition severity can be useful for discriminating among sets of conditions or phenotypes, for example when prioritizing patient care or for other healthcare purposes. Electronic Health Records (EHRs) represent a rich source of labeled information that can be harnessed for severity classification. The labeling of EHRs is expensive and in many cases requires employing professionals with high level of expertise. In this study, we demonstrate the use of Active Learning (AL) techniques to decrease expert labeling efforts. We employ three AL methods and demonstrate their ability to reduce labeling efforts while effectively discriminating condition severity. We incorporate three AL methods into a new framework based on the original CAESAR (Classification Approach for Extracting Severity Automatically from Electronic Health Records) framework to create the Active Learning Enhancement framework (CAESAR-ALE). We applied CAESAR-ALE to a dataset containing 516 conditions of varying severity levels that were manually labeled by seven experts. Our dataset, called the "CAESAR dataset," was created from the medical records of 1.9 million patients treated at Columbia University Medical Center (CUMC). All three AL methods decreased labelers' efforts compared to the learning methods applied by the original CAESER framework in which the classifier was trained on the entire set of conditions; depending on the AL strategy used in the current study, the reduction ranged from 48% to 64% that can result in significant savings, both in time and money. As for the PPV (precision) measure, CAESAR-ALE achieved more than 13% absolute improvement in the predictive capabilities of the framework when classifying conditions as severe. These results demonstrate the potential of AL methods to decrease the labeling efforts of medical experts, while increasing accuracy given the same (or even a smaller) number of acquired conditions. We also demonstrated that the methods included in the CAESAR-ALE framework (Exploitation and Combination_XA) are more robust to the use of human labelers with different levels of professional expertise. Copyright © 2016 Elsevier Inc. All rights reserved.
Cacha, L A; Parida, S; Dehuri, S; Cho, S-B; Poznanski, R R
2016-12-01
The huge number of voxels in fMRI over time poses a major challenge to for effective analysis. Fast, accurate, and reliable classifiers are required for estimating the decoding accuracy of brain activities. Although machine-learning classifiers seem promising, individual classifiers have their own limitations. To address this limitation, the present paper proposes a method based on the ensemble of neural networks to analyze fMRI data for cognitive state classification for application across multiple subjects. Similarly, the fuzzy integral (FI) approach has been employed as an efficient tool for combining different classifiers. The FI approach led to the development of a classifiers ensemble technique that performs better than any of the single classifier by reducing the misclassification, the bias, and the variance. The proposed method successfully classified the different cognitive states for multiple subjects with high accuracy of classification. Comparison of the performance improvement, while applying ensemble neural networks method, vs. that of the individual neural network strongly points toward the usefulness of the proposed method.
Document image improvement for OCR as a classification problem
NASA Astrophysics Data System (ADS)
Summers, Kristen M.
2003-01-01
In support of the goal of automatically selecting methods of enhancing an image to improve the accuracy of OCR on that image, we consider the problem of determining whether to apply each of a set of methods as a supervised classification problem for machine learning. We characterize each image according to a combination of two sets of measures: a set that are intended to reflect the degree of particular types of noise present in documents in a single font of Roman or similar script and a more general set based on connected component statistics. We consider several potential methods of image improvement, each of which constitutes its own 2-class classification problem, according to whether transforming the image with this method improves the accuracy of OCR. In our experiments, the results varied for the different image transformation methods, but the system made the correct choice in 77% of the cases in which the decision affected the OCR score (in the range [0,1]) by at least .01, and it made the correct choice 64% of the time overall.
Active Learning of Classification Models with Likert-Scale Feedback.
Xue, Yanbing; Hauskrecht, Milos
2017-01-01
Annotation of classification data by humans can be a time-consuming and tedious process. Finding ways of reducing the annotation effort is critical for building the classification models in practice and for applying them to a variety of classification tasks. In this paper, we develop a new active learning framework that combines two strategies to reduce the annotation effort. First, it relies on label uncertainty information obtained from the human in terms of the Likert-scale feedback. Second, it uses active learning to annotate examples with the greatest expected change. We propose a Bayesian approach to calculate the expectation and an incremental SVM solver to reduce the time complexity of the solvers. We show the combination of our active learning strategy and the Likert-scale feedback can learn classification models more rapidly and with a smaller number of labeled instances than methods that rely on either Likert-scale labels or active learning alone.
Active Learning of Classification Models with Likert-Scale Feedback
Xue, Yanbing; Hauskrecht, Milos
2017-01-01
Annotation of classification data by humans can be a time-consuming and tedious process. Finding ways of reducing the annotation effort is critical for building the classification models in practice and for applying them to a variety of classification tasks. In this paper, we develop a new active learning framework that combines two strategies to reduce the annotation effort. First, it relies on label uncertainty information obtained from the human in terms of the Likert-scale feedback. Second, it uses active learning to annotate examples with the greatest expected change. We propose a Bayesian approach to calculate the expectation and an incremental SVM solver to reduce the time complexity of the solvers. We show the combination of our active learning strategy and the Likert-scale feedback can learn classification models more rapidly and with a smaller number of labeled instances than methods that rely on either Likert-scale labels or active learning alone. PMID:28979827
A Hybrid Supervised/Unsupervised Machine Learning Approach to Solar Flare Prediction
NASA Astrophysics Data System (ADS)
Benvenuto, Federico; Piana, Michele; Campi, Cristina; Massone, Anna Maria
2018-01-01
This paper introduces a novel method for flare forecasting, combining prediction accuracy with the ability to identify the most relevant predictive variables. This result is obtained by means of a two-step approach: first, a supervised regularization method for regression, namely, LASSO is applied, where a sparsity-enhancing penalty term allows the identification of the significance with which each data feature contributes to the prediction; then, an unsupervised fuzzy clustering technique for classification, namely, Fuzzy C-Means, is applied, where the regression outcome is partitioned through the minimization of a cost function and without focusing on the optimization of a specific skill score. This approach is therefore hybrid, since it combines supervised and unsupervised learning; realizes classification in an automatic, skill-score-independent way; and provides effective prediction performances even in the case of imbalanced data sets. Its prediction power is verified against NOAA Space Weather Prediction Center data, using as a test set, data in the range between 1996 August and 2010 December and as training set, data in the range between 1988 December and 1996 June. To validate the method, we computed several skill scores typically utilized in flare prediction and compared the values provided by the hybrid approach with the ones provided by several standard (non-hybrid) machine learning methods. The results showed that the hybrid approach performs classification better than all other supervised methods and with an effectiveness comparable to the one of clustering methods; but, in addition, it provides a reliable ranking of the weights with which the data properties contribute to the forecast.
NASA Astrophysics Data System (ADS)
Alevizos, Evangelos; Snellen, Mirjam; Simons, Dick; Siemes, Kerstin; Greinert, Jens
2018-06-01
This study applies three classification methods exploiting the angular dependence of acoustic seafloor backscatter along with high resolution sub-bottom profiling for seafloor sediment characterization in the Eckernförde Bay, Baltic Sea Germany. This area is well suited for acoustic backscatter studies due to its shallowness, its smooth bathymetry and the presence of a wide range of sediment types. Backscatter data were acquired using a Seabeam1180 (180 kHz) multibeam echosounder and sub-bottom profiler data were recorded using a SES-2000 parametric sonar transmitting 6 and 12 kHz. The high density of seafloor soundings allowed extracting backscatter layers for five beam angles over a large part of the surveyed area. A Bayesian probability method was employed for sediment classification based on the backscatter variability at a single incidence angle, whereas Maximum Likelihood Classification (MLC) and Principal Components Analysis (PCA) were applied to the multi-angle layers. The Bayesian approach was used for identifying the optimum number of acoustic classes because cluster validation is carried out prior to class assignment and class outputs are ordinal categorical values. The method is based on the principle that backscatter values from a single incidence angle express a normal distribution for a particular sediment type. The resulting Bayesian classes were well correlated to median grain sizes and the percentage of coarse material. The MLC method uses angular response information from five layers of training areas extracted from the Bayesian classification map. The subsequent PCA analysis is based on the transformation of these five layers into two principal components that comprise most of the data variability. These principal components were clustered in five classes after running an external cluster validation test. In general both methods MLC and PCA, separated the various sediment types effectively, showing good agreement (kappa >0.7) with the Bayesian approach which also correlates well with ground truth data (r2 > 0.7). In addition, sub-bottom data were used in conjunction with the Bayesian classification results to characterize acoustic classes with respect to their geological and stratigraphic interpretation. The joined interpretation of seafloor and sub-seafloor data sets proved to be an efficient approach for a better understanding of seafloor backscatter patchiness and to discriminate acoustically similar classes in different geological/bathymetric settings.
[The physiological classification of human thermal states under high environmental temperatures].
Bobrov, A F; Kuznets, E I
1995-01-01
The paper deals with the physiological classification of human thermal states in a hot environment. A review of the basic systems of classifications of thermal states is given, their main drawbacks are discussed. On the basis of human functional state research in a broad range of environmental temperatures the system of evaluation and classification of human thermal states is proposed. New integral one-dimensional multi-parametric criteria for evaluation are used. For the development of these criteria methods of factor, cluster and canonical correlation analyses are applied. Stochastic nomograms capable of identification of human thermal state for different intensity of influence are given. In this case evaluation of intensity is estimated according to one-dimensional criteria taking into account environmental temperature, physical load and time of man's staying in overheating conditions.
NASA Technical Reports Server (NTRS)
Boyd, R. K.; Brumfield, J. O.; Campbell, W. J.
1984-01-01
Three feature extraction methods, canonical analysis (CA), principal component analysis (PCA), and band selection, have been applied to Thematic Mapper Simulator (TMS) data in order to evaluate the relative performance of the methods. The results obtained show that CA is capable of providing a transformation of TMS data which leads to better classification results than provided by all seven bands, by PCA, or by band selection. A second conclusion drawn from the study is that TMS bands 2, 3, 4, and 7 (thermal) are most important for landcover classification.
[The study of M dwarf spectral classification].
Yi, Zhen-Ping; Pan, Jing-Chang; Luo, A-Li
2013-08-01
As the most common stars in the galaxy, M dwarfs can be used to trace the structure and evolution of the Milky Way. Besides, investigating M dwarfs is important for searching for habitability of extrasolar planets orbiting M dwarfs. Spectral classification of M dwarfs is a fundamental work. The authors used DR7 M dwarf sample of SLOAN to extract important features from the range of 600-900 nm by random forest method. Compared to the features used in Hammer Code, the authors added three new indices. Our test showed that the improved Hammer with new indices is more accurate. Our method has been applied to classify M dwarf spectra of LAMOST.
Extraction and Classification of Emotions for Business Research
NASA Astrophysics Data System (ADS)
Verma, Rajib
The commercial study of emotions has not embraced Internet / social mining yet, even though it has important applications in management. This is surprising since the emotional content is freeform, wide spread, can give a better indication of feelings (for instance with taboo subjects), and is inexpensive compared to other business research methods. A brief framework for applying text mining to this new research domain is shown and classification issues are discussed in an effort to quickly get businessman and researchers to adopt the mining methodology.
Raman spectroscopy detection of platelet for Alzheimer’s disease with predictive probabilities
NASA Astrophysics Data System (ADS)
Wang, L. J.; Du, X. Q.; Du, Z. W.; Yang, Y. Y.; Chen, P.; Tian, Q.; Shang, X. L.; Liu, Z. C.; Yao, X. Q.; Wang, J. Z.; Wang, X. H.; Cheng, Y.; Peng, J.; Shen, A. G.; Hu, J. M.
2014-08-01
Alzheimer’s disease (AD) is a common form of dementia. Early and differential diagnosis of AD has always been an arduous task for the medical expert due to the unapparent early symptoms and the currently imperfect imaging examination methods. Therefore, obtaining reliable markers with clinical diagnostic value in easily assembled samples is worthy and significant. Our previous work with laser Raman spectroscopy (LRS), in which we detected platelet samples of different ages of AD transgenic mice and non-transgenic controls, showed great effect in the diagnosis of AD. In addition, a multilayer perception network (MLP) classification method was adopted to discriminate the spectral data. However, there were disturbances, which were induced by noise from the machines and so on, in the data set; thus the MLP method had to be trained with large-scale data. In this paper, we aim to re-establish the classification models of early and advanced AD and the control group with fewer features, and apply some mechanism of noise reduction to improve the accuracy of models. An adaptive classification method based on the Gaussian process (GP) featured, with predictive probabilities, is proposed, which could tell when a data set is related to some kind of disease. Compared with MLP on the same feature set, GP showed much better performance in the experimental results. What is more, since the spectra of platelets are isolated from AD, GP has good expansibility and can be applied in diagnosis of many other similar diseases, such as Parkinson’s disease (PD). Spectral data of 4 month and 12 month AD platelets, as well as control data, were collected. With predictive probabilities, the proposed GP classification method improved the diagnostic sensitivity to nearly 100%. Samples were also collected from PD platelets as classification and comparison to the 12 month AD. The presented approach and our experiments indicate that utilization of GP with predictive probabilities in platelet LRS detection analysis turns out to be more accurate for early and differential diagnosis of AD and has a wide application prospect.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Olander, Jonathan; Myers, Corey
2013-07-01
Studsviks' Processing Facility Erwin (SPFE) has been treating Low-Level Radioactive Waste using its patented THOR process for over 13 years. Studsvik has been mixing and processing wastes of the same waste classification but different chemical and isotopic characteristics for the full extent of this period as a general matter of operations. Studsvik utilizes the accountability method to track the movement of radionuclides from acceptance of waste, through processing, and finally in the classification of waste for disposal. Recently the NRC has proposed to revise the 1995 Branch Technical Position on Concentration Averaging and Encapsulation (1995 BTP on CA) with additionalmore » clarification (draft BTP on CA). The draft BTP on CA has paved the way for large scale blending of higher activity and lower activity waste to produce a single waste for the purpose of classification. With the onset of blending in the waste treatment industry, there is concern from the public and state regulators as to the robustness of the accountability method and the ability of processors to prevent the inclusion of hot spots in waste. To address these concerns and verify the accountability method as applied by the SPFE, as well as the SPFE's ability to control waste package classification, testing of actual waste packages was performed. Testing consisted of a comprehensive dose rate survey of a container of processed waste. Separately, the waste package was modeled chemically and radiologically. Comparing the observed and theoretical data demonstrated that actual dose rates were lower than, but consistent with, modeled dose rates. Moreover, the distribution of radioactivity confirms that the SPFE can produce a radiologically homogeneous waste form. The results of the study demonstrate: 1) the accountability method as applied by the SPFE is valid and produces expected results; 2) the SPFE can produce a radiologically homogeneous waste; and 3) the SPFE can effectively control the waste package classification. (authors)« less
Prostate segmentation by sparse representation based classification.
Gao, Yaozong; Liao, Shu; Shen, Dinggang
2012-10-01
The segmentation of prostate in CT images is of essential importance to external beam radiotherapy, which is one of the major treatments for prostate cancer nowadays. During the radiotherapy, the prostate is radiated by high-energy x rays from different directions. In order to maximize the dose to the cancer and minimize the dose to the surrounding healthy tissues (e.g., bladder and rectum), the prostate in the new treatment image needs to be accurately localized. Therefore, the effectiveness and efficiency of external beam radiotherapy highly depend on the accurate localization of the prostate. However, due to the low contrast of the prostate with its surrounding tissues (e.g., bladder), the unpredicted prostate motion, and the large appearance variations across different treatment days, it is challenging to segment the prostate in CT images. In this paper, the authors present a novel classification based segmentation method to address these problems. To segment the prostate, the proposed method first uses sparse representation based classification (SRC) to enhance the prostate in CT images by pixel-wise classification, in order to overcome the limitation of poor contrast of the prostate images. Then, based on the classification results, previous segmented prostates of the same patient are used as patient-specific atlases to align onto the current treatment image and the majority voting strategy is finally adopted to segment the prostate. In order to address the limitations of the traditional SRC in pixel-wise classification, especially for the purpose of segmentation, the authors extend SRC from the following four aspects: (1) A discriminant subdictionary learning method is proposed to learn a discriminant and compact representation of training samples for each class so that the discriminant power of SRC can be increased and also SRC can be applied to the large-scale pixel-wise classification. (2) The L1 regularized sparse coding is replaced by the elastic net in order to obtain a smooth and clear prostate boundary in the classification result. (3) Residue-based linear regression is incorporated to improve the classification performance and to extend SRC from hard classification to soft classification. (4) Iterative SRC is proposed by using context information to iteratively refine the classification results. The proposed method has been comprehensively evaluated on a dataset consisting of 330 CT images from 24 patients. The effectiveness of the extended SRC has been validated by comparing it with the traditional SRC based on the proposed four extensions. The experimental results show that our extended SRC can obtain not only more accurate classification results but also smoother and clearer prostate boundary than the traditional SRC. Besides, the comparison with other five state-of-the-art prostate segmentation methods indicates that our method can achieve better performance than other methods under comparison. The authors have proposed a novel prostate segmentation method based on the sparse representation based classification, which can achieve considerably accurate segmentation results in CT prostate segmentation.
Armutlu, Pelin; Ozdemir, Muhittin E; Uney-Yuksektepe, Fadime; Kavakli, I Halil; Turkay, Metin
2008-10-03
A priori analysis of the activity of drugs on the target protein by computational approaches can be useful in narrowing down drug candidates for further experimental tests. Currently, there are a large number of computational methods that predict the activity of drugs on proteins. In this study, we approach the activity prediction problem as a classification problem and, we aim to improve the classification accuracy by introducing an algorithm that combines partial least squares regression with mixed-integer programming based hyper-boxes classification method, where drug molecules are classified as low active or high active regarding their binding activity (IC50 values) on target proteins. We also aim to determine the most significant molecular descriptors for the drug molecules. We first apply our approach by analyzing the activities of widely known inhibitor datasets including Acetylcholinesterase (ACHE), Benzodiazepine Receptor (BZR), Dihydrofolate Reductase (DHFR), Cyclooxygenase-2 (COX-2) with known IC50 values. The results at this stage proved that our approach consistently gives better classification accuracies compared to 63 other reported classification methods such as SVM, Naïve Bayes, where we were able to predict the experimentally determined IC50 values with a worst case accuracy of 96%. To further test applicability of this approach we first created dataset for Cytochrome P450 C17 inhibitors and then predicted their activities with 100% accuracy. Our results indicate that this approach can be utilized to predict the inhibitory effects of inhibitors based on their molecular descriptors. This approach will not only enhance drug discovery process, but also save time and resources committed.
Yang, Ze-Hui; Zheng, Rui; Gao, Yuan; Zhang, Qiang
2016-09-01
With the widespread application of high-throughput technology, numerous meta-analysis methods have been proposed for differential expression profiling across multiple studies. We identified the suitable differentially expressed (DE) genes that contributed to lung adenocarcinoma (ADC) clustering based on seven popular multiple meta-analysis methods. Seven microarray expression profiles of ADC and normal controls were extracted from the ArrayExpress database. The Bioconductor was used to perform the data preliminary preprocessing. Then, DE genes across multiple studies were identified. Hierarchical clustering was applied to compare the classification performance for microarray data samples. The classification efficiency was compared based on accuracy, sensitivity and specificity. Across seven datasets, 573 ADC cases and 222 normal controls were collected. After filtering out unexpressed and noninformative genes, 3688 genes were remained for further analysis. The classification efficiency analysis showed that DE genes identified by sum of ranks method separated ADC from normal controls with the best accuracy, sensitivity and specificity of 0.953, 0.969 and 0.932, respectively. The gene set with the highest classification accuracy mainly participated in the regulation of response to external stimulus (P = 7.97E-04), cyclic nucleotide-mediated signaling (P = 0.01), regulation of cell morphogenesis (P = 0.01) and regulation of cell proliferation (P = 0.01). Evaluation of DE genes identified by different meta-analysis methods in classification efficiency provided a new perspective to the choice of the suitable method in a given application. Varying meta-analysis methods always present varying abilities, so synthetic consideration should be taken when providing meta-analysis methods for particular research. © 2015 John Wiley & Sons Ltd.
Chinese Sentence Classification Based on Convolutional Neural Network
NASA Astrophysics Data System (ADS)
Gu, Chengwei; Wu, Ming; Zhang, Chuang
2017-10-01
Sentence classification is one of the significant issues in Natural Language Processing (NLP). Feature extraction is often regarded as the key point for natural language processing. Traditional ways based on machine learning can not take high level features into consideration, such as Naive Bayesian Model. The neural network for sentence classification can make use of contextual information to achieve greater results in sentence classification tasks. In this paper, we focus on classifying Chinese sentences. And the most important is that we post a novel architecture of Convolutional Neural Network (CNN) to apply on Chinese sentence classification. In particular, most of the previous methods often use softmax classifier for prediction, we embed a linear support vector machine to substitute softmax in the deep neural network model, minimizing a margin-based loss to get a better result. And we use tanh as an activation function, instead of ReLU. The CNN model improve the result of Chinese sentence classification tasks. Experimental results on the Chinese news title database validate the effectiveness of our model.
How a national vegetation classification can help ecological research and management
Franklin, Scott; Comer, Patrick; Evens, Julie; Ezcurra, Exequiel; Faber-Langendoen, Don; Franklin, Janet; Jennings, Michael; Josse, Carmen; Lea, Chris; Loucks, Orie; Muldavin, Esteban; Peet, Robert K.; Ponomarenko, Serguei; Roberts, David G.; Solomeshch, Ayzik; Keeler-Wolf, Todd; Van Kley, James; Weakley, Alan; McKerrow, Alexa; Burke, Marianne; Spurrier, Carol
2015-01-01
The elegance of classification lies in its ability to compile and systematize various terminological conventions and masses of information that are unattainable during typical research projects. Imagine a discipline without standards for collection, analysis, and interpretation; unfortunately, that describes much of 20th-century vegetation ecology. With differing methods, how do we assess community dynamics over decades, much less centuries? How do we compare plant communities from different areas? The need for a widely applied vegetation classification has long been clear. Now imagine a multi-decade effort to assimilate hundreds of disparate vegetation classifications into one common classification for the US. In this letter, we introduce the US National Vegetation Classification (USNVC; www.usnvc.org) as a powerful tool for research and conservation, analogous to the argument made by Schimel and Chadwick (2013) for soils. The USNVC provides a national framework to classify and describe vegetation; here we describe the USNVC and offer brief examples of its efficacy.
NASA Astrophysics Data System (ADS)
Swain, Sushree Diptimayee; Ray, Pravat Kumar; Mohanty, K. B.
2016-06-01
This research paper discover the design of a shunt Passive Power Filter (PPF) in Hybrid Series Active Power Filter (HSAPF) that employs a novel analytic methodology which is superior than FFT analysis. This novel approach consists of the estimation, detection and classification of the signals. The proposed method is applied to estimate, detect and classify the power quality (PQ) disturbance such as harmonics. This proposed work deals with three methods: the harmonic detection through wavelet transform method, the harmonic estimation by Kalman Filter algorithm and harmonic classification by decision tree method. From different type of mother wavelets in wavelet transform method, the db8 is selected as suitable mother wavelet because of its potency on transient response and crouched oscillation at frequency domain. In harmonic compensation process, the detected harmonic is compensated through Hybrid Series Active Power Filter (HSAPF) based on Instantaneous Reactive Power Theory (IRPT). The efficacy of the proposed method is verified in MATLAB/SIMULINK domain and as well as with an experimental set up. The obtained results confirm the superiority of the proposed methodology than FFT analysis. This newly proposed PPF is used to make the conventional HSAPF more robust and stable.
Nelson, G.; Ramsey, Elijah W.; Rangoonwala, A.
2005-01-01
Landsat Thematic Mapper images and collateral data sources were used to classify the land cover of the Mermentau River Basin within the chenier coastal plain and the adjacent uplands of Louisiana, USA. Landcover classes followed that of the National Oceanic and Atmospheric Administration's Coastal Change Analysis Program; however, classification methods needed to be developed to meet these national standards. Our first classification was limited to the Mermentau River Basin (MRB) in southcentral Louisiana, and the years of 1990, 1993, and 1996. To overcome problems due to class spectral inseparable, spatial and spectra continuums, mixed landcovers, and abnormal transitions, we separated the coastal area into regions of commonality and applying masks to specific land mixtures. Over the three years and 14 landcover classes (aggregating the cultivated land and grassland, and water and floating vegetation classes), overall accuracies ranged from 82% to 90%. To enhance landcover change interpretation, three indicators were introduced as Location Stability, Residence stability, and Turnover. Implementing methods substantiated in the multiple date MRB classification, we spatially extended the classification to the entire Louisiana coast and temporally extended the original 1990, 1993, 1996 classifications to 1999 (Figure 1). We also advanced the operational functionality of the classification and increased the credibility of change detection results. Increased operational functionality that resulted in diminished user input was for the most part gained by implementing a classification logic based on forbidden transitions. The logic detected and corrected misclassifications and mostly alleviated the necessity of subregion separation prior to the classification. The new methods provided an improved ability for more timely detection and response to landcover impact. ?? 2005 IEEE.
Task-Driven Dictionary Learning Based on Mutual Information for Medical Image Classification.
Diamant, Idit; Klang, Eyal; Amitai, Michal; Konen, Eli; Goldberger, Jacob; Greenspan, Hayit
2017-06-01
We present a novel variant of the bag-of-visual-words (BoVW) method for automated medical image classification. Our approach improves the BoVW model by learning a task-driven dictionary of the most relevant visual words per task using a mutual information-based criterion. Additionally, we generate relevance maps to visualize and localize the decision of the automatic classification algorithm. These maps demonstrate how the algorithm works and show the spatial layout of the most relevant words. We applied our algorithm to three different tasks: chest x-ray pathology identification (of four pathologies: cardiomegaly, enlarged mediastinum, right consolidation, and left consolidation), liver lesion classification into four categories in computed tomography (CT) images and benign/malignant clusters of microcalcifications (MCs) classification in breast mammograms. Validation was conducted on three datasets: 443 chest x-rays, 118 portal phase CT images of liver lesions, and 260 mammography MCs. The proposed method improves the classical BoVW method for all tested applications. For chest x-ray, area under curve of 0.876 was obtained for enlarged mediastinum identification compared to 0.855 using classical BoVW (with p-value 0.01). For MC classification, a significant improvement of 4% was achieved using our new approach (with p-value = 0.03). For liver lesion classification, an improvement of 6% in sensitivity and 2% in specificity were obtained (with p-value 0.001). We demonstrated that classification based on informative selected set of words results in significant improvement. Our new BoVW approach shows promising results in clinically important domains. Additionally, it can discover relevant parts of images for the task at hand without explicit annotations for training data. This can provide computer-aided support for medical experts in challenging image analysis tasks.
Gilbert, Fabian; Böhm, Dirk; Eden, Lars; Schmalzl, Jonas; Meffert, Rainer H; Köstler, Herbert; Weng, Andreas M; Ziegler, Dirk
2016-08-22
The Goutallier Classification is a semi quantitative classification system to determine the amount of fatty degeneration in rotator cuff muscles. Although initially proposed for axial computer tomography scans it is currently applied to magnet-resonance-imaging-scans. The role for its clinical use is controversial, as the reliability of the classification has been shown to be inconsistent. The purpose of this study was to compare the semi quantitative MRI-based Goutallier Classification applied by 5 different raters to experimental MR spectroscopic quantitative fat measurement in order to determine the correlation between this classification system and the true extent of fatty degeneration shown by spectroscopy. MRI-scans of 42 patients with rotator cuff tears were examined by 5 shoulder surgeons and were graduated according to the MRI-based Goutallier Classification proposed by Fuchs et al. Additionally the fat/water ratio was measured with MR spectroscopy using the experimental SPLASH technique. The semi quantitative grading according to the Goutallier Classification was statistically correlated with the quantitative measured fat/water ratio using Spearman's rank correlation. Statistical analysis of the data revealed only fair correlation of the Goutallier Classification system and the quantitative fat/water ratio with R = 0.35 (p < 0.05). By dichotomizing the scale the correlation was 0.72. The interobserver and intraobserver reliabilities were substantial with R = 0.62 and R = 0.74 (p < 0.01). The correlation between the semi quantitative MRI based Goutallier Classification system and MR spectroscopic fat measurement is weak. As an adequate estimation of fatty degeneration based on standard MRI may not be possible, quantitative methods need to be considered in order to increase diagnostic safety and thus provide patients with ideal care in regard to the amount of fatty degeneration. Spectroscopic MR measurement may increase the accuracy of the Goutallier classification and thus improve the prediction of clinical results after rotator cuff repair. However, these techniques are currently only available in an experimental setting.
Novel Strength Test Battery to Permit Evidence-Based Paralympic Classification
Beckman, Emma M.; Newcombe, Peter; Vanlandewijck, Yves; Connick, Mark J.; Tweedy, Sean M.
2014-01-01
Abstract Ordinal-scale strength assessment methods currently used in Paralympic athletics classification prevent the development of evidence-based classification systems. This study evaluated a battery of 7, ratio-scale, isometric tests with the aim of facilitating the development of evidence-based methods of classification. This study aimed to report sex-specific normal performance ranges, evaluate test–retest reliability, and evaluate the relationship between the measures and body mass. Body mass and strength measures were obtained from 118 participants—63 males and 55 females—ages 23.2 years ± 3.7 (mean ± SD). Seventeen participants completed the battery twice to evaluate test–retest reliability. The body mass–strength relationship was evaluated using Pearson correlations and allometric exponents. Conventional patterns of force production were observed. Reliability was acceptable (mean intraclass correlation = 0.85). Eight measures had moderate significant correlations with body size (r = 0.30–61). Allometric exponents were higher in males than in females (mean 0.99 vs 0.30). Results indicate that this comprehensive and parsimonious battery is an important methodological advance because it has psychometric properties critical for the development of evidence-based classification. Measures were interrelated with body size, indicating further research is required to determine whether raw measures require normalization in order to be validly applied in classification. PMID:25068950
Integrated feature extraction and selection for neuroimage classification
NASA Astrophysics Data System (ADS)
Fan, Yong; Shen, Dinggang
2009-02-01
Feature extraction and selection are of great importance in neuroimage classification for identifying informative features and reducing feature dimensionality, which are generally implemented as two separate steps. This paper presents an integrated feature extraction and selection algorithm with two iterative steps: constrained subspace learning based feature extraction and support vector machine (SVM) based feature selection. The subspace learning based feature extraction focuses on the brain regions with higher possibility of being affected by the disease under study, while the possibility of brain regions being affected by disease is estimated by the SVM based feature selection, in conjunction with SVM classification. This algorithm can not only take into account the inter-correlation among different brain regions, but also overcome the limitation of traditional subspace learning based feature extraction methods. To achieve robust performance and optimal selection of parameters involved in feature extraction, selection, and classification, a bootstrapping strategy is used to generate multiple versions of training and testing sets for parameter optimization, according to the classification performance measured by the area under the ROC (receiver operating characteristic) curve. The integrated feature extraction and selection method is applied to a structural MR image based Alzheimer's disease (AD) study with 98 non-demented and 100 demented subjects. Cross-validation results indicate that the proposed algorithm can improve performance of the traditional subspace learning based classification.
Automated classification of dolphin echolocation click types from the Gulf of Mexico.
Frasier, Kaitlin E; Roch, Marie A; Soldevilla, Melissa S; Wiggins, Sean M; Garrison, Lance P; Hildebrand, John A
2017-12-01
Delphinids produce large numbers of short duration, broadband echolocation clicks which may be useful for species classification in passive acoustic monitoring efforts. A challenge in echolocation click classification is to overcome the many sources of variability to recognize underlying patterns across many detections. An automated unsupervised network-based classification method was developed to simulate the approach a human analyst uses when categorizing click types: Clusters of similar clicks were identified by incorporating multiple click characteristics (spectral shape and inter-click interval distributions) to distinguish within-type from between-type variation, and identify distinct, persistent click types. Once click types were established, an algorithm for classifying novel detections using existing clusters was tested. The automated classification method was applied to a dataset of 52 million clicks detected across five monitoring sites over two years in the Gulf of Mexico (GOM). Seven distinct click types were identified, one of which is known to be associated with an acoustically identifiable delphinid (Risso's dolphin) and six of which are not yet identified. All types occurred at multiple monitoring locations, but the relative occurrence of types varied, particularly between continental shelf and slope locations. Automatically-identified click types from autonomous seafloor recorders without verifiable species identification were compared with clicks detected on sea-surface towed hydrophone arrays in the presence of visually identified delphinid species. These comparisons suggest potential species identities for the animals producing some echolocation click types. The network-based classification method presented here is effective for rapid, unsupervised delphinid click classification across large datasets in which the click types may not be known a priori.
Automated classification of dolphin echolocation click types from the Gulf of Mexico
Roch, Marie A.; Soldevilla, Melissa S.; Wiggins, Sean M.; Garrison, Lance P.; Hildebrand, John A.
2017-01-01
Delphinids produce large numbers of short duration, broadband echolocation clicks which may be useful for species classification in passive acoustic monitoring efforts. A challenge in echolocation click classification is to overcome the many sources of variability to recognize underlying patterns across many detections. An automated unsupervised network-based classification method was developed to simulate the approach a human analyst uses when categorizing click types: Clusters of similar clicks were identified by incorporating multiple click characteristics (spectral shape and inter-click interval distributions) to distinguish within-type from between-type variation, and identify distinct, persistent click types. Once click types were established, an algorithm for classifying novel detections using existing clusters was tested. The automated classification method was applied to a dataset of 52 million clicks detected across five monitoring sites over two years in the Gulf of Mexico (GOM). Seven distinct click types were identified, one of which is known to be associated with an acoustically identifiable delphinid (Risso’s dolphin) and six of which are not yet identified. All types occurred at multiple monitoring locations, but the relative occurrence of types varied, particularly between continental shelf and slope locations. Automatically-identified click types from autonomous seafloor recorders without verifiable species identification were compared with clicks detected on sea-surface towed hydrophone arrays in the presence of visually identified delphinid species. These comparisons suggest potential species identities for the animals producing some echolocation click types. The network-based classification method presented here is effective for rapid, unsupervised delphinid click classification across large datasets in which the click types may not be known a priori. PMID:29216184
Reliability of a New Radiographic Classification for Developmental Dysplasia of the Hip.
Narayanan, Unni; Mulpuri, Kishore; Sankar, Wudbhav N; Clarke, Nicholas M P; Hosalkar, Harish; Price, Charles T
2015-01-01
Existing radiographic classification schemes (eg, Tönnis criteria) for DDH quantify the severity of disease based on the position of the ossific nucleus relative to Hilgenreiner's and Perkin's lines. By definition, this method requires the presence of an ossification centre, which can be delayed in appearance and eccentric in location within the femoral head. A new radiographic classification system has been developed by the International Hip Dysplasia Institute (IHDI), which uses the mid-point of the proximal femoral metaphysis as a reference landmark, and can therefore be applied to children of all ages. The purpose of this study was to compare the reliability of this new method with that of Tönnis, as the first step in establishing its validity and clinical utility. Twenty standardized anteroposterior pelvic radiographs of children with untreated DDH were selected purposefully to capture the spectrum of age (range, 3 to 32 mo) at presentation and disease severity. Each of the hips was classified separately by the IHDI and Tönnis methods by 6 experienced pediatric orthopaedists from the United States, Canada, Mexico, United Kingdom, and by 2 orthopaedic senior residents. The inter-rater reliability was tested using the Intra Class Correlation coefficient (ICC) to measure concordance between raters. All 40 hips were classifiable by the IHDI method by all raters. Ten of the 40 hips could not be classified by the Tönnis method because of the absence of the ossific nucleus on one or both sides. The ICC (95% confidence interval) for the IHDI method for all raters was 0.90 (0.83-0.95) and 0.95 (0.91-0.98) for the right and left hips, respectively. The corresponding ICCs for the Tönnis method were 0.63 (0.46-0.80) and 0.60 (0.43-0.78), respectively. There was no significant difference between the ICCs of the 6 experts and 2 trainees. The IHDI method of classification has excellent inter-rater reliability, both among experts and novices, and is more widely applicable than the Tönnis method as it can be applied even when the ossification centre is absent. Level II (diagnostic).
Di Lonardo, Maria Chiara; Franzese, Maurizio; Costa, Giulia; Gavasci, Renato; Lombardi, Francesco
2016-01-01
This work assessed the quality in terms of solid recovered fuel (SRF) definitions of the dry light flow (until now indicated as refuse derived fuel, RDF), heavy rejects and stabilisation rejects, produced by two mechanical biological treatment plants of Rome (Italy). SRF classification and specifications were evaluated first on the basis of RDF historical characterisation methods and data and then applying the sampling and analytical methods laid down by the recently issued SRF standards. The results showed that the dry light flow presented a worst SRF class in terms of net calorific value applying the new methods compared to that obtained from RDF historical data (4 instead of 3). This lead to incompliance with end of waste criteria established by Italian legislation for SRF use as co-fuel in cement kilns and power plants. Furthermore, the metal contents of the dry light flow obtained applying SRF current methods proved to be considerably higher (although still meeting SRF specifications) compared to those resulting from historical data retrieved with RDF standard methods. These differences were not related to a decrease in the quality of the dry light flow produced in the mechanical-biological treatment plants but rather to the different sampling procedures set by the former RDF and current SRF standards. In particular, the shredding of the sample before quartering established by the latter methods ensures that also the finest waste fractions, characterised by higher moisture and metal contents, are included in the sample to be analysed, therefore affecting the composition and net calorific value of the waste. As for the reject flows, on the basis of their SRF classification and specification parameters, it was found that combined with the dry light flow they may present similar if not the same class codes as the latter alone, thus indicating that these material flows could be also treated in combustion plants instead of landfilled. In conclusion, the introduction of SRF definitions, classification and specification procedures, while not necessarily leading to an upgrade of the waste as co-fuel in cement kilns and power plants, may anyhow provide new possibilities for energy recovery from waste by increasing the types of mechanically treated waste flows that may be thermally treated. Copyright © 2015 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
He, Shixuan; Xie, Wanyi; Zhang, Wei; Zhang, Liqun; Wang, Yunxia; Liu, Xiaoling; Liu, Yulong; Du, Chunlei
2015-02-01
A novel strategy which combines iteratively cubic spline fitting baseline correction method with discriminant partial least squares qualitative analysis is employed to analyze the surface enhanced Raman scattering (SERS) spectroscopy of banned food additives, such as Sudan I dye and Rhodamine B in food, Malachite green residues in aquaculture fish. Multivariate qualitative analysis methods, using the combination of spectra preprocessing iteratively cubic spline fitting (ICSF) baseline correction with principal component analysis (PCA) and discriminant partial least squares (DPLS) classification respectively, are applied to investigate the effectiveness of SERS spectroscopy for predicting the class assignments of unknown banned food additives. PCA cannot be used to predict the class assignments of unknown samples. However, the DPLS classification can discriminate the class assignment of unknown banned additives using the information of differences in relative intensities. The results demonstrate that SERS spectroscopy combined with ICSF baseline correction method and exploratory analysis methodology DPLS classification can be potentially used for distinguishing the banned food additives in field of food safety.
Shin, Jaeyoung; Kwon, Jinuk; Im, Chang-Hwan
2018-01-01
The performance of a brain-computer interface (BCI) can be enhanced by simultaneously using two or more modalities to record brain activity, which is generally referred to as a hybrid BCI. To date, many BCI researchers have tried to implement a hybrid BCI system by combining electroencephalography (EEG) and functional near-infrared spectroscopy (NIRS) to improve the overall accuracy of binary classification. However, since hybrid EEG-NIRS BCI, which will be denoted by hBCI in this paper, has not been applied to ternary classification problems, paradigms and classification strategies appropriate for ternary classification using hBCI are not well investigated. Here we propose the use of an hBCI for the classification of three brain activation patterns elicited by mental arithmetic, motor imagery, and idle state, with the aim to elevate the information transfer rate (ITR) of hBCI by increasing the number of classes while minimizing the loss of accuracy. EEG electrodes were placed over the prefrontal cortex and the central cortex, and NIRS optodes were placed only on the forehead. The ternary classification problem was decomposed into three binary classification problems using the "one-versus-one" (OVO) classification strategy to apply the filter-bank common spatial patterns filter to EEG data. A 10 × 10-fold cross validation was performed using shrinkage linear discriminant analysis (sLDA) to evaluate the average classification accuracies for EEG-BCI, NIRS-BCI, and hBCI when the meta-classification method was adopted to enhance classification accuracy. The ternary classification accuracies for EEG-BCI, NIRS-BCI, and hBCI were 76.1 ± 12.8, 64.1 ± 9.7, and 82.2 ± 10.2%, respectively. The classification accuracy of the proposed hBCI was thus significantly higher than those of the other BCIs ( p < 0.005). The average ITR for the proposed hBCI was calculated to be 4.70 ± 1.92 bits/minute, which was 34.3% higher than that reported for a previous binary hBCI study.
Can SLE classification rules be effectively applied to diagnose unclear SLE cases?
Mesa, Annia; Fernandez, Mitch; Wu, Wensong; Narasimhan, Giri; Greidinger, Eric L.; Mills, DeEtta K.
2016-01-01
Summary Objective Develop a novel classification criteria to distinguish between unclear SLE and MCTD cases. Methods A total of 205 variables from 111 SLE and 55 MCTD patients were evaluated to uncover unique molecular and clinical markers for each disease. Binomial logistic regressions (BLR) were performed on currently used SLE and MCTD classification criteria sets to obtain six reduced models with power to discriminate between unclear SLE and MCTD patients which were confirmed by Receiving Operating Characteristic (ROC) curve. Decision trees were employed to delineate novel classification rules to discriminate between unclear SLE and MCTD patients. Results SLE and MCTD patients exhibited contrasting molecular markers and clinical manifestations. Furthermore, reduced models highlighted SLE patients exhibit prevalence of skin rashes and renal disease while MCTD cases show dominance of myositis and muscle weakness. Additionally decision trees analyses revealed a novel classification rule tailored to differentiate unclear SLE and MCTD patients (Lu-vs-M) with an overall accuracy of 88%. Conclusions Validation of our novel proposed classification rule (Lu-vs-M) includes novel contrasting characteristics (calcinosis, CPK elevated and anti-IgM reactivity for U1-70K, U1A and U1C) between SLE and MCTD patients and showed a 33% improvement in distinguishing these disorders when compare to currently used classification criteria sets. Pending additional validation, our novel classification rule is a promising method to distinguish between patients with unclear SLE and MCTD diagnosis. PMID:27353506
NASA Astrophysics Data System (ADS)
Li, Zuhe; Fan, Yangyu; Liu, Weihua; Yu, Zeqi; Wang, Fengqin
2017-01-01
We aim to apply sparse autoencoder-based unsupervised feature learning to emotional semantic analysis for textile images. To tackle the problem of limited training data, we present a cross-domain feature learning scheme for emotional textile image classification using convolutional autoencoders. We further propose a correlation-analysis-based feature selection method for the weights learned by sparse autoencoders to reduce the number of features extracted from large size images. First, we randomly collect image patches on an unlabeled image dataset in the source domain and learn local features with a sparse autoencoder. We then conduct feature selection according to the correlation between different weight vectors corresponding to the autoencoder's hidden units. We finally adopt a convolutional neural network including a pooling layer to obtain global feature activations of textile images in the target domain and send these global feature vectors into logistic regression models for emotional image classification. The cross-domain unsupervised feature learning method achieves 65% to 78% average accuracy in the cross-validation experiments corresponding to eight emotional categories and performs better than conventional methods. Feature selection can reduce the computational cost of global feature extraction by about 50% while improving classification performance.
Kavianpour, Hamidreza; Vasighi, Mahdi
2017-02-01
Nowadays, having knowledge about cellular attributes of proteins has an important role in pharmacy, medical science and molecular biology. These attributes are closely correlated with the function and three-dimensional structure of proteins. Knowledge of protein structural class is used by various methods for better understanding the protein functionality and folding patterns. Computational methods and intelligence systems can have an important role in performing structural classification of proteins. Most of protein sequences are saved in databanks as characters and strings and a numerical representation is essential for applying machine learning methods. In this work, a binary representation of protein sequences is introduced based on reduced amino acids alphabets according to surrounding hydrophobicity index. Many important features which are hidden in these long binary sequences can be clearly displayed through their cellular automata images. The extracted features from these images are used to build a classification model by support vector machine. Comparing to previous studies on the several benchmark datasets, the promising classification rates obtained by tenfold cross-validation imply that the current approach can help in revealing some inherent features deeply hidden in protein sequences and improve the quality of predicting protein structural class.
Trakoolwilaiwan, Thanawin; Behboodi, Bahareh; Lee, Jaeseok; Kim, Kyungsoo; Choi, Ji-Woong
2018-01-01
The aim of this work is to develop an effective brain-computer interface (BCI) method based on functional near-infrared spectroscopy (fNIRS). In order to improve the performance of the BCI system in terms of accuracy, the ability to discriminate features from input signals and proper classification are desired. Previous studies have mainly extracted features from the signal manually, but proper features need to be selected carefully. To avoid performance degradation caused by manual feature selection, we applied convolutional neural networks (CNNs) as the automatic feature extractor and classifier for fNIRS-based BCI. In this study, the hemodynamic responses evoked by performing rest, right-, and left-hand motor execution tasks were measured on eight healthy subjects to compare performances. Our CNN-based method provided improvements in classification accuracy over conventional methods employing the most commonly used features of mean, peak, slope, variance, kurtosis, and skewness, classified by support vector machine (SVM) and artificial neural network (ANN). Specifically, up to 6.49% and 3.33% improvement in classification accuracy was achieved by CNN compared with SVM and ANN, respectively.
Zhang, Yu; Zhou, Guoxu; Jin, Jing; Wang, Xingyu; Cichocki, Andrzej
2015-11-30
Common spatial pattern (CSP) has been most popularly applied to motor-imagery (MI) feature extraction for classification in brain-computer interface (BCI) application. Successful application of CSP depends on the filter band selection to a large degree. However, the most proper band is typically subject-specific and can hardly be determined manually. This study proposes a sparse filter band common spatial pattern (SFBCSP) for optimizing the spatial patterns. SFBCSP estimates CSP features on multiple signals that are filtered from raw EEG data at a set of overlapping bands. The filter bands that result in significant CSP features are then selected in a supervised way by exploiting sparse regression. A support vector machine (SVM) is implemented on the selected features for MI classification. Two public EEG datasets (BCI Competition III dataset IVa and BCI Competition IV IIb) are used to validate the proposed SFBCSP method. Experimental results demonstrate that SFBCSP help improve the classification performance of MI. The optimized spatial patterns by SFBCSP give overall better MI classification accuracy in comparison with several competing methods. The proposed SFBCSP is a potential method for improving the performance of MI-based BCI. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Wu, M. F.; Sun, Z. C.; Yang, B.; Yu, S. S.
2016-11-01
In order to reduce the “salt and pepper” in pixel-based urban land cover classification and expand the application of fusion of multi-source data in the field of urban remote sensing, WorldView-2 imagery and airborne Light Detection and Ranging (LiDAR) data were used to improve the classification of urban land cover. An approach of object- oriented hierarchical classification was proposed in our study. The processing of proposed method consisted of two hierarchies. (1) In the first hierarchy, LiDAR Normalized Digital Surface Model (nDSM) image was segmented to objects. The NDVI, Costal Blue and nDSM thresholds were set for extracting building objects. (2) In the second hierarchy, after removing building objects, WorldView-2 fused imagery was obtained by Haze-ratio-based (HR) fusion, and was segmented. A SVM classifier was applied to generate road/parking lot, vegetation and bare soil objects. (3) Trees and grasslands were split based on an nDSM threshold (2.4 meter). The results showed that compared with pixel-based and non-hierarchical object-oriented approach, proposed method provided a better performance of urban land cover classification, the overall accuracy (OA) and overall kappa (OK) improved up to 92.75% and 0.90. Furthermore, proposed method reduced “salt and pepper” in pixel-based classification, improved the extraction accuracy of buildings based on LiDAR nDSM image segmentation, and reduced the confusion between trees and grasslands through setting nDSM threshold.
Application of the SNoW machine learning paradigm to a set of transportation imaging problems
NASA Astrophysics Data System (ADS)
Paul, Peter; Burry, Aaron M.; Wang, Yuheng; Kozitsky, Vladimir
2012-01-01
Machine learning methods have been successfully applied to image object classification problems where there is clear distinction between classes and where a comprehensive set of training samples and ground truth are readily available. The transportation domain is an area where machine learning methods are particularly applicable, since the classification problems typically have well defined class boundaries and, due to high traffic volumes in most applications, massive roadway data is available. Though these classes tend to be well defined, the particular image noises and variations can be challenging. Another challenge is the extremely high accuracy typically required in most traffic applications. Incorrect assignment of fines or tolls due to imaging mistakes is not acceptable in most applications. For the front seat vehicle occupancy detection problem, classification amounts to determining whether one face (driver only) or two faces (driver + passenger) are detected in the front seat of a vehicle on a roadway. For automatic license plate recognition, the classification problem is a type of optical character recognition problem encompassing multiple class classification. The SNoW machine learning classifier using local SMQT features is shown to be successful in these two transportation imaging applications.
NASA Astrophysics Data System (ADS)
He, Zhi; Liu, Lin
2016-11-01
Empirical mode decomposition (EMD) and its variants have recently been applied for hyperspectral image (HSI) classification due to their ability to extract useful features from the original HSI. However, it remains a challenging task to effectively exploit the spectral-spatial information by the traditional vector or image-based methods. In this paper, a three-dimensional (3D) extension of EMD (3D-EMD) is proposed to naturally treat the HSI as a cube and decompose the HSI into varying oscillations (i.e. 3D intrinsic mode functions (3D-IMFs)). To achieve fast 3D-EMD implementation, 3D Delaunay triangulation (3D-DT) is utilized to determine the distances of extrema, while separable filters are adopted to generate the envelopes. Taking the extracted 3D-IMFs as features of different tasks, robust multitask learning (RMTL) is further proposed for HSI classification. In RMTL, pairs of low-rank and sparse structures are formulated by trace-norm and l1,2 -norm to capture task relatedness and specificity, respectively. Moreover, the optimization problems of RMTL can be efficiently solved by the inexact augmented Lagrangian method (IALM). Compared with several state-of-the-art feature extraction and classification methods, the experimental results conducted on three benchmark data sets demonstrate the superiority of the proposed methods.
Tang, Yunwei; Jing, Linhai; Li, Hui; Liu, Qingjie; Yan, Qi; Li, Xiuxia
2016-01-01
This study explores the ability of WorldView-2 (WV-2) imagery for bamboo mapping in a mountainous region in Sichuan Province, China. A large area of this place is covered by shadows in the image, and only a few sampled points derived were useful. In order to identify bamboos based on sparse training data, the sample size was expanded according to the reflectance of multispectral bands selected using the principal component analysis (PCA). Then, class separability based on the training data was calculated using a feature space optimization method to select the features for classification. Four regular object-based classification methods were applied based on both sets of training data. The results show that the k-nearest neighbor (k-NN) method produced the greatest accuracy. A geostatistically-weighted k-NN classifier, accounting for the spatial correlation between classes, was then applied to further increase the accuracy. It achieved 82.65% and 93.10% of the producer’s and user’s accuracies respectively for the bamboo class. The canopy densities were estimated to explain the result. This study demonstrates that the WV-2 image can be used to identify small patches of understory bamboos given limited known samples, and the resulting bamboo distribution facilitates the assessments of the habitats of giant pandas. PMID:27879661
Application of information-retrieval methods to the classification of physical data
NASA Technical Reports Server (NTRS)
Mamotko, Z. N.; Khorolskaya, S. K.; Shatrovskiy, L. I.
1975-01-01
Scientific data received from satellites are characterized as a multi-dimensional time series, whose terms are vector functions of a vector of measurement conditions. Information retrieval methods are used to construct lower dimensional samples on the basis of the condition vector, in order to obtain these data and to construct partial relations. The methods are applied to the joint Soviet-French Arkad project.
Supernova Photometric Lightcurve Classification
NASA Astrophysics Data System (ADS)
Zaidi, Tayeb; Narayan, Gautham
2016-01-01
This is a preliminary report on photometric supernova classification. We first explore the properties of supernova light curves, and attempt to restructure the unevenly sampled and sparse data from assorted datasets to allow for processing and classification. The data was primarily drawn from the Dark Energy Survey (DES) simulated data, created for the Supernova Photometric Classification Challenge. This poster shows a method for producing a non-parametric representation of the light curve data, and applying a Random Forest classifier algorithm to distinguish between supernovae types. We examine the impact of Principal Component Analysis to reduce the dimensionality of the dataset, for future classification work. The classification code will be used in a stage of the ANTARES pipeline, created for use on the Large Synoptic Survey Telescope alert data and other wide-field surveys. The final figure-of-merit for the DES data in the r band was 60% for binary classification (Type I vs II).Zaidi was supported by the NOAO/KPNO Research Experiences for Undergraduates (REU) Program which is funded by the National Science Foundation Research Experiences for Undergraduates Program (AST-1262829).
NASA Technical Reports Server (NTRS)
Glick, B. J.
1985-01-01
Techniques for classifying objects into groups or clases go under many different names including, most commonly, cluster analysis. Mathematically, the general problem is to find a best mapping of objects into an index set consisting of class identifiers. When an a priori grouping of objects exists, the process of deriving the classification rules from samples of classified objects is known as discrimination. When such rules are applied to objects of unknown class, the process is denoted classification. The specific problem addressed involves the group classification of a set of objects that are each associated with a series of measurements (ratio, interval, ordinal, or nominal levels of measurement). Each measurement produces one variable in a multidimensional variable space. Cluster analysis techniques are reviewed and methods for incuding geographic location, distance measures, and spatial pattern (distribution) as parameters in clustering are examined. For the case of patterning, measures of spatial autocorrelation are discussed in terms of the kind of data (nominal, ordinal, or interval scaled) to which they may be applied.
Khanmohammadi, Mohammadreza; Bagheri Garmarudi, Amir; Samani, Simin; Ghasemi, Keyvan; Ashuri, Ahmad
2011-06-01
Attenuated Total Reflectance Fourier Transform Infrared (ATR-FTIR) microspectroscopy was applied for detection of colon cancer according to the spectral features of colon tissues. Supervised classification models can be trained to identify the tissue type based on the spectroscopic fingerprint. A total of 78 colon tissues were used in spectroscopy studies. Major spectral differences were observed in 1,740-900 cm(-1) spectral region. Several chemometric methods such as analysis of variance (ANOVA), cluster analysis (CA) and linear discriminate analysis (LDA) were applied for classification of IR spectra. Utilizing the chemometric techniques, clear and reproducible differences were observed between the spectra of normal and cancer cases, suggesting that infrared microspectroscopy in conjunction with spectral data processing would be useful for diagnostic classification. Using LDA technique, the spectra were classified into cancer and normal tissue classes with an accuracy of 95.8%. The sensitivity and specificity was 100 and 93.1%, respectively.
NASA Technical Reports Server (NTRS)
Blackwell, R. J.
1982-01-01
Remote sensing data analysis of water quality monitoring is evaluated. Data anaysis and image processing techniques are applied to LANDSAT remote sensing data to produce an effective operational tool for lake water quality surveying and monitoring. Digital image processing and analysis techniques were designed, developed, tested, and applied to LANDSAT multispectral scanner (MSS) data and conventional surface acquired data. Utilization of these techniques facilitates the surveying and monitoring of large numbers of lakes in an operational manner. Supervised multispectral classification, when used in conjunction with surface acquired water quality indicators, is used to characterize water body trophic status. Unsupervised multispectral classification, when interpreted by lake scientists familiar with a specific water body, yields classifications of equal validity with supervised methods and in a more cost effective manner. Image data base technology is used to great advantage in characterizing other contributing effects to water quality. These effects include drainage basin configuration, terrain slope, soil, precipitation and land cover characteristics.
User oriented ERTS-1 images. [vegetation identification in Canada through image enhancement
NASA Technical Reports Server (NTRS)
Shlien, S.; Goodenough, D.
1974-01-01
Photographic reproduction of ERTS-1 images are capable of displaying only a portion of the total information available from the multispectral scanner. Methods are being developed to generate ERTS-1 images oriented towards special users such as agriculturists, foresters, and hydrologists by applying image enhancement techniques and interactive statistical classification schemes. Spatial boundaries and linear features can be emphasized and delineated using simple filters. Linear and nonlinear transformations can be applied to the spectral data to emphasize certain ground information. An automatic classification scheme was developed to identify particular ground cover classes such as fallow, grain, rape seed or various vegetation covers. The scheme applies the maximum likelihood decision rule to the spectral information and classifies the ERTS-1 image on a pixel by pixel basis. Preliminary results indicate that the classifier has limited success in distinguishing crops, but is well adapted for identifying different types of vegetation.
Probabilistic topic modeling for the analysis and classification of genomic sequences
2015-01-01
Background Studies on genomic sequences for classification and taxonomic identification have a leading role in the biomedical field and in the analysis of biodiversity. These studies are focusing on the so-called barcode genes, representing a well defined region of the whole genome. Recently, alignment-free techniques are gaining more importance because they are able to overcome the drawbacks of sequence alignment techniques. In this paper a new alignment-free method for DNA sequences clustering and classification is proposed. The method is based on k-mers representation and text mining techniques. Methods The presented method is based on Probabilistic Topic Modeling, a statistical technique originally proposed for text documents. Probabilistic topic models are able to find in a document corpus the topics (recurrent themes) characterizing classes of documents. This technique, applied on DNA sequences representing the documents, exploits the frequency of fixed-length k-mers and builds a generative model for a training group of sequences. This generative model, obtained through the Latent Dirichlet Allocation (LDA) algorithm, is then used to classify a large set of genomic sequences. Results and conclusions We performed classification of over 7000 16S DNA barcode sequences taken from Ribosomal Database Project (RDP) repository, training probabilistic topic models. The proposed method is compared to the RDP tool and Support Vector Machine (SVM) classification algorithm in a extensive set of trials using both complete sequences and short sequence snippets (from 400 bp to 25 bp). Our method reaches very similar results to RDP classifier and SVM for complete sequences. The most interesting results are obtained when short sequence snippets are considered. In these conditions the proposed method outperforms RDP and SVM with ultra short sequences and it exhibits a smooth decrease of performance, at every taxonomic level, when the sequence length is decreased. PMID:25916734
NASA Astrophysics Data System (ADS)
Albert, L.; Rottensteiner, F.; Heipke, C.
2015-08-01
Land cover and land use exhibit strong contextual dependencies. We propose a novel approach for the simultaneous classification of land cover and land use, where semantic and spatial context is considered. The image sites for land cover and land use classification form a hierarchy consisting of two layers: a land cover layer and a land use layer. We apply Conditional Random Fields (CRF) at both layers. The layers differ with respect to the image entities corresponding to the nodes, the employed features and the classes to be distinguished. In the land cover layer, the nodes represent super-pixels; in the land use layer, the nodes correspond to objects from a geospatial database. Both CRFs model spatial dependencies between neighbouring image sites. The complex semantic relations between land cover and land use are integrated in the classification process by using contextual features. We propose a new iterative inference procedure for the simultaneous classification of land cover and land use, in which the two classification tasks mutually influence each other. This helps to improve the classification accuracy for certain classes. The main idea of this approach is that semantic context helps to refine the class predictions, which, in turn, leads to more expressive context information. Thus, potentially wrong decisions can be reversed at later stages. The approach is designed for input data based on aerial images. Experiments are carried out on a test site to evaluate the performance of the proposed method. We show the effectiveness of the iterative inference procedure and demonstrate that a smaller size of the super-pixels has a positive influence on the classification result.
NASA Astrophysics Data System (ADS)
Fang, Leyuan; Wang, Chong; Li, Shutao; Yan, Jun; Chen, Xiangdong; Rabbani, Hossein
2017-11-01
We present an automatic method, termed as the principal component analysis network with composite kernel (PCANet-CK), for the classification of three-dimensional (3-D) retinal optical coherence tomography (OCT) images. Specifically, the proposed PCANet-CK method first utilizes the PCANet to automatically learn features from each B-scan of the 3-D retinal OCT images. Then, multiple kernels are separately applied to a set of very important features of the B-scans and these kernels are fused together, which can jointly exploit the correlations among features of the 3-D OCT images. Finally, the fused (composite) kernel is incorporated into an extreme learning machine for the OCT image classification. We tested our proposed algorithm on two real 3-D spectral domain OCT (SD-OCT) datasets (of normal subjects and subjects with the macular edema and age-related macular degeneration), which demonstrated its effectiveness.
Research on cardiovascular disease prediction based on distance metric learning
NASA Astrophysics Data System (ADS)
Ni, Zhuang; Liu, Kui; Kang, Guixia
2018-04-01
Distance metric learning algorithm has been widely applied to medical diagnosis and exhibited its strengths in classification problems. The k-nearest neighbour (KNN) is an efficient method which treats each feature equally. The large margin nearest neighbour classification (LMNN) improves the accuracy of KNN by learning a global distance metric, which did not consider the locality of data distributions. In this paper, we propose a new distance metric algorithm adopting cosine metric and LMNN named COS-SUBLMNN which takes more care about local feature of data to overcome the shortage of LMNN and improve the classification accuracy. The proposed methodology is verified on CVDs patient vector derived from real-world medical data. The Experimental results show that our method provides higher accuracy than KNN and LMNN did, which demonstrates the effectiveness of the Risk predictive model of CVDs based on COS-SUBLMNN.
Classification of microbial α-amylases for food manufacturing using proteinase digestion.
Akiyama, Takumi; Yamazaki, Takeshi; Tada, Atsuko; Ito, Yusai; Otsuki, Noriko; Akiyama, Hiroshi
2014-09-01
Enzymes produced by microorganisms and plants are used as food additives to aid the processing of foods. Identification of the origin of these enzyme products is important for their proper use. Proteinase digestion of α-amylase products, followed by high performance liquid chromatography (HPLC) analysis, was applied to α-amylase from the mold Aspergillus species, the bacteria Bacillus species, and the actinomycetes Saccharomonospora species. Eighteen commercial products of α-amylase were digested with trypsin and endoproteinase Lys-C and HPLC analyzed. For some proteinase/sample combinations, the area of the intact α-amylase peak decreased and new peaks were detected after digestion. The presence and retention times of the novel peaks were used to group the products. The results from this method, called the proteinase digestion-HPLC method, allowed the classification of the α-amylase products into 10 groups, whereas the results from sodium dodecyl sulfate polyacrylamide gel electrophoresis allowed their classification into seven groups.
Characterisation of Feature Points in Eye Fundus Images
NASA Astrophysics Data System (ADS)
Calvo, D.; Ortega, M.; Penedo, M. G.; Rouco, J.
The retinal vessel tree adds decisive knowledge in the diagnosis of numerous opthalmologic pathologies such as hypertension or diabetes. One of the problems in the analysis of the retinal vessel tree is the lack of information in terms of vessels depth as the image acquisition usually leads to a 2D image. This situation provokes a scenario where two different vessels coinciding in a point could be interpreted as a vessel forking into a bifurcation. That is why, for traking and labelling the retinal vascular tree, bifurcations and crossovers of vessels are considered feature points. In this work a novel method for these retinal vessel tree feature points detection and classification is introduced. The method applies image techniques such as filters or thinning to obtain the adequate structure to detect the points and sets a classification of these points studying its environment. The methodology is tested using a standard database and the results show high classification capabilities.
Transient classification in LIGO data using difference boosting neural network
NASA Astrophysics Data System (ADS)
Mukund, N.; Abraham, S.; Kandhasamy, S.; Mitra, S.; Philip, N. S.
2017-05-01
Detection and classification of transients in data from gravitational wave detectors are crucial for efficient searches for true astrophysical events and identification of noise sources. We present a hybrid method for classification of short duration transients seen in gravitational wave data using both supervised and unsupervised machine learning techniques. To train the classifiers, we use the relative wavelet energy and the corresponding entropy obtained by applying one-dimensional wavelet decomposition on the data. The prediction accuracy of the trained classifier on nine simulated classes of gravitational wave transients and also LIGO's sixth science run hardware injections are reported. Targeted searches for a couple of known classes of nonastrophysical signals in the first observational run of Advanced LIGO data are also presented. The ability to accurately identify transient classes using minimal training samples makes the proposed method a useful tool for LIGO detector characterization as well as searches for short duration gravitational wave signals.
A measure of association for ordered categorical data in population-based studies
Nelson, Kerrie P; Edwards, Don
2016-01-01
Ordinal classification scales are commonly used to define a patient’s disease status in screening and diagnostic tests such as mammography. Challenges arise in agreement studies when evaluating the association between many raters’ classifications of patients’ disease or health status when an ordered categorical scale is used. In this paper, we describe a population-based approach and chance-corrected measure of association to evaluate the strength of relationship between multiple raters’ ordinal classifications where any number of raters can be accommodated. In contrast to Shrout and Fleiss’ intraclass correlation coefficient, the proposed measure of association is invariant with respect to changes in disease prevalence. We demonstrate how unique characteristics of individual raters can be explored using random effects. Simulation studies are conducted to demonstrate the properties of the proposed method under varying assumptions. The methods are applied to two large-scale agreement studies of breast cancer screening and prostate cancer severity. PMID:27184590
Topic Identification and Categorization of Public Information in Community-Based Social Media
NASA Astrophysics Data System (ADS)
Kusumawardani, RP; Basri, MH
2017-01-01
This paper presents a work on a semi-supervised method for topic identification and classification of short texts in the social media, and its application on tweets containing dialogues in a large community of dwellers in a city, written mostly in Indonesian. These dialogues comprise a wealth of information about the city, shared in real-time. We found that despite the high irregularity of the language used, and the scarcity of suitable linguistic resources, a meaningful identification of topics could be performed by clustering the tweets using the K-Means algorithm. The resulting clusters are found to be robust enough to be the basis of a classification. On three grouping schemes derived from the clusters, we get accuracy of 95.52%, 95.51%, and 96.7 using linear SVMs, reflecting the applicability of applying this method for generating topic identification and classification on such data.
E-Nose Vapor Identification Based on Dempster-Shafer Fusion of Multiple Classifiers
NASA Technical Reports Server (NTRS)
Li, Winston; Leung, Henry; Kwan, Chiman; Linnell, Bruce R.
2005-01-01
Electronic nose (e-nose) vapor identification is an efficient approach to monitor air contaminants in space stations and shuttles in order to ensure the health and safety of astronauts. Data preprocessing (measurement denoising and feature extraction) and pattern classification are important components of an e-nose system. In this paper, a wavelet-based denoising method is applied to filter the noisy sensor measurements. Transient-state features are then extracted from the denoised sensor measurements, and are used to train multiple classifiers such as multi-layer perceptions (MLP), support vector machines (SVM), k nearest neighbor (KNN), and Parzen classifier. The Dempster-Shafer (DS) technique is used at the end to fuse the results of the multiple classifiers to get the final classification. Experimental analysis based on real vapor data shows that the wavelet denoising method can remove both random noise and outliers successfully, and the classification rate can be improved by using classifier fusion.
NASA Astrophysics Data System (ADS)
Wutsqa, D. U.; Marwah, M.
2017-06-01
In this paper, we consider spatial operation median filter to reduce the noise in the cervical images yielded by colposcopy tool. The backpropagation neural network (BPNN) model is applied to the colposcopy images to classify cervical cancer. The classification process requires an image extraction by using a gray level co-occurrence matrix (GLCM) method to obtain image features that are used as inputs of BPNN model. The advantage of noise reduction is evaluated by comparing the performances of BPNN models with and without spatial operation median filter. The experimental result shows that the spatial operation median filter can improve the accuracy of the BPNN model for cervical cancer classification.
NASA Astrophysics Data System (ADS)
Mlynarczuk, Mariusz; Skiba, Marta
2017-06-01
The correct and consistent identification of the petrographic properties of coal is an important issue for researchers in the fields of mining and geology. As part of the study described in this paper, investigations concerning the application of artificial intelligence methods for the identification of the aforementioned characteristics were carried out. The methods in question were used to identify the maceral groups of coal, i.e. vitrinite, inertinite, and liptinite. Additionally, an attempt was made to identify some non-organic minerals. The analyses were performed using pattern recognition techniques (NN, kNN), as well as artificial neural network techniques (a multilayer perceptron - MLP). The classification process was carried out using microscopy images of polished sections of coals. A multidimensional feature space was defined, which made it possible to classify the discussed structures automatically, based on the methods of pattern recognition and algorithms of the artificial neural networks. Also, from the study we assessed the impact of the parameters for which the applied methods proved effective upon the final outcome of the classification procedure. The result of the analyses was a high percentage (over 97%) of correct classifications of maceral groups and mineral components. The paper discusses also an attempt to analyze particular macerals of the inertinite group. It was demonstrated that using artificial neural networks to this end makes it possible to classify the macerals properly in over 91% of cases. Thus, it was proved that artificial intelligence methods can be successfully applied for the identification of selected petrographic features of coal.
NASA Astrophysics Data System (ADS)
Muszynski, G.; Kashinath, K.; Wehner, M. F.; Prabhat, M.; Kurlin, V.
2017-12-01
We investigate novel approaches to detecting, classifying and characterizing extreme weather events, such as atmospheric rivers (ARs), in large high-dimensional climate datasets. ARs are narrow filaments of concentrated water vapour in the atmosphere that bring much of the precipitation in many mid-latitude regions. The precipitation associated with ARs is also responsible for major flooding events in many coastal regions of the world, including the west coast of the United States and western Europe. In this study we combine ideas from Topological Data Analysis (TDA) with Machine Learning (ML) for detecting, classifying and characterizing extreme weather events, like ARs. TDA is a new field that sits at the interface between topology and computer science, that studies "shape" - hidden topological structure - in raw data. It has been applied successfully in many areas of applied sciences, including complex networks, signal processing and image recognition. Using TDA we provide ARs with a shape characteristic as a new feature descriptor for the task of AR classification. In particular, we track the change in topology in precipitable water (integrated water vapour) fields using the Union-Find algorithm. We use the generated feature descriptors with ML classifiers to establish reliability and classification performance of our approach. We utilize the parallel toolkit for extreme climate events analysis (TECA: Petascale Pattern Recognition for Climate Science, Prabhat et al., Computer Analysis of Images and Patterns, 2015) for comparison (it is assumed that events identified by TECA is ground truth). Preliminary results indicate that our approach brings new insight into the study of ARs and provides quantitative information about the relevance of topological feature descriptors in analyses of a large climate datasets. We illustrate this method on climate model output and NCEP reanalysis datasets. Further, our method outperforms existing methods on detection and classification of ARs. This work illustrates that TDA combined with ML may provide a uniquely powerful approach for detection, classification and characterization of extreme weather phenomena.
NASA Astrophysics Data System (ADS)
Wen, Hongwei; Liu, Yue; Wang, Jieqiong; Zhang, Jishui; Peng, Yun; He, Huiguang
2016-03-01
Tourette syndrome (TS) is a developmental neuropsychiatric disorder with the cardinal symptoms of motor and vocal tics which emerges in early childhood and fluctuates in severity in later years. To date, the neural basis of TS is not fully understood yet and TS has a long-term prognosis that is difficult to accurately estimate. Few studies have looked at the potential of using diffusion tensor imaging (DTI) in conjunction with machine learning algorithms in order to automate the classification of healthy children and TS children. Here we apply Tract-Based Spatial Statistics (TBSS) method to 44 TS children and 48 age and gender matched healthy children in order to extract the diffusion values from each voxel in the white matter (WM) skeleton, and a feature selection algorithm (ReliefF) was used to select the most salient voxels for subsequent classification with support vector machine (SVM). We use a nested cross validation to yield an unbiased assessment of the classification method and prevent overestimation. The accuracy (88.04%), sensitivity (88.64%) and specificity (87.50%) were achieved in our method as peak performance of the SVM classifier was achieved using the axial diffusion (AD) metric, demonstrating the potential of a joint TBSS and SVM pipeline for fast, objective classification of healthy and TS children. These results support that our methods may be useful for the early identification of subjects with TS, and hold promise for predicting prognosis and treatment outcome for individuals with TS.
Evaluation of normalization methods for cDNA microarray data by k-NN classification
Wu, Wei; Xing, Eric P; Myers, Connie; Mian, I Saira; Bissell, Mina J
2005-01-01
Background Non-biological factors give rise to unwanted variations in cDNA microarray data. There are many normalization methods designed to remove such variations. However, to date there have been few published systematic evaluations of these techniques for removing variations arising from dye biases in the context of downstream, higher-order analytical tasks such as classification. Results Ten location normalization methods that adjust spatial- and/or intensity-dependent dye biases, and three scale methods that adjust scale differences were applied, individually and in combination, to five distinct, published, cancer biology-related cDNA microarray data sets. Leave-one-out cross-validation (LOOCV) classification error was employed as the quantitative end-point for assessing the effectiveness of a normalization method. In particular, a known classifier, k-nearest neighbor (k-NN), was estimated from data normalized using a given technique, and the LOOCV error rate of the ensuing model was computed. We found that k-NN classifiers are sensitive to dye biases in the data. Using NONRM and GMEDIAN as baseline methods, our results show that single-bias-removal techniques which remove either spatial-dependent dye bias (referred later as spatial effect) or intensity-dependent dye bias (referred later as intensity effect) moderately reduce LOOCV classification errors; whereas double-bias-removal techniques which remove both spatial- and intensity effect reduce LOOCV classification errors even further. Of the 41 different strategies examined, three two-step processes, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, all of which removed intensity effect globally and spatial effect locally, appear to reduce LOOCV classification errors most consistently and effectively across all data sets. We also found that the investigated scale normalization methods do not reduce LOOCV classification error. Conclusion Using LOOCV error of k-NNs as the evaluation criterion, three double-bias-removal normalization strategies, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, outperform other strategies for removing spatial effect, intensity effect and scale differences from cDNA microarray data. The apparent sensitivity of k-NN LOOCV classification error to dye biases suggests that this criterion provides an informative measure for evaluating normalization methods. All the computational tools used in this study were implemented using the R language for statistical computing and graphics. PMID:16045803
Evaluation of normalization methods for cDNA microarray data by k-NN classification.
Wu, Wei; Xing, Eric P; Myers, Connie; Mian, I Saira; Bissell, Mina J
2005-07-26
Non-biological factors give rise to unwanted variations in cDNA microarray data. There are many normalization methods designed to remove such variations. However, to date there have been few published systematic evaluations of these techniques for removing variations arising from dye biases in the context of downstream, higher-order analytical tasks such as classification. Ten location normalization methods that adjust spatial- and/or intensity-dependent dye biases, and three scale methods that adjust scale differences were applied, individually and in combination, to five distinct, published, cancer biology-related cDNA microarray data sets. Leave-one-out cross-validation (LOOCV) classification error was employed as the quantitative end-point for assessing the effectiveness of a normalization method. In particular, a known classifier, k-nearest neighbor (k-NN), was estimated from data normalized using a given technique, and the LOOCV error rate of the ensuing model was computed. We found that k-NN classifiers are sensitive to dye biases in the data. Using NONRM and GMEDIAN as baseline methods, our results show that single-bias-removal techniques which remove either spatial-dependent dye bias (referred later as spatial effect) or intensity-dependent dye bias (referred later as intensity effect) moderately reduce LOOCV classification errors; whereas double-bias-removal techniques which remove both spatial- and intensity effect reduce LOOCV classification errors even further. Of the 41 different strategies examined, three two-step processes, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, all of which removed intensity effect globally and spatial effect locally, appear to reduce LOOCV classification errors most consistently and effectively across all data sets. We also found that the investigated scale normalization methods do not reduce LOOCV classification error. Using LOOCV error of k-NNs as the evaluation criterion, three double-bias-removal normalization strategies, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, outperform other strategies for removing spatial effect, intensity effect and scale differences from cDNA microarray data. The apparent sensitivity of k-NN LOOCV classification error to dye biases suggests that this criterion provides an informative measure for evaluating normalization methods. All the computational tools used in this study were implemented using the R language for statistical computing and graphics.
Lynn Hedt, Bethany; van Leth, Frank; Zignol, Matteo; Cobelens, Frank; van Gemert, Wayne; Viet Nhung, Nguyen; Lyepshina, Svitlana; Egwaga, Saidi; Cohen, Ted
2012-01-01
Background Current methodology for multidrug-resistant TB (MDR TB) surveys endorsed by the World Health Organization provides estimates of MDR TB prevalence among new cases at the national level. On the aggregate, local variation in the burden of MDR TB may be masked. This paper investigates the utility of applying lot quality-assurance sampling to identify geographic heterogeneity in the proportion of new cases with multidrug resistance. Methods We simulated the performance of lot quality-assurance sampling by applying these classification-based approaches to data collected in the most recent TB drug-resistance surveys in Ukraine, Vietnam, and Tanzania. We explored three classification systems—two-way static, three-way static, and three-way truncated sequential sampling—at two sets of thresholds: low MDR TB = 2%, high MDR TB = 10%, and low MDR TB = 5%, high MDR TB = 20%. Results The lot quality-assurance sampling systems identified local variability in the prevalence of multidrug resistance in both high-resistance (Ukraine) and low-resistance settings (Vietnam). In Tanzania, prevalence was uniformly low, and the lot quality-assurance sampling approach did not reveal variability. The three-way classification systems provide additional information, but sample sizes may not be obtainable in some settings. New rapid drug-sensitivity testing methods may allow truncated sequential sampling designs and early stopping within static designs, producing even greater efficiency gains. Conclusions Lot quality-assurance sampling study designs may offer an efficient approach for collecting critical information on local variability in the burden of multidrug-resistant TB. Before this methodology is adopted, programs must determine appropriate classification thresholds, the most useful classification system, and appropriate weighting if unbiased national estimates are also desired. PMID:22249242
Kernel Wiener filter and its application to pattern recognition.
Yoshino, Hirokazu; Dong, Chen; Washizawa, Yoshikazu; Yamashita, Yukihiko
2010-11-01
The Wiener filter (WF) is widely used for inverse problems. From an observed signal, it provides the best estimated signal with respect to the squared error averaged over the original and the observed signals among linear operators. The kernel WF (KWF), extended directly from WF, has a problem that an additive noise has to be handled by samples. Since the computational complexity of kernel methods depends on the number of samples, a huge computational cost is necessary for the case. By using the first-order approximation of kernel functions, we realize KWF that can handle such a noise not by samples but as a random variable. We also propose the error estimation method for kernel filters by using the approximations. In order to show the advantages of the proposed methods, we conducted the experiments to denoise images and estimate errors. We also apply KWF to classification since KWF can provide an approximated result of the maximum a posteriori classifier that provides the best recognition accuracy. The noise term in the criterion can be used for the classification in the presence of noise or a new regularization to suppress changes in the input space, whereas the ordinary regularization for the kernel method suppresses changes in the feature space. In order to show the advantages of the proposed methods, we conducted experiments of binary and multiclass classifications and classification in the presence of noise.
D Object Classification Based on Thermal and Visible Imagery in Urban Area
NASA Astrophysics Data System (ADS)
Hasani, H.; Samadzadegan, F.
2015-12-01
The spatial distribution of land cover in the urban area especially 3D objects (buildings and trees) is a fundamental dataset for urban planning, ecological research, disaster management, etc. According to recent advances in sensor technologies, several types of remotely sensed data are available from the same area. Data fusion has been widely investigated for integrating different source of data in classification of urban area. Thermal infrared imagery (TIR) contains information on emitted radiation and has unique radiometric properties. However, due to coarse spatial resolution of thermal data, its application has been restricted in urban areas. On the other hand, visible image (VIS) has high spatial resolution and information in visible spectrum. Consequently, there is a complementary relation between thermal and visible imagery in classification of urban area. This paper evaluates the potential of aerial thermal hyperspectral and visible imagery fusion in classification of urban area. In the pre-processing step, thermal imagery is resampled to the spatial resolution of visible image. Then feature level fusion is applied to construct hybrid feature space include visible bands, thermal hyperspectral bands, spatial and texture features and moreover Principle Component Analysis (PCA) transformation is applied to extract PCs. Due to high dimensionality of feature space, dimension reduction method is performed. Finally, Support Vector Machines (SVMs) classify the reduced hybrid feature space. The obtained results show using thermal imagery along with visible imagery, improved the classification accuracy up to 8% respect to visible image classification.
Siddique, Juned; Ruhnke, Gregory W.; Flores, Andrea; Prochaska, Micah T.; Paesch, Elizabeth; Meltzer, David O.; Whelan, Chad T.
2015-01-01
Background Lower gastrointestinal bleeding (LGIB) is a common cause of acute hospitalization. Currently, there is no accepted standard for identifying patients with LGIB in hospital administrative data. The objective of this study was to develop and validate a set of classification algorithms that use hospital administrative data to identify LGIB. Methods Our sample consists of patients admitted between July 1, 2001 and June 30, 2003 (derivation cohort) and July 1, 2003 and June 30, 2005 (validation cohort) to the general medicine inpatient service of the University of Chicago Hospital, a large urban academic medical center. Confirmed cases of LGIB in both cohorts were determined by reviewing the charts of those patients who had at least 1 of 36 principal or secondary International Classification of Diseases, Ninth revision, Clinical Modification (ICD-9-CM) diagnosis codes associated with LGIB. Classification trees were used on the data of the derivation cohort to develop a set of decision rules for identifying patients with LGIB. These rules were then applied to the validation cohort to assess their performance. Results Three classification algorithms were identified and validated: a high specificity rule with 80.1% sensitivity and 95.8% specificity, a rule that balances sensitivity and specificity (87.8% sensitivity, 90.9% specificity), and a high sensitivity rule with 100% sensitivity and 91.0% specificity. Conclusion These classification algorithms can be used in future studies to evaluate resource utilization and assess outcomes associated with LGIB without the use of chart review. PMID:26406318
Deep Recurrent Neural Networks for Supernovae Classification
NASA Astrophysics Data System (ADS)
Charnock, Tom; Moss, Adam
2017-03-01
We apply deep recurrent neural networks, which are capable of learning complex sequential information, to classify supernovae (code available at https://github.com/adammoss/supernovae). The observational time and filter fluxes are used as inputs to the network, but since the inputs are agnostic, additional data such as host galaxy information can also be included. Using the Supernovae Photometric Classification Challenge (SPCC) data, we find that deep networks are capable of learning about light curves, however the performance of the network is highly sensitive to the amount of training data. For a training size of 50% of the representational SPCC data set (around 104 supernovae) we obtain a type-Ia versus non-type-Ia classification accuracy of 94.7%, an area under the Receiver Operating Characteristic curve AUC of 0.986 and an SPCC figure-of-merit F 1 = 0.64. When using only the data for the early-epoch challenge defined by the SPCC, we achieve a classification accuracy of 93.1%, AUC of 0.977, and F 1 = 0.58, results almost as good as with the whole light curve. By employing bidirectional neural networks, we can acquire impressive classification results between supernovae types I, II and III at an accuracy of 90.4% and AUC of 0.974. We also apply a pre-trained model to obtain classification probabilities as a function of time and show that it can give early indications of supernovae type. Our method is competitive with existing algorithms and has applications for future large-scale photometric surveys.
A rule-based automatic sleep staging method.
Liang, Sheng-Fu; Kuo, Chin-En; Hu, Yu-Han; Cheng, Yu-Shian
2012-03-30
In this paper, a rule-based automatic sleep staging method was proposed. Twelve features including temporal and spectrum analyses of the EEG, EOG, and EMG signals were utilized. Normalization was applied to each feature to eliminating individual differences. A hierarchical decision tree with fourteen rules was constructed for sleep stage classification. Finally, a smoothing process considering the temporal contextual information was applied for the continuity. The overall agreement and kappa coefficient of the proposed method applied to the all night polysomnography (PSG) of seventeen healthy subjects compared with the manual scorings by R&K rules can reach 86.68% and 0.79, respectively. This method can integrate with portable PSG system for sleep evaluation at-home in the near future. Copyright © 2012 Elsevier B.V. All rights reserved.
Improved Sparse Multi-Class SVM and Its Application for Gene Selection in Cancer Classification
Huang, Lingkang; Zhang, Hao Helen; Zeng, Zhao-Bang; Bushel, Pierre R.
2013-01-01
Background Microarray techniques provide promising tools for cancer diagnosis using gene expression profiles. However, molecular diagnosis based on high-throughput platforms presents great challenges due to the overwhelming number of variables versus the small sample size and the complex nature of multi-type tumors. Support vector machines (SVMs) have shown superior performance in cancer classification due to their ability to handle high dimensional low sample size data. The multi-class SVM algorithm of Crammer and Singer provides a natural framework for multi-class learning. Despite its effective performance, the procedure utilizes all variables without selection. In this paper, we propose to improve the procedure by imposing shrinkage penalties in learning to enforce solution sparsity. Results The original multi-class SVM of Crammer and Singer is effective for multi-class classification but does not conduct variable selection. We improved the method by introducing soft-thresholding type penalties to incorporate variable selection into multi-class classification for high dimensional data. The new methods were applied to simulated data and two cancer gene expression data sets. The results demonstrate that the new methods can select a small number of genes for building accurate multi-class classification rules. Furthermore, the important genes selected by the methods overlap significantly, suggesting general agreement among different variable selection schemes. Conclusions High accuracy and sparsity make the new methods attractive for cancer diagnostics with gene expression data and defining targets of therapeutic intervention. Availability: The source MATLAB code are available from http://math.arizona.edu/~hzhang/software.html. PMID:23966761
Classification algorithm of lung lobe for lung disease cases based on multislice CT images
NASA Astrophysics Data System (ADS)
Matsuhiro, M.; Kawata, Y.; Niki, N.; Nakano, Y.; Mishima, M.; Ohmatsu, H.; Tsuchida, T.; Eguchi, K.; Kaneko, M.; Moriyama, N.
2011-03-01
With the development of multi-slice CT technology, to obtain an accurate 3D image of lung field in a short time is possible. To support that, a lot of image processing methods need to be developed. In clinical setting for diagnosis of lung cancer, it is important to study and analyse lung structure. Therefore, classification of lung lobe provides useful information for lung cancer analysis. In this report, we describe algorithm which classify lungs into lung lobes for lung disease cases from multi-slice CT images. The classification algorithm of lung lobes is efficiently carried out using information of lung blood vessel, bronchus, and interlobar fissure. Applying the classification algorithms to multi-slice CT images of 20 normal cases and 5 lung disease cases, we demonstrate the usefulness of the proposed algorithms.
[A New Distance Metric between Different Stellar Spectra: the Residual Distribution Distance].
Liu, Jie; Pan, Jing-chang; Luo, A-li; Wei, Peng; Liu, Meng
2015-12-01
Distance metric is an important issue for the spectroscopic survey data processing, which defines a calculation method of the distance between two different spectra. Based on this, the classification, clustering, parameter measurement and outlier data mining of spectral data can be carried out. Therefore, the distance measurement method has some effect on the performance of the classification, clustering, parameter measurement and outlier data mining. With the development of large-scale stellar spectral sky surveys, how to define more efficient distance metric on stellar spectra has become a very important issue in the spectral data processing. Based on this problem and fully considering of the characteristics and data features of the stellar spectra, a new distance measurement method of stellar spectra named Residual Distribution Distance is proposed. While using this method to measure the distance, the two spectra are firstly scaled and then the standard deviation of the residual is used the distance. Different from the traditional distance metric calculation methods of stellar spectra, when used to calculate the distance between stellar spectra, this method normalize the two spectra to the same scale, and then calculate the residual corresponding to the same wavelength, and the standard error of the residual spectrum is used as the distance measure. The distance measurement method can be used for stellar classification, clustering and stellar atmospheric physical parameters measurement and so on. This paper takes stellar subcategory classification as an example to test the distance measure method. The results show that the distance defined by the proposed method is more effective to describe the gap between different types of spectra in the classification than other methods, which can be well applied in other related applications. At the same time, this paper also studies the effect of the signal to noise ratio (SNR) on the performance of the proposed method. The result show that the distance is affected by the SNR. The smaller the signal-to-noise ratio is, the greater impact is on the distance; While SNR is larger than 10, the signal-to-noise ratio has little effect on the performance for the classification.
NASA Astrophysics Data System (ADS)
Wu, Yu; Zheng, Lijuan; Xie, Donghai; Zhong, Ruofei
2017-07-01
In this study, the extended morphological attribute profiles (EAPs) and independent component analysis (ICA) were combined for feature extraction of high-resolution multispectral satellite remote sensing images and the regularized least squares (RLS) approach with the radial basis function (RBF) kernel was further applied for the classification. Based on the major two independent components, the geometrical features were extracted using the EAPs method. In this study, three morphological attributes were calculated and extracted for each independent component, including area, standard deviation, and moment of inertia. The extracted geometrical features classified results using RLS approach and the commonly used LIB-SVM library of support vector machines method. The Worldview-3 and Chinese GF-2 multispectral images were tested, and the results showed that the features extracted by EAPs and ICA can effectively improve the accuracy of the high-resolution multispectral image classification, 2% larger than EAPs and principal component analysis (PCA) method, and 6% larger than APs and original high-resolution multispectral data. Moreover, it is also suggested that both the GURLS and LIB-SVM libraries are well suited for the multispectral remote sensing image classification. The GURLS library is easy to be used with automatic parameter selection but its computation time may be larger than the LIB-SVM library. This study would be helpful for the classification application of high-resolution multispectral satellite remote sensing images.
Patch-based Convolutional Neural Network for Whole Slide Tissue Image Classification
Hou, Le; Samaras, Dimitris; Kurc, Tahsin M.; Gao, Yi; Davis, James E.; Saltz, Joel H.
2016-01-01
Convolutional Neural Networks (CNN) are state-of-the-art models for many image classification tasks. However, to recognize cancer subtypes automatically, training a CNN on gigapixel resolution Whole Slide Tissue Images (WSI) is currently computationally impossible. The differentiation of cancer subtypes is based on cellular-level visual features observed on image patch scale. Therefore, we argue that in this situation, training a patch-level classifier on image patches will perform better than or similar to an image-level classifier. The challenge becomes how to intelligently combine patch-level classification results and model the fact that not all patches will be discriminative. We propose to train a decision fusion model to aggregate patch-level predictions given by patch-level CNNs, which to the best of our knowledge has not been shown before. Furthermore, we formulate a novel Expectation-Maximization (EM) based method that automatically locates discriminative patches robustly by utilizing the spatial relationships of patches. We apply our method to the classification of glioma and non-small-cell lung carcinoma cases into subtypes. The classification accuracy of our method is similar to the inter-observer agreement between pathologists. Although it is impossible to train CNNs on WSIs, we experimentally demonstrate using a comparable non-cancer dataset of smaller images that a patch-based CNN can outperform an image-based CNN. PMID:27795661
Waveform fitting and geometry analysis for full-waveform lidar feature extraction
NASA Astrophysics Data System (ADS)
Tsai, Fuan; Lai, Jhe-Syuan; Cheng, Yi-Hsiu
2016-10-01
This paper presents a systematic approach that integrates spline curve fitting and geometry analysis to extract full-waveform LiDAR features for land-cover classification. The cubic smoothing spline algorithm is used to fit the waveform curve of the received LiDAR signals. After that, the local peak locations of the waveform curve are detected using a second derivative method. According to the detected local peak locations, commonly used full-waveform features such as full width at half maximum (FWHM) and amplitude can then be obtained. In addition, the number of peaks, time difference between the first and last peaks, and the average amplitude are also considered as features of LiDAR waveforms with multiple returns. Based on the waveform geometry, dynamic time-warping (DTW) is applied to measure the waveform similarity. The sum of the absolute amplitude differences that remain after time-warping can be used as a similarity feature in a classification procedure. An airborne full-waveform LiDAR data set was used to test the performance of the developed feature extraction method for land-cover classification. Experimental results indicate that the developed spline curve- fitting algorithm and geometry analysis can extract helpful full-waveform LiDAR features to produce better land-cover classification than conventional LiDAR data and feature extraction methods. In particular, the multiple-return features and the dynamic time-warping index can improve the classification results significantly.
NASA Astrophysics Data System (ADS)
Wilson, Machelle; Ustin, Susan L.; Rocke, David
2003-03-01
Remote sensing technologies with high spatial and spectral resolution show a great deal of promise in addressing critical environmental monitoring issues, but the ability to analyze and interpret the data lags behind the technology. Robust analytical methods are required before the wealth of data available through remote sensing can be applied to a wide range of environmental problems for which remote detection is the best method. In this study we compare the classification effectiveness of two relatively new techniques on data consisting of leaf-level reflectance from plants that have been exposed to varying levels of heavy metal toxicity. If these methodologies work well on leaf-level data, then there is some hope that they will also work well on data from airborne and space-borne platforms. The classification methods compared were support vector machine classification of exposed and non-exposed plants based on the reflectance data, and partial east squares compression of the reflectance data followed by classification using logistic discrimination (PLS/LD). PLS/LD was performed in two ways. We used the continuous concentration data as the response during compression, and then used the binary response required during logistic discrimination. We also used a binary response during compression followed by logistic discrimination. The statistics we used to compare the effectiveness of the methodologies was the leave-one-out cross validation estimate of the prediction error.
Falgreen, Steffen; Ellern Bilgrau, Anders; Brøndum, Rasmus Froberg; Hjort Jakobsen, Lasse; Have, Jonas; Lindblad Nielsen, Kasper; El-Galaly, Tarec Christoffer; Bødker, Julie Støve; Schmitz, Alexander; H Young, Ken; Johnsen, Hans Erik; Dybkær, Karen; Bøgsted, Martin
2016-01-01
Dozens of omics based cancer classification systems have been introduced with prognostic, diagnostic, and predictive capabilities. However, they often employ complex algorithms and are only applicable on whole cohorts of patients, making them difficult to apply in a personalized clinical setting. This prompted us to create hemaClass.org, an online web application providing an easy interface to one-by-one RMA normalization of microarrays and subsequent risk classifications of diffuse large B-cell lymphoma (DLBCL) into cell-of-origin and chemotherapeutic sensitivity classes. Classification results for one-by-one array pre-processing with and without a laboratory specific RMA reference dataset were compared to cohort based classifiers in 4 publicly available datasets. Classifications showed high agreement between one-by-one and whole cohort pre-processsed data when a laboratory specific reference set was supplied. The website is essentially the R-package hemaClass accompanied by a Shiny web application. The well-documented package can be used to run the website locally or to use the developed methods programmatically. The website and R-package is relevant for biological and clinical lymphoma researchers using affymetrix U-133 Plus 2 arrays, as it provides reliable and swift methods for calculation of disease subclasses. The proposed one-by-one pre-processing method is relevant for all researchers using microarrays.
NASA Astrophysics Data System (ADS)
Benaouda, D.; Wadge, G.; Whitmarsh, R. B.; Rothwell, R. G.; MacLeod, C.
1999-02-01
In boreholes with partial or no core recovery, interpretations of lithology in the remainder of the hole are routinely attempted using data from downhole geophysical sensors. We present a practical neural net-based technique that greatly enhances lithological interpretation in holes with partial core recovery by using downhole data to train classifiers to give a global classification scheme for those parts of the borehole for which no core was retrieved. We describe the system and its underlying methods of data exploration, selection and classification, and present a typical example of the system in use. Although the technique is equally applicable to oil industry boreholes, we apply it here to an Ocean Drilling Program (ODP) borehole (Hole 792E, Izu-Bonin forearc, a mixture of volcaniclastic sandstones, conglomerates and claystones). The quantitative benefits of quality-control measures and different subsampling strategies are shown. Direct comparisons between a number of discriminant analysis methods and the use of neural networks with back-propagation of error are presented. The neural networks perform better than the discriminant analysis techniques both in terms of performance rates with test data sets (2-3 per cent better) and in qualitative correlation with non-depth-matched core. We illustrate with the Hole 792E data how vital it is to have a system that permits the number and membership of training classes to be changed as analysis proceeds. The initial classification for Hole 792E evolved from a five-class to a three-class and then to a four-class scheme with resultant classification performance rates for the back-propagation neural network method of 83, 84 and 93 per cent respectively.
Guo, Hao; Zhang, Fan; Chen, Junjie; Xu, Yong; Xiang, Jie
2017-01-01
Exploring functional interactions among various brain regions is helpful for understanding the pathological underpinnings of neurological disorders. Brain networks provide an important representation of those functional interactions, and thus are widely applied in the diagnosis and classification of neurodegenerative diseases. Many mental disorders involve a sharp decline in cognitive ability as a major symptom, which can be caused by abnormal connectivity patterns among several brain regions. However, conventional functional connectivity networks are usually constructed based on pairwise correlations among different brain regions. This approach ignores higher-order relationships, and cannot effectively characterize the high-order interactions of many brain regions working together. Recent neuroscience research suggests that higher-order relationships between brain regions are important for brain network analysis. Hyper-networks have been proposed that can effectively represent the interactions among brain regions. However, this method extracts the local properties of brain regions as features, but ignores the global topology information, which affects the evaluation of network topology and reduces the performance of the classifier. This problem can be compensated by a subgraph feature-based method, but it is not sensitive to change in a single brain region. Considering that both of these feature extraction methods result in the loss of information, we propose a novel machine learning classification method that combines multiple features of a hyper-network based on functional magnetic resonance imaging in Alzheimer's disease. The method combines the brain region features and subgraph features, and then uses a multi-kernel SVM for classification. This retains not only the global topological information, but also the sensitivity to change in a single brain region. To certify the proposed method, 28 normal control subjects and 38 Alzheimer's disease patients were selected to participate in an experiment. The proposed method achieved satisfactory classification accuracy, with an average of 91.60%. The abnormal brain regions included the bilateral precuneus, right parahippocampal gyrus\\hippocampus, right posterior cingulate gyrus, and other regions that are known to be important in Alzheimer's disease. Machine learning classification combining multiple features of a hyper-network of functional magnetic resonance imaging data in Alzheimer's disease obtains better classification performance. PMID:29209156
Elastic SCAD as a novel penalization method for SVM classification tasks in high-dimensional data.
Becker, Natalia; Toedt, Grischa; Lichter, Peter; Benner, Axel
2011-05-09
Classification and variable selection play an important role in knowledge discovery in high-dimensional data. Although Support Vector Machine (SVM) algorithms are among the most powerful classification and prediction methods with a wide range of scientific applications, the SVM does not include automatic feature selection and therefore a number of feature selection procedures have been developed. Regularisation approaches extend SVM to a feature selection method in a flexible way using penalty functions like LASSO, SCAD and Elastic Net.We propose a novel penalty function for SVM classification tasks, Elastic SCAD, a combination of SCAD and ridge penalties which overcomes the limitations of each penalty alone.Since SVM models are extremely sensitive to the choice of tuning parameters, we adopted an interval search algorithm, which in comparison to a fixed grid search finds rapidly and more precisely a global optimal solution. Feature selection methods with combined penalties (Elastic Net and Elastic SCAD SVMs) are more robust to a change of the model complexity than methods using single penalties. Our simulation study showed that Elastic SCAD SVM outperformed LASSO (L1) and SCAD SVMs. Moreover, Elastic SCAD SVM provided sparser classifiers in terms of median number of features selected than Elastic Net SVM and often better predicted than Elastic Net in terms of misclassification error.Finally, we applied the penalization methods described above on four publicly available breast cancer data sets. Elastic SCAD SVM was the only method providing robust classifiers in sparse and non-sparse situations. The proposed Elastic SCAD SVM algorithm provides the advantages of the SCAD penalty and at the same time avoids sparsity limitations for non-sparse data. We were first to demonstrate that the integration of the interval search algorithm and penalized SVM classification techniques provides fast solutions on the optimization of tuning parameters.The penalized SVM classification algorithms as well as fixed grid and interval search for finding appropriate tuning parameters were implemented in our freely available R package 'penalizedSVM'.We conclude that the Elastic SCAD SVM is a flexible and robust tool for classification and feature selection tasks for high-dimensional data such as microarray data sets.
Elastic SCAD as a novel penalization method for SVM classification tasks in high-dimensional data
2011-01-01
Background Classification and variable selection play an important role in knowledge discovery in high-dimensional data. Although Support Vector Machine (SVM) algorithms are among the most powerful classification and prediction methods with a wide range of scientific applications, the SVM does not include automatic feature selection and therefore a number of feature selection procedures have been developed. Regularisation approaches extend SVM to a feature selection method in a flexible way using penalty functions like LASSO, SCAD and Elastic Net. We propose a novel penalty function for SVM classification tasks, Elastic SCAD, a combination of SCAD and ridge penalties which overcomes the limitations of each penalty alone. Since SVM models are extremely sensitive to the choice of tuning parameters, we adopted an interval search algorithm, which in comparison to a fixed grid search finds rapidly and more precisely a global optimal solution. Results Feature selection methods with combined penalties (Elastic Net and Elastic SCAD SVMs) are more robust to a change of the model complexity than methods using single penalties. Our simulation study showed that Elastic SCAD SVM outperformed LASSO (L1) and SCAD SVMs. Moreover, Elastic SCAD SVM provided sparser classifiers in terms of median number of features selected than Elastic Net SVM and often better predicted than Elastic Net in terms of misclassification error. Finally, we applied the penalization methods described above on four publicly available breast cancer data sets. Elastic SCAD SVM was the only method providing robust classifiers in sparse and non-sparse situations. Conclusions The proposed Elastic SCAD SVM algorithm provides the advantages of the SCAD penalty and at the same time avoids sparsity limitations for non-sparse data. We were first to demonstrate that the integration of the interval search algorithm and penalized SVM classification techniques provides fast solutions on the optimization of tuning parameters. The penalized SVM classification algorithms as well as fixed grid and interval search for finding appropriate tuning parameters were implemented in our freely available R package 'penalizedSVM'. We conclude that the Elastic SCAD SVM is a flexible and robust tool for classification and feature selection tasks for high-dimensional data such as microarray data sets. PMID:21554689
Inferring Human Activity Recognition with Ambient Sound on Wireless Sensor Nodes.
Salomons, Etto L; Havinga, Paul J M; van Leeuwen, Henk
2016-09-27
A wireless sensor network that consists of nodes with a sound sensor can be used to obtain context awareness in home environments. However, the limited processing power of wireless nodes offers a challenge when extracting features from the signal, and subsequently, classifying the source. Although multiple papers can be found on different methods of sound classification, none of these are aimed at limited hardware or take the efficiency of the algorithms into account. In this paper, we compare and evaluate several classification methods on a real sensor platform using different feature types and classifiers, in order to find an approach that results in a good classifier that can run on limited hardware. To be as realistic as possible, we trained our classifiers using sound waves from many different sources. We conclude that despite the fact that the classifiers are often of low quality due to the highly restricted hardware resources, sufficient performance can be achieved when (1) the window length for our classifiers is increased, and (2) if we apply a two-step approach that uses a refined classification after a global classification has been performed.
Automatic Estimation of Osteoporotic Fracture Cases by Using Ensemble Learning Approaches.
Kilic, Niyazi; Hosgormez, Erkan
2016-03-01
Ensemble learning methods are one of the most powerful tools for the pattern classification problems. In this paper, the effects of ensemble learning methods and some physical bone densitometry parameters on osteoporotic fracture detection were investigated. Six feature set models were constructed including different physical parameters and they fed into the ensemble classifiers as input features. As ensemble learning techniques, bagging, gradient boosting and random subspace (RSM) were used. Instance based learning (IBk) and random forest (RF) classifiers applied to six feature set models. The patients were classified into three groups such as osteoporosis, osteopenia and control (healthy), using ensemble classifiers. Total classification accuracy and f-measure were also used to evaluate diagnostic performance of the proposed ensemble classification system. The classification accuracy has reached to 98.85 % by the combination of model 6 (five BMD + five T-score values) using RSM-RF classifier. The findings of this paper suggest that the patients will be able to be warned before a bone fracture occurred, by just examining some physical parameters that can easily be measured without invasive operations.
An alternative respiratory sounds classification system utilizing artificial neural networks.
Oweis, Rami J; Abdulhay, Enas W; Khayal, Amer; Awad, Areen
2015-01-01
Computerized lung sound analysis involves recording lung sound via an electronic device, followed by computer analysis and classification based on specific signal characteristics as non-linearity and nonstationarity caused by air turbulence. An automatic analysis is necessary to avoid dependence on expert skills. This work revolves around exploiting autocorrelation in the feature extraction stage. All process stages were implemented in MATLAB. The classification process was performed comparatively using both artificial neural networks (ANNs) and adaptive neuro-fuzzy inference systems (ANFIS) toolboxes. The methods have been applied to 10 different respiratory sounds for classification. The ANN was superior to the ANFIS system and returned superior performance parameters. Its accuracy, specificity, and sensitivity were 98.6%, 100%, and 97.8%, respectively. The obtained parameters showed superiority to many recent approaches. The promising proposed method is an efficient fast tool for the intended purpose as manifested in the performance parameters, specifically, accuracy, specificity, and sensitivity. Furthermore, it may be added that utilizing the autocorrelation function in the feature extraction in such applications results in enhanced performance and avoids undesired computation complexities compared to other techniques.
Using reconstructed IVUS images for coronary plaque classification.
Caballero, Karla L; Barajas, Joel; Pujol, Oriol; Rodriguez, Oriol; Radeva, Petia
2007-01-01
Coronary plaque rupture is one of the principal causes of sudden death in western societies. Reliable diagnostic of the different plaque types are of great interest for the medical community the predicting their evolution and applying an effective treatment. To achieve this, a tissue classification must be performed. Intravascular Ultrasound (IVUS) represents a technique to explore the vessel walls and to observe its histological properties. In this paper, a method to reconstruct IVUS images from the raw Radio Frequency (RF) data coming from ultrasound catheter is proposed. This framework offers a normalization scheme to compare accurately different patient studies. The automatic tissue classification is based on texture analysis and Adapting Boosting (Adaboost) learning technique combined with Error Correcting Output Codes (ECOC). In this study, 9 in-vivo cases are reconstructed with 7 different parameter set. This method improves the classification rate based on images, yielding a 91% of well-detected tissue using the best parameter set. It also reduces the inter-patient variability compared with the analysis of DICOM images, which are obtained from the commercial equipment.
NASA Astrophysics Data System (ADS)
Hu, Yan-Yan; Li, Dong-Sheng
2016-01-01
The hyperspectral images(HSI) consist of many closely spaced bands carrying the most object information. While due to its high dimensionality and high volume nature, it is hard to get satisfactory classification performance. In order to reduce HSI data dimensionality preparation for high classification accuracy, it is proposed to combine a band selection method of artificial immune systems (AIS) with a hybrid kernels support vector machine (SVM-HK) algorithm. In fact, after comparing different kernels for hyperspectral analysis, the approach mixed radial basis function kernel (RBF-K) with sigmoid kernel (Sig-K) and applied the optimized hybrid kernels in SVM classifiers. Then the SVM-HK algorithm used to induce the bands selection of an improved version of AIS. The AIS was composed of clonal selection and elite antibody mutation, including evaluation process with optional index factor (OIF). Experimental classification performance was on a San Diego Naval Base acquired by AVIRIS, the HRS dataset shows that the method is able to efficiently achieve bands redundancy removal while outperforming the traditional SVM classifier.
Lee, Seokho; Shin, Hyejin; Lee, Sang Han
2016-12-01
Alzheimer's disease (AD) is usually diagnosed by clinicians through cognitive and functional performance test with a potential risk of misdiagnosis. Since the progression of AD is known to cause structural changes in the corpus callosum (CC), the CC thickness can be used as a functional covariate in AD classification problem for a diagnosis. However, misclassified class labels negatively impact the classification performance. Motivated by AD-CC association studies, we propose a logistic regression for functional data classification that is robust to misdiagnosis or label noise. Specifically, our logistic regression model is constructed by adopting individual intercepts to functional logistic regression model. This approach enables to indicate which observations are possibly mislabeled and also lead to a robust and efficient classifier. An effective algorithm using MM algorithm provides simple closed-form update formulas. We test our method using synthetic datasets to demonstrate its superiority over an existing method, and apply it to differentiating patients with AD from healthy normals based on CC from MRI. © 2016, The International Biometric Society.
NASA Astrophysics Data System (ADS)
Su, Lihong
In remote sensing communities, support vector machine (SVM) learning has recently received increasing attention. SVM learning usually requires large memory and enormous amounts of computation time on large training sets. According to SVM algorithms, the SVM classification decision function is fully determined by support vectors, which compose a subset of the training sets. In this regard, a solution to optimize SVM learning is to efficiently reduce training sets. In this paper, a data reduction method based on agglomerative hierarchical clustering is proposed to obtain smaller training sets for SVM learning. Using a multiple angle remote sensing dataset of a semi-arid region, the effectiveness of the proposed method is evaluated by classification experiments with a series of reduced training sets. The experiments show that there is no loss of SVM accuracy when the original training set is reduced to 34% using the proposed approach. Maximum likelihood classification (MLC) also is applied on the reduced training sets. The results show that MLC can also maintain the classification accuracy. This implies that the most informative data instances can be retained by this approach.
CNN for breaking text-based CAPTCHA with noise
NASA Astrophysics Data System (ADS)
Liu, Kaixuan; Zhang, Rong; Qing, Ke
2017-07-01
A CAPTCHA ("Completely Automated Public Turing test to tell Computers and Human Apart") system is a program that most humans can pass but current computer programs could hardly pass. As the most common type of CAPTCHAs , text-based CAPTCHA has been widely used in different websites to defense network bots. In order to breaking textbased CAPTCHA, in this paper, two trained CNN models are connected for the segmentation and classification of CAPTCHA images. Then base on these two models, we apply sliding window segmentation and voting classification methods realize an end-to-end CAPTCHA breaking system with high success rate. The experiment results show that our method is robust and effective in breaking text-based CAPTCHA with noise.
Deep neural network and noise classification-based speech enhancement
NASA Astrophysics Data System (ADS)
Shi, Wenhua; Zhang, Xiongwei; Zou, Xia; Han, Wei
2017-07-01
In this paper, a speech enhancement method using noise classification and Deep Neural Network (DNN) was proposed. Gaussian mixture model (GMM) was employed to determine the noise type in speech-absent frames. DNN was used to model the relationship between noisy observation and clean speech. Once the noise type was determined, the corresponding DNN model was applied to enhance the noisy speech. GMM was trained with mel-frequency cepstrum coefficients (MFCC) and the parameters were estimated with an iterative expectation-maximization (EM) algorithm. Noise type was updated by spectrum entropy-based voice activity detection (VAD). Experimental results demonstrate that the proposed method could achieve better objective speech quality and smaller distortion under stationary and non-stationary conditions.
Machine learning in APOGEE. Unsupervised spectral classification with K-means
NASA Astrophysics Data System (ADS)
Garcia-Dias, Rafael; Allende Prieto, Carlos; Sánchez Almeida, Jorge; Ordovás-Pascual, Ignacio
2018-05-01
Context. The volume of data generated by astronomical surveys is growing rapidly. Traditional analysis techniques in spectroscopy either demand intensive human interaction or are computationally expensive. In this scenario, machine learning, and unsupervised clustering algorithms in particular, offer interesting alternatives. The Apache Point Observatory Galactic Evolution Experiment (APOGEE) offers a vast data set of near-infrared stellar spectra, which is perfect for testing such alternatives. Aims: Our research applies an unsupervised classification scheme based on K-means to the massive APOGEE data set. We explore whether the data are amenable to classification into discrete classes. Methods: We apply the K-means algorithm to 153 847 high resolution spectra (R ≈ 22 500). We discuss the main virtues and weaknesses of the algorithm, as well as our choice of parameters. Results: We show that a classification based on normalised spectra captures the variations in stellar atmospheric parameters, chemical abundances, and rotational velocity, among other factors. The algorithm is able to separate the bulge and halo populations, and distinguish dwarfs, sub-giants, RC, and RGB stars. However, a discrete classification in flux space does not result in a neat organisation in the parameters' space. Furthermore, the lack of obvious groups in flux space causes the results to be fairly sensitive to the initialisation, and disrupts the efficiency of commonly-used methods to select the optimal number of clusters. Our classification is publicly available, including extensive online material associated with the APOGEE Data Release 12 (DR12). Conclusions: Our description of the APOGEE database can help greatly with the identification of specific types of targets for various applications. We find a lack of obvious groups in flux space, and identify limitations of the K-means algorithm in dealing with this kind of data. Full Tables B.1-B.4 are only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/612/A98
McAllister, Patrick; Zheng, Huiru; Bond, Raymond; Moorhead, Anne
2018-04-01
Obesity is increasing worldwide and can cause many chronic conditions such as type-2 diabetes, heart disease, sleep apnea, and some cancers. Monitoring dietary intake through food logging is a key method to maintain a healthy lifestyle to prevent and manage obesity. Computer vision methods have been applied to food logging to automate image classification for monitoring dietary intake. In this work we applied pretrained ResNet-152 and GoogleNet convolutional neural networks (CNNs), initially trained using ImageNet Large Scale Visual Recognition Challenge (ILSVRC) dataset with MatConvNet package, to extract features from food image datasets; Food 5K, Food-11, RawFooT-DB, and Food-101. Deep features were extracted from CNNs and used to train machine learning classifiers including artificial neural network (ANN), support vector machine (SVM), Random Forest, and Naive Bayes. Results show that using ResNet-152 deep features with SVM with RBF kernel can accurately detect food items with 99.4% accuracy using Food-5K validation food image dataset and 98.8% with Food-5K evaluation dataset using ANN, SVM-RBF, and Random Forest classifiers. Trained with ResNet-152 features, ANN can achieve 91.34%, 99.28% when applied to Food-11 and RawFooT-DB food image datasets respectively and SVM with RBF kernel can achieve 64.98% with Food-101 image dataset. From this research it is clear that using deep CNN features can be used efficiently for diverse food item image classification. The work presented in this research shows that pretrained ResNet-152 features provide sufficient generalisation power when applied to a range of food image classification tasks. Copyright © 2018 Elsevier Ltd. All rights reserved.
Data preprocessing methods of FT-NIR spectral data for the classification cooking oil
NASA Astrophysics Data System (ADS)
Ruah, Mas Ezatul Nadia Mohd; Rasaruddin, Nor Fazila; Fong, Sim Siong; Jaafar, Mohd Zuli
2014-12-01
This recent work describes the data pre-processing method of FT-NIR spectroscopy datasets of cooking oil and its quality parameters with chemometrics method. Pre-processing of near-infrared (NIR) spectral data has become an integral part of chemometrics modelling. Hence, this work is dedicated to investigate the utility and effectiveness of pre-processing algorithms namely row scaling, column scaling and single scaling process with Standard Normal Variate (SNV). The combinations of these scaling methods have impact on exploratory analysis and classification via Principle Component Analysis plot (PCA). The samples were divided into palm oil and non-palm cooking oil. The classification model was build using FT-NIR cooking oil spectra datasets in absorbance mode at the range of 4000cm-1-14000cm-1. Savitzky Golay derivative was applied before developing the classification model. Then, the data was separated into two sets which were training set and test set by using Duplex method. The number of each class was kept equal to 2/3 of the class that has the minimum number of sample. Then, the sample was employed t-statistic as variable selection method in order to select which variable is significant towards the classification models. The evaluation of data pre-processing were looking at value of modified silhouette width (mSW), PCA and also Percentage Correctly Classified (%CC). The results show that different data processing strategies resulting to substantial amount of model performances quality. The effects of several data pre-processing i.e. row scaling, column standardisation and single scaling process with Standard Normal Variate indicated by mSW and %CC. At two PCs model, all five classifier gave high %CC except Quadratic Distance Analysis.
Integration of Network Topological and Connectivity Properties for Neuroimaging Classification
Jie, Biao; Gao, Wei; Wang, Qian; Wee, Chong-Yaw
2014-01-01
Rapid advances in neuroimaging techniques have provided an efficient and noninvasive way for exploring the structural and functional connectivity of the human brain. Quantitative measurement of abnormality of brain connectivity in patients with neurodegenerative diseases, such as mild cognitive impairment (MCI) and Alzheimer’s disease (AD), have also been widely reported, especially at a group level. Recently, machine learning techniques have been applied to the study of AD and MCI, i.e., to identify the individuals with AD/MCI from the healthy controls (HCs). However, most existing methods focus on using only a single property of a connectivity network, although multiple network properties, such as local connectivity and global topological properties, can potentially be used. In this paper, by employing multikernel based approach, we propose a novel connectivity based framework to integrate multiple properties of connectivity network for improving the classification performance. Specifically, two different types of kernels (i.e., vector-based kernel and graph kernel) are used to quantify two different yet complementary properties of the network, i.e., local connectivity and global topological properties. Then, multikernel learning (MKL) technique is adopted to fuse these heterogeneous kernels for neuroimaging classification. We test the performance of our proposed method on two different data sets. First, we test it on the functional connectivity networks of 12 MCI and 25 HC subjects. The results show that our method achieves significant performance improvement over those using only one type of network property. Specifically, our method achieves a classification accuracy of 91.9%, which is 10.8% better than those by single network-property-based methods. Then, we test our method for gender classification on a large set of functional connectivity networks with 133 infants scanned at birth, 1 year, and 2 years, also demonstrating very promising results. PMID:24108708
Kumar, Shiu; Mamun, Kabir; Sharma, Alok
2017-12-01
Classification of electroencephalography (EEG) signals for motor imagery based brain computer interface (MI-BCI) is an exigent task and common spatial pattern (CSP) has been extensively explored for this purpose. In this work, we focused on developing a new framework for classification of EEG signals for MI-BCI. We propose a single band CSP framework for MI-BCI that utilizes the concept of tangent space mapping (TSM) in the manifold of covariance matrices. The proposed method is named CSP-TSM. Spatial filtering is performed on the bandpass filtered MI EEG signal. Riemannian tangent space is utilized for extracting features from the spatial filtered signal. The TSM features are then fused with the CSP variance based features and feature selection is performed using Lasso. Linear discriminant analysis (LDA) is then applied to the selected features and finally classification is done using support vector machine (SVM) classifier. The proposed framework gives improved performance for MI EEG signal classification in comparison with several competing methods. Experiments conducted shows that the proposed framework reduces the overall classification error rate for MI-BCI by 3.16%, 5.10% and 1.70% (for BCI Competition III dataset IVa, BCI Competition IV Dataset I and BCI Competition IV Dataset IIb, respectively) compared to the conventional CSP method under the same experimental settings. The proposed CSP-TSM method produces promising results when compared with several competing methods in this paper. In addition, the computational complexity is less compared to that of TSM method. Our proposed CSP-TSM framework can be potentially used for developing improved MI-BCI systems. Copyright © 2017 Elsevier Ltd. All rights reserved.
Improved Fuzzy K-Nearest Neighbor Using Modified Particle Swarm Optimization
NASA Astrophysics Data System (ADS)
Jamaluddin; Siringoringo, Rimbun
2017-12-01
Fuzzy k-Nearest Neighbor (FkNN) is one of the most powerful classification methods. The presence of fuzzy concepts in this method successfully improves its performance on almost all classification issues. The main drawbackof FKNN is that it is difficult to determine the parameters. These parameters are the number of neighbors (k) and fuzzy strength (m). Both parameters are very sensitive. This makes it difficult to determine the values of ‘m’ and ‘k’, thus making FKNN difficult to control because no theories or guides can deduce how proper ‘m’ and ‘k’ should be. This study uses Modified Particle Swarm Optimization (MPSO) to determine the best value of ‘k’ and ‘m’. MPSO is focused on the Constriction Factor Method. Constriction Factor Method is an improvement of PSO in order to avoid local circumstances optima. The model proposed in this study was tested on the German Credit Dataset. The test of the data/The data test has been standardized by UCI Machine Learning Repository which is widely applied to classification problems. The application of MPSO to the determination of FKNN parameters is expected to increase the value of classification performance. Based on the experiments that have been done indicating that the model offered in this research results in a better classification performance compared to the Fk-NN model only. The model offered in this study has an accuracy rate of 81%, while. With using Fk-NN model, it has the accuracy of 70%. At the end is done comparison of research model superiority with 2 other classification models;such as Naive Bayes and Decision Tree. This research model has a better performance level, where Naive Bayes has accuracy 75%, and the decision tree model has 70%
NASA Astrophysics Data System (ADS)
Jobin, Benoît; Labrecque, Sandra; Grenier, Marcelle; Falardeau, Gilles
2008-01-01
The traditional method of identifying wildlife habitat distribution over large regions consists of pixel-based classification of satellite images into a suite of habitat classes used to select suitable habitat patches. Object-based classification is a new method that can achieve the same objective based on the segmentation of spectral bands of the image creating homogeneous polygons with regard to spatial or spectral characteristics. The segmentation algorithm does not solely rely on the single pixel value, but also on shape, texture, and pixel spatial continuity. The object-based classification is a knowledge base process where an interpretation key is developed using ground control points and objects are assigned to specific classes according to threshold values of determined spectral and/or spatial attributes. We developed a model using the eCognition software to identify suitable habitats for the Grasshopper Sparrow, a rare and declining species found in southwestern Québec. The model was developed in a region with known breeding sites and applied on other images covering adjacent regions where potential breeding habitats may be present. We were successful in locating potential habitats in areas where dairy farming prevailed but failed in an adjacent region covered by a distinct Landsat scene and dominated by annual crops. We discuss the added value of this method, such as the possibility to use the contextual information associated to objects and the ability to eliminate unsuitable areas in the segmentation and land cover classification processes, as well as technical and logistical constraints. A series of recommendations on the use of this method and on conservation issues of Grasshopper Sparrow habitat is also provided.
Common component classification: what can we learn from machine learning?
Anderson, Ariana; Labus, Jennifer S; Vianna, Eduardo P; Mayer, Emeran A; Cohen, Mark S
2011-05-15
Machine learning methods have been applied to classifying fMRI scans by studying locations in the brain that exhibit temporal intensity variation between groups, frequently reporting classification accuracy of 90% or better. Although empirical results are quite favorable, one might doubt the ability of classification methods to withstand changes in task ordering and the reproducibility of activation patterns over runs, and question how much of the classification machines' power is due to artifactual noise versus genuine neurological signal. To examine the true strength and power of machine learning classifiers we create and then deconstruct a classifier to examine its sensitivity to physiological noise, task reordering, and across-scan classification ability. The models are trained and tested both within and across runs to assess stability and reproducibility across conditions. We demonstrate the use of independent components analysis for both feature extraction and artifact removal and show that removal of such artifacts can reduce predictive accuracy even when data has been cleaned in the preprocessing stages. We demonstrate how mistakes in the feature selection process can cause the cross-validation error seen in publication to be a biased estimate of the testing error seen in practice and measure this bias by purposefully making flawed models. We discuss other ways to introduce bias and the statistical assumptions lying behind the data and model themselves. Finally we discuss the complications in drawing inference from the smaller sample sizes typically seen in fMRI studies, the effects of small or unbalanced samples on the Type 1 and Type 2 error rates, and how publication bias can give a false confidence of the power of such methods. Collectively this work identifies challenges specific to fMRI classification and methods affecting the stability of models. Copyright © 2010 Elsevier Inc. All rights reserved.
Domínguez, Rocio Berenice; Moreno-Barón, Laura; Muñoz, Roberto; Gutiérrez, Juan Manuel
2014-01-01
This paper describes a new method based on a voltammetric electronic tongue (ET) for the recognition of distinctive features in coffee samples. An ET was directly applied to different samples from the main Mexican coffee regions without any pretreatment before the analysis. The resulting electrochemical information was modeled with two different mathematical tools, namely Linear Discriminant Analysis (LDA) and Support Vector Machines (SVM). Growing conditions (i.e., organic or non-organic practices and altitude of crops) were considered for a first classification. LDA results showed an average discrimination rate of 88% ± 6.53% while SVM successfully accomplished an overall accuracy of 96.4% ± 3.50% for the same task. A second classification based on geographical origin of samples was carried out. Results showed an overall accuracy of 87.5% ± 7.79% for LDA and a superior performance of 97.5% ± 3.22% for SVM. Given the complexity of coffee samples, the high accuracy percentages achieved by ET coupled with SVM in both classification problems suggested a potential applicability of ET in the assessment of selected coffee features with a simpler and faster methodology along with a null sample pretreatment. In addition, the proposed method can be applied to authentication assessment while improving cost, time and accuracy of the general procedure. PMID:25254303
Domínguez, Rocio Berenice; Moreno-Barón, Laura; Muñoz, Roberto; Gutiérrez, Juan Manuel
2014-09-24
This paper describes a new method based on a voltammetric electronic tongue (ET) for the recognition of distinctive features in coffee samples. An ET was directly applied to different samples from the main Mexican coffee regions without any pretreatment before the analysis. The resulting electrochemical information was modeled with two different mathematical tools, namely Linear Discriminant Analysis (LDA) and Support Vector Machines (SVM). Growing conditions (i.e., organic or non-organic practices and altitude of crops) were considered for a first classification. LDA results showed an average discrimination rate of 88% ± 6.53% while SVM successfully accomplished an overall accuracy of 96.4% ± 3.50% for the same task. A second classification based on geographical origin of samples was carried out. Results showed an overall accuracy of 87.5% ± 7.79% for LDA and a superior performance of 97.5% ± 3.22% for SVM. Given the complexity of coffee samples, the high accuracy percentages achieved by ET coupled with SVM in both classification problems suggested a potential applicability of ET in the assessment of selected coffee features with a simpler and faster methodology along with a null sample pretreatment. In addition, the proposed method can be applied to authentication assessment while improving cost, time and accuracy of the general procedure.
Cao, Jianfang; Cui, Hongyan; Shi, Hao; Jiao, Lijuan
2016-01-01
A back-propagation (BP) neural network can solve complicated random nonlinear mapping problems; therefore, it can be applied to a wide range of problems. However, as the sample size increases, the time required to train BP neural networks becomes lengthy. Moreover, the classification accuracy decreases as well. To improve the classification accuracy and runtime efficiency of the BP neural network algorithm, we proposed a parallel design and realization method for a particle swarm optimization (PSO)-optimized BP neural network based on MapReduce on the Hadoop platform using both the PSO algorithm and a parallel design. The PSO algorithm was used to optimize the BP neural network's initial weights and thresholds and improve the accuracy of the classification algorithm. The MapReduce parallel programming model was utilized to achieve parallel processing of the BP algorithm, thereby solving the problems of hardware and communication overhead when the BP neural network addresses big data. Datasets on 5 different scales were constructed using the scene image library from the SUN Database. The classification accuracy of the parallel PSO-BP neural network algorithm is approximately 92%, and the system efficiency is approximately 0.85, which presents obvious advantages when processing big data. The algorithm proposed in this study demonstrated both higher classification accuracy and improved time efficiency, which represents a significant improvement obtained from applying parallel processing to an intelligent algorithm on big data.
NASA Astrophysics Data System (ADS)
Gautam, Nitin
The main objectives of this thesis are to develop a robust statistical method for the classification of ocean precipitation based on physical properties to which the SSM/I is sensitive and to examine how these properties vary globally and seasonally. A two step approach is adopted for the classification of oceanic precipitation classes from multispectral SSM/I data: (1)we subjectively define precipitation classes using a priori information about the precipitating system and its possible distinct signature on SSM/I data such as scattering by ice particles aloft in the precipitating cloud, emission by liquid rain water below freezing level, the difference of polarization at 19 GHz-an indirect measure of optical depth, etc.; (2)we then develop an objective classification scheme which is found to reproduce the subjective classification with high accuracy. This hybrid strategy allows us to use the characteristics of the data to define and encode classes and helps retain the physical interpretation of classes. The classification methods based on k-nearest neighbor and neural network are developed to objectively classify six precipitation classes. It is found that the classification method based neural network yields high accuracy for all precipitation classes. An inversion method based on minimum variance approach was used to retrieve gross microphysical properties of these precipitation classes such as column integrated liquid water path, column integrated ice water path, and column integrated min water path. This classification method is then applied to 2 years (1991-92) of SSM/I data to examine and document the seasonal and global distribution of precipitation frequency corresponding to each of these objectively defined six classes. The characteristics of the distribution are found to be consistent with assumptions used in defining these six precipitation classes and also with well known climatological patterns of precipitation regions. The seasonal and global distribution of these six classes is also compared with the earlier results obtained from Comprehensive Ocean Atmosphere Data Sets (COADS). It is found that the gross pattern of the distributions obtained from SSM/I and COADS data match remarkably well with each other.
Proceedings of the Third Annual Symposium on Mathematical Pattern Recognition and Image Analysis
NASA Technical Reports Server (NTRS)
Guseman, L. F., Jr.
1985-01-01
Topics addressed include: multivariate spline method; normal mixture analysis applied to remote sensing; image data analysis; classifications in spatially correlated environments; probability density functions; graphical nonparametric methods; subpixel registration analysis; hypothesis integration in image understanding systems; rectification of satellite scanner imagery; spatial variation in remotely sensed images; smooth multidimensional interpolation; and optimal frequency domain textural edge detection filters.
Scale invariant texture descriptors for classifying celiac disease
Hegenbart, Sebastian; Uhl, Andreas; Vécsei, Andreas; Wimmer, Georg
2013-01-01
Scale invariant texture recognition methods are applied for the computer assisted diagnosis of celiac disease. In particular, emphasis is given to techniques enhancing the scale invariance of multi-scale and multi-orientation wavelet transforms and methods based on fractal analysis. After fine-tuning to specific properties of our celiac disease imagery database, which consists of endoscopic images of the duodenum, some scale invariant (and often even viewpoint invariant) methods provide classification results improving the current state of the art. However, not each of the investigated scale invariant methods is applicable successfully to our dataset. Therefore, the scale invariance of the employed approaches is explicitly assessed and it is found that many of the analyzed methods are not as scale invariant as they theoretically should be. Results imply that scale invariance is not a key-feature required for successful classification of our celiac disease dataset. PMID:23481171
DeepPap: Deep Convolutional Networks for Cervical Cell Classification.
Zhang, Ling; Le Lu; Nogues, Isabella; Summers, Ronald M; Liu, Shaoxiong; Yao, Jianhua
2017-11-01
Automation-assisted cervical screening via Pap smear or liquid-based cytology (LBC) is a highly effective cell imaging based cancer detection tool, where cells are partitioned into "abnormal" and "normal" categories. However, the success of most traditional classification methods relies on the presence of accurate cell segmentations. Despite sixty years of research in this field, accurate segmentation remains a challenge in the presence of cell clusters and pathologies. Moreover, previous classification methods are only built upon the extraction of hand-crafted features, such as morphology and texture. This paper addresses these limitations by proposing a method to directly classify cervical cells-without prior segmentation-based on deep features, using convolutional neural networks (ConvNets). First, the ConvNet is pretrained on a natural image dataset. It is subsequently fine-tuned on a cervical cell dataset consisting of adaptively resampled image patches coarsely centered on the nuclei. In the testing phase, aggregation is used to average the prediction scores of a similar set of image patches. The proposed method is evaluated on both Pap smear and LBC datasets. Results show that our method outperforms previous algorithms in classification accuracy (98.3%), area under the curve (0.99) values, and especially specificity (98.3%), when applied to the Herlev benchmark Pap smear dataset and evaluated using five-fold cross validation. Similar superior performances are also achieved on the HEMLBC (H&E stained manual LBC) dataset. Our method is promising for the development of automation-assisted reading systems in primary cervical screening.
46 CFR 160.062-4 - Inspections and tests.
Code of Federal Regulations, 2010 CFR
2010-10-01
... approved hydraulic releases are manufactured or reconditioned to observe production methods and to conduct... place of manufacture by the marine inspector. (b) Classification of tests. The sampling, inspections... shall be tested by applying buoyant loads of its designed capacity to its spring-tensioned gripe as...
Bhaganagarapu, Kaushik; Jackson, Graeme D; Abbott, David F
2013-01-01
An enduring issue with data-driven analysis and filtering methods is the interpretation of results. To assist, we present an automatic method for identification of artifact in independent components (ICs) derived from functional MRI (fMRI). The method was designed with the following features: does not require temporal information about an fMRI paradigm; does not require the user to train the algorithm; requires only the fMRI images (additional acquisition of anatomical imaging not required); is able to identify a high proportion of artifact-related ICs without removing components that are likely to be of neuronal origin; can be applied to resting-state fMRI; is automated, requiring minimal or no human intervention. We applied the method to a MELODIC probabilistic ICA of resting-state functional connectivity data acquired in 50 healthy control subjects, and compared the results to a blinded expert manual classification. The method identified between 26 and 72% of the components as artifact (mean 55%). About 0.3% of components identified as artifact were discordant with the manual classification; retrospective examination of these ICs suggested the automated method had correctly identified these as artifact. We have developed an effective automated method which removes a substantial number of unwanted noisy components in ICA analyses of resting-state fMRI data. Source code of our implementation of the method is available.
An Automated Method for Identifying Artifact in Independent Component Analysis of Resting-State fMRI
Bhaganagarapu, Kaushik; Jackson, Graeme D.; Abbott, David F.
2013-01-01
An enduring issue with data-driven analysis and filtering methods is the interpretation of results. To assist, we present an automatic method for identification of artifact in independent components (ICs) derived from functional MRI (fMRI). The method was designed with the following features: does not require temporal information about an fMRI paradigm; does not require the user to train the algorithm; requires only the fMRI images (additional acquisition of anatomical imaging not required); is able to identify a high proportion of artifact-related ICs without removing components that are likely to be of neuronal origin; can be applied to resting-state fMRI; is automated, requiring minimal or no human intervention. We applied the method to a MELODIC probabilistic ICA of resting-state functional connectivity data acquired in 50 healthy control subjects, and compared the results to a blinded expert manual classification. The method identified between 26 and 72% of the components as artifact (mean 55%). About 0.3% of components identified as artifact were discordant with the manual classification; retrospective examination of these ICs suggested the automated method had correctly identified these as artifact. We have developed an effective automated method which removes a substantial number of unwanted noisy components in ICA analyses of resting-state fMRI data. Source code of our implementation of the method is available. PMID:23847511
NASA Astrophysics Data System (ADS)
Shvelidze, T. D.; Malyuto, V. D.
Quantitative spectral classification of F, G and K stars with the 70-cm telescope of the Ambastumani Astrophysical Observatory in areas of the main meridional section of the Galaxy, and for which proper motion data are available, has been performed. Fundamental parameters have been obtained for 333 stars in four areas. Space densities of stars of different spectral types, the stellar luminosity function and the relationships between the kinematics and metallicity of stars have been studied. The results have confirmed and completed the conclusions made on the basis of some previous spectroscopic and photometric surveys. Many plates have been obtained for other important directions in the sky: the Kapteyn areas, the Galactic anticentre and the main meridional section of the Galaxy. The data can be treated with the same quantitative method applied here. This method may also be applied to other available and future spectroscopic data of similar resolution, notably that obtained with large format CCD detectors on Schmidt-type telescopes.
A study and evaluation of image analysis techniques applied to remotely sensed data
NASA Technical Reports Server (NTRS)
Atkinson, R. J.; Dasarathy, B. V.; Lybanon, M.; Ramapriyan, H. K.
1976-01-01
An analysis of phenomena causing nonlinearities in the transformation from Landsat multispectral scanner coordinates to ground coordinates is presented. Experimental results comparing rms errors at ground control points indicated a slight improvement when a nonlinear (8-parameter) transformation was used instead of an affine (6-parameter) transformation. Using a preliminary ground truth map of a test site in Alabama covering the Mobile Bay area and six Landsat images of the same scene, several classification methods were assessed. A methodology was developed for automatic change detection using classification/cluster maps. A coding scheme was employed for generation of change depiction maps indicating specific types of changes. Inter- and intraseasonal data of the Mobile Bay test area were compared to illustrate the method. A beginning was made in the study of data compression by applying a Karhunen-Loeve transform technique to a small section of the test data set. The second part of the report provides a formal documentation of the several programs developed for the analysis and assessments presented.
Brain tissue segmentation in 4D CT using voxel classification
NASA Astrophysics Data System (ADS)
van den Boom, R.; Oei, M. T. H.; Lafebre, S.; Oostveen, L. J.; Meijer, F. J. A.; Steens, S. C. A.; Prokop, M.; van Ginneken, B.; Manniesing, R.
2012-02-01
A method is proposed to segment anatomical regions of the brain from 4D computer tomography (CT) patient data. The method consists of a three step voxel classification scheme, each step focusing on structures that are increasingly difficult to segment. The first step classifies air and bone, the second step classifies vessels and the third step classifies white matter, gray matter and cerebrospinal fluid. As features the time averaged intensity value and the temporal intensity change value were used. In each step, a k-Nearest-Neighbor classifier was used to classify the voxels. Training data was obtained by placing regions of interest in reconstructed 3D image data. The method has been applied to ten 4D CT cerebral patient data. A leave-one-out experiment showed consistent and accurate segmentation results.
Hayes, Timothy; Usami, Satoshi; Jacobucci, Ross; McArdle, John J
2015-12-01
In this article, we describe a recent development in the analysis of attrition: using classification and regression trees (CART) and random forest methods to generate inverse sampling weights. These flexible machine learning techniques have the potential to capture complex nonlinear, interactive selection models, yet to our knowledge, their performance in the missing data analysis context has never been evaluated. To assess the potential benefits of these methods, we compare their performance with commonly employed multiple imputation and complete case techniques in 2 simulations. These initial results suggest that weights computed from pruned CART analyses performed well in terms of both bias and efficiency when compared with other methods. We discuss the implications of these findings for applied researchers. (c) 2015 APA, all rights reserved).
Hayes, Timothy; Usami, Satoshi; Jacobucci, Ross; McArdle, John J.
2016-01-01
In this article, we describe a recent development in the analysis of attrition: using classification and regression trees (CART) and random forest methods to generate inverse sampling weights. These flexible machine learning techniques have the potential to capture complex nonlinear, interactive selection models, yet to our knowledge, their performance in the missing data analysis context has never been evaluated. To assess the potential benefits of these methods, we compare their performance with commonly employed multiple imputation and complete case techniques in 2 simulations. These initial results suggest that weights computed from pruned CART analyses performed well in terms of both bias and efficiency when compared with other methods. We discuss the implications of these findings for applied researchers. PMID:26389526
Long, Zhuqing; Jing, Bin; Yan, Huagang; Dong, Jianxin; Liu, Han; Mo, Xiao; Han, Ying; Li, Haiyun
2016-09-07
Mild cognitive impairment (MCI) represents a transitional state between normal aging and Alzheimer's disease (AD). Non-invasive diagnostic methods are desirable to identify MCI for early therapeutic interventions. In this study, we proposed a support vector machine (SVM)-based method to discriminate between MCI patients and normal controls (NCs) using multi-level characteristics of magnetic resonance imaging (MRI). This method adopted a radial basis function (RBF) as the kernel function, and a grid search method to optimize the two parameters of SVM. The calculated characteristics, i.e., the Hurst exponent (HE), amplitude of low-frequency fluctuations (ALFF), regional homogeneity (ReHo) and gray matter density (GMD), were adopted as the classification features. A leave-one-out cross-validation (LOOCV) was used to evaluate the classification performance of the method. Applying the proposed method to the experimental data from 29 MCI patients and 33 healthy subjects, we achieved a classification accuracy of up to 96.77%, with a sensitivity of 93.10% and a specificity of 100%, and the area under the curve (AUC) yielded up to 0.97. Furthermore, the most discriminative features for classification were found to predominantly involve default-mode regions, such as hippocampus (HIP), parahippocampal gyrus (PHG), posterior cingulate gyrus (PCG) and middle frontal gyrus (MFG), and subcortical regions such as lentiform nucleus (LN) and amygdala (AMYG). Therefore, our method is promising in distinguishing MCI patients from NCs and may be useful for the diagnosis of MCI. Copyright © 2016 IBRO. Published by Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Liu, Haijian; Wu, Changshan
2018-06-01
Crown-level tree species classification is a challenging task due to the spectral similarity among different tree species. Shadow, underlying objects, and other materials within a crown may decrease the purity of extracted crown spectra and further reduce classification accuracy. To address this problem, an innovative pixel-weighting approach was developed for tree species classification at the crown level. The method utilized high density discrete LiDAR data for individual tree delineation and Airborne Imaging Spectrometer for Applications (AISA) hyperspectral imagery for pure crown-scale spectra extraction. Specifically, three steps were included: 1) individual tree identification using LiDAR data, 2) pixel-weighted representative crown spectra calculation using hyperspectral imagery, with which pixel-based illuminated-leaf fractions estimated using a linear spectral mixture analysis (LSMA) were employed as weighted factors, and 3) representative spectra based tree species classification was performed through applying a support vector machine (SVM) approach. Analysis of results suggests that the developed pixel-weighting approach (OA = 82.12%, Kc = 0.74) performed better than treetop-based (OA = 70.86%, Kc = 0.58) and pixel-majority methods (OA = 72.26, Kc = 0.62) in terms of classification accuracy. McNemar tests indicated the differences in accuracy between pixel-weighting and treetop-based approaches as well as that between pixel-weighting and pixel-majority approaches were statistically significant.
Vesicular stomatitis forecasting based on Google Trends
Lu, Yi; Zhou, GuangYa; Chen, Qin
2018-01-01
Background Vesicular stomatitis (VS) is an important viral disease of livestock. The main feature of VS is irregular blisters that occur on the lips, tongue, oral mucosa, hoof crown and nipple. Humans can also be infected with vesicular stomatitis and develop meningitis. This study analyses 2014 American VS outbreaks in order to accurately predict vesicular stomatitis outbreak trends. Methods American VS outbreaks data were collected from OIE. The data for VS keywords were obtained by inputting 24 disease-related keywords into Google Trends. After calculating the Pearson and Spearman correlation coefficients, it was found that there was a relationship between outbreaks and keywords derived from Google Trends. Finally, the predicted model was constructed based on qualitative classification and quantitative regression. Results For the regression model, the Pearson correlation coefficients between the predicted outbreaks and actual outbreaks are 0.953 and 0.948, respectively. For the qualitative classification model, we constructed five classification predictive models and chose the best classification predictive model as the result. The results showed, SN (sensitivity), SP (specificity) and ACC (prediction accuracy) values of the best classification predictive model are 78.52%,72.5% and 77.14%, respectively. Conclusion This study applied Google search data to construct a qualitative classification model and a quantitative regression model. The results show that the method is effective and that these two models obtain more accurate forecast. PMID:29385198
NASA Astrophysics Data System (ADS)
Hale Topaloğlu, Raziye; Sertel, Elif; Musaoğlu, Nebiye
2016-06-01
This study aims to compare classification accuracies of land cover/use maps created from Sentinel-2 and Landsat-8 data. Istanbul metropolitan city of Turkey, with a population of around 14 million, having different landscape characteristics was selected as study area. Water, forest, agricultural areas, grasslands, transport network, urban, airport- industrial units and barren land- mine land cover/use classes adapted from CORINE nomenclature were used as main land cover/use classes to identify. To fulfil the aims of this research, recently acquired dated 08/02/2016 Sentinel-2 and dated 22/02/2016 Landsat-8 images of Istanbul were obtained and image pre-processing steps like atmospheric and geometric correction were employed. Both Sentinel-2 and Landsat-8 images were resampled to 30m pixel size after geometric correction and similar spectral bands for both satellites were selected to create a similar base for these multi-sensor data. Maximum Likelihood (MLC) and Support Vector Machine (SVM) supervised classification methods were applied to both data sets to accurately identify eight different land cover/ use classes. Error matrix was created using same reference points for Sentinel-2 and Landsat-8 classifications. After the classification accuracy, results were compared to find out the best approach to create current land cover/use map of the region. The results of MLC and SVM classification methods were compared for both images.
NASA Astrophysics Data System (ADS)
Krappe, Sebastian; Wittenberg, Thomas; Haferlach, Torsten; Münzenmayer, Christian
2016-03-01
The morphological differentiation of bone marrow is fundamental for the diagnosis of leukemia. Currently, the counting and classification of the different types of bone marrow cells is done manually under the use of bright field microscopy. This is a time-consuming, subjective, tedious and error-prone process. Furthermore, repeated examinations of a slide may yield intra- and inter-observer variances. For that reason a computer assisted diagnosis system for bone marrow differentiation is pursued. In this work we focus (a) on a new method for the separation of nucleus and plasma parts and (b) on a knowledge-based hierarchical tree classifier for the differentiation of bone marrow cells in 16 different classes. Classification trees are easily interpretable and understandable and provide a classification together with an explanation. Using classification trees, expert knowledge (i.e. knowledge about similar classes and cell lines in the tree model of hematopoiesis) is integrated in the structure of the tree. The proposed segmentation method is evaluated with more than 10,000 manually segmented cells. For the evaluation of the proposed hierarchical classifier more than 140,000 automatically segmented bone marrow cells are used. Future automated solutions for the morphological analysis of bone marrow smears could potentially apply such an approach for the pre-classification of bone marrow cells and thereby shortening the examination time.
Change classification in SAR time series: a functional approach
NASA Astrophysics Data System (ADS)
Boldt, Markus; Thiele, Antje; Schulz, Karsten; Hinz, Stefan
2017-10-01
Change detection represents a broad field of research in SAR remote sensing, consisting of many different approaches. Besides the simple recognition of change areas, the analysis of type, category or class of the change areas is at least as important for creating a comprehensive result. Conventional strategies for change classification are based on supervised or unsupervised landuse / landcover classifications. The main drawback of such approaches is that the quality of the classification result directly depends on the selection of training and reference data. Additionally, supervised processing methods require an experienced operator who capably selects the training samples. This training step is not necessary when using unsupervised strategies, but nevertheless meaningful reference data must be available for identifying the resulting classes. Consequently, an experienced operator is indispensable. In this study, an innovative concept for the classification of changes in SAR time series data is proposed. Regarding the drawbacks of traditional strategies given above, it copes without using any training data. Moreover, the method can be applied by an operator, who does not have detailed knowledge about the available scenery yet. This knowledge is provided by the algorithm. The final step of the procedure, which main aspect is given by the iterative optimization of an initial class scheme with respect to the categorized change objects, is represented by the classification of these objects to the finally resulting classes. This assignment step is subject of this paper.
NASA Astrophysics Data System (ADS)
Navratil, Peter; Wilps, Hans
2013-01-01
Three different object-based image classification techniques are applied to high-resolution satellite data for the mapping of the habitats of Asian migratory locust (Locusta migratoria migratoria) in the southern Aral Sea basin, Uzbekistan. A set of panchromatic and multispectral Système Pour l'Observation de la Terre-5 satellite images was spectrally enhanced by normalized difference vegetation index and tasseled cap transformation and segmented into image objects, which were then classified by three different classification approaches: a rule-based hierarchical fuzzy threshold (HFT) classification method was compared to a supervised nearest neighbor classifier and classification tree analysis by the quick, unbiased, efficient statistical trees algorithm. Special emphasis was laid on the discrimination of locust feeding and breeding habitats due to the significance of this discrimination for practical locust control. Field data on vegetation and land cover, collected at the time of satellite image acquisition, was used to evaluate classification accuracy. The results show that a robust HFT classifier outperformed the two automated procedures by 13% overall accuracy. The classification method allowed a reliable discrimination of locust feeding and breeding habitats, which is of significant importance for the application of the resulting data for an economically and environmentally sound control of locust pests because exact spatial knowledge on the habitat types allows a more effective surveying and use of pesticides.
22 CFR 9.4 - Original classification.
Code of Federal Regulations, 2014 CFR
2014-04-01
... 22 Foreign Relations 1 2014-04-01 2014-04-01 false Original classification. 9.4 Section 9.4... classification. (a) Definition. Original classification is the initial determination that certain information... classification. (b) Classification levels. (1) Top Secret shall be applied to information the unauthorized...
22 CFR 9.4 - Original classification.
Code of Federal Regulations, 2013 CFR
2013-04-01
... 22 Foreign Relations 1 2013-04-01 2013-04-01 false Original classification. 9.4 Section 9.4... classification. (a) Definition. Original classification is the initial determination that certain information... classification. (b) Classification levels. (1) Top Secret shall be applied to information the unauthorized...
22 CFR 9.4 - Original classification.
Code of Federal Regulations, 2012 CFR
2012-04-01
... 22 Foreign Relations 1 2012-04-01 2012-04-01 false Original classification. 9.4 Section 9.4... classification. (a) Definition. Original classification is the initial determination that certain information... classification. (b) Classification levels. (1) Top Secret shall be applied to information the unauthorized...
22 CFR 9.4 - Original classification.
Code of Federal Regulations, 2011 CFR
2011-04-01
... 22 Foreign Relations 1 2011-04-01 2011-04-01 false Original classification. 9.4 Section 9.4... classification. (a) Definition. Original classification is the initial determination that certain information... classification. (b) Classification levels. (1) Top Secret shall be applied to information the unauthorized...
Gastric precancerous diseases classification using CNN with a concise model.
Zhang, Xu; Hu, Weiling; Chen, Fei; Liu, Jiquan; Yang, Yuanhang; Wang, Liangjing; Duan, Huilong; Si, Jianmin
2017-01-01
Gastric precancerous diseases (GPD) may deteriorate into early gastric cancer if misdiagnosed, so it is important to help doctors recognize GPD accurately and quickly. In this paper, we realize the classification of 3-class GPD, namely, polyp, erosion, and ulcer using convolutional neural networks (CNN) with a concise model called the Gastric Precancerous Disease Network (GPDNet). GPDNet introduces fire modules from SqueezeNet to reduce the model size and parameters about 10 times while improving speed for quick classification. To maintain classification accuracy with fewer parameters, we propose an innovative method called iterative reinforced learning (IRL). After training GPDNet from scratch, we apply IRL to fine-tune the parameters whose values are close to 0, and then we take the modified model as a pretrained model for the next training. The result shows that IRL can improve the accuracy about 9% after 6 iterations. The final classification accuracy of our GPDNet was 88.90%, which is promising for clinical GPD recognition.
Training echo state networks for rotation-invariant bone marrow cell classification.
Kainz, Philipp; Burgsteiner, Harald; Asslaber, Martin; Ahammer, Helmut
2017-01-01
The main principle of diagnostic pathology is the reliable interpretation of individual cells in context of the tissue architecture. Especially a confident examination of bone marrow specimen is dependent on a valid classification of myeloid cells. In this work, we propose a novel rotation-invariant learning scheme for multi-class echo state networks (ESNs), which achieves very high performance in automated bone marrow cell classification. Based on representing static images as temporal sequence of rotations, we show how ESNs robustly recognize cells of arbitrary rotations by taking advantage of their short-term memory capacity. The performance of our approach is compared to a classification random forest that learns rotation-invariance in a conventional way by exhaustively training on multiple rotations of individual samples. The methods were evaluated on a human bone marrow image database consisting of granulopoietic and erythropoietic cells in different maturation stages. Our ESN approach to cell classification does not rely on segmentation of cells or manual feature extraction and can therefore directly be applied to image data.
Fuzzy support vector machine: an efficient rule-based classification technique for microarrays.
Hajiloo, Mohsen; Rabiee, Hamid R; Anooshahpour, Mahdi
2013-01-01
The abundance of gene expression microarray data has led to the development of machine learning algorithms applicable for tackling disease diagnosis, disease prognosis, and treatment selection problems. However, these algorithms often produce classifiers with weaknesses in terms of accuracy, robustness, and interpretability. This paper introduces fuzzy support vector machine which is a learning algorithm based on combination of fuzzy classifiers and kernel machines for microarray classification. Experimental results on public leukemia, prostate, and colon cancer datasets show that fuzzy support vector machine applied in combination with filter or wrapper feature selection methods develops a robust model with higher accuracy than the conventional microarray classification models such as support vector machine, artificial neural network, decision trees, k nearest neighbors, and diagonal linear discriminant analysis. Furthermore, the interpretable rule-base inferred from fuzzy support vector machine helps extracting biological knowledge from microarray data. Fuzzy support vector machine as a new classification model with high generalization power, robustness, and good interpretability seems to be a promising tool for gene expression microarray classification.
A fingerprint classification algorithm based on combination of local and global information
NASA Astrophysics Data System (ADS)
Liu, Chongjin; Fu, Xiang; Bian, Junjie; Feng, Jufu
2011-12-01
Fingerprint recognition is one of the most important technologies in biometric identification and has been wildly applied in commercial and forensic areas. Fingerprint classification, as the fundamental procedure in fingerprint recognition, can sharply decrease the quantity for fingerprint matching and improve the efficiency of fingerprint recognition. Most fingerprint classification algorithms are based on the number and position of singular points. Because the singular points detecting method only considers the local information commonly, the classification algorithms are sensitive to noise. In this paper, we propose a novel fingerprint classification algorithm combining the local and global information of fingerprint. Firstly we use local information to detect singular points and measure their quality considering orientation structure and image texture in adjacent areas. Furthermore the global orientation model is adopted to measure the reliability of singular points group. Finally the local quality and global reliability is weighted to classify fingerprint. Experiments demonstrate the accuracy and effectivity of our algorithm especially for the poor quality fingerprint images.
Computational approaches for the classification of seed storage proteins.
Radhika, V; Rao, V Sree Hari
2015-07-01
Seed storage proteins comprise a major part of the protein content of the seed and have an important role on the quality of the seed. These storage proteins are important because they determine the total protein content and have an effect on the nutritional quality and functional properties for food processing. Transgenic plants are being used to develop improved lines for incorporation into plant breeding programs and the nutrient composition of seeds is a major target of molecular breeding programs. Hence, classification of these proteins is crucial for the development of superior varieties with improved nutritional quality. In this study we have applied machine learning algorithms for classification of seed storage proteins. We have presented an algorithm based on nearest neighbor approach for classification of seed storage proteins and compared its performance with decision tree J48, multilayer perceptron neural (MLP) network and support vector machine (SVM) libSVM. The model based on our algorithm has been able to give higher classification accuracy in comparison to the other methods.
NASA Astrophysics Data System (ADS)
Felgaer, Pablo; Britos, Paola; García-Martínez, Ramón
A Bayesian network is a directed acyclic graph in which each node represents a variable and each arc a probabilistic dependency; they are used to provide: a compact form to represent the knowledge and flexible methods of reasoning. Obtaining it from data is a learning process that is divided in two steps: structural learning and parametric learning. In this paper we define an automatic learning method that optimizes the Bayesian networks applied to classification, using a hybrid method of learning that combines the advantages of the induction techniques of the decision trees (TDIDT-C4.5) with those of the Bayesian networks. The resulting method is applied to prediction in health domain.
NASA Astrophysics Data System (ADS)
Bigdeli, Behnaz; Pahlavani, Parham
2017-01-01
Interpretation of synthetic aperture radar (SAR) data processing is difficult because the geometry and spectral range of SAR are different from optical imagery. Consequently, SAR imaging can be a complementary data to multispectral (MS) optical remote sensing techniques because it does not depend on solar illumination and weather conditions. This study presents a multisensor fusion of SAR and MS data based on the use of classification and regression tree (CART) and support vector machine (SVM) through a decision fusion system. First, different feature extraction strategies were applied on SAR and MS data to produce more spectral and textural information. To overcome the redundancy and correlation between features, an intrinsic dimension estimation method based on noise-whitened Harsanyi, Farrand, and Chang determines the proper dimension of the features. Then, principal component analysis and independent component analysis were utilized on stacked feature space of two data. Afterward, SVM and CART classified each reduced feature space. Finally, a fusion strategy was utilized to fuse the classification results. To show the effectiveness of the proposed methodology, single classification on each data was compared to the obtained results. A coregistered Radarsat-2 and WorldView-2 data set from San Francisco, USA, was available to examine the effectiveness of the proposed method. The results show that combinations of SAR data with optical sensor based on the proposed methodology improve the classification results for most of the classes. The proposed fusion method provided approximately 93.24% and 95.44% for two different areas of the data.
Automated artery-venous classification of retinal blood vessels based on structural mapping method
NASA Astrophysics Data System (ADS)
Joshi, Vinayak S.; Garvin, Mona K.; Reinhardt, Joseph M.; Abramoff, Michael D.
2012-03-01
Retinal blood vessels show morphologic modifications in response to various retinopathies. However, the specific responses exhibited by arteries and veins may provide a precise diagnostic information, i.e., a diabetic retinopathy may be detected more accurately with the venous dilatation instead of average vessel dilatation. In order to analyze the vessel type specific morphologic modifications, the classification of a vessel network into arteries and veins is required. We previously described a method for identification and separation of retinal vessel trees; i.e. structural mapping. Therefore, we propose the artery-venous classification based on structural mapping and identification of color properties prominent to the vessel types. The mean and standard deviation of each of green channel intensity and hue channel intensity are analyzed in a region of interest around each centerline pixel of a vessel. Using the vector of color properties extracted from each centerline pixel, it is classified into one of the two clusters (artery and vein), obtained by the fuzzy-C-means clustering. According to the proportion of clustered centerline pixels in a particular vessel, and utilizing the artery-venous crossing property of retinal vessels, each vessel is assigned a label of an artery or a vein. The classification results are compared with the manually annotated ground truth (gold standard). We applied the proposed method to a dataset of 15 retinal color fundus images resulting in an accuracy of 88.28% correctly classified vessel pixels. The automated classification results match well with the gold standard suggesting its potential in artery-venous classification and the respective morphology analysis.
NASA Astrophysics Data System (ADS)
Porto, C. D. N.; Costa Filho, C. F. F.; Macedo, M. M. G.; Gutierrez, M. A.; Costa, M. G. F.
2017-03-01
Studies in intravascular optical coherence tomography (IV-OCT) have demonstrated the importance of coronary bifurcation regions in intravascular medical imaging analysis, as plaques are more likely to accumulate in this region leading to coronary disease. A typical IV-OCT pullback acquires hundreds of frames, thus developing an automated tool to classify the OCT frames as bifurcation or non-bifurcation can be an important step to speed up OCT pullbacks analysis and assist automated methods for atherosclerotic plaque quantification. In this work, we evaluate the performance of two state-of-the-art classifiers, SVM and Neural Networks in the bifurcation classification task. The study included IV-OCT frames from 9 patients. In order to improve classification performance, we trained and tested the SVM with different parameters by means of a grid search and different stop criteria were applied to the Neural Network classifier: mean square error, early stop and regularization. Different sets of features were tested, using feature selection techniques: PCA, LDA and scalar feature selection with correlation. Training and test were performed in sets with a maximum of 1460 OCT frames. We quantified our results in terms of false positive rate, true positive rate, accuracy, specificity, precision, false alarm, f-measure and area under ROC curve. Neural networks obtained the best classification accuracy, 98.83%, overcoming the results found in literature. Our methods appear to offer a robust and reliable automated classification of OCT frames that might assist physicians indicating potential frames to analyze. Methods for improving neural networks generalization have increased the classification performance.
Lati, Ran N; Filin, Sagi; Aly, Radi; Lande, Tal; Levin, Ilan; Eizenberg, Hanan
2014-07-01
Weed/crop classification is considered the main problem in developing precise weed-management methodologies, because both crops and weeds share similar hues. Great effort has been invested in the development of classification models, most based on expensive sensors and complicated algorithms. However, satisfactory results are not consistently obtained due to imaging conditions in the field. We report on an innovative approach that combines advances in genetic engineering and robust image-processing methods to detect weeds and distinguish them from crop plants by manipulating the crop's leaf color. We demonstrate this on genetically modified tomato (germplasm AN-113) which expresses a purple leaf color. An autonomous weed/crop classification is performed using an invariant-hue transformation that is applied to images acquired by a standard consumer camera (visible wavelength) and handles variations in illumination intensities. The integration of these methodologies is simple and effective, and classification results were accurate and stable under a wide range of imaging conditions. Using this approach, we simplify the most complicated stage in image-based weed/crop classification models. © 2013 Society of Chemical Industry.
Liu, Wenbo; Li, Ming; Yi, Li
2016-08-01
The atypical face scanning patterns in individuals with Autism Spectrum Disorder (ASD) has been repeatedly discovered by previous research. The present study examined whether their face scanning patterns could be potentially useful to identify children with ASD by adopting the machine learning algorithm for the classification purpose. Particularly, we applied the machine learning method to analyze an eye movement dataset from a face recognition task [Yi et al., 2016], to classify children with and without ASD. We evaluated the performance of our model in terms of its accuracy, sensitivity, and specificity of classifying ASD. Results indicated promising evidence for applying the machine learning algorithm based on the face scanning patterns to identify children with ASD, with a maximum classification accuracy of 88.51%. Nevertheless, our study is still preliminary with some constraints that may apply in the clinical practice. Future research should shed light on further valuation of our method and contribute to the development of a multitask and multimodel approach to aid the process of early detection and diagnosis of ASD. Autism Res 2016, 9: 888-898. © 2016 International Society for Autism Research, Wiley Periodicals, Inc. © 2016 International Society for Autism Research, Wiley Periodicals, Inc.
2011-01-01
Background In view of the long term discussion on the appropriateness of the dengue classification into dengue fever (DF), dengue haemorrhagic fever (DHF) and dengue shock syndrome (DSS), the World Health Organization (WHO) has outlined in its new global dengue guidelines a revised classification into levels of severity: dengue fever with an intermediary group of "dengue fever with warning sings", and severe dengue. The objective of this paper was to compare the two classification systems regarding applicability in clinical practice and surveillance, as well as user-friendliness and acceptance by health staff. Methods A mix of quantitative (prospective and retrospective review of medical charts by expert reviewers, formal staff interviews), semi-quantitative (open questions in staff interviews) and qualitative methods (focus group discussions) were used in 18 countries. Quality control of data collected was undertaken by external monitors. Results The applicability of the DF/DHF/DSS classification was limited, even when strict DHF criteria were not applied (13.7% of dengue cases could not be classified using the DF/DHF/DSS classification by experienced reviewers, compared to only 1.6% with the revised classification). The fact that some severe dengue cases could not be classified in the DF/DHF/DSS system was of particular concern. Both acceptance and perceived user-friendliness of the revised system were high, particularly in relation to triage and case management. The applicability of the revised classification to retrospective data sets (of importance for dengue surveillance) was also favourable. However, the need for training, dissemination and further research on the warning signs was highlighted. Conclusions The revised dengue classification has a high potential for facilitating dengue case management and surveillance. PMID:21510901
Model selection for anomaly detection
NASA Astrophysics Data System (ADS)
Burnaev, E.; Erofeev, P.; Smolyakov, D.
2015-12-01
Anomaly detection based on one-class classification algorithms is broadly used in many applied domains like image processing (e.g. detection of whether a patient is "cancerous" or "healthy" from mammography image), network intrusion detection, etc. Performance of an anomaly detection algorithm crucially depends on a kernel, used to measure similarity in a feature space. The standard approaches (e.g. cross-validation) for kernel selection, used in two-class classification problems, can not be used directly due to the specific nature of a data (absence of a second, abnormal, class data). In this paper we generalize several kernel selection methods from binary-class case to the case of one-class classification and perform extensive comparison of these approaches using both synthetic and real-world data.
A Survey on Sentiment Classification in Face Recognition
NASA Astrophysics Data System (ADS)
Qian, Jingyu
2018-01-01
Face recognition has been an important topic for both industry and academia for a long time. K-means clustering, autoencoder, and convolutional neural network, each representing a design idea for face recognition method, are three popular algorithms to deal with face recognition problems. It is worthwhile to summarize and compare these three different algorithms. This paper will focus on one specific face recognition problem-sentiment classification from images. Three different algorithms for sentiment classification problems will be summarized, including k-means clustering, autoencoder, and convolutional neural network. An experiment with the application of these algorithms on a specific dataset of human faces will be conducted to illustrate how these algorithms are applied and their accuracy. Finally, the three algorithms are compared based on the accuracy result.
Yang, Lingjian; Ainali, Chrysanthi; Tsoka, Sophia; Papageorgiou, Lazaros G
2014-12-05
Applying machine learning methods on microarray gene expression profiles for disease classification problems is a popular method to derive biomarkers, i.e. sets of genes that can predict disease state or outcome. Traditional approaches where expression of genes were treated independently suffer from low prediction accuracy and difficulty of biological interpretation. Current research efforts focus on integrating information on protein interactions through biochemical pathway datasets with expression profiles to propose pathway-based classifiers that can enhance disease diagnosis and prognosis. As most of the pathway activity inference methods in literature are either unsupervised or applied on two-class datasets, there is good scope to address such limitations by proposing novel methodologies. A supervised multiclass pathway activity inference method using optimisation techniques is reported. For each pathway expression dataset, patterns of its constituent genes are summarised into one composite feature, termed pathway activity, and a novel mathematical programming model is proposed to infer this feature as a weighted linear summation of expression of its constituent genes. Gene weights are determined by the optimisation model, in a way that the resulting pathway activity has the optimal discriminative power with regards to disease phenotypes. Classification is then performed on the resulting low-dimensional pathway activity profile. The model was evaluated through a variety of published gene expression profiles that cover different types of disease. We show that not only does it improve classification accuracy, but it can also perform well in multiclass disease datasets, a limitation of other approaches from the literature. Desirable features of the model include the ability to control the maximum number of genes that may participate in determining pathway activity, which may be pre-specified by the user. Overall, this work highlights the potential of building pathway-based multi-phenotype classifiers for accurate disease diagnosis and prognosis problems.
19 CFR 152.16 - Judicial changes in classification.
Code of Federal Regulations, 2010 CFR
2010-04-01
... OF THE TREASURY (CONTINUED) CLASSIFICATION AND APPRAISEMENT OF MERCHANDISE Classification § 152.16 Judicial changes in classification. The following procedures apply to changes in classification made by... 19 Customs Duties 2 2010-04-01 2010-04-01 false Judicial changes in classification. 152.16 Section...
NASA Astrophysics Data System (ADS)
Prochazka, D.; Mazura, M.; Samek, O.; Rebrošová, K.; Pořízka, P.; Klus, J.; Prochazková, P.; Novotný, J.; Novotný, K.; Kaiser, J.
2018-01-01
In this work, we investigate the impact of data provided by complementary laser-based spectroscopic methods on multivariate classification accuracy. Discrimination and classification of five Staphylococcus bacterial strains and one strain of Escherichia coli is presented. The technique that we used for measurements is a combination of Raman spectroscopy and Laser-Induced Breakdown Spectroscopy (LIBS). Obtained spectroscopic data were then processed using Multivariate Data Analysis algorithms. Principal Components Analysis (PCA) was selected as the most suitable technique for visualization of bacterial strains data. To classify the bacterial strains, we used Neural Networks, namely a supervised version of Kohonen's self-organizing maps (SOM). We were processing results in three different ways - separately from LIBS measurements, from Raman measurements, and we also merged data from both mentioned methods. The three types of results were then compared. By applying the PCA to Raman spectroscopy data, we observed that two bacterial strains were fully distinguished from the rest of the data set. In the case of LIBS data, three bacterial strains were fully discriminated. Using a combination of data from both methods, we achieved the complete discrimination of all bacterial strains. All the data were classified with a high success rate using SOM algorithm. The most accurate classification was obtained using a combination of data from both techniques. The classification accuracy varied, depending on specific samples and techniques. As for LIBS, the classification accuracy ranged from 45% to 100%, as for Raman Spectroscopy from 50% to 100% and in case of merged data, all samples were classified correctly. Based on the results of the experiments presented in this work, we can assume that the combination of Raman spectroscopy and LIBS significantly enhances discrimination and classification accuracy of bacterial species and strains. The reason is the complementarity in obtained chemical information while using these two methods.
NASA Astrophysics Data System (ADS)
Pereira, A. A.; Gironas, J. A.; Passalacqua, P.; Mejia, A.; Niemann, J. D.
2017-12-01
Previous work has shown that lithological, tectonic and climatic processes have a major influence in shaping the geomorphology of river networks. Accordingly, quantitative classification methods have been developed to identify and characterize network types (dendritic, parallel, pinnate, rectangular and trellis) based solely on the self-affinity of their planform properties, computed from available Digital Elevation Model (DEM) data. In contrast, this research aim is to include both horizontal and vertical properties to evaluate a quantitative classification method for river networks. We include vertical properties to consider the unique surficial conditions (e.g., large and steep height drops, volcanic activity, and complexity of stream networks) of the Andes Mountains. Furthermore, the goal of the research is also to explain the implications and possible relations between the hydro-geomorphological properties and climatic conditions. The classification method is applied to 42 basins in the southern Andes in Chile, ranging in size from 208 Km2 to 8,000 Km2. The planform metrics include the incremental drainage area, stream course irregularity and junction angles, while the vertical metrics include the hypsometric curve and the slope-area relationship. We introduce new network structures (Brush, Funnel and Low Sinuosity Rectangular), possibly unique to the Andes, that can be quantitatively differentiated from previous networks identified in other geographic regions. Then, this research evaluates the effect that excluding different Strahler order streams has on the horizontal properties and therefore in the classification. We found that climatic conditions are not only linked to horizontal parameters, but also to vertical ones, finding significant correlation between climatic variables (average near-surface temperature and rainfall) and vertical measures (parameters associated with the hypsometric curve and slope-area relation). The proposed classification shows differences among basins previously classified as the same type, which are not noticeable in their horizontal properties and helps reduce misclassifications within the old clusters. Additional hydro-geomorphological metrics are to be considered in the classification method to improve the effectiveness of it.
Invasive species change detection using artificial neural networks and CASI hyperspectral imagery
USDA-ARS?s Scientific Manuscript database
For monitoring and controlling the extent and intensity of an invasive species, a direct multi-date image classification method was applied in invasive species (saltcedar) change detection in the study area of Lovelock, Nevada. With multi-date Compact Airborne Spectrographic Imager (CASI) hyperspec...
NASA Astrophysics Data System (ADS)
Movia, A.; Beinat, A.; Crosilla, F.
2015-04-01
The recognition of vegetation by the analysis of very high resolution (VHR) aerial images provides meaningful information about environmental features; nevertheless, VHR images frequently contain shadows that generate significant problems for the classification of the image components and for the extraction of the needed information. The aim of this research is to classify, from VHR aerial images, vegetation involved in the balance process of the environmental biochemical cycle, and to discriminate it with respect to urban and agricultural features. Three classification algorithms have been experimented in order to better recognize vegetation, and compared to NDVI index; unfortunately all these methods are conditioned by the presence of shadows on the images. Literature presents several algorithms to detect and remove shadows in the scene: most of them are based on the RGB to HSI transformations. In this work some of them have been implemented and compared with one based on RGB bands. Successively, in order to remove shadows and restore brightness on the images, some innovative algorithms, based on Procrustes theory, have been implemented and applied. Among these, we evaluate the capability of the so called "not-centered oblique Procrustes" and "anisotropic Procrustes" methods to efficiently restore brightness with respect to a linear correlation correction based on the Cholesky decomposition. Some experimental results obtained by different classification methods after shadows removal carried out with the innovative algorithms are presented and discussed.
Application of quantum-behaved particle swarm optimization to motor imagery EEG classification.
Hsu, Wei-Yen
2013-12-01
In this study, we propose a recognition system for single-trial analysis of motor imagery (MI) electroencephalogram (EEG) data. Applying event-related brain potential (ERP) data acquired from the sensorimotor cortices, the system chiefly consists of automatic artifact elimination, feature extraction, feature selection and classification. In addition to the use of independent component analysis, a similarity measure is proposed to further remove the electrooculographic (EOG) artifacts automatically. Several potential features, such as wavelet-fractal features, are then extracted for subsequent classification. Next, quantum-behaved particle swarm optimization (QPSO) is used to select features from the feature combination. Finally, selected sub-features are classified by support vector machine (SVM). Compared with without artifact elimination, feature selection using a genetic algorithm (GA) and feature classification with Fisher's linear discriminant (FLD) on MI data from two data sets for eight subjects, the results indicate that the proposed method is promising in brain-computer interface (BCI) applications.
NASA Technical Reports Server (NTRS)
Langley, P. G.
1981-01-01
A method of relating different classifications at each stage of a multistage, multiresource inventory using remotely sensed imagery is discussed. A class transformation matrix allowing the conversion of a set of proportions at one stage, to a set of proportions at the subsequent stage through use of a linear model, is described. The technique was tested by applying it to Kershaw County, South Carolina. Unsupervised LANDSAT spectral classifications were correlated with interpretations of land use aerial photography, the correlations employed to estimate land use classifications using the linear model, and the land use proportions used to stratify current annual increment (CAI) field plot data to obtain a total CAI for the county. The estimate differed by 1% from the published figure for land use. Potential sediment loss and a variety of land use classifications were also obtained.
HYBRID NEURAL NETWORK AND SUPPORT VECTOR MACHINE METHOD FOR OPTIMIZATION
NASA Technical Reports Server (NTRS)
Rai, Man Mohan (Inventor)
2005-01-01
System and method for optimization of a design associated with a response function, using a hybrid neural net and support vector machine (NN/SVM) analysis to minimize or maximize an objective function, optionally subject to one or more constraints. As a first example, the NN/SVM analysis is applied iteratively to design of an aerodynamic component, such as an airfoil shape, where the objective function measures deviation from a target pressure distribution on the perimeter of the aerodynamic component. As a second example, the NN/SVM analysis is applied to data classification of a sequence of data points in a multidimensional space. The NN/SVM analysis is also applied to data regression.
Hybrid Neural Network and Support Vector Machine Method for Optimization
NASA Technical Reports Server (NTRS)
Rai, Man Mohan (Inventor)
2007-01-01
System and method for optimization of a design associated with a response function, using a hybrid neural net and support vector machine (NN/SVM) analysis to minimize or maximize an objective function, optionally subject to one or more constraints. As a first example, the NN/SVM analysis is applied iteratively to design of an aerodynamic component, such as an airfoil shape, where the objective function measures deviation from a target pressure distribution on the perimeter of the aerodynamic component. As a second example, the NN/SVM analysis is applied to data classification of a sequence of data points in a multidimensional space. The NN/SVM analysis is also applied to data regression.
Reduction of Topographic Effect for Curve Number Estimated from Remotely Sensed Imagery
NASA Astrophysics Data System (ADS)
Zhang, Wen-Yan; Lin, Chao-Yuan
2016-04-01
The Soil Conservation Service Curve Number (SCS-CN) method is commonly used in hydrology to estimate direct runoff volume. The CN is the empirical parameter which corresponding to land use/land cover, hydrologic soil group and antecedent soil moisture condition. In large watersheds with complex topography, satellite remote sensing is the appropriate approach to acquire the land use change information. However, the topographic effect have been usually found in the remotely sensed imageries and resulted in land use classification. This research selected summer and winter scenes of Landsat-5 TM during 2008 to classified land use in Chen-You-Lan Watershed, Taiwan. The b-correction, the empirical topographic correction method, was applied to Landsat-5 TM data. Land use were categorized using K-mean classification into 4 groups i.e. forest, grassland, agriculture and river. Accuracy assessment of image classification was performed with national land use map. The results showed that after topographic correction, the overall accuracy of classification was increased from 68.0% to 74.5%. The average CN estimated from remotely sensed imagery decreased from 48.69 to 45.35 where the average CN estimated from national LULC map was 44.11. Therefore, the topographic correction method was recommended to normalize the topographic effect from the satellite remote sensing data before estimating the CN.
Parametric Time-Frequency Analysis and Its Applications in Music Classification
NASA Astrophysics Data System (ADS)
Shen, Ying; Li, Xiaoli; Ma, Ngok-Wah; Krishnan, Sridhar
2010-12-01
Analysis of nonstationary signals, such as music signals, is a challenging task. The purpose of this study is to explore an efficient and powerful technique to analyze and classify music signals in higher frequency range (44.1 kHz). The pursuit methods are good tools for this purpose, but they aimed at representing the signals rather than classifying them as in Y. Paragakin et al., 2009. Among the pursuit methods, matching pursuit (MP), an adaptive true nonstationary time-frequency signal analysis tool, is applied for music classification. First, MP decomposes the sample signals into time-frequency functions or atoms. Atom parameters are then analyzed and manipulated, and discriminant features are extracted from atom parameters. Besides the parameters obtained using MP, an additional feature, central energy, is also derived. Linear discriminant analysis and the leave-one-out method are used to evaluate the classification accuracy rate for different feature sets. The study is one of the very few works that analyze atoms statistically and extract discriminant features directly from the parameters. From our experiments, it is evident that the MP algorithm with the Gabor dictionary decomposes nonstationary signals, such as music signals, into atoms in which the parameters contain strong discriminant information sufficient for accurate and efficient signal classifications.
Chen, Lili; Hao, Yaru
2017-01-01
Preterm birth (PTB) is the leading cause of perinatal mortality and long-term morbidity, which results in significant health and economic problems. The early detection of PTB has great significance for its prevention. The electrohysterogram (EHG) related to uterine contraction is a noninvasive, real-time, and automatic novel technology which can be used to detect, diagnose, or predict PTB. This paper presents a method for feature extraction and classification of EHG between pregnancy and labour group, based on Hilbert-Huang transform (HHT) and extreme learning machine (ELM). For each sample, each channel was decomposed into a set of intrinsic mode functions (IMFs) using empirical mode decomposition (EMD). Then, the Hilbert transform was applied to IMF to obtain analytic function. The maximum amplitude of analytic function was extracted as feature. The identification model was constructed based on ELM. Experimental results reveal that the best classification performance of the proposed method can reach an accuracy of 88.00%, a sensitivity of 91.30%, and a specificity of 85.19%. The area under receiver operating characteristic (ROC) curve is 0.88. Finally, experimental results indicate that the method developed in this work could be effective in the classification of EHG between pregnancy and labour group.
Zemp, Roland; Tanadini, Matteo; Plüss, Stefan; Schnüriger, Karin; Singh, Navrag B; Taylor, William R; Lorenzetti, Silvio
2016-01-01
Occupational musculoskeletal disorders, particularly chronic low back pain (LBP), are ubiquitous due to prolonged static sitting or nonergonomic sitting positions. Therefore, the aim of this study was to develop an instrumented chair with force and acceleration sensors to determine the accuracy of automatically identifying the user's sitting position by applying five different machine learning methods (Support Vector Machines, Multinomial Regression, Boosting, Neural Networks, and Random Forest). Forty-one subjects were requested to sit four times in seven different prescribed sitting positions (total 1148 samples). Sixteen force sensor values and the backrest angle were used as the explanatory variables (features) for the classification. The different classification methods were compared by means of a Leave-One-Out cross-validation approach. The best performance was achieved using the Random Forest classification algorithm, producing a mean classification accuracy of 90.9% for subjects with which the algorithm was not familiar. The classification accuracy varied between 81% and 98% for the seven different sitting positions. The present study showed the possibility of accurately classifying different sitting positions by means of the introduced instrumented office chair combined with machine learning analyses. The use of such novel approaches for the accurate assessment of chair usage could offer insights into the relationships between sitting position, sitting behaviour, and the occurrence of musculoskeletal disorders.
Peng, Xiang; King, Irwin
2008-01-01
The Biased Minimax Probability Machine (BMPM) constructs a classifier which deals with the imbalanced learning tasks. It provides a worst-case bound on the probability of misclassification of future data points based on reliable estimates of means and covariance matrices of the classes from the training data samples, and achieves promising performance. In this paper, we develop a novel yet critical extension training algorithm for BMPM that is based on Second-Order Cone Programming (SOCP). Moreover, we apply the biased classification model to medical diagnosis problems to demonstrate its usefulness. By removing some crucial assumptions in the original solution to this model, we make the new method more accurate and robust. We outline the theoretical derivatives of the biased classification model, and reformulate it into an SOCP problem which could be efficiently solved with global optima guarantee. We evaluate our proposed SOCP-based BMPM (BMPMSOCP) scheme in comparison with traditional solutions on medical diagnosis tasks where the objectives are to focus on improving the sensitivity (the accuracy of the more important class, say "ill" samples) instead of the overall accuracy of the classification. Empirical results have shown that our method is more effective and robust to handle imbalanced classification problems than traditional classification approaches, and the original Fractional Programming-based BMPM (BMPMFP).
Adaptive sleep-wake discrimination for wearable devices.
Karlen, Walter; Floreano, Dario
2011-04-01
Sleep/wake classification systems that rely on physiological signals suffer from intersubject differences that make accurate classification with a single, subject-independent model difficult. To overcome the limitations of intersubject variability, we suggest a novel online adaptation technique that updates the sleep/wake classifier in real time. The objective of the present study was to evaluate the performance of a newly developed adaptive classification algorithm that was embedded on a wearable sleep/wake classification system called SleePic. The algorithm processed ECG and respiratory effort signals for the classification task and applied behavioral measurements (obtained from accelerometer and press-button data) for the automatic adaptation task. When trained as a subject-independent classifier algorithm, the SleePic device was only able to correctly classify 74.94 ± 6.76% of the human-rated sleep/wake data. By using the suggested automatic adaptation method, the mean classification accuracy could be significantly improved to 92.98 ± 3.19%. A subject-independent classifier based on activity data only showed a comparable accuracy of 90.44 ± 3.57%. We demonstrated that subject-independent models used for online sleep-wake classification can successfully be adapted to previously unseen subjects without the intervention of human experts or off-line calibration.
Rank preserving sparse learning for Kinect based scene classification.
Tao, Dapeng; Jin, Lianwen; Yang, Zhao; Li, Xuelong
2013-10-01
With the rapid development of the RGB-D sensors and the promptly growing population of the low-cost Microsoft Kinect sensor, scene classification, which is a hard, yet important, problem in computer vision, has gained a resurgence of interest recently. That is because the depth of information provided by the Kinect sensor opens an effective and innovative way for scene classification. In this paper, we propose a new scheme for scene classification, which applies locality-constrained linear coding (LLC) to local SIFT features for representing the RGB-D samples and classifies scenes through the cooperation between a new rank preserving sparse learning (RPSL) based dimension reduction and a simple classification method. RPSL considers four aspects: 1) it preserves the rank order information of the within-class samples in a local patch; 2) it maximizes the margin between the between-class samples on the local patch; 3) the L1-norm penalty is introduced to obtain the parsimony property; and 4) it models the classification error minimization by utilizing the least-squares error minimization. Experiments are conducted on the NYU Depth V1 dataset and demonstrate the robustness and effectiveness of RPSL for scene classification.
Bennet, Jaison; Ganaprakasam, Chilambuchelvan Arul; Arputharaj, Kannan
2014-01-01
Cancer classification by doctors and radiologists was based on morphological and clinical features and had limited diagnostic ability in olden days. The recent arrival of DNA microarray technology has led to the concurrent monitoring of thousands of gene expressions in a single chip which stimulates the progress in cancer classification. In this paper, we have proposed a hybrid approach for microarray data classification based on nearest neighbor (KNN), naive Bayes, and support vector machine (SVM). Feature selection prior to classification plays a vital role and a feature selection technique which combines discrete wavelet transform (DWT) and moving window technique (MWT) is used. The performance of the proposed method is compared with the conventional classifiers like support vector machine, nearest neighbor, and naive Bayes. Experiments have been conducted on both real and benchmark datasets and the results indicate that the ensemble approach produces higher classification accuracy than conventional classifiers. This paper serves as an automated system for the classification of cancer and can be applied by doctors in real cases which serve as a boon to the medical community. This work further reduces the misclassification of cancers which is highly not allowed in cancer detection.
Differentiation of Glioblastoma and Lymphoma Using Feature Extraction and Support Vector Machine.
Yang, Zhangjing; Feng, Piaopiao; Wen, Tian; Wan, Minghua; Hong, Xunning
2017-01-01
Differentiation of glioblastoma multiformes (GBMs) and lymphomas using multi-sequence magnetic resonance imaging (MRI) is an important task that is valuable for treatment planning. However, this task is a challenge because GBMs and lymphomas may have a similar appearance in MRI images. This similarity may lead to misclassification and could affect the treatment results. In this paper, we propose a semi-automatic method based on multi-sequence MRI to differentiate these two types of brain tumors. Our method consists of three steps: 1) the key slice is selected from 3D MRIs and region of interests (ROIs) are drawn around the tumor region; 2) different features are extracted based on prior clinical knowledge and validated using a t-test; and 3) features that are helpful for classification are used to build an original feature vector and a support vector machine is applied to perform classification. In total, 58 GBM cases and 37 lymphoma cases are used to validate our method. A leave-one-out crossvalidation strategy is adopted in our experiments. The global accuracy of our method was determined as 96.84%, which indicates that our method is effective for the differentiation of GBM and lymphoma and can be applied in clinical diagnosis. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
O'Neill, William; Penn, Richard; Werner, Michael; Thomas, Justin
2015-06-01
Estimation of stochastic process models from data is a common application of time series analysis methods. Such system identification processes are often cast as hypothesis testing exercises whose intent is to estimate model parameters and test them for statistical significance. Ordinary least squares (OLS) regression and the Levenberg-Marquardt algorithm (LMA) have proven invaluable computational tools for models being described by non-homogeneous, linear, stationary, ordinary differential equations. In this paper we extend stochastic model identification to linear, stationary, partial differential equations in two independent variables (2D) and show that OLS and LMA apply equally well to these systems. The method employs an original nonparametric statistic as a test for the significance of estimated parameters. We show gray scale and color images are special cases of 2D systems satisfying a particular autoregressive partial difference equation which estimates an analogous partial differential equation. Several applications to medical image modeling and classification illustrate the method by correctly classifying demented and normal OLS models of axial magnetic resonance brain scans according to subject Mini Mental State Exam (MMSE) scores. Comparison with 13 image classifiers from the literature indicates our classifier is at least 14 times faster than any of them and has a classification accuracy better than all but one. Our modeling method applies to any linear, stationary, partial differential equation and the method is readily extended to 3D whole-organ systems. Further, in addition to being a robust image classifier, estimated image models offer insights into which parameters carry the most diagnostic image information and thereby suggest finer divisions could be made within a class. Image models can be estimated in milliseconds which translate to whole-organ models in seconds; such runtimes could make real-time medicine and surgery modeling possible.
Towards a robust framework for catchment classification
NASA Astrophysics Data System (ADS)
Deshmukh, A.; Samal, A.; Singh, R.
2017-12-01
Classification of catchments based on various measures of similarity has emerged as an important technique to understand regional scale hydrologic behavior. Classification of catchment characteristics and/or streamflow response has been used reveal which characteristics are more likely to explain the observed variability of hydrologic response. However, numerous algorithms for supervised or unsupervised classification are available, making it hard to identify the algorithm most suitable for the dataset at hand. Consequently, existing catchment classification studies vary significantly in the classification algorithms employed with no previous attempt at understanding the degree of uncertainty in classification due to this algorithmic choice. This hinders the generalizability of interpretations related to hydrologic behavior. Our goal is to develop a protocol that can be followed while classifying hydrologic datasets. We focus on a classification framework for unsupervised classification and provide a step-by-step classification procedure. The steps include testing the clusterabiltiy of original dataset prior to classification, feature selection, validation of clustered data, and quantification of similarity of two clusterings. We test several commonly available methods within this framework to understand the level of similarity of classification results across algorithms. We apply the proposed framework on recently developed datasets for India to analyze to what extent catchment properties can explain observed catchment response. Our testing dataset includes watershed characteristics for over 200 watersheds which comprise of both natural (physio-climatic) characteristics and socio-economic characteristics. This framework allows us to understand the controls on observed hydrologic variability across India.
Pastusiak, J; Zakrzewski, J
1988-11-01
Specific biocybernetic approach to the problem of the blood supply determination of paradontium tissues by means of thermometric methods has been presented in the paper. The compartment models of the measuring procedure have been given. Dilutodynamic methology and classification has been applied. Such an approach enables to select appropriate biophysical parameters describing the state of blood supply of paradontium tissues and optimal design of transducers and measuring methods.
Predicting β-Turns in Protein Using Kernel Logistic Regression
Elbashir, Murtada Khalafallah; Sheng, Yu; Wang, Jianxin; Wu, FangXiang; Li, Min
2013-01-01
A β-turn is a secondary protein structure type that plays a significant role in protein configuration and function. On average 25% of amino acids in protein structures are located in β-turns. It is very important to develope an accurate and efficient method for β-turns prediction. Most of the current successful β-turns prediction methods use support vector machines (SVMs) or neural networks (NNs). The kernel logistic regression (KLR) is a powerful classification technique that has been applied successfully in many classification problems. However, it is often not found in β-turns classification, mainly because it is computationally expensive. In this paper, we used KLR to obtain sparse β-turns prediction in short evolution time. Secondary structure information and position-specific scoring matrices (PSSMs) are utilized as input features. We achieved Q total of 80.7% and MCC of 50% on BT426 dataset. These results show that KLR method with the right algorithm can yield performance equivalent to or even better than NNs and SVMs in β-turns prediction. In addition, KLR yields probabilistic outcome and has a well-defined extension to multiclass case. PMID:23509793
Predicting β-turns in protein using kernel logistic regression.
Elbashir, Murtada Khalafallah; Sheng, Yu; Wang, Jianxin; Wu, Fangxiang; Li, Min
2013-01-01
A β-turn is a secondary protein structure type that plays a significant role in protein configuration and function. On average 25% of amino acids in protein structures are located in β-turns. It is very important to develope an accurate and efficient method for β-turns prediction. Most of the current successful β-turns prediction methods use support vector machines (SVMs) or neural networks (NNs). The kernel logistic regression (KLR) is a powerful classification technique that has been applied successfully in many classification problems. However, it is often not found in β-turns classification, mainly because it is computationally expensive. In this paper, we used KLR to obtain sparse β-turns prediction in short evolution time. Secondary structure information and position-specific scoring matrices (PSSMs) are utilized as input features. We achieved Q total of 80.7% and MCC of 50% on BT426 dataset. These results show that KLR method with the right algorithm can yield performance equivalent to or even better than NNs and SVMs in β-turns prediction. In addition, KLR yields probabilistic outcome and has a well-defined extension to multiclass case.
Cai, Suxian; Yang, Shanshan; Zheng, Fang; Lu, Meng; Wu, Yunfeng; Krishnan, Sridhar
2013-01-01
Analysis of knee joint vibration (VAG) signals can provide quantitative indices for detection of knee joint pathology at an early stage. In addition to the statistical features developed in the related previous studies, we extracted two separable features, that is, the number of atoms derived from the wavelet matching pursuit decomposition and the number of significant signal turns detected with the fixed threshold in the time domain. To perform a better classification over the data set of 89 VAG signals, we applied a novel classifier fusion system based on the dynamic weighted fusion (DWF) method to ameliorate the classification performance. For comparison, a single leastsquares support vector machine (LS-SVM) and the Bagging ensemble were used for the classification task as well. The results in terms of overall accuracy in percentage and area under the receiver operating characteristic curve obtained with the DWF-based classifier fusion method reached 88.76% and 0.9515, respectively, which demonstrated the effectiveness and superiority of the DWF method with two distinct features for the VAG signal analysis. PMID:23573175
Bevilacqua, M; Ciarapica, F E; Giacchetta, G
2008-07-01
This work is an attempt to apply classification tree methods to data regarding accidents in a medium-sized refinery, so as to identify the important relationships between the variables, which can be considered as decision-making rules when adopting any measures for improvement. The results obtained using the CART (Classification And Regression Trees) method proved to be the most precise and, in general, they are encouraging concerning the use of tree diagrams as preliminary explorative techniques for the assessment of the ergonomic, management and operational parameters which influence high accident risk situations. The Occupational Injury analysis carried out in this paper was planned as a dynamic process and can be repeated systematically. The CART technique, which considers a very wide set of objective and predictive variables, shows new cause-effect correlations in occupational safety which had never been previously described, highlighting possible injury risk groups and supporting decision-making in these areas. The use of classification trees must not, however, be seen as an attempt to supplant other techniques, but as a complementary method which can be integrated into traditional types of analysis.
He, Shixuan; Xie, Wanyi; Zhang, Wei; Zhang, Liqun; Wang, Yunxia; Liu, Xiaoling; Liu, Yulong; Du, Chunlei
2015-02-25
A novel strategy which combines iteratively cubic spline fitting baseline correction method with discriminant partial least squares qualitative analysis is employed to analyze the surface enhanced Raman scattering (SERS) spectroscopy of banned food additives, such as Sudan I dye and Rhodamine B in food, Malachite green residues in aquaculture fish. Multivariate qualitative analysis methods, using the combination of spectra preprocessing iteratively cubic spline fitting (ICSF) baseline correction with principal component analysis (PCA) and discriminant partial least squares (DPLS) classification respectively, are applied to investigate the effectiveness of SERS spectroscopy for predicting the class assignments of unknown banned food additives. PCA cannot be used to predict the class assignments of unknown samples. However, the DPLS classification can discriminate the class assignment of unknown banned additives using the information of differences in relative intensities. The results demonstrate that SERS spectroscopy combined with ICSF baseline correction method and exploratory analysis methodology DPLS classification can be potentially used for distinguishing the banned food additives in field of food safety. Copyright © 2014 Elsevier B.V. All rights reserved.
Liu, Aiming; Liu, Quan; Ai, Qingsong; Xie, Yi; Chen, Anqi
2017-01-01
Motor Imagery (MI) electroencephalography (EEG) is widely studied for its non-invasiveness, easy availability, portability, and high temporal resolution. As for MI EEG signal processing, the high dimensions of features represent a research challenge. It is necessary to eliminate redundant features, which not only create an additional overhead of managing the space complexity, but also might include outliers, thereby reducing classification accuracy. The firefly algorithm (FA) can adaptively select the best subset of features, and improve classification accuracy. However, the FA is easily entrapped in a local optimum. To solve this problem, this paper proposes a method of combining the firefly algorithm and learning automata (LA) to optimize feature selection for motor imagery EEG. We employed a method of combining common spatial pattern (CSP) and local characteristic-scale decomposition (LCD) algorithms to obtain a high dimensional feature set, and classified it by using the spectral regression discriminant analysis (SRDA) classifier. Both the fourth brain–computer interface competition data and real-time data acquired in our designed experiments were used to verify the validation of the proposed method. Compared with genetic and adaptive weight particle swarm optimization algorithms, the experimental results show that our proposed method effectively eliminates redundant features, and improves the classification accuracy of MI EEG signals. In addition, a real-time brain–computer interface system was implemented to verify the feasibility of our proposed methods being applied in practical brain–computer interface systems. PMID:29117100
Liu, Aiming; Chen, Kun; Liu, Quan; Ai, Qingsong; Xie, Yi; Chen, Anqi
2017-11-08
Motor Imagery (MI) electroencephalography (EEG) is widely studied for its non-invasiveness, easy availability, portability, and high temporal resolution. As for MI EEG signal processing, the high dimensions of features represent a research challenge. It is necessary to eliminate redundant features, which not only create an additional overhead of managing the space complexity, but also might include outliers, thereby reducing classification accuracy. The firefly algorithm (FA) can adaptively select the best subset of features, and improve classification accuracy. However, the FA is easily entrapped in a local optimum. To solve this problem, this paper proposes a method of combining the firefly algorithm and learning automata (LA) to optimize feature selection for motor imagery EEG. We employed a method of combining common spatial pattern (CSP) and local characteristic-scale decomposition (LCD) algorithms to obtain a high dimensional feature set, and classified it by using the spectral regression discriminant analysis (SRDA) classifier. Both the fourth brain-computer interface competition data and real-time data acquired in our designed experiments were used to verify the validation of the proposed method. Compared with genetic and adaptive weight particle swarm optimization algorithms, the experimental results show that our proposed method effectively eliminates redundant features, and improves the classification accuracy of MI EEG signals. In addition, a real-time brain-computer interface system was implemented to verify the feasibility of our proposed methods being applied in practical brain-computer interface systems.
REVIEW ARTICLE: Spectrophotometric applications of digital signal processing
NASA Astrophysics Data System (ADS)
Morawski, Roman Z.
2006-09-01
Spectrophotometry is more and more often the method of choice not only in analysis of (bio)chemical substances, but also in the identification of physical properties of various objects and their classification. The applications of spectrophotometry include such diversified tasks as monitoring of optical telecommunications links, assessment of eating quality of food, forensic classification of papers, biometric identification of individuals, detection of insect infestation of seeds and classification of textiles. In all those applications, large numbers of data, generated by spectrophotometers, are processed by various digital means in order to extract measurement information. The main objective of this paper is to review the state-of-the-art methodology for digital signal processing (DSP) when applied to data provided by spectrophotometric transducers and spectrophotometers. First, a general methodology of DSP applications in spectrophotometry, based on DSP-oriented models of spectrophotometric data, is outlined. Then, the most important classes of DSP methods for processing spectrophotometric data—the methods for DSP-aided calibration of spectrophotometric instrumentation, the methods for the estimation of spectra on the basis of spectrophotometric data, the methods for the estimation of spectrum-related measurands on the basis of spectrophotometric data—are presented. Finally, the methods for preprocessing and postprocessing of spectrophotometric data are overviewed. Throughout the review, the applications of DSP are illustrated with numerous examples related to broadly understood spectrophotometry.
Song, Weiran; Wang, Hui; Maguire, Paul; Nibouche, Omar
2018-06-07
Partial Least Squares Discriminant Analysis (PLS-DA) is one of the most effective multivariate analysis methods for spectral data analysis, which extracts latent variables and uses them to predict responses. In particular, it is an effective method for handling high-dimensional and collinear spectral data. However, PLS-DA does not explicitly address data multimodality, i.e., within-class multimodal distribution of data. In this paper, we present a novel method termed nearest clusters based PLS-DA (NCPLS-DA) for addressing the multimodality and nonlinearity issues explicitly and improving the performance of PLS-DA on spectral data classification. The new method applies hierarchical clustering to divide samples into clusters and calculates the corresponding centre of every cluster. For a given query point, only clusters whose centres are nearest to such a query point are used for PLS-DA. Such a method can provide a simple and effective tool for separating multimodal and nonlinear classes into clusters which are locally linear and unimodal. Experimental results on 17 datasets, including 12 UCI and 5 spectral datasets, show that NCPLS-DA can outperform 4 baseline methods, namely, PLS-DA, kernel PLS-DA, local PLS-DA and k-NN, achieving the highest classification accuracy most of the time. Copyright © 2018 Elsevier B.V. All rights reserved.
ERIC Educational Resources Information Center
Zhang, Yanling; Dorans, Neil J.; Matthews-López, Joy L.
2005-01-01
Statistical procedures for detecting differential item functioning (DIF) are often used as an initial step to screen items for construct irrelevant variance. This research applies a DIF dissection method and a two-way classification scheme to SAT Reasoning Test™ verbal section data and explores the effects of deleting sizable DIF items on reported…
NASA Technical Reports Server (NTRS)
Kim, Hakil; Swain, Philip H.
1990-01-01
An axiomatic approach to intervalued (IV) probabilities is presented, where the IV probability is defined by a pair of set-theoretic functions which satisfy some pre-specified axioms. On the basis of this approach representation of statistical evidence and combination of multiple bodies of evidence are emphasized. Although IV probabilities provide an innovative means for the representation and combination of evidential information, they make the decision process rather complicated. It entails more intelligent strategies for making decisions. The development of decision rules over IV probabilities is discussed from the viewpoint of statistical pattern recognition. The proposed method, so called evidential reasoning method, is applied to the ground-cover classification of a multisource data set consisting of Multispectral Scanner (MSS) data, Synthetic Aperture Radar (SAR) data, and digital terrain data such as elevation, slope, and aspect. By treating the data sources separately, the method is able to capture both parametric and nonparametric information and to combine them. Then the method is applied to two separate cases of classifying multiband data obtained by a single sensor. In each case a set of multiple sources is obtained by dividing the dimensionally huge data into smaller and more manageable pieces based on the global statistical correlation information. By a divide-and-combine process, the method is able to utilize more features than the conventional maximum likelihood method.
Hennebert, Pierre; van der Sloot, Hans A; Rebischung, Flore; Weltens, Reinhilde; Geerts, Lieve; Hjelmar, Ole
2014-10-01
Hazard classification of waste is a necessity, but the hazard properties (named "H" and soon "HP") are still not all defined in a practical and operational manner at EU level. Following discussion of subsequent draft proposals from the Commission there is still no final decision. Methods to implement the proposals have recently been proposed: tests methods for physical risks, test batteries for aquatic and terrestrial ecotoxicity, an analytical package for exhaustive determination of organic substances and mineral elements, surrogate methods for the speciation of mineral elements in mineral substances in waste, and calculation methods for human toxicity and ecotoxicity with M factors. In this paper the different proposed methods have been applied to a large assortment of solid and liquid wastes (>100). Data for 45 wastes - documented with extensive chemical analysis and flammability test - were assessed in terms of the different HP criteria and results were compared to LoW for lack of an independent classification. For most waste streams the classification matches with the designation provided in the LoW. This indicates that the criteria used by LoW are similar to the HP limit values. This data set showed HP 14 'Ecotoxic chronic' is the most discriminating HP. All wastes classified as acute ecotoxic are also chronic ecotoxic and the assessment of acute ecotoxicity separately is therefore not needed. The high number of HP 14 classified wastes is due to the very low limit values when stringent M factors are applied to total concentrations (worst case method). With M factor set to 1 the classification method is not sufficiently discriminating between hazardous and non-hazardous materials. The second most frequent hazard is HP 7 'Carcinogenic'. The third most frequent hazard is HP 10 'Toxic for reproduction' and the fourth most frequent hazard is HP 4 "Irritant - skin irritation and eye damage". In a stepwise approach, it seems relevant to assess HP 14 first, then, if the waste is not classified as hazardous, to assess subsequently HP 7, HP 10 and HP 4, and then if still not classified as hazardous, to assess the remaining properties. The elements triggering the HP 14 classification in order of importance are Zn, Cu, Pb, Cr, Cd and Hg. Progress in the speciation of Zn and Cu is essential for HP 14. Organics were quantified by the proposed method (AFNOR XP X30-489) and need no speciation. Organics can contribute significantly to intrinsic toxicity in many waste materials, but they are only of minor importance for the assessment of HP 14 as the metal concentrations are the main HP 14 classifiers. Organic compounds are however responsible for other toxicological characteristics (hormone disturbance, genotoxicity, reprotoxicity…) and shall be taken into account when the waste is not HP 14 classified. Copyright © 2014 Elsevier Ltd. All rights reserved.
Classification of wheat: Badhwar profile similarity technique
NASA Technical Reports Server (NTRS)
Austin, W. W.
1980-01-01
The Badwar profile similarity classification technique used successfully for classification of corn was applied to spring wheat classifications. The software programs and the procedures used to generate full-scene classifications are presented, and numerical results of the acreage estimations are given.
Diffuse reflectance startigraphy - a new method in the study of loess (?)
NASA Astrophysics Data System (ADS)
József, Szeberényi; Balázs, Bradák; Klaudia, Kiss; József, Kovács; György, Varga; Réka, Balázs; Viczián, István
2017-04-01
The different varieties of loess (and intercalated paleosol layers) together constitute one of the most widespread terrestrial sediments, which was deposited, altered, and redeposited in the course of the changing climatic conditions of the Pleistocene. To reveal more information about Pleistocene climate cycles and/or environments the detailed lithostratigraphical subdivision and classification of the loess variations and paleosols are necessary. Beside the numerous method such as various field measurements, semi-quantitative tests and laboratory investigations, diffuse reflectance spectroscopy (DRS) is one of the well applied method on loess/paleosol sequences. Generally, DRS has been used to separate the detrital and pedogenic mineral component of the loess sections by the hematite/goethite ratio. DRS also has been applied as a joint method of various environmental magnetic investigations such as magnetic susceptibility- and isothermal remanent magnetization measurements. In our study the so-called "diffuse reflectance stratigraphy method" were developed. At First, complex mathematical method was applied to compare the results of the spectral reflectance measurements. One of the most preferred multivariate methods is cluster analysis. Its scope is to group and compare the loess variations and paleosol based on the similarity and common properties of their reflectance curves. In the Second, beside the basic subdivision of the profiles by the different reflectance curves of the layers, the most characteristic wavelength section of the reflectance curve was determined. This sections played the most important role during the classification of the different materials of the section. The reflectance value of individual samples, belonged to the characteristic wavelength were depicted in the function of depth and well correlated with other proxies like grain size distribution and magnetic susceptibility data. The results of the correlation showed the significance of the "combined reflectance stratigraphy" as a stratigraphical method and as an environmental proxy also.
Deep learning approach to bacterial colony classification.
Zieliński, Bartosz; Plichta, Anna; Misztal, Krzysztof; Spurek, Przemysław; Brzychczy-Włoch, Monika; Ochońska, Dorota
2017-01-01
In microbiology it is diagnostically useful to recognize various genera and species of bacteria. It can be achieved using computer-aided methods, which make the recognition processes more automatic and thus significantly reduce the time necessary for the classification. Moreover, in case of diagnostic uncertainty (the misleading similarity in shape or structure of bacterial cells), such methods can minimize the risk of incorrect recognition. In this article, we apply the state of the art method for texture analysis to classify genera and species of bacteria. This method uses deep Convolutional Neural Networks to obtain image descriptors, which are then encoded and classified with Support Vector Machine or Random Forest. To evaluate this approach and to make it comparable with other approaches, we provide a new dataset of images. DIBaS dataset (Digital Image of Bacterial Species) contains 660 images with 33 different genera and species of bacteria.
Isik, Nimet
2016-04-01
Multi-element electrostatic aperture lens systems are widely used to control electron or charged particle beams in many scientific instruments. By means of applied voltages, these lens systems can be operated for different purposes. In this context, numerous methods have been performed to calculate focal properties of these lenses. In this study, an artificial neural network (ANN) classification method is utilized to determine the focused/unfocused charged particle beam in the image point as a function of lens voltages for multi-element electrostatic aperture lenses. A data set for training and testing of ANN is taken from the SIMION 8.1 simulation program, which is a well known and proven accuracy program in charged particle optics. Mean squared error results of this study indicate that the ANN classification method provides notable performance characteristics for electrostatic aperture zoom lenses.
Comparison of Different EHG Feature Selection Methods for the Detection of Preterm Labor
Alamedine, D.; Khalil, M.; Marque, C.
2013-01-01
Numerous types of linear and nonlinear features have been extracted from the electrohysterogram (EHG) in order to classify labor and pregnancy contractions. As a result, the number of available features is now very large. The goal of this study is to reduce the number of features by selecting only the relevant ones which are useful for solving the classification problem. This paper presents three methods for feature subset selection that can be applied to choose the best subsets for classifying labor and pregnancy contractions: an algorithm using the Jeffrey divergence (JD) distance, a sequential forward selection (SFS) algorithm, and a binary particle swarm optimization (BPSO) algorithm. The two last methods are based on a classifier and were tested with three types of classifiers. These methods have allowed us to identify common features which are relevant for contraction classification. PMID:24454536
SkICAT: A cataloging and analysis tool for wide field imaging surveys
NASA Technical Reports Server (NTRS)
Weir, N.; Fayyad, U. M.; Djorgovski, S. G.; Roden, J.
1992-01-01
We describe an integrated system, SkICAT (Sky Image Cataloging and Analysis Tool), for the automated reduction and analysis of the Palomar Observatory-ST ScI Digitized Sky Survey. The Survey will consist of the complete digitization of the photographic Second Palomar Observatory Sky Survey (POSS-II) in three bands, comprising nearly three Terabytes of pixel data. SkICAT applies a combination of existing packages, including FOCAS for basic image detection and measurement and SAS for database management, as well as custom software, to the task of managing this wealth of data. One of the most novel aspects of the system is its method of object classification. Using state-of-theart machine learning classification techniques (GID3* and O-BTree), we have developed a powerful method for automatically distinguishing point sources from non-point sources and artifacts, achieving comparably accurate discrimination a full magnitude fainter than in previous Schmidt plate surveys. The learning algorithms produce decision trees for classification by examining instances of objects classified by eye on both plate and higher quality CCD data. The same techniques will be applied to perform higher-level object classification (e.g., of galaxy morphology) in the near future. Another key feature of the system is the facility to integrate the catalogs from multiple plates (and portions thereof) to construct a single catalog of uniform calibration and quality down to the faintest limits of the survey. SkICAT also provides a variety of data analysis and exploration tools for the scientific utilization of the resulting catalogs. We include initial results of applying this system to measure the counts and distribution of galaxies in two bands down to Bj is approximately 21 mag over an approximate 70 square degree multi-plate field from POSS-II. SkICAT is constructed in a modular and general fashion and should be readily adaptable to other large-scale imaging surveys.
NASA Astrophysics Data System (ADS)
Aufaristama, Muhammad; Hölbling, Daniel; Höskuldsson, Ármann; Jónsdóttir, Ingibjörg
2017-04-01
The Krafla volcanic system is part of the Icelandic North Volcanic Zone (NVZ). During Holocene, two eruptive events occurred in Krafla, 1724-1729 and 1975-1984. The last eruptive episode (1975-1984), known as the "Krafla Fires", resulted in nine volcanic eruption episodes. The total area covered by the lavas from this eruptive episode is 36 km2 and the volume is about 0.25-0.3 km3. Lava morphology is related to the characteristics of the surface morphology of a lava flow after solidification. The typical morphology of lava can be used as primary basis for the classification of lava flows when rheological properties cannot be directly observed during emplacement, and also for better understanding the behavior of lava flow models. Although mapping of lava flows in the field is relatively accurate such traditional methods are time consuming, especially when the lava covers large areas such as it is the case in Krafla. Semi-automatic mapping methods that make use of satellite remote sensing data allow for an efficient and fast mapping of lava morphology. In this study, two semi-automatic methods for lava morphology classification are presented and compared using Landsat 8 (30 m spatial resolution) and SPOT-5 (10 m spatial resolution) satellite images. For assessing the classification accuracy, the results from semi-automatic mapping were compared to the respective results from visual interpretation. On the one hand, the Spectral Angle Mapper (SAM) classification method was used. With this method an image is classified according to the spectral similarity between the image reflectance spectrums and the reference reflectance spectra. SAM successfully produced detailed lava surface morphology maps. However, the pixel-based approach partly leads to a salt-and-pepper effect. On the other hand, we applied the Random Forest (RF) classification method within an object-based image analysis (OBIA) framework. This statistical classifier uses a randomly selected subset of training samples to produce multiple decision trees. For final classification of pixels or - in the present case - image objects, the average of the class assignments probability predicted by the different decision trees is used. While the resulting OBIA classification of lava morphology types shows a high coincidence with the reference data, the approach is sensitive to the segmentation-derived image objects that constitute the base units for classification. Both semi-automatic methods produce reasonable results in the Krafla lava field, even if the identification of different pahoehoe and aa types of lava appeared to be difficult. The use of satellite remote sensing data shows a high potential for fast and efficient classification of lava morphology, particularly over large and inaccessible areas.
Wen, Tingxi; Zhang, Zhongnan; Qiu, Ming; Zeng, Ming; Luo, Weizhen
2017-01-01
The computer mouse is an important human-computer interaction device. But patients with physical finger disability are unable to operate this device. Surface EMG (sEMG) can be monitored by electrodes on the skin surface and is a reflection of the neuromuscular activities. Therefore, we can control limbs auxiliary equipment by utilizing sEMG classification in order to help the physically disabled patients to operate the mouse. To develop a new a method to extract sEMG generated by finger motion and apply novel features to classify sEMG. A window-based data acquisition method was presented to extract signal samples from sEMG electordes. Afterwards, a two-dimensional matrix image based feature extraction method, which differs from the classical methods based on time domain or frequency domain, was employed to transform signal samples to feature maps used for classification. In the experiments, sEMG data samples produced by the index and middle fingers at the click of a mouse button were separately acquired. Then, characteristics of the samples were analyzed to generate a feature map for each sample. Finally, the machine learning classification algorithms (SVM, KNN, RBF-NN) were employed to classify these feature maps on a GPU. The study demonstrated that all classifiers can identify and classify sEMG samples effectively. In particular, the accuracy of the SVM classifier reached up to 100%. The signal separation method is a convenient, efficient and quick method, which can effectively extract the sEMG samples produced by fingers. In addition, unlike the classical methods, the new method enables to extract features by enlarging sample signals' energy appropriately. The classical machine learning classifiers all performed well by using these features.
NASA Astrophysics Data System (ADS)
Ye, Su; Chen, Dongmei; Yu, Jie
2016-04-01
In remote sensing, conventional supervised change-detection methods usually require effective training data for multiple change types. This paper introduces a more flexible and efficient procedure that seeks to identify only the changes that users are interested in, here after referred to as "targeted change detection". Based on a one-class classifier "Support Vector Domain Description (SVDD)", a novel algorithm named "Three-layer SVDD Fusion (TLSF)" is developed specially for targeted change detection. The proposed algorithm combines one-class classification generated from change vector maps, as well as before- and after-change images in order to get a more reliable detecting result. In addition, this paper introduces a detailed workflow for implementing this algorithm. This workflow has been applied to two case studies with different practical monitoring objectives: urban expansion and forest fire assessment. The experiment results of these two case studies show that the overall accuracy of our proposed algorithm is superior (Kappa statistics are 86.3% and 87.8% for Case 1 and 2, respectively), compared to applying SVDD to change vector analysis and post-classification comparison.
Empirical Specification of Utility Functions.
ERIC Educational Resources Information Center
Mellenbergh, Gideon J.
Decision theory can be applied to four types of decision situations in education and psychology: (1) selection; (2) placement; (3) classification; and (4) mastery. For the application of the theory, a utility function must be specified. Usually the utility function is chosen on a priori grounds. In this paper methods for the empirical assessment…
The Long-Term Sustainability of Different Item Response Theory Scaling Methods
ERIC Educational Resources Information Center
Keller, Lisa A.; Keller, Robert R.
2011-01-01
This article investigates the accuracy of examinee classification into performance categories and the estimation of the theta parameter for several item response theory (IRT) scaling techniques when applied to six administrations of a test. Previous research has investigated only two administrations; however, many testing programs equate tests…
USDA-ARS?s Scientific Manuscript database
Support Vector Machine (SVM) was used in the Genetic Algorithms (GA) process to select and classify a subset of hyperspectral image bands. The method was applied to fluorescence hyperspectral data for the detection of aflatoxin contamination in Aspergillus flavus infected single corn kernels. In the...
NASA Astrophysics Data System (ADS)
Yamashita, Yasuo; Arimura, Hidetaka; Yoshiura, Takashi; Tokunaga, Chiaki; Magome, Taiki; Monji, Akira; Noguchi, Tomoyuki; Toyofuku, Fukai; Oki, Masafumi; Nakamura, Yasuhiko; Honda, Hiroshi
2010-03-01
Arterial spin labeling (ASL) is one of promising non-invasive magnetic resonance (MR) imaging techniques for diagnosis of Alzheimer's disease (AD) by measuring cerebral blood flow (CBF). The aim of this study was to develop a computer-aided classification system for AD patients based on CBFs measured by the ASL technique. The average CBFs in cortical regions were determined as functional image features based on the CBF map image, which was non-linearly transformed to a Talairach brain atlas by using a free-form deformation. An artificial neural network (ANN) was trained with the CBF functional features in 10 cortical regions, and was employed for distinguishing patients with AD from control subjects. For evaluation of the method, we applied the proposed method to 20 cases including ten AD patients and ten control subjects, who were scanned a 3.0-Tesla MR unit. As a result, the area under the receiver operating characteristic curve obtained by the proposed method was 0.893 based on a leave-one-out-by-case test in identification of AD cases among 20 cases. The proposed method would be feasible for classification of patients with AD.
Transfer Learning for Class Imbalance Problems with Inadequate Data.
Al-Stouhi, Samir; Reddy, Chandan K
2016-07-01
A fundamental problem in data mining is to effectively build robust classifiers in the presence of skewed data distributions. Class imbalance classifiers are trained specifically for skewed distribution datasets. Existing methods assume an ample supply of training examples as a fundamental prerequisite for constructing an effective classifier. However, when sufficient data is not readily available, the development of a representative classification algorithm becomes even more difficult due to the unequal distribution between classes. We provide a unified framework that will potentially take advantage of auxiliary data using a transfer learning mechanism and simultaneously build a robust classifier to tackle this imbalance issue in the presence of few training samples in a particular target domain of interest. Transfer learning methods use auxiliary data to augment learning when training examples are not sufficient and in this paper we will develop a method that is optimized to simultaneously augment the training data and induce balance into skewed datasets. We propose a novel boosting based instance-transfer classifier with a label-dependent update mechanism that simultaneously compensates for class imbalance and incorporates samples from an auxiliary domain to improve classification. We provide theoretical and empirical validation of our method and apply to healthcare and text classification applications.
Men, Hong; Shi, Yan; Fu, Songlin; Jiao, Yanan; Qiao, Yu; Liu, Jingjing
2017-01-01
Multi-sensor data fusion can provide more comprehensive and more accurate analysis results. However, it also brings some redundant information, which is an important issue with respect to finding a feature-mining method for intuitive and efficient analysis. This paper demonstrates a feature-mining method based on variable accumulation to find the best expression form and variables’ behavior affecting beer flavor. First, e-tongue and e-nose were used to gather the taste and olfactory information of beer, respectively. Second, principal component analysis (PCA), genetic algorithm-partial least squares (GA-PLS), and variable importance of projection (VIP) scores were applied to select feature variables of the original fusion set. Finally, the classification models based on support vector machine (SVM), random forests (RF), and extreme learning machine (ELM) were established to evaluate the efficiency of the feature-mining method. The result shows that the feature-mining method based on variable accumulation obtains the main feature affecting beer flavor information, and the best classification performance for the SVM, RF, and ELM models with 96.67%, 94.44%, and 98.33% prediction accuracy, respectively. PMID:28753917
Sparse kernel methods for high-dimensional survival data.
Evers, Ludger; Messow, Claudia-Martina
2008-07-15
Sparse kernel methods like support vector machines (SVM) have been applied with great success to classification and (standard) regression settings. Existing support vector classification and regression techniques however are not suitable for partly censored survival data, which are typically analysed using Cox's proportional hazards model. As the partial likelihood of the proportional hazards model only depends on the covariates through inner products, it can be 'kernelized'. The kernelized proportional hazards model however yields a solution that is dense, i.e. the solution depends on all observations. One of the key features of an SVM is that it yields a sparse solution, depending only on a small fraction of the training data. We propose two methods. One is based on a geometric idea, where-akin to support vector classification-the margin between the failed observation and the observations currently at risk is maximised. The other approach is based on obtaining a sparse model by adding observations one after another akin to the Import Vector Machine (IVM). Data examples studied suggest that both methods can outperform competing approaches. Software is available under the GNU Public License as an R package and can be obtained from the first author's website http://www.maths.bris.ac.uk/~maxle/software.html.
Chen, C L; Kaber, D B; Dempsey, P G
2000-06-01
A new and improved method to feedforward neural network (FNN) development for application to data classification problems, such as the prediction of levels of low-back disorder (LBD) risk associated with industrial jobs, is presented. Background on FNN development for data classification is provided along with discussions of previous research and neighborhood (local) solution search methods for hard combinatorial problems. An analytical study is presented which compared prediction accuracy of a FNN based on an error-back propagation (EBP) algorithm with the accuracy of a FNN developed by considering results of local solution search (simulated annealing) for classifying industrial jobs as posing low or high risk for LBDs. The comparison demonstrated superior performance of the FNN generated using the new method. The architecture of this FNN included fewer input (predictor) variables and hidden neurons than the FNN developed based on the EBP algorithm. Independent variable selection methods and the phenomenon of 'overfitting' in FNN (and statistical model) generation for data classification are discussed. The results are supportive of the use of the new approach to FNN development for applications to musculoskeletal disorders and risk forecasting in other domains.
Classification of conductance traces with recurrent neural networks
NASA Astrophysics Data System (ADS)
Lauritzen, Kasper P.; Magyarkuti, András; Balogh, Zoltán; Halbritter, András; Solomon, Gemma C.
2018-02-01
We present a new automated method for structural classification of the traces obtained in break junction experiments. Using recurrent neural networks trained on the traces of minimal cross-sectional area in molecular dynamics simulations, we successfully separate the traces into two classes: point contact or nanowire. This is done without any assumptions about the expected features of each class. The trained neural network is applied to experimental break junction conductance traces, and it separates the classes as well as the previously used experimental methods. The effect of using partial conductance traces is explored, and we show that the method performs equally well using full or partial traces (as long as the trace just prior to breaking is included). When only the initial part of the trace is included, the results are still better than random chance. Finally, we show that the neural network classification method can be used to classify experimental conductance traces without using simulated results for training, but instead training the network on a few representative experimental traces. This offers a tool to recognize some characteristic motifs of the traces, which can be hard to find by simple data selection algorithms.
Jackman, Patrick; Sun, Da-Wen; Allen, Paul; Valous, Nektarios A; Mendoza, Fernando; Ward, Paddy
2010-04-01
A method to discriminate between various grades of pork and turkey ham was developed using colour and wavelet texture features. Image analysis methods originally developed for predicting the palatability of beef were applied to rapidly identify the ham grade. With high quality digital images of 50-94 slices per ham it was possible to identify the greyscale that best expressed the differences between the various ham grades. The best 10 discriminating image features were then found with a genetic algorithm. Using the best 10 image features, simple linear discriminant analysis models produced 100% correct classifications for both pork and turkey on both calibration and validation sets. 2009 Elsevier Ltd. All rights reserved.
Micro-Raman spectroscopy of natural and synthetic indigo samples.
Vandenabeele, Peter; Moens, Luc
2003-02-01
In this work indigo samples from three different sources are studied by using Raman spectroscopy: the synthetic pigment and pigments from the woad (Isatis tinctoria) and the indigo plant (Indigofera tinctoria). 21 samples were obtained from 8 suppliers; for each sample 5 Raman spectra were recorded and used for further chemometrical analysis. Principal components analysis (PCA) was performed as data reduction method before applying hierarchical cluster analysis. Linear discriminant analysis (LDA) was implemented as a non-hierarchical supervised pattern recognition method to build a classification model. In order to avoid broad-shaped interferences from the fluorescence background, the influence of 1st and 2nd derivatives on the classification was studied by using cross-validation. Although chemically identical, it is shown that Raman spectroscopy in combination with suitable chemometric methods has the potential to discriminate between synthetic and natural indigo samples.
Novel Intersection Type Recognition for Autonomous Vehicles Using a Multi-Layer Laser Scanner.
An, Jhonghyun; Choi, Baehoon; Sim, Kwee-Bo; Kim, Euntai
2016-07-20
There are several types of intersections such as merge-roads, diverge-roads, plus-shape intersections and two types of T-shape junctions in urban roads. When an autonomous vehicle encounters new intersections, it is crucial to recognize the types of intersections for safe navigation. In this paper, a novel intersection type recognition method is proposed for an autonomous vehicle using a multi-layer laser scanner. The proposed method consists of two steps: (1) static local coordinate occupancy grid map (SLOGM) building and (2) intersection classification. In the first step, the SLOGM is built relative to the local coordinate using the dynamic binary Bayes filter. In the second step, the SLOGM is used as an attribute for the classification. The proposed method is applied to a real-world environment and its validity is demonstrated through experimentation.
Novel Intersection Type Recognition for Autonomous Vehicles Using a Multi-Layer Laser Scanner
An, Jhonghyun; Choi, Baehoon; Sim, Kwee-Bo; Kim, Euntai
2016-01-01
There are several types of intersections such as merge-roads, diverge-roads, plus-shape intersections and two types of T-shape junctions in urban roads. When an autonomous vehicle encounters new intersections, it is crucial to recognize the types of intersections for safe navigation. In this paper, a novel intersection type recognition method is proposed for an autonomous vehicle using a multi-layer laser scanner. The proposed method consists of two steps: (1) static local coordinate occupancy grid map (SLOGM) building and (2) intersection classification. In the first step, the SLOGM is built relative to the local coordinate using the dynamic binary Bayes filter. In the second step, the SLOGM is used as an attribute for the classification. The proposed method is applied to a real-world environment and its validity is demonstrated through experimentation. PMID:27447640
Virtual Sensor of Surface Electromyography in a New Extensive Fault-Tolerant Classification System.
de Moura, Karina de O A; Balbinot, Alexandre
2018-05-01
A few prosthetic control systems in the scientific literature obtain pattern recognition algorithms adapted to changes that occur in the myoelectric signal over time and, frequently, such systems are not natural and intuitive. These are some of the several challenges for myoelectric prostheses for everyday use. The concept of the virtual sensor, which has as its fundamental objective to estimate unavailable measures based on other available measures, is being used in other fields of research. The virtual sensor technique applied to surface electromyography can help to minimize these problems, typically related to the degradation of the myoelectric signal that usually leads to a decrease in the classification accuracy of the movements characterized by computational intelligent systems. This paper presents a virtual sensor in a new extensive fault-tolerant classification system to maintain the classification accuracy after the occurrence of the following contaminants: ECG interference, electrode displacement, movement artifacts, power line interference, and saturation. The Time-Varying Autoregressive Moving Average (TVARMA) and Time-Varying Kalman filter (TVK) models are compared to define the most robust model for the virtual sensor. Results of movement classification were presented comparing the usual classification techniques with the method of the degraded signal replacement and classifier retraining. The experimental results were evaluated for these five noise types in 16 surface electromyography (sEMG) channel degradation case studies. The proposed system without using classifier retraining techniques recovered of mean classification accuracy was of 4% to 38% for electrode displacement, movement artifacts, and saturation noise. The best mean classification considering all signal contaminants and channel combinations evaluated was the classification using the retraining method, replacing the degraded channel by the virtual sensor TVARMA model. This method recovered the classification accuracy after the degradations, reaching an average of 5.7% below the classification of the clean signal, that is the signal without the contaminants or the original signal. Moreover, the proposed intelligent technique minimizes the impact of the motion classification caused by signal contamination related to degrading events over time. There are improvements in the virtual sensor model and in the algorithm optimization that need further development to provide an increase the clinical application of myoelectric prostheses but already presents robust results to enable research with virtual sensors on biological signs with stochastic behavior.
Virtual Sensor of Surface Electromyography in a New Extensive Fault-Tolerant Classification System
Balbinot, Alexandre
2018-01-01
A few prosthetic control systems in the scientific literature obtain pattern recognition algorithms adapted to changes that occur in the myoelectric signal over time and, frequently, such systems are not natural and intuitive. These are some of the several challenges for myoelectric prostheses for everyday use. The concept of the virtual sensor, which has as its fundamental objective to estimate unavailable measures based on other available measures, is being used in other fields of research. The virtual sensor technique applied to surface electromyography can help to minimize these problems, typically related to the degradation of the myoelectric signal that usually leads to a decrease in the classification accuracy of the movements characterized by computational intelligent systems. This paper presents a virtual sensor in a new extensive fault-tolerant classification system to maintain the classification accuracy after the occurrence of the following contaminants: ECG interference, electrode displacement, movement artifacts, power line interference, and saturation. The Time-Varying Autoregressive Moving Average (TVARMA) and Time-Varying Kalman filter (TVK) models are compared to define the most robust model for the virtual sensor. Results of movement classification were presented comparing the usual classification techniques with the method of the degraded signal replacement and classifier retraining. The experimental results were evaluated for these five noise types in 16 surface electromyography (sEMG) channel degradation case studies. The proposed system without using classifier retraining techniques recovered of mean classification accuracy was of 4% to 38% for electrode displacement, movement artifacts, and saturation noise. The best mean classification considering all signal contaminants and channel combinations evaluated was the classification using the retraining method, replacing the degraded channel by the virtual sensor TVARMA model. This method recovered the classification accuracy after the degradations, reaching an average of 5.7% below the classification of the clean signal, that is the signal without the contaminants or the original signal. Moreover, the proposed intelligent technique minimizes the impact of the motion classification caused by signal contamination related to degrading events over time. There are improvements in the virtual sensor model and in the algorithm optimization that need further development to provide an increase the clinical application of myoelectric prostheses but already presents robust results to enable research with virtual sensors on biological signs with stochastic behavior. PMID:29723994
NASA Astrophysics Data System (ADS)
Jevtić, Dubravka R.; Avramov Ivić, Milka L.; Reljin, Irini S.; Reljin, Branimir D.; Plavec, Goran I.; Petrović, Slobodan D.; Mijin, Dušan Ž.
2014-06-01
The automated, computer-aided method for differentiation and classification of malignant (M) from benign (B) cases, by analyzing the UV/VIS spectra of pleural effusions is described. It was shown that by two independent objective features, the maximum of Katz fractal dimension (KFDmax) and the area under normalized UV/VIS absorbance curve (Area), highly reliable M-B classification is possible. In the Area-KFDmax space M and B samples are linearly separable permitting thus the use of linear support vector machine as a classification tool. By analyzing 104 samples of UV/VIS spectra of pleural effusions (88 M and 16 B) collected from patients at the Clinic for Lung Diseases and Tuberculosis, Military Medical Academy in Belgrade, the accuracy of 95.45% for M cases and 100% for B cases are obtained by using the proposed method. It was shown that by applying some modifications, which are suggested in the paper, the accuracy of 100% for M cases can be reached.
Liu, Zhenli; Liu, Yuanyan; Liu, Chunsheng; Song, Zhiqian; Li, Qing; Zha, Qinglin; Lu, Cheng; Wang, Chun; Ning, Zhangchi; Zhang, Yuxin; Tian, Cheng; Lu, Aiping
2013-07-12
Rhodiola plants are used as a natural remedy in the western world and as a traditional herbal medicine in China, and are valued for their ability to enhance human resistance to stress or fatigue and to promote longevity. Due to the morphological similarities among different species, the identification of the genus remains somewhat controversial, which may affect their safety and effectiveness in clinical use. In this paper, 47 Rhodiola samples of seven species were collected from thirteen local provinces of China. They were identified by their morphological characteristics and genetic and phytochemical taxonomies. Eight bioactive chemotaxonomic markers from four chemical classes (phenylpropanoids, phenylethanol derivatives, flavonoids and phenolic acids) were determined to evaluate and distinguish the chemotaxonomy of Rhodiola samples using an HPLC-DAD/UV method. Hierarchical cluster analysis (HCA) and principal component analysis (PCA) were applied to compare the two classification methods between genetic and phytochemical taxonomy. The established chemotaxonomic classification could be effectively used for Rhodiola species identification.
Fang, Leyuan; Wang, Chong; Li, Shutao; Yan, Jun; Chen, Xiangdong; Rabbani, Hossein
2017-11-01
We present an automatic method, termed as the principal component analysis network with composite kernel (PCANet-CK), for the classification of three-dimensional (3-D) retinal optical coherence tomography (OCT) images. Specifically, the proposed PCANet-CK method first utilizes the PCANet to automatically learn features from each B-scan of the 3-D retinal OCT images. Then, multiple kernels are separately applied to a set of very important features of the B-scans and these kernels are fused together, which can jointly exploit the correlations among features of the 3-D OCT images. Finally, the fused (composite) kernel is incorporated into an extreme learning machine for the OCT image classification. We tested our proposed algorithm on two real 3-D spectral domain OCT (SD-OCT) datasets (of normal subjects and subjects with the macular edema and age-related macular degeneration), which demonstrated its effectiveness. (2017) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).
A New Data Mining Scheme Using Artificial Neural Networks
Kamruzzaman, S. M.; Jehad Sarkar, A. M.
2011-01-01
Classification is one of the data mining problems receiving enormous attention in the database community. Although artificial neural networks (ANNs) have been successfully applied in a wide range of machine learning applications, they are however often regarded as black boxes, i.e., their predictions cannot be explained. To enhance the explanation of ANNs, a novel algorithm to extract symbolic rules from ANNs has been proposed in this paper. ANN methods have not been effectively utilized for data mining tasks because how the classifications were made is not explicitly stated as symbolic rules that are suitable for verification or interpretation by human experts. With the proposed approach, concise symbolic rules with high accuracy, that are easily explainable, can be extracted from the trained ANNs. Extracted rules are comparable with other methods in terms of number of rules, average number of conditions for a rule, and the accuracy. The effectiveness of the proposed approach is clearly demonstrated by the experimental results on a set of benchmark data mining classification problems. PMID:22163866