Tu, Li-ping; Chen, Jing-bo; Hu, Xiao-juan; Zhang, Zhi-feng
2016-01-01
Background and Goal. The application of digital image processing techniques and machine learning methods in tongue image classification in Traditional Chinese Medicine (TCM) has been widely studied nowadays. However, it is difficult for the outcomes to generalize because of lack of color reproducibility and image standardization. Our study aims at the exploration of tongue colors classification with a standardized tongue image acquisition process and color correction. Methods. Three traditional Chinese medical experts are chosen to identify the selected tongue pictures taken by the TDA-1 tongue imaging device in TIFF format through ICC profile correction. Then we compare the mean value of L * a * b * of different tongue colors and evaluate the effect of the tongue color classification by machine learning methods. Results. The L * a * b * values of the five tongue colors are statistically different. Random forest method has a better performance than SVM in classification. SMOTE algorithm can increase classification accuracy by solving the imbalance of the varied color samples. Conclusions. At the premise of standardized tongue acquisition and color reproduction, preliminary objectification of tongue color classification in Traditional Chinese Medicine (TCM) is feasible. PMID:28050555
Qi, Zhen; Tu, Li-Ping; Chen, Jing-Bo; Hu, Xiao-Juan; Xu, Jia-Tuo; Zhang, Zhi-Feng
2016-01-01
Background and Goal . The application of digital image processing techniques and machine learning methods in tongue image classification in Traditional Chinese Medicine (TCM) has been widely studied nowadays. However, it is difficult for the outcomes to generalize because of lack of color reproducibility and image standardization. Our study aims at the exploration of tongue colors classification with a standardized tongue image acquisition process and color correction. Methods . Three traditional Chinese medical experts are chosen to identify the selected tongue pictures taken by the TDA-1 tongue imaging device in TIFF format through ICC profile correction. Then we compare the mean value of L * a * b * of different tongue colors and evaluate the effect of the tongue color classification by machine learning methods. Results . The L * a * b * values of the five tongue colors are statistically different. Random forest method has a better performance than SVM in classification. SMOTE algorithm can increase classification accuracy by solving the imbalance of the varied color samples. Conclusions . At the premise of standardized tongue acquisition and color reproduction, preliminary objectification of tongue color classification in Traditional Chinese Medicine (TCM) is feasible.
Huo, Guanying
2017-01-01
As a typical deep-learning model, Convolutional Neural Networks (CNNs) can be exploited to automatically extract features from images using the hierarchical structure inspired by mammalian visual system. For image classification tasks, traditional CNN models employ the softmax function for classification. However, owing to the limited capacity of the softmax function, there are some shortcomings of traditional CNN models in image classification. To deal with this problem, a new method combining Biomimetic Pattern Recognition (BPR) with CNNs is proposed for image classification. BPR performs class recognition by a union of geometrical cover sets in a high-dimensional feature space and therefore can overcome some disadvantages of traditional pattern recognition. The proposed method is evaluated on three famous image classification benchmarks, that is, MNIST, AR, and CIFAR-10. The classification accuracies of the proposed method for the three datasets are 99.01%, 98.40%, and 87.11%, respectively, which are much higher in comparison with the other four methods in most cases. PMID:28316614
[Construction of biopharmaceutics classification system of Chinese materia medica].
Liu, Yang; Wei, Li; Dong, Ling; Zhu, Mei-Ling; Tang, Ming-Min; Zhang, Lei
2014-12-01
Based on the characteristics of multicomponent of traditional Chinese medicine and drawing lessons from the concepts, methods and techniques of biopharmaceutics classification system (BCS) in chemical field, this study comes up with the science framework of biopharmaceutics classification system of Chinese materia medica (CMMBCS). Using the different comparison method of multicomponent level and the CMMBCS method of overall traditional Chinese medicine, the study constructs the method process while setting forth academic thoughts and analyzing theory. The basic role of this system is clear to reveal the interaction and the related absorption mechanism of multicomponent in traditional Chinese medicine. It also provides new ideas and methods for improving the quality of Chinese materia medica and the development of new drug research.
Land Cover Classification in a Complex Urban-Rural Landscape with Quickbird Imagery
Moran, Emilio Federico.
2010-01-01
High spatial resolution images have been increasingly used for urban land use/cover classification, but the high spectral variation within the same land cover, the spectral confusion among different land covers, and the shadow problem often lead to poor classification performance based on the traditional per-pixel spectral-based classification methods. This paper explores approaches to improve urban land cover classification with Quickbird imagery. Traditional per-pixel spectral-based supervised classification, incorporation of textural images and multispectral images, spectral-spatial classifier, and segmentation-based classification are examined in a relatively new developing urban landscape, Lucas do Rio Verde in Mato Grosso State, Brazil. This research shows that use of spatial information during the image classification procedure, either through the integrated use of textural and spectral images or through the use of segmentation-based classification method, can significantly improve land cover classification performance. PMID:21643433
Classification of Clouds in Satellite Imagery Using Adaptive Fuzzy Sparse Representation.
Jin, Wei; Gong, Fei; Zeng, Xingbin; Fu, Randi
2016-12-16
Automatic cloud detection and classification using satellite cloud imagery have various meteorological applications such as weather forecasting and climate monitoring. Cloud pattern analysis is one of the research hotspots recently. Since satellites sense the clouds remotely from space, and different cloud types often overlap and convert into each other, there must be some fuzziness and uncertainty in satellite cloud imagery. Satellite observation is susceptible to noises, while traditional cloud classification methods are sensitive to noises and outliers; it is hard for traditional cloud classification methods to achieve reliable results. To deal with these problems, a satellite cloud classification method using adaptive fuzzy sparse representation-based classification (AFSRC) is proposed. Firstly, by defining adaptive parameters related to attenuation rate and critical membership, an improved fuzzy membership is introduced to accommodate the fuzziness and uncertainty of satellite cloud imagery; secondly, by effective combination of the improved fuzzy membership function and sparse representation-based classification (SRC), atoms in training dictionary are optimized; finally, an adaptive fuzzy sparse representation classifier for cloud classification is proposed. Experiment results on FY-2G satellite cloud image show that, the proposed method not only improves the accuracy of cloud classification, but also has strong stability and adaptability with high computational efficiency.
Android malware detection based on evolutionary super-network
NASA Astrophysics Data System (ADS)
Yan, Haisheng; Peng, Lingling
2018-04-01
In the paper, an android malware detection method based on evolutionary super-network is proposed in order to improve the precision of android malware detection. Chi square statistics method is used for selecting characteristics on the basis of analyzing android authority. Boolean weighting is utilized for calculating characteristic weight. Processed characteristic vector is regarded as the system training set and test set; hyper edge alternative strategy is used for training super-network classification model, thereby classifying test set characteristic vectors, and it is compared with traditional classification algorithm. The results show that the detection method proposed in the paper is close to or better than traditional classification algorithm. The proposed method belongs to an effective Android malware detection means.
The classification and application of toxic Chinese materia medica.
Liu, Xinmin; Wang, Qiong; Song, Guangqing; Zhang, Guangping; Ye, Zuguang; Williamson, Elizabeth M
2014-03-01
Many important drugs in the Chinese materia medica (CMM) are known to be toxic, and it has long been recognized in classical Chinese medical theory that toxicity can arise directly from the components of a single CMM or may be induced by an interaction between combined CMM. Traditional Chinese Medicine presents a unique set of pharmaceutical theories that include particular methods for processing, combining and decocting, and these techniques contribute to reducing toxicity as well as enhancing efficacy. The current classification of toxic CMM drugs, traditional methods for processing toxic CMM and the prohibited use of certain combinations, is based on traditional experience and ancient texts and monographs, but accumulating evidence increasingly supports their use to eliminate or reduce toxicity. Modern methods are now being used to evaluate the safety of CMM; however, a new system for describing the toxicity of Chinese herbal medicines may need to be established to take into account those herbs whose toxicity is delayed or otherwise hidden, and which have not been incorporated into the traditional classification. This review explains the existing classification and justifies it where appropriate, using experimental results often originally published in Chinese and previously not available outside China. Copyright © 2013 John Wiley & Sons, Ltd.
Classification of Clouds in Satellite Imagery Using Adaptive Fuzzy Sparse Representation
Jin, Wei; Gong, Fei; Zeng, Xingbin; Fu, Randi
2016-01-01
Automatic cloud detection and classification using satellite cloud imagery have various meteorological applications such as weather forecasting and climate monitoring. Cloud pattern analysis is one of the research hotspots recently. Since satellites sense the clouds remotely from space, and different cloud types often overlap and convert into each other, there must be some fuzziness and uncertainty in satellite cloud imagery. Satellite observation is susceptible to noises, while traditional cloud classification methods are sensitive to noises and outliers; it is hard for traditional cloud classification methods to achieve reliable results. To deal with these problems, a satellite cloud classification method using adaptive fuzzy sparse representation-based classification (AFSRC) is proposed. Firstly, by defining adaptive parameters related to attenuation rate and critical membership, an improved fuzzy membership is introduced to accommodate the fuzziness and uncertainty of satellite cloud imagery; secondly, by effective combination of the improved fuzzy membership function and sparse representation-based classification (SRC), atoms in training dictionary are optimized; finally, an adaptive fuzzy sparse representation classifier for cloud classification is proposed. Experiment results on FY-2G satellite cloud image show that, the proposed method not only improves the accuracy of cloud classification, but also has strong stability and adaptability with high computational efficiency. PMID:27999261
NASA Astrophysics Data System (ADS)
Wan, Yi
2011-06-01
Chinese wines can be classification or graded by the micrographs. Micrographs of Chinese wines show floccules, stick and granule of variant shape and size. Different wines have variant microstructure and micrographs, we study the classification of Chinese wines based on the micrographs. Shape and structure of wines' particles in microstructure is the most important feature for recognition and classification of wines. So we introduce a feature extraction method which can describe the structure and region shape of micrograph efficiently. First, the micrographs are enhanced using total variation denoising, and segmented using a modified Otsu's method based on the Rayleigh Distribution. Then features are extracted using proposed method in the paper based on area, perimeter and traditional shape feature. Eight kinds total 26 features are selected. Finally, Chinese wine classification system based on micrograph using combination of shape and structure features and BP neural network have been presented. We compare the recognition results for different choices of features (traditional shape features or proposed features). The experimental results show that the better classification rate have been achieved using the combinational features proposed in this paper.
Kassian, Alexei
2015-01-01
A lexicostatistical classification is proposed for 20 languages and dialects of the Lezgian group of the North Caucasian family, based on meticulously compiled 110-item wordlists, published as part of the Global Lexicostatistical Database project. The lexical data have been subsequently analyzed with the aid of the principal phylogenetic methods, both distance-based and character-based: Starling neighbor joining (StarlingNJ), Neighbor joining (NJ), Unweighted pair group method with arithmetic mean (UPGMA), Bayesian Markov chain Monte Carlo (MCMC), Unweighted maximum parsimony (UMP). Cognation indexes within the input matrix were marked by two different algorithms: traditional etymological approach and phonetic similarity, i.e., the automatic method of consonant classes (Levenshtein distances). Due to certain reasons (first of all, high lexicographic quality of the wordlists and a consensus about the Lezgian phylogeny among Caucasologists), the Lezgian database is a perfect testing area for appraisal of phylogenetic methods. For the etymology-based input matrix, all the phylogenetic methods, with the possible exception of UMP, have yielded trees that are sufficiently compatible with each other to generate a consensus phylogenetic tree of the Lezgian lects. The obtained consensus tree agrees with the traditional expert classification as well as some of the previously proposed formal classifications of this linguistic group. Contrary to theoretical expectations, the UMP method has suggested the least plausible tree of all. In the case of the phonetic similarity-based input matrix, the distance-based methods (StarlingNJ, NJ, UPGMA) have produced the trees that are rather close to the consensus etymology-based tree and the traditional expert classification, whereas the character-based methods (Bayesian MCMC, UMP) have yielded less likely topologies.
Kassian, Alexei
2015-01-01
A lexicostatistical classification is proposed for 20 languages and dialects of the Lezgian group of the North Caucasian family, based on meticulously compiled 110-item wordlists, published as part of the Global Lexicostatistical Database project. The lexical data have been subsequently analyzed with the aid of the principal phylogenetic methods, both distance-based and character-based: Starling neighbor joining (StarlingNJ), Neighbor joining (NJ), Unweighted pair group method with arithmetic mean (UPGMA), Bayesian Markov chain Monte Carlo (MCMC), Unweighted maximum parsimony (UMP). Cognation indexes within the input matrix were marked by two different algorithms: traditional etymological approach and phonetic similarity, i.e., the automatic method of consonant classes (Levenshtein distances). Due to certain reasons (first of all, high lexicographic quality of the wordlists and a consensus about the Lezgian phylogeny among Caucasologists), the Lezgian database is a perfect testing area for appraisal of phylogenetic methods. For the etymology-based input matrix, all the phylogenetic methods, with the possible exception of UMP, have yielded trees that are sufficiently compatible with each other to generate a consensus phylogenetic tree of the Lezgian lects. The obtained consensus tree agrees with the traditional expert classification as well as some of the previously proposed formal classifications of this linguistic group. Contrary to theoretical expectations, the UMP method has suggested the least plausible tree of all. In the case of the phonetic similarity-based input matrix, the distance-based methods (StarlingNJ, NJ, UPGMA) have produced the trees that are rather close to the consensus etymology-based tree and the traditional expert classification, whereas the character-based methods (Bayesian MCMC, UMP) have yielded less likely topologies. PMID:25719456
A Classification of Remote Sensing Image Based on Improved Compound Kernels of Svm
NASA Astrophysics Data System (ADS)
Zhao, Jianing; Gao, Wanlin; Liu, Zili; Mou, Guifen; Lu, Lin; Yu, Lina
The accuracy of RS classification based on SVM which is developed from statistical learning theory is high under small number of train samples, which results in satisfaction of classification on RS using SVM methods. The traditional RS classification method combines visual interpretation with computer classification. The accuracy of the RS classification, however, is improved a lot based on SVM method, because it saves much labor and time which is used to interpret images and collect training samples. Kernel functions play an important part in the SVM algorithm. It uses improved compound kernel function and therefore has a higher accuracy of classification on RS images. Moreover, compound kernel improves the generalization and learning ability of the kernel.
A New Direction of Cancer Classification: Positive Effect of Low-Ranking MicroRNAs.
Li, Feifei; Piao, Minghao; Piao, Yongjun; Li, Meijing; Ryu, Keun Ho
2014-10-01
Many studies based on microRNA (miRNA) expression profiles showed a new aspect of cancer classification. Because one characteristic of miRNA expression data is the high dimensionality, feature selection methods have been used to facilitate dimensionality reduction. The feature selection methods have one shortcoming thus far: they just consider the problem of where feature to class is 1:1 or n:1. However, because one miRNA may influence more than one type of cancer, human miRNA is considered to be ranked low in traditional feature selection methods and are removed most of the time. In view of the limitation of the miRNA number, low-ranking miRNAs are also important to cancer classification. We considered both high- and low-ranking features to cover all problems (1:1, n:1, 1:n, and m:n) in cancer classification. First, we used the correlation-based feature selection method to select the high-ranking miRNAs, and chose the support vector machine, Bayes network, decision tree, k-nearest-neighbor, and logistic classifier to construct cancer classification. Then, we chose Chi-square test, information gain, gain ratio, and Pearson's correlation feature selection methods to build the m:n feature subset, and used the selected miRNAs to determine cancer classification. The low-ranking miRNA expression profiles achieved higher classification accuracy compared with just using high-ranking miRNAs in traditional feature selection methods. Our results demonstrate that the m:n feature subset made a positive impression of low-ranking miRNAs in cancer classification.
Using Pseudozoids to Teach Classification and Phylogeny to Middle School Students
ERIC Educational Resources Information Center
Freidenberg, Rolfe Jr.; Kelly, Martin G.
2004-01-01
This research compared the outcomes of teaching middle school students two different methods of classification and phylogeny. Two classes were randomly selected and taught using traditional methods of instruction. Three classes were taught using the "Pseudozoid" approach, where students learned to classify, develop and read dichotomous keys, and…
Objected-oriented remote sensing image classification method based on geographic ontology model
NASA Astrophysics Data System (ADS)
Chu, Z.; Liu, Z. J.; Gu, H. Y.
2016-11-01
Nowadays, with the development of high resolution remote sensing image and the wide application of laser point cloud data, proceeding objected-oriented remote sensing classification based on the characteristic knowledge of multi-source spatial data has been an important trend on the field of remote sensing image classification, which gradually replaced the traditional method through improving algorithm to optimize image classification results. For this purpose, the paper puts forward a remote sensing image classification method that uses the he characteristic knowledge of multi-source spatial data to build the geographic ontology semantic network model, and carries out the objected-oriented classification experiment to implement urban features classification, the experiment uses protégé software which is developed by Stanford University in the United States, and intelligent image analysis software—eCognition software as the experiment platform, uses hyperspectral image and Lidar data that is obtained through flight in DaFeng City of JiangSu as the main data source, first of all, the experiment uses hyperspectral image to obtain feature knowledge of remote sensing image and related special index, the second, the experiment uses Lidar data to generate nDSM(Normalized DSM, Normalized Digital Surface Model),obtaining elevation information, the last, the experiment bases image feature knowledge, special index and elevation information to build the geographic ontology semantic network model that implement urban features classification, the experiment results show that, this method is significantly higher than the traditional classification algorithm on classification accuracy, especially it performs more evidently on the respect of building classification. The method not only considers the advantage of multi-source spatial data, for example, remote sensing image, Lidar data and so on, but also realizes multi-source spatial data knowledge integration and application of the knowledge to the field of remote sensing image classification, which provides an effective way for objected-oriented remote sensing image classification in the future.
Development of a template for the classification of traditional medical knowledge in Korea.
Kim, Sungha; Kim, Boyoung; Mun, Sujeong; Park, Jeong Hwan; Kim, Min-Kyeoung; Choi, Sunmi; Lee, Sanghun
2016-02-03
Traditional Medical Knowledge (TMK) is a form of Traditional Knowledge associated with medicine that is handed down orally or by written material. There are efforts to document TMK, and make database to conserve Traditional Medicine and facilitate future research to validate traditional use. Despite of these efforts, there is no widely accepted template in data file format that is specific for TMK and, at the same time, helpful for understanding and organizing TMK. We aimed to develop a template to classify TMK. First, we reviewed books, articles, and health-related classification systems, and used focus group discussion to establish the definition, scope, and constituents of TMK. Second, we developed an initial version of the template to classify TMK, and applied it to TMK data. Third, we revised the template, based on the results of the initial template and input from experts, and applied it to the data. We developed the template for classification of TMK. The constituents of the template were summary, properties, tools/ingredients, indication/preparation/application, and international standard classification. We applied International Patent Classification, International Classification of Diseases (Korea version), and Classification of Korean Traditional Knowledge Resources to provide legal protection of TMK and facilitate academic research. The template provides standard terms for ingredients, preparation, administration route, and procedure method to assess safety and efficacy. This is the first template that is specialized for TMK for arranging and classifying TMK. The template would have important roles in preserving TMK, and protecting intellectual property. TMK data classified with the template could be used as the preliminary data to screen potential candidates for new pharmaceuticals. Copyright © 2015 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.
A spectrum fractal feature classification algorithm for agriculture crops with hyper spectrum image
NASA Astrophysics Data System (ADS)
Su, Junying
2011-11-01
A fractal dimension feature analysis method in spectrum domain for hyper spectrum image is proposed for agriculture crops classification. Firstly, a fractal dimension calculation algorithm in spectrum domain is presented together with the fast fractal dimension value calculation algorithm using the step measurement method. Secondly, the hyper spectrum image classification algorithm and flowchart is presented based on fractal dimension feature analysis in spectrum domain. Finally, the experiment result of the agricultural crops classification with FCL1 hyper spectrum image set with the proposed method and SAM (spectral angle mapper). The experiment results show it can obtain better classification result than the traditional SAM feature analysis which can fulfill use the spectrum information of hyper spectrum image to realize precision agricultural crops classification.
Prostate segmentation by sparse representation based classification
Gao, Yaozong; Liao, Shu; Shen, Dinggang
2012-01-01
Purpose: The segmentation of prostate in CT images is of essential importance to external beam radiotherapy, which is one of the major treatments for prostate cancer nowadays. During the radiotherapy, the prostate is radiated by high-energy x rays from different directions. In order to maximize the dose to the cancer and minimize the dose to the surrounding healthy tissues (e.g., bladder and rectum), the prostate in the new treatment image needs to be accurately localized. Therefore, the effectiveness and efficiency of external beam radiotherapy highly depend on the accurate localization of the prostate. However, due to the low contrast of the prostate with its surrounding tissues (e.g., bladder), the unpredicted prostate motion, and the large appearance variations across different treatment days, it is challenging to segment the prostate in CT images. In this paper, the authors present a novel classification based segmentation method to address these problems. Methods: To segment the prostate, the proposed method first uses sparse representation based classification (SRC) to enhance the prostate in CT images by pixel-wise classification, in order to overcome the limitation of poor contrast of the prostate images. Then, based on the classification results, previous segmented prostates of the same patient are used as patient-specific atlases to align onto the current treatment image and the majority voting strategy is finally adopted to segment the prostate. In order to address the limitations of the traditional SRC in pixel-wise classification, especially for the purpose of segmentation, the authors extend SRC from the following four aspects: (1) A discriminant subdictionary learning method is proposed to learn a discriminant and compact representation of training samples for each class so that the discriminant power of SRC can be increased and also SRC can be applied to the large-scale pixel-wise classification. (2) The L1 regularized sparse coding is replaced by the elastic net in order to obtain a smooth and clear prostate boundary in the classification result. (3) Residue-based linear regression is incorporated to improve the classification performance and to extend SRC from hard classification to soft classification. (4) Iterative SRC is proposed by using context information to iteratively refine the classification results. Results: The proposed method has been comprehensively evaluated on a dataset consisting of 330 CT images from 24 patients. The effectiveness of the extended SRC has been validated by comparing it with the traditional SRC based on the proposed four extensions. The experimental results show that our extended SRC can obtain not only more accurate classification results but also smoother and clearer prostate boundary than the traditional SRC. Besides, the comparison with other five state-of-the-art prostate segmentation methods indicates that our method can achieve better performance than other methods under comparison. Conclusions: The authors have proposed a novel prostate segmentation method based on the sparse representation based classification, which can achieve considerably accurate segmentation results in CT prostate segmentation. PMID:23039673
NASA Astrophysics Data System (ADS)
Liu, Tao; Abd-Elrahman, Amr
2018-05-01
Deep convolutional neural network (DCNN) requires massive training datasets to trigger its image classification power, while collecting training samples for remote sensing application is usually an expensive process. When DCNN is simply implemented with traditional object-based image analysis (OBIA) for classification of Unmanned Aerial systems (UAS) orthoimage, its power may be undermined if the number training samples is relatively small. This research aims to develop a novel OBIA classification approach that can take advantage of DCNN by enriching the training dataset automatically using multi-view data. Specifically, this study introduces a Multi-View Object-based classification using Deep convolutional neural network (MODe) method to process UAS images for land cover classification. MODe conducts the classification on multi-view UAS images instead of directly on the orthoimage, and gets the final results via a voting procedure. 10-fold cross validation results show the mean overall classification accuracy increasing substantially from 65.32%, when DCNN was applied on the orthoimage to 82.08% achieved when MODe was implemented. This study also compared the performances of the support vector machine (SVM) and random forest (RF) classifiers with DCNN under traditional OBIA and the proposed multi-view OBIA frameworks. The results indicate that the advantage of DCNN over traditional classifiers in terms of accuracy is more obvious when these classifiers were applied with the proposed multi-view OBIA framework than when these classifiers were applied within the traditional OBIA framework.
NASA Astrophysics Data System (ADS)
Li, Bocong; Huang, Qingmei; Lu, Yan; Chen, Songhe; Liang, Rong; Wang, Zhaoping
Objective tongue color analysis is an important research point for tongue diagnosis in Traditional Chinese Medicine. In this paper a research based on the clinical process of diagnosing tongue color is reported. The color data in RGB color space were first transformed into the data in CIELAB color space, and the color gamut of the displayed tongue was obtained. Then a numerical method of tongue color classification based on the Traditional Chinese Medicine (for example: light white tongue, light red tongue, red tongue) was developed. The conclusion is that this research can give the description and classification of the tongue color close to those given by human vision and may be carried out in clinical diagnosis.
Prostate segmentation by sparse representation based classification.
Gao, Yaozong; Liao, Shu; Shen, Dinggang
2012-10-01
The segmentation of prostate in CT images is of essential importance to external beam radiotherapy, which is one of the major treatments for prostate cancer nowadays. During the radiotherapy, the prostate is radiated by high-energy x rays from different directions. In order to maximize the dose to the cancer and minimize the dose to the surrounding healthy tissues (e.g., bladder and rectum), the prostate in the new treatment image needs to be accurately localized. Therefore, the effectiveness and efficiency of external beam radiotherapy highly depend on the accurate localization of the prostate. However, due to the low contrast of the prostate with its surrounding tissues (e.g., bladder), the unpredicted prostate motion, and the large appearance variations across different treatment days, it is challenging to segment the prostate in CT images. In this paper, the authors present a novel classification based segmentation method to address these problems. To segment the prostate, the proposed method first uses sparse representation based classification (SRC) to enhance the prostate in CT images by pixel-wise classification, in order to overcome the limitation of poor contrast of the prostate images. Then, based on the classification results, previous segmented prostates of the same patient are used as patient-specific atlases to align onto the current treatment image and the majority voting strategy is finally adopted to segment the prostate. In order to address the limitations of the traditional SRC in pixel-wise classification, especially for the purpose of segmentation, the authors extend SRC from the following four aspects: (1) A discriminant subdictionary learning method is proposed to learn a discriminant and compact representation of training samples for each class so that the discriminant power of SRC can be increased and also SRC can be applied to the large-scale pixel-wise classification. (2) The L1 regularized sparse coding is replaced by the elastic net in order to obtain a smooth and clear prostate boundary in the classification result. (3) Residue-based linear regression is incorporated to improve the classification performance and to extend SRC from hard classification to soft classification. (4) Iterative SRC is proposed by using context information to iteratively refine the classification results. The proposed method has been comprehensively evaluated on a dataset consisting of 330 CT images from 24 patients. The effectiveness of the extended SRC has been validated by comparing it with the traditional SRC based on the proposed four extensions. The experimental results show that our extended SRC can obtain not only more accurate classification results but also smoother and clearer prostate boundary than the traditional SRC. Besides, the comparison with other five state-of-the-art prostate segmentation methods indicates that our method can achieve better performance than other methods under comparison. The authors have proposed a novel prostate segmentation method based on the sparse representation based classification, which can achieve considerably accurate segmentation results in CT prostate segmentation.
Behavior Based Social Dimensions Extraction for Multi-Label Classification
Li, Le; Xu, Junyi; Xiao, Weidong; Ge, Bin
2016-01-01
Classification based on social dimensions is commonly used to handle the multi-label classification task in heterogeneous networks. However, traditional methods, which mostly rely on the community detection algorithms to extract the latent social dimensions, produce unsatisfactory performance when community detection algorithms fail. In this paper, we propose a novel behavior based social dimensions extraction method to improve the classification performance in multi-label heterogeneous networks. In our method, nodes’ behavior features, instead of community memberships, are used to extract social dimensions. By introducing Latent Dirichlet Allocation (LDA) to model the network generation process, nodes’ connection behaviors with different communities can be extracted accurately, which are applied as latent social dimensions for classification. Experiments on various public datasets reveal that the proposed method can obtain satisfactory classification results in comparison to other state-of-the-art methods on smaller social dimensions. PMID:27049849
The information extraction of Gannan citrus orchard based on the GF-1 remote sensing image
NASA Astrophysics Data System (ADS)
Wang, S.; Chen, Y. L.
2017-02-01
The production of Gannan oranges is the largest in China, which occupied an important part in the world. The extraction of citrus orchard quickly and effectively has important significance for fruit pathogen defense, fruit production and industrial planning. The traditional spectra extraction method of citrus orchard based on pixel has a lower classification accuracy, difficult to avoid the “pepper phenomenon”. In the influence of noise, the phenomenon that different spectrums of objects have the same spectrum is graveness. Taking Xunwu County citrus fruit planting area of Ganzhou as the research object, aiming at the disadvantage of the lower accuracy of the traditional method based on image element classification method, a decision tree classification method based on object-oriented rule set is proposed. Firstly, multi-scale segmentation is performed on the GF-1 remote sensing image data of the study area. Subsequently the sample objects are selected for statistical analysis of spectral features and geometric features. Finally, combined with the concept of decision tree classification, a variety of empirical values of single band threshold, NDVI, band combination and object geometry characteristics are used hierarchically to execute the information extraction of the research area, and multi-scale segmentation and hierarchical decision tree classification is implemented. The classification results are verified with the confusion matrix, and the overall Kappa index is 87.91%.
A novel fruit shape classification method based on multi-scale analysis
NASA Astrophysics Data System (ADS)
Gui, Jiangsheng; Ying, Yibin; Rao, Xiuqin
2005-11-01
Shape is one of the major concerns and which is still a difficult problem in automated inspection and sorting of fruits. In this research, we proposed the multi-scale energy distribution (MSED) for object shape description, the relationship between objects shape and its boundary energy distribution at multi-scale was explored for shape extraction. MSED offers not only the mainly energy which represent primary shape information at the lower scales, but also subordinate energy which represent local shape information at higher differential scales. Thus, it provides a natural tool for multi resolution representation and can be used as a feature for shape classification. We addressed the three main processing steps in the MSED-based shape classification. They are namely, 1) image preprocessing and citrus shape extraction, 2) shape resample and shape feature normalization, 3) energy decomposition by wavelet and classification by BP neural network. Hereinto, shape resample is resample 256 boundary pixel from a curve which is approximated original boundary by using cubic spline in order to get uniform raw data. A probability function was defined and an effective method to select a start point was given through maximal expectation, which overcame the inconvenience of traditional methods in order to have a property of rotation invariants. The experiment result is relatively well normal citrus and serious abnormality, with a classification rate superior to 91.2%. The global correct classification rate is 89.77%, and our method is more effective than traditional method. The global result can meet the request of fruit grading.
Teaching Methods, Intelligence, and Gender Factors in Pupil Achievement on a Classification Task
ERIC Educational Resources Information Center
Ryman, Don
1977-01-01
Reports on twelve year-old students instructed in Nuffield Project and in "traditional" classrooms. A division of the subjects into two groups based on intelligence revealed significant differences on classification ability. Interaction effects were also observed. (CP)
Object Manifold Alignment for Multi-Temporal High Resolution Remote Sensing Images Classification
NASA Astrophysics Data System (ADS)
Gao, G.; Zhang, M.; Gu, Y.
2017-05-01
Multi-temporal remote sensing images classification is very useful for monitoring the land cover changes. Traditional approaches in this field mainly face to limited labelled samples and spectral drift of image information. With spatial resolution improvement, "pepper and salt" appears and classification results will be effected when the pixelwise classification algorithms are applied to high-resolution satellite images, in which the spatial relationship among the pixels is ignored. For classifying the multi-temporal high resolution images with limited labelled samples, spectral drift and "pepper and salt" problem, an object-based manifold alignment method is proposed. Firstly, multi-temporal multispectral images are cut to superpixels by simple linear iterative clustering (SLIC) respectively. Secondly, some features obtained from superpixels are formed as vector. Thirdly, a majority voting manifold alignment method aiming at solving high resolution problem is proposed and mapping the vector data to alignment space. At last, all the data in the alignment space are classified by using KNN method. Multi-temporal images from different areas or the same area are both considered in this paper. In the experiments, 2 groups of multi-temporal HR images collected by China GF1 and GF2 satellites are used for performance evaluation. Experimental results indicate that the proposed method not only has significantly outperforms than traditional domain adaptation methods in classification accuracy, but also effectively overcome the problem of "pepper and salt".
Advances in Patient Classification for Traditional Chinese Medicine: A Machine Learning Perspective
Zhao, Changbo; Li, Guo-Zheng; Wang, Chengjun; Niu, Jinling
2015-01-01
As a complementary and alternative medicine in medical field, traditional Chinese medicine (TCM) has drawn great attention in the domestic field and overseas. In practice, TCM provides a quite distinct methodology to patient diagnosis and treatment compared to western medicine (WM). Syndrome (ZHENG or pattern) is differentiated by a set of symptoms and signs examined from an individual by four main diagnostic methods: inspection, auscultation and olfaction, interrogation, and palpation which reflects the pathological and physiological changes of disease occurrence and development. Patient classification is to divide patients into several classes based on different criteria. In this paper, from the machine learning perspective, a survey on patient classification issue will be summarized on three major aspects of TCM: sign classification, syndrome differentiation, and disease classification. With the consideration of different diagnostic data analyzed by different computational methods, we present the overview for four subfields of TCM diagnosis, respectively. For each subfield, we design a rectangular reference list with applications in the horizontal direction and machine learning algorithms in the longitudinal direction. According to the current development of objective TCM diagnosis for patient classification, a discussion of the research issues around machine learning techniques with applications to TCM diagnosis is given to facilitate the further research for TCM patient classification. PMID:26246834
Griffiths, Jason I.; Fronhofer, Emanuel A.; Garnier, Aurélie; Seymour, Mathew; Altermatt, Florian; Petchey, Owen L.
2017-01-01
The development of video-based monitoring methods allows for rapid, dynamic and accurate monitoring of individuals or communities, compared to slower traditional methods, with far reaching ecological and evolutionary applications. Large amounts of data are generated using video-based methods, which can be effectively processed using machine learning (ML) algorithms into meaningful ecological information. ML uses user defined classes (e.g. species), derived from a subset (i.e. training data) of video-observed quantitative features (e.g. phenotypic variation), to infer classes in subsequent observations. However, phenotypic variation often changes due to environmental conditions, which may lead to poor classification, if environmentally induced variation in phenotypes is not accounted for. Here we describe a framework for classifying species under changing environmental conditions based on the random forest classification. A sliding window approach was developed that restricts temporal and environmentally conditions to improve the classification. We tested our approach by applying the classification framework to experimental data. The experiment used a set of six ciliate species to monitor changes in community structure and behavior over hundreds of generations, in dozens of species combinations and across a temperature gradient. Differences in biotic and abiotic conditions caused simplistic classification approaches to be unsuccessful. In contrast, the sliding window approach allowed classification to be highly successful, as phenotypic differences driven by environmental change, could be captured by the classifier. Importantly, classification using the random forest algorithm showed comparable success when validated against traditional, slower, manual identification. Our framework allows for reliable classification in dynamic environments, and may help to improve strategies for long-term monitoring of species in changing environments. Our classification pipeline can be applied in fields assessing species community dynamics, such as eco-toxicology, ecology and evolutionary ecology. PMID:28472193
Wang, Guizhou; Liu, Jianbo; He, Guojin
2013-01-01
This paper presents a new classification method for high-spatial-resolution remote sensing images based on a strategic mechanism of spatial mapping and reclassification. The proposed method includes four steps. First, the multispectral image is classified by a traditional pixel-based classification method (support vector machine). Second, the panchromatic image is subdivided by watershed segmentation. Third, the pixel-based multispectral image classification result is mapped to the panchromatic segmentation result based on a spatial mapping mechanism and the area dominant principle. During the mapping process, an area proportion threshold is set, and the regional property is defined as unclassified if the maximum area proportion does not surpass the threshold. Finally, unclassified regions are reclassified based on spectral information using the minimum distance to mean algorithm. Experimental results show that the classification method for high-spatial-resolution remote sensing images based on the spatial mapping mechanism and reclassification strategy can make use of both panchromatic and multispectral information, integrate the pixel- and object-based classification methods, and improve classification accuracy. PMID:24453808
van Gemert, Jan C; Veenman, Cor J; Smeulders, Arnold W M; Geusebroek, Jan-Mark
2010-07-01
This paper studies automatic image classification by modeling soft assignment in the popular codebook model. The codebook model describes an image as a bag of discrete visual words selected from a vocabulary, where the frequency distributions of visual words in an image allow classification. One inherent component of the codebook model is the assignment of discrete visual words to continuous image features. Despite the clear mismatch of this hard assignment with the nature of continuous features, the approach has been successfully applied for some years. In this paper, we investigate four types of soft assignment of visual words to image features. We demonstrate that explicitly modeling visual word assignment ambiguity improves classification performance compared to the hard assignment of the traditional codebook model. The traditional codebook model is compared against our method for five well-known data sets: 15 natural scenes, Caltech-101, Caltech-256, and Pascal VOC 2007/2008. We demonstrate that large codebook vocabulary sizes completely deteriorate the performance of the traditional model, whereas the proposed model performs consistently. Moreover, we show that our method profits in high-dimensional feature spaces and reaps higher benefits when increasing the number of image categories.
John A. Scrivani; Randolph H. Wynne; Christine E. Blinn; Rebecca F. Musy
2001-01-01
Two methods of training data collection for automated image classification were tested in Virginia as part of a larger effort to develop an objective, repeatable, and low-cost method to provide forest area classification from satellite imagery. The derived forest area estimates were compared to estimates derived from a traditional photo-interpreted, double sample. One...
Classification of Odours for Mobile Robots Using an Ensemble of Linear Classifiers
NASA Astrophysics Data System (ADS)
Trincavelli, Marco; Coradeschi, Silvia; Loutfi, Amy
2009-05-01
This paper investigates the classification of odours using an electronic nose mounted on a mobile robot. The samples are collected as the robot explores the environment. Under such conditions, the sensor response differs from typical three phase sampling processes. In this paper, we focus particularly on the classification problem and how it is influenced by the movement of the robot. To cope with these influences, an algorithm consisting of an ensemble of classifiers is presented. Experimental results show that this algorithm increases classification performance compared to other traditional classification methods.
Integrated feature extraction and selection for neuroimage classification
NASA Astrophysics Data System (ADS)
Fan, Yong; Shen, Dinggang
2009-02-01
Feature extraction and selection are of great importance in neuroimage classification for identifying informative features and reducing feature dimensionality, which are generally implemented as two separate steps. This paper presents an integrated feature extraction and selection algorithm with two iterative steps: constrained subspace learning based feature extraction and support vector machine (SVM) based feature selection. The subspace learning based feature extraction focuses on the brain regions with higher possibility of being affected by the disease under study, while the possibility of brain regions being affected by disease is estimated by the SVM based feature selection, in conjunction with SVM classification. This algorithm can not only take into account the inter-correlation among different brain regions, but also overcome the limitation of traditional subspace learning based feature extraction methods. To achieve robust performance and optimal selection of parameters involved in feature extraction, selection, and classification, a bootstrapping strategy is used to generate multiple versions of training and testing sets for parameter optimization, according to the classification performance measured by the area under the ROC (receiver operating characteristic) curve. The integrated feature extraction and selection method is applied to a structural MR image based Alzheimer's disease (AD) study with 98 non-demented and 100 demented subjects. Cross-validation results indicate that the proposed algorithm can improve performance of the traditional subspace learning based classification.
Mu, Guangyu; Liu, Ying; Wang, Limin
2015-01-01
The spatial pooling method such as spatial pyramid matching (SPM) is very crucial in the bag of features model used in image classification. SPM partitions the image into a set of regular grids and assumes that the spatial layout of all visual words obey the uniform distribution over these regular grids. However, in practice, we consider that different visual words should obey different spatial layout distributions. To improve SPM, we develop a novel spatial pooling method, namely spatial distribution pooling (SDP). The proposed SDP method uses an extension model of Gauss mixture model to estimate the spatial layout distributions of the visual vocabulary. For each visual word type, SDP can generate a set of flexible grids rather than the regular grids from the traditional SPM. Furthermore, we can compute the grid weights for visual word tokens according to their spatial coordinates. The experimental results demonstrate that SDP outperforms the traditional spatial pooling methods, and is competitive with the state-of-the-art classification accuracy on several challenging image datasets.
Yin, Chang Shik; Ko, Seong-Gyu
2014-01-01
Objectives. Korean medicine, an integrated allopathic and traditional medicine, has developed unique characteristics and has been active in contributing to evidence-based medicine. Recent developments in Korean medicine have not been as well disseminated as traditional Chinese medicine. This introduction to recent developments in Korean medicine will draw attention to, and facilitate, the advancement of evidence-based complementary alternative medicine (CAM). Methods and Results. The history of and recent developments in Korean medicine as evidence-based medicine are explored through discussions on the development of a national standard classification of diseases and study reports, ranging from basic research to newly developed clinical therapies. A national standard classification of diseases has been developed and revised serially into an integrated classification of Western allopathic and traditional holistic medicine disease entities. Standard disease classifications offer a starting point for the reliable gathering of evidence and provide a representative example of the unique status of evidence-based Korean medicine as an integration of Western allopathic medicine and traditional holistic medicine. Conclusions. Recent developments in evidence-based Korean medicine show a unique development in evidence-based medicine, adopting both Western allopathic and holistic traditional medicine. It is expected that Korean medicine will continue to be an important contributor to evidence-based medicine, encompassing conventional and complementary approaches.
NASA Astrophysics Data System (ADS)
Starkey, Andrew; Usman Ahmad, Aliyu; Hamdoun, Hassan
2017-10-01
This paper investigates the application of a novel method for classification called Feature Weighted Self Organizing Map (FWSOM) that analyses the topology information of a converged standard Self Organizing Map (SOM) to automatically guide the selection of important inputs during training for improved classification of data with redundant inputs, examined against two traditional approaches namely neural networks and Support Vector Machines (SVM) for the classification of EEG data as presented in previous work. In particular, the novel method looks to identify the features that are important for classification automatically, and in this way the important features can be used to improve the diagnostic ability of any of the above methods. The paper presents the results and shows how the automated identification of the important features successfully identified the important features in the dataset and how this results in an improvement of the classification results for all methods apart from linear discriminatory methods which cannot separate the underlying nonlinear relationship in the data. The FWSOM in addition to achieving higher classification accuracy has given insights into what features are important in the classification of each class (left and right-hand movements), and these are corroborated by already published work in this area.
Forest site classification for cultural plant harvest by tribal weavers can inform management
S. Hummel; F.K. Lake
2015-01-01
Do qualitative classifications of ecological conditions for harvesting culturally important forest plants correspond to quantitative differences among sites? To address this question, we blended scientific methods (SEK) and traditional ecological knowledge (TEK) to identify conditions on sites considered good, marginal, or poor for harvesting the leaves of a plant (...
Comparisons and Selections of Features and Classifiers for Short Text Classification
NASA Astrophysics Data System (ADS)
Wang, Ye; Zhou, Zhi; Jin, Shan; Liu, Debin; Lu, Mi
2017-10-01
Short text is considerably different from traditional long text documents due to its shortness and conciseness, which somehow hinders the applications of conventional machine learning and data mining algorithms in short text classification. According to traditional artificial intelligence methods, we divide short text classification into three steps, namely preprocessing, feature selection and classifier comparison. In this paper, we have illustrated step-by-step how we approach our goals. Specifically, in feature selection, we compared the performance and robustness of the four methods of one-hot encoding, tf-idf weighting, word2vec and paragraph2vec, and in the classification part, we deliberately chose and compared Naive Bayes, Logistic Regression, Support Vector Machine, K-nearest Neighbor and Decision Tree as our classifiers. Then, we compared and analysed the classifiers horizontally with each other and vertically with feature selections. Regarding the datasets, we crawled more than 400,000 short text files from Shanghai and Shenzhen Stock Exchanges and manually labeled them into two classes, the big and the small. There are eight labels in the big class, and 59 labels in the small class.
A novel application of deep learning for single-lead ECG classification.
Mathews, Sherin M; Kambhamettu, Chandra; Barner, Kenneth E
2018-06-04
Detecting and classifying cardiac arrhythmias is critical to the diagnosis of patients with cardiac abnormalities. In this paper, a novel approach based on deep learning methodology is proposed for the classification of single-lead electrocardiogram (ECG) signals. We demonstrate the application of the Restricted Boltzmann Machine (RBM) and deep belief networks (DBN) for ECG classification following detection of ventricular and supraventricular heartbeats using single-lead ECG. The effectiveness of this proposed algorithm is illustrated using real ECG signals from the widely-used MIT-BIH database. Simulation results demonstrate that with a suitable choice of parameters, RBM and DBN can achieve high average recognition accuracies of ventricular ectopic beats (93.63%) and of supraventricular ectopic beats (95.57%) at a low sampling rate of 114 Hz. Experimental results indicate that classifiers built into this deep learning-based framework achieved state-of-the art performance models at lower sampling rates and simple features when compared to traditional methods. Further, employing features extracted at a sampling rate of 114 Hz when combined with deep learning provided enough discriminatory power for the classification task. This performance is comparable to that of traditional methods and uses a much lower sampling rate and simpler features. Thus, our proposed deep neural network algorithm demonstrates that deep learning-based methods offer accurate ECG classification and could potentially be extended to other physiological signal classifications, such as those in arterial blood pressure (ABP), nerve conduction (EMG), and heart rate variability (HRV) studies. Copyright © 2018. Published by Elsevier Ltd.
NASA Astrophysics Data System (ADS)
Teffahi, Hanane; Yao, Hongxun; Belabid, Nasreddine; Chaib, Souleyman
2018-02-01
The satellite images with very high spatial resolution have been recently widely used in image classification topic as it has become challenging task in remote sensing field. Due to a number of limitations such as the redundancy of features and the high dimensionality of the data, different classification methods have been proposed for remote sensing images classification particularly the methods using feature extraction techniques. This paper propose a simple efficient method exploiting the capability of extended multi-attribute profiles (EMAP) with sparse autoencoder (SAE) for remote sensing image classification. The proposed method is used to classify various remote sensing datasets including hyperspectral and multispectral images by extracting spatial and spectral features based on the combination of EMAP and SAE by linking them to kernel support vector machine (SVM) for classification. Experiments on new hyperspectral image "Huston data" and multispectral image "Washington DC data" shows that this new scheme can achieve better performance of feature learning than the primitive features, traditional classifiers and ordinary autoencoder and has huge potential to achieve higher accuracy for classification in short running time.
Data Field Modeling and Spectral-Spatial Feature Fusion for Hyperspectral Data Classification.
Liu, Da; Li, Jianxun
2016-12-16
Classification is a significant subject in hyperspectral remote sensing image processing. This study proposes a spectral-spatial feature fusion algorithm for the classification of hyperspectral images (HSI). Unlike existing spectral-spatial classification methods, the influences and interactions of the surroundings on each measured pixel were taken into consideration in this paper. Data field theory was employed as the mathematical realization of the field theory concept in physics, and both the spectral and spatial domains of HSI were considered as data fields. Therefore, the inherent dependency of interacting pixels was modeled. Using data field modeling, spatial and spectral features were transformed into a unified radiation form and further fused into a new feature by using a linear model. In contrast to the current spectral-spatial classification methods, which usually simply stack spectral and spatial features together, the proposed method builds the inner connection between the spectral and spatial features, and explores the hidden information that contributed to classification. Therefore, new information is included for classification. The final classification result was obtained using a random forest (RF) classifier. The proposed method was tested with the University of Pavia and Indian Pines, two well-known standard hyperspectral datasets. The experimental results demonstrate that the proposed method has higher classification accuracies than those obtained by the traditional approaches.
Taghanaki, Saeid Asgari; Kawahara, Jeremy; Miles, Brandon; Hamarneh, Ghassan
2017-07-01
Feature reduction is an essential stage in computer aided breast cancer diagnosis systems. Multilayer neural networks can be trained to extract relevant features by encoding high-dimensional data into low-dimensional codes. Optimizing traditional auto-encoders works well only if the initial weights are close to a proper solution. They are also trained to only reduce the mean squared reconstruction error (MRE) between the encoder inputs and the decoder outputs, but do not address the classification error. The goal of the current work is to test the hypothesis that extending traditional auto-encoders (which only minimize reconstruction error) to multi-objective optimization for finding Pareto-optimal solutions provides more discriminative features that will improve classification performance when compared to single-objective and other multi-objective approaches (i.e. scalarized and sequential). In this paper, we introduce a novel multi-objective optimization of deep auto-encoder networks, in which the auto-encoder optimizes two objectives: MRE and mean classification error (MCE) for Pareto-optimal solutions, rather than just MRE. These two objectives are optimized simultaneously by a non-dominated sorting genetic algorithm. We tested our method on 949 X-ray mammograms categorized into 12 classes. The results show that the features identified by the proposed algorithm allow a classification accuracy of up to 98.45%, demonstrating favourable accuracy over the results of state-of-the-art methods reported in the literature. We conclude that adding the classification objective to the traditional auto-encoder objective and optimizing for finding Pareto-optimal solutions, using evolutionary multi-objective optimization, results in producing more discriminative features. Copyright © 2017 Elsevier B.V. All rights reserved.
Mao, Xue Gang; Du, Zi Han; Liu, Jia Qian; Chen, Shu Xin; Hou, Ji Yu
2018-01-01
Traditional field investigation and artificial interpretation could not satisfy the need of forest gaps extraction at regional scale. High spatial resolution remote sensing image provides the possibility for regional forest gaps extraction. In this study, we used object-oriented classification method to segment and classify forest gaps based on QuickBird high resolution optical remote sensing image in Jiangle National Forestry Farm of Fujian Province. In the process of object-oriented classification, 10 scales (10-100, with a step length of 10) were adopted to segment QuickBird remote sensing image; and the intersection area of reference object (RA or ) and intersection area of segmented object (RA os ) were adopted to evaluate the segmentation result at each scale. For segmentation result at each scale, 16 spectral characteristics and support vector machine classifier (SVM) were further used to classify forest gaps, non-forest gaps and others. The results showed that the optimal segmentation scale was 40 when RA or was equal to RA os . The accuracy difference between the maximum and minimum at different segmentation scales was 22%. At optimal scale, the overall classification accuracy was 88% (Kappa=0.82) based on SVM classifier. Combining high resolution remote sensing image data with object-oriented classification method could replace the traditional field investigation and artificial interpretation method to identify and classify forest gaps at regional scale.
Visual attention based bag-of-words model for image classification
NASA Astrophysics Data System (ADS)
Wang, Qiwei; Wan, Shouhong; Yue, Lihua; Wang, Che
2014-04-01
Bag-of-words is a classical method for image classification. The core problem is how to count the frequency of the visual words and what visual words to select. In this paper, we propose a visual attention based bag-of-words model (VABOW model) for image classification task. The VABOW model utilizes visual attention method to generate a saliency map, and uses the saliency map as a weighted matrix to instruct the statistic process for the frequency of the visual words. On the other hand, the VABOW model combines shape, color and texture cues and uses L1 regularization logistic regression method to select the most relevant and most efficient features. We compare our approach with traditional bag-of-words based method on two datasets, and the result shows that our VABOW model outperforms the state-of-the-art method for image classification.
Deep Convolutional Neural Networks for Classifying Body Constitution Based on Face Image.
Huan, Er-Yang; Wen, Gui-Hua; Zhang, Shi-Jun; Li, Dan-Yang; Hu, Yang; Chang, Tian-Yuan; Wang, Qing; Huang, Bing-Lin
2017-01-01
Body constitution classification is the basis and core content of traditional Chinese medicine constitution research. It is to extract the relevant laws from the complex constitution phenomenon and finally build the constitution classification system. Traditional identification methods have the disadvantages of inefficiency and low accuracy, for instance, questionnaires. This paper proposed a body constitution recognition algorithm based on deep convolutional neural network, which can classify individual constitution types according to face images. The proposed model first uses the convolutional neural network to extract the features of face image and then combines the extracted features with the color features. Finally, the fusion features are input to the Softmax classifier to get the classification result. Different comparison experiments show that the algorithm proposed in this paper can achieve the accuracy of 65.29% about the constitution classification. And its performance was accepted by Chinese medicine practitioners.
Yuan, Yuan; Lin, Jianzhe; Wang, Qi
2016-12-01
Hyperspectral image (HSI) classification is a crucial issue in remote sensing. Accurate classification benefits a large number of applications such as land use analysis and marine resource utilization. But high data correlation brings difficulty to reliable classification, especially for HSI with abundant spectral information. Furthermore, the traditional methods often fail to well consider the spatial coherency of HSI that also limits the classification performance. To address these inherent obstacles, a novel spectral-spatial classification scheme is proposed in this paper. The proposed method mainly focuses on multitask joint sparse representation (MJSR) and a stepwise Markov random filed framework, which are claimed to be two main contributions in this procedure. First, the MJSR not only reduces the spectral redundancy, but also retains necessary correlation in spectral field during classification. Second, the stepwise optimization further explores the spatial correlation that significantly enhances the classification accuracy and robustness. As far as several universal quality evaluation indexes are concerned, the experimental results on Indian Pines and Pavia University demonstrate the superiority of our method compared with the state-of-the-art competitors.
Optimized extreme learning machine for urban land cover classification using hyperspectral imagery
NASA Astrophysics Data System (ADS)
Su, Hongjun; Tian, Shufang; Cai, Yue; Sheng, Yehua; Chen, Chen; Najafian, Maryam
2017-12-01
This work presents a new urban land cover classification framework using the firefly algorithm (FA) optimized extreme learning machine (ELM). FA is adopted to optimize the regularization coefficient C and Gaussian kernel σ for kernel ELM. Additionally, effectiveness of spectral features derived from an FA-based band selection algorithm is studied for the proposed classification task. Three sets of hyperspectral databases were recorded using different sensors, namely HYDICE, HyMap, and AVIRIS. Our study shows that the proposed method outperforms traditional classification algorithms such as SVM and reduces computational cost significantly.
Method of Grassland Information Extraction Based on Multi-Level Segmentation and Cart Model
NASA Astrophysics Data System (ADS)
Qiao, Y.; Chen, T.; He, J.; Wen, Q.; Liu, F.; Wang, Z.
2018-04-01
It is difficult to extract grassland accurately by traditional classification methods, such as supervised method based on pixels or objects. This paper proposed a new method combing the multi-level segmentation with CART (classification and regression tree) model. The multi-level segmentation which combined the multi-resolution segmentation and the spectral difference segmentation could avoid the over and insufficient segmentation seen in the single segmentation mode. The CART model was established based on the spectral characteristics and texture feature which were excavated from training sample data. Xilinhaote City in Inner Mongolia Autonomous Region was chosen as the typical study area and the proposed method was verified by using visual interpretation results as approximate truth value. Meanwhile, the comparison with the nearest neighbor supervised classification method was obtained. The experimental results showed that the total precision of classification and the Kappa coefficient of the proposed method was 95 % and 0.9, respectively. However, the total precision of classification and the Kappa coefficient of the nearest neighbor supervised classification method was 80 % and 0.56, respectively. The result suggested that the accuracy of classification proposed in this paper was higher than the nearest neighbor supervised classification method. The experiment certificated that the proposed method was an effective extraction method of grassland information, which could enhance the boundary of grassland classification and avoid the restriction of grassland distribution scale. This method was also applicable to the extraction of grassland information in other regions with complicated spatial features, which could avoid the interference of woodland, arable land and water body effectively.
Adaptive phase k-means algorithm for waveform classification
NASA Astrophysics Data System (ADS)
Song, Chengyun; Liu, Zhining; Wang, Yaojun; Xu, Feng; Li, Xingming; Hu, Guangmin
2018-01-01
Waveform classification is a powerful technique for seismic facies analysis that describes the heterogeneity and compartments within a reservoir. Horizon interpretation is a critical step in waveform classification. However, the horizon often produces inconsistent waveform phase, and thus results in an unsatisfied classification. To alleviate this problem, an adaptive phase waveform classification method called the adaptive phase k-means is introduced in this paper. Our method improves the traditional k-means algorithm using an adaptive phase distance for waveform similarity measure. The proposed distance is a measure with variable phases as it moves from sample to sample along the traces. Model traces are also updated with the best phase interference in the iterative process. Therefore, our method is robust to phase variations caused by the interpretation horizon. We tested the effectiveness of our algorithm by applying it to synthetic and real data. The satisfactory results reveal that the proposed method tolerates certain waveform phase variation and is a good tool for seismic facies analysis.
Peng, Xiang; King, Irwin
2008-01-01
The Biased Minimax Probability Machine (BMPM) constructs a classifier which deals with the imbalanced learning tasks. It provides a worst-case bound on the probability of misclassification of future data points based on reliable estimates of means and covariance matrices of the classes from the training data samples, and achieves promising performance. In this paper, we develop a novel yet critical extension training algorithm for BMPM that is based on Second-Order Cone Programming (SOCP). Moreover, we apply the biased classification model to medical diagnosis problems to demonstrate its usefulness. By removing some crucial assumptions in the original solution to this model, we make the new method more accurate and robust. We outline the theoretical derivatives of the biased classification model, and reformulate it into an SOCP problem which could be efficiently solved with global optima guarantee. We evaluate our proposed SOCP-based BMPM (BMPMSOCP) scheme in comparison with traditional solutions on medical diagnosis tasks where the objectives are to focus on improving the sensitivity (the accuracy of the more important class, say "ill" samples) instead of the overall accuracy of the classification. Empirical results have shown that our method is more effective and robust to handle imbalanced classification problems than traditional classification approaches, and the original Fractional Programming-based BMPM (BMPMFP).
NASA Astrophysics Data System (ADS)
Jobin, Benoît; Labrecque, Sandra; Grenier, Marcelle; Falardeau, Gilles
2008-01-01
The traditional method of identifying wildlife habitat distribution over large regions consists of pixel-based classification of satellite images into a suite of habitat classes used to select suitable habitat patches. Object-based classification is a new method that can achieve the same objective based on the segmentation of spectral bands of the image creating homogeneous polygons with regard to spatial or spectral characteristics. The segmentation algorithm does not solely rely on the single pixel value, but also on shape, texture, and pixel spatial continuity. The object-based classification is a knowledge base process where an interpretation key is developed using ground control points and objects are assigned to specific classes according to threshold values of determined spectral and/or spatial attributes. We developed a model using the eCognition software to identify suitable habitats for the Grasshopper Sparrow, a rare and declining species found in southwestern Québec. The model was developed in a region with known breeding sites and applied on other images covering adjacent regions where potential breeding habitats may be present. We were successful in locating potential habitats in areas where dairy farming prevailed but failed in an adjacent region covered by a distinct Landsat scene and dominated by annual crops. We discuss the added value of this method, such as the possibility to use the contextual information associated to objects and the ability to eliminate unsuitable areas in the segmentation and land cover classification processes, as well as technical and logistical constraints. A series of recommendations on the use of this method and on conservation issues of Grasshopper Sparrow habitat is also provided.
ERIC Educational Resources Information Center
Smith, J. McCree
Three methods for the preparation of maintenance budgets are discussed--(1) a traditional method, inconclusive and obsolete, based on gross square footage, (2) the formula approach method based on building classification (wood-frame, masonry-wood, masonry-concrete) with maintenance cost factors for each type plus custodial service rates by type of…
Li, Tao; Su, Chen
2018-06-02
Rhodiola is an increasingly widely used traditional Tibetan medicine and traditional Chinese medicine in China. The composition profiles of bioactive compounds are somewhat jagged according to different species, which makes it crucial to identify authentic Rhodiola species accurately so as to ensure clinical application of Rhodiola. In this paper, a nondestructive, rapid, and efficient method in classification of Rhodiola was developed by Fourier transform near-infrared (FT-NIR) spectroscopy combined with chemometrics analysis. A total of 160 batches of raw spectra were obtained from four different species of Rhodiola by FT-NIR, such as Rhodiola crenulata, Rhodiola fastigiata, Rhodiola kirilowii, and Rhodiola brevipetiolata. After excluding the outliers, different performances of 3 sample dividing methods, 12 spectral preprocessing methods, 2 wavelength selection methods, and 2 modeling evaluation methods were compared. The results indicated that this combination was superior than others in the authenticity identification analysis, which was FT-NIR combined with sample set partitioning based on joint x-y distances (SPXY), standard normal variate transformation (SNV) + Norris-Williams (NW) + 2nd derivative, competitive adaptive reweighted sampling (CARS), and kernel extreme learning machine (KELM). The accuracy (ACCU), sensitivity (SENS), and specificity (SPEC) of the optimal model were all 1, which showed that this combination of FT-NIR and chemometrics methods had the optimal authenticity identification performance. The classification performance of the partial least squares discriminant analysis (PLS-DA) model was slightly lower than KELM model, and PLS-DA model results were ACCU = 0.97, SENS = 0.93, and SPEC = 0.98, respectively. It can be concluded that FT-NIR combined with chemometrics analysis has great potential in authenticity identification and classification of Rhodiola, which can provide a valuable reference for the safety and effectiveness of clinical application of Rhodiola. Copyright © 2018 Elsevier B.V. All rights reserved.
[Application of Delphi method in traditional Chinese medicine clinical research].
Bi, Ying-fei; Mao, Jing-yuan
2012-03-01
In recent years, Delphi method has been widely applied in traditional Chinese medicine (TCM) clinical research. This article analyzed the present application situation of Delphi method in TCM clinical research, and discussed some problems presented in the choice of evaluation method, classification of observation indexes and selection of survey items. On the basis of present application of Delphi method, the author analyzed the method on questionnaire making, selection of experts, evaluation of observation indexes and selection of survey items. Furthermore, the author summarized the steps of application of Delphi method in TCM clinical research.
Steganalysis using logistic regression
NASA Astrophysics Data System (ADS)
Lubenko, Ivans; Ker, Andrew D.
2011-02-01
We advocate Logistic Regression (LR) as an alternative to the Support Vector Machine (SVM) classifiers commonly used in steganalysis. LR offers more information than traditional SVM methods - it estimates class probabilities as well as providing a simple classification - and can be adapted more easily and efficiently for multiclass problems. Like SVM, LR can be kernelised for nonlinear classification, and it shows comparable classification accuracy to SVM methods. This work is a case study, comparing accuracy and speed of SVM and LR classifiers in detection of LSB Matching and other related spatial-domain image steganography, through the state-of-art 686-dimensional SPAM feature set, in three image sets.
Application of a neural network for reflectance spectrum classification
NASA Astrophysics Data System (ADS)
Yang, Gefei; Gartley, Michael
2017-05-01
Traditional reflectance spectrum classification algorithms are based on comparing spectrum across the electromagnetic spectrum anywhere from the ultra-violet to the thermal infrared regions. These methods analyze reflectance on a pixel by pixel basis. Inspired by high performance that Convolution Neural Networks (CNN) have demonstrated in image classification, we applied a neural network to analyze directional reflectance pattern images. By using the bidirectional reflectance distribution function (BRDF) data, we can reformulate the 4-dimensional into 2 dimensions, namely incident direction × reflected direction × channels. Meanwhile, RIT's micro-DIRSIG model is utilized to simulate additional training samples for improving the robustness of the neural networks training. Unlike traditional classification by using hand-designed feature extraction with a trainable classifier, neural networks create several layers to learn a feature hierarchy from pixels to classifier and all layers are trained jointly. Hence, the our approach of utilizing the angular features are different to traditional methods utilizing spatial features. Although training processing typically has a large computational cost, simple classifiers work well when subsequently using neural network generated features. Currently, most popular neural networks such as VGG, GoogLeNet and AlexNet are trained based on RGB spatial image data. Our approach aims to build a directional reflectance spectrum based neural network to help us to understand from another perspective. At the end of this paper, we compare the difference among several classifiers and analyze the trade-off among neural networks parameters.
[Classification of local anesthesia methods].
Petricas, A Zh; Medvedev, D V; Olkhovskaya, E B
The traditional classification methods of dental local anesthesia must be modified. In this paper we proved that the vascular mechanism is leading component of spongy injection. It is necessary to take into account the high effectiveness and relative safety of spongy anesthesia, as well as versatility, ease of implementation and the growing prevalence in the world. The essence of the proposed modification is to distinguish the methods in diffusive (including surface anesthesia, infiltration and conductive anesthesia) and vascular-diffusive (including intraosseous, intraligamentary, intraseptal and intrapulpal anesthesia). For the last four methods the common term «spongy (intraosseous) anesthesia» may be used.
Determination of maize hardness by biospeckle and fuzzy granularity.
Weber, Christian; Dai Pra, Ana L; Passoni, Lucía I; Rabal, Héctor J; Trivi, Marcelo; Poggio Aguerre, Guillermo J
2014-09-01
In recent years there has been renewed interest in the development of novel grain classification methods that could complement traditional empirical tests. A speckle pattern occurs when a laser beam illuminates an optically rough surface that flickers when the object is active and is called biospeckle. In this work, we use laser biospeckle to classify maize (Zea mays L.) kernel hardness. A series of grains of three types of maize were cut and illuminated by a laser. A series of images were then registered, stored, and processed. These were compared with results obtained by floating test. The laser speckle technique was effective in discriminating the grains based on the presence of floury or vitreous endosperm and could be considered a feasible alternative to traditional floating methods. The results indicate that this methodology can distinguish floury and vitreous grains. Moreover, the assay showed higher discrimination capability than traditional tests. It could be potentially useful for maize classification and to increase the efficiency of processing dry milling corn.
Reliability studies of diagnostic methods in Indian traditional Ayurveda medicine: An overview
Kurande, Vrinda Hitendra; Waagepetersen, Rasmus; Toft, Egon; Prasad, Ramjee
2013-01-01
Recently, a need to develop supportive new scientific evidence for contemporary Ayurveda has emerged. One of the research objectives is an assessment of the reliability of diagnoses and treatment. Reliability is a quantitative measure of consistency. It is a crucial issue in classification (such as prakriti classification), method development (pulse diagnosis), quality assurance for diagnosis and treatment and in the conduct of clinical studies. Several reliability studies are conducted in western medicine. The investigation of the reliability of traditional Chinese, Japanese and Sasang medicine diagnoses is in the formative stage. However, reliability studies in Ayurveda are in the preliminary stage. In this paper, examples are provided to illustrate relevant concepts of reliability studies of diagnostic methods and their implication in practice, education, and training. An introduction to reliability estimates and different study designs and statistical analysis is given for future studies in Ayurveda. PMID:23930037
Jiang, Shun-Yuan; Sun, Hong-Bing; Sun, Hui; Ma, Yu-Ying; Chen, Hong-Yu; Zhu, Wen-Tao; Zhou, Yi
2016-03-01
This paper aims to explore a comprehensive assessment method combined traditional Chinese medicinal material specifications with quantitative quality indicators. Seventy-six samples of Notopterygii Rhizoma et Radix were collected on market and at producing areas. Traditional commercial specifications were described and assigned, and 10 chemical components and volatile oils were determined for each sample. Cluster analysis, Fisher discriminant analysis and correspondence analysis were used to establish the relationship between the traditional qualitative commercial specifications and quantitative chemical indices for comprehensive evaluating quality of medicinal materials, and quantitative classification of commercial grade and quality grade. A herb quality index (HQI) including traditional commercial specifications and chemical components for quantitative grade classification were established, and corresponding discriminant function were figured out for precise determination of quality grade and sub-grade of Notopterygii Rhizoma et Radix. The result showed that notopterol, isoimperatorin and volatile oil were the major components for determination of chemical quality, and their dividing values were specified for every grade and sub-grade of the commercial materials of Notopterygii Rhizoma et Radix. According to the result, essential relationship between traditional medicinal indicators, qualitative commercial specifications, and quantitative chemical composition indicators can be examined by K-mean cluster, Fisher discriminant analysis and correspondence analysis, which provide a new method for comprehensive quantitative evaluation of traditional Chinese medicine quality integrated traditional commodity specifications and quantitative modern chemical index. Copyright© by the Chinese Pharmaceutical Association.
A hierarchical classification method for finger knuckle print recognition
NASA Astrophysics Data System (ADS)
Kong, Tao; Yang, Gongping; Yang, Lu
2014-12-01
Finger knuckle print has recently been seen as an effective biometric technique. In this paper, we propose a hierarchical classification method for finger knuckle print recognition, which is rooted in traditional score-level fusion methods. In the proposed method, we firstly take Gabor feature as the basic feature for finger knuckle print recognition and then a new decision rule is defined based on the predefined threshold. Finally, the minor feature speeded-up robust feature is conducted for these users, who cannot be recognized by the basic feature. Extensive experiments are performed to evaluate the proposed method, and experimental results show that it can achieve a promising performance.
A Wavelet Polarization Decomposition Net Model for Polarimetric SAR Image Classification
NASA Astrophysics Data System (ADS)
He, Chu; Ou, Dan; Yang, Teng; Wu, Kun; Liao, Mingsheng; Chen, Erxue
2014-11-01
In this paper, a deep model based on wavelet texture has been proposed for Polarimetric Synthetic Aperture Radar (PolSAR) image classification inspired by recent successful deep learning method. Our model is supposed to learn powerful and informative representations to improve the generalization ability for the complex scene classification tasks. Given the influence of speckle noise in Polarimetric SAR image, wavelet polarization decomposition is applied first to obtain basic and discriminative texture features which are then embedded into a Deep Neural Network (DNN) in order to compose multi-layer higher representations. We demonstrate that the model can produce a powerful representation which can capture some untraceable information from Polarimetric SAR images and show a promising achievement in comparison with other traditional SAR image classification methods for the SAR image dataset.
Supervised DNA Barcodes species classification: analysis, comparisons and results
2014-01-01
Background Specific fragments, coming from short portions of DNA (e.g., mitochondrial, nuclear, and plastid sequences), have been defined as DNA Barcode and can be used as markers for organisms of the main life kingdoms. Species classification with DNA Barcode sequences has been proven effective on different organisms. Indeed, specific gene regions have been identified as Barcode: COI in animals, rbcL and matK in plants, and ITS in fungi. The classification problem assigns an unknown specimen to a known species by analyzing its Barcode. This task has to be supported with reliable methods and algorithms. Methods In this work the efficacy of supervised machine learning methods to classify species with DNA Barcode sequences is shown. The Weka software suite, which includes a collection of supervised classification methods, is adopted to address the task of DNA Barcode analysis. Classifier families are tested on synthetic and empirical datasets belonging to the animal, fungus, and plant kingdoms. In particular, the function-based method Support Vector Machines (SVM), the rule-based RIPPER, the decision tree C4.5, and the Naïve Bayes method are considered. Additionally, the classification results are compared with respect to ad-hoc and well-established DNA Barcode classification methods. Results A software that converts the DNA Barcode FASTA sequences to the Weka format is released, to adapt different input formats and to allow the execution of the classification procedure. The analysis of results on synthetic and real datasets shows that SVM and Naïve Bayes outperform on average the other considered classifiers, although they do not provide a human interpretable classification model. Rule-based methods have slightly inferior classification performances, but deliver the species specific positions and nucleotide assignments. On synthetic data the supervised machine learning methods obtain superior classification performances with respect to the traditional DNA Barcode classification methods. On empirical data their classification performances are at a comparable level to the other methods. Conclusions The classification analysis shows that supervised machine learning methods are promising candidates for handling with success the DNA Barcoding species classification problem, obtaining excellent performances. To conclude, a powerful tool to perform species identification is now available to the DNA Barcoding community. PMID:24721333
ERIC Educational Resources Information Center
Cooper, Linda
1997-01-01
Discusses the problems encountered by elementary school children in retrieving information from a library catalog, either the traditional card catalog or an OPAC (online public access catalog). An alternative system of classification using colors and symbols is described that was developed in the Common School (Amherst, Massachusetts). (Author/LRW)
Classification of radiolarian images with hand-crafted and deep features
NASA Astrophysics Data System (ADS)
Keçeli, Ali Seydi; Kaya, Aydın; Keçeli, Seda Uzunçimen
2017-12-01
Radiolarians are planktonic protozoa and are important biostratigraphic and paleoenvironmental indicators for paleogeographic reconstructions. Radiolarian paleontology still remains as a low cost and the one of the most convenient way to obtain dating of deep ocean sediments. Traditional methods for identifying radiolarians are time-consuming and cannot scale to the granularity or scope necessary for large-scale studies. Automated image classification will allow making these analyses promptly. In this study, a method for automatic radiolarian image classification is proposed on Scanning Electron Microscope (SEM) images of radiolarians to ease species identification of fossilized radiolarians. The proposed method uses both hand-crafted features like invariant moments, wavelet moments, Gabor features, basic morphological features and deep features obtained from a pre-trained Convolutional Neural Network (CNN). Feature selection is applied over deep features to reduce high dimensionality. Classification outcomes are analyzed to compare hand-crafted features, deep features, and their combinations. Results show that the deep features obtained from a pre-trained CNN are more discriminative comparing to hand-crafted ones. Additionally, feature selection utilizes to the computational cost of classification algorithms and have no negative effect on classification accuracy.
Epileptic seizure detection in EEG signal using machine learning techniques.
Jaiswal, Abeg Kumar; Banka, Haider
2018-03-01
Epilepsy is a well-known nervous system disorder characterized by seizures. Electroencephalograms (EEGs), which capture brain neural activity, can detect epilepsy. Traditional methods for analyzing an EEG signal for epileptic seizure detection are time-consuming. Recently, several automated seizure detection frameworks using machine learning technique have been proposed to replace these traditional methods. The two basic steps involved in machine learning are feature extraction and classification. Feature extraction reduces the input pattern space by keeping informative features and the classifier assigns the appropriate class label. In this paper, we propose two effective approaches involving subpattern based PCA (SpPCA) and cross-subpattern correlation-based PCA (SubXPCA) with Support Vector Machine (SVM) for automated seizure detection in EEG signals. Feature extraction was performed using SpPCA and SubXPCA. Both techniques explore the subpattern correlation of EEG signals, which helps in decision-making process. SVM is used for classification of seizure and non-seizure EEG signals. The SVM was trained with radial basis kernel. All the experiments have been carried out on the benchmark epilepsy EEG dataset. The entire dataset consists of 500 EEG signals recorded under different scenarios. Seven different experimental cases for classification have been conducted. The classification accuracy was evaluated using tenfold cross validation. The classification results of the proposed approaches have been compared with the results of some of existing techniques proposed in the literature to establish the claim.
Chung, Sukhoon; Rhee, Hyunsill; Suh, Yongmoo
2010-01-01
Objectives This study sought to find answers to the following questions: 1) Can we predict whether a patient will revisit a healthcare center? 2) Can we anticipate diseases of patients who revisit the center? Methods For the first question, we applied 5 classification algorithms (decision tree, artificial neural network, logistic regression, Bayesian networks, and Naïve Bayes) and the stacking-bagging method for building classification models. To solve the second question, we performed sequential pattern analysis. Results We determined: 1) In general, the most influential variables which impact whether a patient of a public healthcare center will revisit it or not are personal burden, insurance bill, period of prescription, age, systolic pressure, name of disease, and postal code. 2) The best plain classification model is dependent on the dataset. 3) Based on average of classification accuracy, the proposed stacking-bagging method outperformed all traditional classification models and our sequential pattern analysis revealed 16 sequential patterns. Conclusions Classification models and sequential patterns can help public healthcare centers plan and implement healthcare service programs and businesses that are more appropriate to local residents, encouraging them to revisit public health centers. PMID:21818426
Deep classification hashing for person re-identification
NASA Astrophysics Data System (ADS)
Wang, Jiabao; Li, Yang; Zhang, Xiancai; Miao, Zhuang; Tao, Gang
2018-04-01
As the development of surveillance in public, person re-identification becomes more and more important. The largescale databases call for efficient computation and storage, hashing technique is one of the most important methods. In this paper, we proposed a new deep classification hashing network by introducing a new binary appropriation layer in the traditional ImageNet pre-trained CNN models. It outputs binary appropriate features, which can be easily quantized into binary hash-codes for hamming similarity comparison. Experiments show that our deep hashing method can outperform the state-of-the-art methods on the public CUHK03 and Market1501 datasets.
NASA Astrophysics Data System (ADS)
Qu, Haicheng; Liang, Xuejian; Liang, Shichao; Liu, Wanjun
2018-01-01
Many methods of hyperspectral image classification have been proposed recently, and the convolutional neural network (CNN) achieves outstanding performance. However, spectral-spatial classification of CNN requires an excessively large model, tremendous computations, and complex network, and CNN is generally unable to use the noisy bands caused by water-vapor absorption. A dimensionality-varied CNN (DV-CNN) is proposed to address these issues. There are four stages in DV-CNN and the dimensionalities of spectral-spatial feature maps vary with the stages. DV-CNN can reduce the computation and simplify the structure of the network. All feature maps are processed by more kernels in higher stages to extract more precise features. DV-CNN also improves the classification accuracy and enhances the robustness to water-vapor absorption bands. The experiments are performed on data sets of Indian Pines and Pavia University scene. The classification performance of DV-CNN is compared with state-of-the-art methods, which contain the variations of CNN, traditional, and other deep learning methods. The experiment of performance analysis about DV-CNN itself is also carried out. The experimental results demonstrate that DV-CNN outperforms state-of-the-art methods for spectral-spatial classification and it is also robust to water-vapor absorption bands. Moreover, reasonable parameters selection is effective to improve classification accuracy.
Decision Tree Repository and Rule Set Based Mingjiang River Estuarine Wetlands Classifaction
NASA Astrophysics Data System (ADS)
Zhang, W.; Li, X.; Xiao, W.
2018-05-01
The increasing urbanization and industrialization have led to wetland losses in estuarine area of Mingjiang River over past three decades. There has been increasing attention given to produce wetland inventories using remote sensing and GIS technology. Due to inconsistency training site and training sample, traditionally pixel-based image classification methods can't achieve a comparable result within different organizations. Meanwhile, object-oriented image classification technique shows grate potential to solve this problem and Landsat moderate resolution remote sensing images are widely used to fulfill this requirement. Firstly, the standardized atmospheric correct, spectrally high fidelity texture feature enhancement was conducted before implementing the object-oriented wetland classification method in eCognition. Secondly, we performed the multi-scale segmentation procedure, taking the scale, hue, shape, compactness and smoothness of the image into account to get the appropriate parameters, using the top and down region merge algorithm from single pixel level, the optimal texture segmentation scale for different types of features is confirmed. Then, the segmented object is used as the classification unit to calculate the spectral information such as Mean value, Maximum value, Minimum value, Brightness value and the Normalized value. The Area, length, Tightness and the Shape rule of the image object Spatial features and texture features such as Mean, Variance and Entropy of image objects are used as classification features of training samples. Based on the reference images and the sampling points of on-the-spot investigation, typical training samples are selected uniformly and randomly for each type of ground objects. The spectral, texture and spatial characteristics of each type of feature in each feature layer corresponding to the range of values are used to create the decision tree repository. Finally, with the help of high resolution reference images, the random sampling method is used to conduct the field investigation, achieve an overall accuracy of 90.31 %, and the Kappa coefficient is 0.88. The classification method based on decision tree threshold values and rule set developed by the repository, outperforms the results obtained from the traditional methodology. Our decision tree repository and rule set based object-oriented classification technique was an effective method for producing comparable and consistency wetlands data set.
Retargeted Least Squares Regression Algorithm.
Zhang, Xu-Yao; Wang, Lingfeng; Xiang, Shiming; Liu, Cheng-Lin
2015-09-01
This brief presents a framework of retargeted least squares regression (ReLSR) for multicategory classification. The core idea is to directly learn the regression targets from data other than using the traditional zero-one matrix as regression targets. The learned target matrix can guarantee a large margin constraint for the requirement of correct classification for each data point. Compared with the traditional least squares regression (LSR) and a recently proposed discriminative LSR models, ReLSR is much more accurate in measuring the classification error of the regression model. Furthermore, ReLSR is a single and compact model, hence there is no need to train two-class (binary) machines that are independent of each other. The convex optimization problem of ReLSR is solved elegantly and efficiently with an alternating procedure including regression and retargeting as substeps. The experimental evaluation over a range of databases identifies the validity of our method.
Sharma, Harshita; Zerbe, Norman; Klempert, Iris; Hellwich, Olaf; Hufnagl, Peter
2017-11-01
Deep learning using convolutional neural networks is an actively emerging field in histological image analysis. This study explores deep learning methods for computer-aided classification in H&E stained histopathological whole slide images of gastric carcinoma. An introductory convolutional neural network architecture is proposed for two computerized applications, namely, cancer classification based on immunohistochemical response and necrosis detection based on the existence of tumor necrosis in the tissue. Classification performance of the developed deep learning approach is quantitatively compared with traditional image analysis methods in digital histopathology requiring prior computation of handcrafted features, such as statistical measures using gray level co-occurrence matrix, Gabor filter-bank responses, LBP histograms, gray histograms, HSV histograms and RGB histograms, followed by random forest machine learning. Additionally, the widely known AlexNet deep convolutional framework is comparatively analyzed for the corresponding classification problems. The proposed convolutional neural network architecture reports favorable results, with an overall classification accuracy of 0.6990 for cancer classification and 0.8144 for necrosis detection. Copyright © 2017 Elsevier Ltd. All rights reserved.
EXhype: A tool for mineral classification using hyperspectral data
NASA Astrophysics Data System (ADS)
Adep, Ramesh Nityanand; shetty, Amba; Ramesh, H.
2017-02-01
Various supervised classification algorithms have been developed to classify earth surface features using hyperspectral data. Each algorithm is modelled based on different human expertises. However, the performance of conventional algorithms is not satisfactory to map especially the minerals in view of their typical spectral responses. This study introduces a new expert system named 'EXhype (Expert system for hyperspectral data classification)' to map minerals. The system incorporates human expertise at several stages of it's implementation: (i) to deal with intra-class variation; (ii) to identify absorption features; (iii) to discriminate spectra by considering absorption features, non-absorption features and by full spectra comparison; and (iv) finally takes a decision based on learning and by emphasizing most important features. It is developed using a knowledge base consisting of an Optimal Spectral Library, Segmented Upper Hull method, Spectral Angle Mapper (SAM) and Artificial Neural Network. The performance of the EXhype is compared with a traditional, most commonly used SAM algorithm using Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) data acquired over Cuprite, Nevada, USA. A virtual verification method is used to collect samples information for accuracy assessment. Further, a modified accuracy assessment method is used to get a real users accuracies in cases where only limited or desired classes are considered for classification. With the modified accuracy assessment method, SAM and EXhype yields an overall accuracy of 60.35% and 90.75% and the kappa coefficient of 0.51 and 0.89 respectively. It was also found that the virtual verification method allows to use most desired stratified random sampling method and eliminates all the difficulties associated with it. The experimental results show that EXhype is not only producing better accuracy compared to traditional SAM but, can also rightly classify the minerals. It is proficient in avoiding misclassification between target classes when applied on minerals.
Feature ranking and rank aggregation for automatic sleep stage classification: a comparative study.
Najdi, Shirin; Gharbali, Ali Abdollahi; Fonseca, José Manuel
2017-08-18
Nowadays, sleep quality is one of the most important measures of healthy life, especially considering the huge number of sleep-related disorders. Identifying sleep stages using polysomnographic (PSG) signals is the traditional way of assessing sleep quality. However, the manual process of sleep stage classification is time-consuming, subjective and costly. Therefore, in order to improve the accuracy and efficiency of the sleep stage classification, researchers have been trying to develop automatic classification algorithms. Automatic sleep stage classification mainly consists of three steps: pre-processing, feature extraction and classification. Since classification accuracy is deeply affected by the extracted features, a poor feature vector will adversely affect the classifier and eventually lead to low classification accuracy. Therefore, special attention should be given to the feature extraction and selection process. In this paper the performance of seven feature selection methods, as well as two feature rank aggregation methods, were compared. Pz-Oz EEG, horizontal EOG and submental chin EMG recordings of 22 healthy males and females were used. A comprehensive feature set including 49 features was extracted from these recordings. The extracted features are among the most common and effective features used in sleep stage classification from temporal, spectral, entropy-based and nonlinear categories. The feature selection methods were evaluated and compared using three criteria: classification accuracy, stability, and similarity. Simulation results show that MRMR-MID achieves the highest classification performance while Fisher method provides the most stable ranking. In our simulations, the performance of the aggregation methods was in the average level, although they are known to generate more stable results and better accuracy. The Borda and RRA rank aggregation methods could not outperform significantly the conventional feature ranking methods. Among conventional methods, some of them slightly performed better than others, although the choice of a suitable technique is dependent on the computational complexity and accuracy requirements of the user.
NASA Astrophysics Data System (ADS)
Jawak, Shridhar D.; Jadhav, Ajay; Luis, Alvarinho J.
2016-05-01
Supraglacial debris was mapped in the Schirmacher Oasis, east Antarctica, by using WorldView-2 (WV-2) high resolution optical remote sensing data consisting of 8-band calibrated Gram Schmidt (GS)-sharpened and atmospherically corrected WV-2 imagery. This study is a preliminary attempt to develop an object-oriented rule set to extract supraglacial debris for Antarctic region using 8-spectral band imagery. Supraglacial debris was manually digitized from the satellite imagery to generate the ground reference data. Several trials were performed using few existing traditional pixel-based classification techniques and color-texture based object-oriented classification methods to extract supraglacial debris over a small domain of the study area. Multi-level segmentation and attributes such as scale, shape, size, compactness along with spectral information from the data were used for developing the rule set. The quantitative analysis of error was carried out against the manually digitized reference data to test the practicability of our approach over the traditional pixel-based methods. Our results indicate that OBIA-based approach (overall accuracy: 93%) for extracting supraglacial debris performed better than all the traditional pixel-based methods (overall accuracy: 80-85%). The present attempt provides a comprehensive improved method for semiautomatic feature extraction in supraglacial environment and a new direction in the cryospheric research.
Jamali, Jamshid; Ayatollahi, Seyyed Mohammad Taghi
2015-01-01
Background: Nurses constitute the most providers of health care systems. Their mental health can affect the quality of services and patients’ satisfaction. General Health Questionnaire (GHQ-12) is a general screening tool used to detect mental disorders. Scoring method and determining thresholds for this questionnaire are debatable and the cut-off points can vary from sample to sample. This study was conducted to estimate the prevalence of mental disorders among Iranian nurses using GHQ-12 and also compare Latent Class Analysis (LCA) and K-means clustering with traditional scoring method. Methodology: A cross-sectional study was carried out in Fars and Bushehr provinces of southern Iran in 2014. Participants were 771 Iranian nurses, who filled out the GHQ-12 questionnaire. Traditional scoring method, LCA and K-means were used to estimate the prevalence of mental disorder among Iranian nurses. Cohen’s kappa statistic was applied to assess the agreement between the LCA and K-means with traditional scoring method of GHQ-12. Results: The nurses with mental disorder by scoring method, LCA and K-mean were 36.3% (n=280), 32.2% (n=248), and 26.5% (n=204), respectively. LCA and logistic regression revealed that the prevalence of mental disorder in females was significantly higher than males. Conclusion: Mental disorder in nurses was in a medium level compared to other people living in Iran. There was a little difference between prevalence of mental disorder estimated by scoring method, K-means and LCA. According to the advantages of LCA than K-means and different results in scoring method, we suggest LCA for classification of Iranian nurses according to their mental health outcomes using GHQ-12 questionnaire PMID:26622202
Jamali, Jamshid; Ayatollahi, Seyyed Mohammad Taghi
2015-10-01
Nurses constitute the most providers of health care systems. Their mental health can affect the quality of services and patients' satisfaction. General Health Questionnaire (GHQ-12) is a general screening tool used to detect mental disorders. Scoring method and determining thresholds for this questionnaire are debatable and the cut-off points can vary from sample to sample. This study was conducted to estimate the prevalence of mental disorders among Iranian nurses using GHQ-12 and also compare Latent Class Analysis (LCA) and K-means clustering with traditional scoring method. A cross-sectional study was carried out in Fars and Bushehr provinces of southern Iran in 2014. Participants were 771 Iranian nurses, who filled out the GHQ-12 questionnaire. Traditional scoring method, LCA and K-means were used to estimate the prevalence of mental disorder among Iranian nurses. Cohen's kappa statistic was applied to assess the agreement between the LCA and K-means with traditional scoring method of GHQ-12. The nurses with mental disorder by scoring method, LCA and K-mean were 36.3% (n=280), 32.2% (n=248), and 26.5% (n=204), respectively. LCA and logistic regression revealed that the prevalence of mental disorder in females was significantly higher than males. Mental disorder in nurses was in a medium level compared to other people living in Iran. There was a little difference between prevalence of mental disorder estimated by scoring method, K-means and LCA. According to the advantages of LCA than K-means and different results in scoring method, we suggest LCA for classification of Iranian nurses according to their mental health outcomes using GHQ-12 questionnaire.
Job and Work Evaluation: A Literature Review.
ERIC Educational Resources Information Center
Heneman, Robert L.
2003-01-01
Describes advantages and disadvantages of work evaluation methods: ranking, market pricing, banding, classification, single-factor, competency, point-factor, and factor comparison. Compares work evaluation perspectives: traditional, realist, market advocate, strategist, organizational development, social reality, contingency theory, competency,…
Design of Clinical Support Systems Using Integrated Genetic Algorithm and Support Vector Machine
NASA Astrophysics Data System (ADS)
Chen, Yung-Fu; Huang, Yung-Fa; Jiang, Xiaoyi; Hsu, Yuan-Nian; Lin, Hsuan-Hung
Clinical decision support system (CDSS) provides knowledge and specific information for clinicians to enhance diagnostic efficiency and improving healthcare quality. An appropriate CDSS can highly elevate patient safety, improve healthcare quality, and increase cost-effectiveness. Support vector machine (SVM) is believed to be superior to traditional statistical and neural network classifiers. However, it is critical to determine suitable combination of SVM parameters regarding classification performance. Genetic algorithm (GA) can find optimal solution within an acceptable time, and is faster than greedy algorithm with exhaustive searching strategy. By taking the advantage of GA in quickly selecting the salient features and adjusting SVM parameters, a method using integrated GA and SVM (IGS), which is different from the traditional method with GA used for feature selection and SVM for classification, was used to design CDSSs for prediction of successful ventilation weaning, diagnosis of patients with severe obstructive sleep apnea, and discrimination of different cell types form Pap smear. The results show that IGS is better than methods using SVM alone or linear discriminator.
Bayesian Redshift Classification of Emission-line Galaxies with Photometric Equivalent Widths
NASA Astrophysics Data System (ADS)
Leung, Andrew S.; Acquaviva, Viviana; Gawiser, Eric; Ciardullo, Robin; Komatsu, Eiichiro; Malz, A. I.; Zeimann, Gregory R.; Bridge, Joanna S.; Drory, Niv; Feldmeier, John J.; Finkelstein, Steven L.; Gebhardt, Karl; Gronwall, Caryl; Hagen, Alex; Hill, Gary J.; Schneider, Donald P.
2017-07-01
We present a Bayesian approach to the redshift classification of emission-line galaxies when only a single emission line is detected spectroscopically. We consider the case of surveys for high-redshift Lyα-emitting galaxies (LAEs), which have traditionally been classified via an inferred rest-frame equivalent width (EW {W}{Lyα }) greater than 20 Å. Our Bayesian method relies on known prior probabilities in measured emission-line luminosity functions and EW distributions for the galaxy populations, and returns the probability that an object in question is an LAE given the characteristics observed. This approach will be directly relevant for the Hobby-Eberly Telescope Dark Energy Experiment (HETDEX), which seeks to classify ˜106 emission-line galaxies into LAEs and low-redshift [{{O}} {{II}}] emitters. For a simulated HETDEX catalog with realistic measurement noise, our Bayesian method recovers 86% of LAEs missed by the traditional {W}{Lyα } > 20 Å cutoff over 2 < z < 3, outperforming the EW cut in both contamination and incompleteness. This is due to the method’s ability to trade off between the two types of binary classification error by adjusting the stringency of the probability requirement for classifying an observed object as an LAE. In our simulations of HETDEX, this method reduces the uncertainty in cosmological distance measurements by 14% with respect to the EW cut, equivalent to recovering 29% more cosmological information. Rather than using binary object labels, this method enables the use of classification probabilities in large-scale structure analyses. It can be applied to narrowband emission-line surveys as well as upcoming large spectroscopic surveys including Euclid and WFIRST.
Schwartzkopf, Wade C; Bovik, Alan C; Evans, Brian L
2005-12-01
Traditional chromosome imaging has been limited to grayscale images, but recently a 5-fluorophore combinatorial labeling technique (M-FISH) was developed wherein each class of chromosomes binds with a different combination of fluorophores. This results in a multispectral image, where each class of chromosomes has distinct spectral components. In this paper, we develop new methods for automatic chromosome identification by exploiting the multispectral information in M-FISH chromosome images and by jointly performing chromosome segmentation and classification. We (1) develop a maximum-likelihood hypothesis test that uses multispectral information, together with conventional criteria, to select the best segmentation possibility; (2) use this likelihood function to combine chromosome segmentation and classification into a robust chromosome identification system; and (3) show that the proposed likelihood function can also be used as a reliable indicator of errors in segmentation, errors in classification, and chromosome anomalies, which can be indicators of radiation damage, cancer, and a wide variety of inherited diseases. We show that the proposed multispectral joint segmentation-classification method outperforms past grayscale segmentation methods when decomposing touching chromosomes. We also show that it outperforms past M-FISH classification techniques that do not use segmentation information.
Multi-level discriminative dictionary learning with application to large scale image classification.
Shen, Li; Sun, Gang; Huang, Qingming; Wang, Shuhui; Lin, Zhouchen; Wu, Enhua
2015-10-01
The sparse coding technique has shown flexibility and capability in image representation and analysis. It is a powerful tool in many visual applications. Some recent work has shown that incorporating the properties of task (such as discrimination for classification task) into dictionary learning is effective for improving the accuracy. However, the traditional supervised dictionary learning methods suffer from high computation complexity when dealing with large number of categories, making them less satisfactory in large scale applications. In this paper, we propose a novel multi-level discriminative dictionary learning method and apply it to large scale image classification. Our method takes advantage of hierarchical category correlation to encode multi-level discriminative information. Each internal node of the category hierarchy is associated with a discriminative dictionary and a classification model. The dictionaries at different layers are learnt to capture the information of different scales. Moreover, each node at lower layers also inherits the dictionary of its parent, so that the categories at lower layers can be described with multi-scale information. The learning of dictionaries and associated classification models is jointly conducted by minimizing an overall tree loss. The experimental results on challenging data sets demonstrate that our approach achieves excellent accuracy and competitive computation cost compared with other sparse coding methods for large scale image classification.
The process and utility of classification and regression tree methodology in nursing research
Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda
2014-01-01
Aim This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Background Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Design Discussion paper. Data sources English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984–2013. Discussion Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Implications for Nursing Research Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Conclusion Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. PMID:24237048
The process and utility of classification and regression tree methodology in nursing research.
Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda
2014-06-01
This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Discussion paper. English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984-2013. Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. © 2013 The Authors. Journal of Advanced Nursing Published by John Wiley & Sons Ltd.
Residential roof condition assessment system using deep learning
NASA Astrophysics Data System (ADS)
Wang, Fan; Kerekes, John P.; Xu, Zhuoyi; Wang, Yandong
2018-01-01
The emergence of high resolution (HR) and ultra high resolution (UHR) airborne remote sensing imagery is enabling humans to move beyond traditional land cover analysis applications to the detailed characterization of surface objects. A residential roof condition assessment method using techniques from deep learning is presented. The proposed method operates on individual roofs and divides the task into two stages: (1) roof segmentation, followed by (2) condition classification of the segmented roof regions. As the first step in this process, a self-tuning method is proposed to segment the images into small homogeneous areas. The segmentation is initialized with simple linear iterative clustering followed by deep learned feature extraction and region merging, with the optimal result selected by an unsupervised index, Q. After the segmentation, a pretrained residual network is fine-tuned on the augmented roof segments using a proposed k-pixel extension technique for classification. The effectiveness of the proposed algorithm was demonstrated on both HR and UHR imagery collected by EagleView over different study sites. The proposed algorithm has yielded promising results and has outperformed traditional machine learning methods using hand-crafted features.
NASA Astrophysics Data System (ADS)
Yao, C.; Zhang, Y.; Zhang, Y.; Liu, H.
2017-09-01
With the rapid development of Precision Agriculture (PA) promoted by high-resolution remote sensing, it makes significant sense in management and estimation of agriculture through crop classification of high-resolution remote sensing image. Due to the complex and fragmentation of the features and the surroundings in the circumstance of high-resolution, the accuracy of the traditional classification methods has not been able to meet the standard of agricultural problems. In this case, this paper proposed a classification method for high-resolution agricultural remote sensing images based on convolution neural networks(CNN). For training, a large number of training samples were produced by panchromatic images of GF-1 high-resolution satellite of China. In the experiment, through training and testing on the CNN under the toolbox of deep learning by MATLAB, the crop classification finally got the correct rate of 99.66 % after the gradual optimization of adjusting parameter during training. Through improving the accuracy of image classification and image recognition, the applications of CNN provide a reference value for the field of remote sensing in PA.
Yu, Yingyan
2014-01-01
Histopathological classification is in a pivotal position in both basic research and clinical diagnosis and treatment of gastric cancer. Currently, there are different classification systems in basic science and clinical application. In medical literatures, different classifications are used including Lauren and WHO systems, which have confused many researchers. Lauren classification has been proposed for half a century, but is still used worldwide. It shows many advantages of simple, easy handling with prognostic significance. The WHO classification scheme is better than Lauren classification in that it is continuously being revised according to the progress of gastric cancer, and is always used in the clinical and pathological diagnosis of common scenarios. Along with the progression of genomics, transcriptomics, proteomics, metabolomics researches, molecular classification of gastric cancer becomes the current hot topics. The traditional therapeutic approach based on phenotypic characteristics of gastric cancer will most likely be replaced with a gene variation mode. The gene-targeted therapy against the same molecular variation seems more reasonable than traditional chemical treatment based on the same morphological change.
CW-SSIM kernel based random forest for image classification
NASA Astrophysics Data System (ADS)
Fan, Guangzhe; Wang, Zhou; Wang, Jiheng
2010-07-01
Complex wavelet structural similarity (CW-SSIM) index has been proposed as a powerful image similarity metric that is robust to translation, scaling and rotation of images, but how to employ it in image classification applications has not been deeply investigated. In this paper, we incorporate CW-SSIM as a kernel function into a random forest learning algorithm. This leads to a novel image classification approach that does not require a feature extraction or dimension reduction stage at the front end. We use hand-written digit recognition as an example to demonstrate our algorithm. We compare the performance of the proposed approach with random forest learning based on other kernels, including the widely adopted Gaussian and the inner product kernels. Empirical evidences show that the proposed method is superior in its classification power. We also compared our proposed approach with the direct random forest method without kernel and the popular kernel-learning method support vector machine. Our test results based on both simulated and realworld data suggest that the proposed approach works superior to traditional methods without the feature selection procedure.
a Hyperspectral Image Classification Method Using Isomap and Rvm
NASA Astrophysics Data System (ADS)
Chang, H.; Wang, T.; Fang, H.; Su, Y.
2018-04-01
Classification is one of the most significant applications of hyperspectral image processing and even remote sensing. Though various algorithms have been proposed to implement and improve this application, there are still drawbacks in traditional classification methods. Thus further investigations on some aspects, such as dimension reduction, data mining, and rational use of spatial information, should be developed. In this paper, we used a widely utilized global manifold learning approach, isometric feature mapping (ISOMAP), to address the intrinsic nonlinearities of hyperspectral image for dimension reduction. Considering the impropriety of Euclidean distance in spectral measurement, we applied spectral angle (SA) for substitute when constructed the neighbourhood graph. Then, relevance vector machines (RVM) was introduced to implement classification instead of support vector machines (SVM) for simplicity, generalization and sparsity. Therefore, a probability result could be obtained rather than a less convincing binary result. Moreover, taking into account the spatial information of the hyperspectral image, we employ a spatial vector formed by different classes' ratios around the pixel. At last, we combined the probability results and spatial factors with a criterion to decide the final classification result. To verify the proposed method, we have implemented multiple experiments with standard hyperspectral images compared with some other methods. The results and different evaluation indexes illustrated the effectiveness of our method.
Sequence-based classification and identification of fungi
USDA-ARS?s Scientific Manuscript database
Fungal taxonomy and ecology have been revolutionized by the application of molecular methods and both have increasing connections to genomics and functional biology. However, data streams from traditional specimen- and culture-based systematics are not yet fully integrated with those from metagenomi...
LDA boost classification: boosting by topics
NASA Astrophysics Data System (ADS)
Lei, La; Qiao, Guo; Qimin, Cao; Qitao, Li
2012-12-01
AdaBoost is an efficacious classification algorithm especially in text categorization (TC) tasks. The methodology of setting up a classifier committee and voting on the documents for classification can achieve high categorization precision. However, traditional Vector Space Model can easily lead to the curse of dimensionality and feature sparsity problems; so it affects classification performance seriously. This article proposed a novel classification algorithm called LDABoost based on boosting ideology which uses Latent Dirichlet Allocation (LDA) to modeling the feature space. Instead of using words or phrase, LDABoost use latent topics as the features. In this way, the feature dimension is significantly reduced. Improved Naïve Bayes (NB) is designed as the weaker classifier which keeps the efficiency advantage of classic NB algorithm and has higher precision. Moreover, a two-stage iterative weighted method called Cute Integration in this article is proposed for improving the accuracy by integrating weak classifiers into strong classifier in a more rational way. Mutual Information is used as metrics of weights allocation. The voting information and the categorization decision made by basis classifiers are fully utilized for generating the strong classifier. Experimental results reveals LDABoost making categorization in a low-dimensional space, it has higher accuracy than traditional AdaBoost algorithms and many other classic classification algorithms. Moreover, its runtime consumption is lower than different versions of AdaBoost, TC algorithms based on support vector machine and Neural Networks.
Cascaded deep decision networks for classification of endoscopic images
NASA Astrophysics Data System (ADS)
Murthy, Venkatesh N.; Singh, Vivek; Sun, Shanhui; Bhattacharya, Subhabrata; Chen, Terrence; Comaniciu, Dorin
2017-02-01
Both traditional and wireless capsule endoscopes can generate tens of thousands of images for each patient. It is desirable to have the majority of irrelevant images filtered out by automatic algorithms during an offline review process or to have automatic indication for highly suspicious areas during an online guidance. This also applies to the newly invented endomicroscopy, where online indication of tumor classification plays a significant role. Image classification is a standard pattern recognition problem and is well studied in the literature. However, performance on the challenging endoscopic images still has room for improvement. In this paper, we present a novel Cascaded Deep Decision Network (CDDN) to improve image classification performance over standard Deep neural network based methods. During the learning phase, CDDN automatically builds a network which discards samples that are classified with high confidence scores by a previously trained network and concentrates only on the challenging samples which would be handled by the subsequent expert shallow networks. We validate CDDN using two different types of endoscopic imaging, which includes a polyp classification dataset and a tumor classification dataset. From both datasets we show that CDDN can outperform other methods by about 10%. In addition, CDDN can also be applied to other image classification problems.
Natural image classification driven by human brain activity
NASA Astrophysics Data System (ADS)
Zhang, Dai; Peng, Hanyang; Wang, Jinqiao; Tang, Ming; Xue, Rong; Zuo, Zhentao
2016-03-01
Natural image classification has been a hot topic in computer vision and pattern recognition research field. Since the performance of an image classification system can be improved by feature selection, many image feature selection methods have been developed. However, the existing supervised feature selection methods are typically driven by the class label information that are identical for different samples from the same class, ignoring with-in class image variability and therefore degrading the feature selection performance. In this study, we propose a novel feature selection method, driven by human brain activity signals collected using fMRI technique when human subjects were viewing natural images of different categories. The fMRI signals associated with subjects viewing different images encode the human perception of natural images, and therefore may capture image variability within- and cross- categories. We then select image features with the guidance of fMRI signals from brain regions with active response to image viewing. Particularly, bag of words features based on GIST descriptor are extracted from natural images for classification, and a sparse regression base feature selection method is adapted to select image features that can best predict fMRI signals. Finally, a classification model is built on the select image features to classify images without fMRI signals. The validation experiments for classifying images from 4 categories of two subjects have demonstrated that our method could achieve much better classification performance than the classifiers built on image feature selected by traditional feature selection methods.
Wang, Jie; Zeng, Hao-Long; Du, Hongying; Liu, Zeyuan; Cheng, Ji; Liu, Taotao; Hu, Ting; Kamal, Ghulam Mustafa; Li, Xihai; Liu, Huili; Xu, Fuqiang
2018-03-01
Metabolomics generate a profile of small molecules from cellular/tissue metabolism, which could directly reflect the mechanisms of complex networks of biochemical reactions. Traditional metabolomics methods, such as OPLS-DA, PLS-DA are mainly used for binary class discrimination. Multiple groups are always involved in the biological system, especially for brain research. Multiple brain regions are involved in the neuronal study of brain metabolic dysfunctions such as alcoholism, Alzheimer's disease, etc. In the current study, 10 different brain regions were utilized for comparative studies between alcohol preferring and non-preferring rats, male and female rats respectively. As many classes are involved (ten different regions and four types of animals), traditional metabolomics methods are no longer efficient for showing differentiation. Here, a novel strategy based on the decision tree algorithm was employed for successfully constructing different classification models to screen out the major characteristics of ten brain regions at the same time. Subsequently, this method was also utilized to select the major effective brain regions related to alcohol preference and gender difference. Compared with the traditional multivariate statistical methods, the decision tree could construct acceptable and understandable classification models for multi-class data analysis. Therefore, the current technology could also be applied to other general metabolomics studies involving multi class data. Copyright © 2017 Elsevier B.V. All rights reserved.
A General Architecture for Intelligent Tutoring of Diagnostic Classification Problem Solving
Crowley, Rebecca S.; Medvedeva, Olga
2003-01-01
We report on a general architecture for creating knowledge-based medical training systems to teach diagnostic classification problem solving. The approach is informed by our previous work describing the development of expertise in classification problem solving in Pathology. The architecture envelops the traditional Intelligent Tutoring System design within the Unified Problem-solving Method description Language (UPML) architecture, supporting component modularity and reuse. Based on the domain ontology, domain task ontology and case data, the abstract problem-solving methods of the expert model create a dynamic solution graph. Student interaction with the solution graph is filtered through an instructional layer, which is created by a second set of abstract problem-solving methods and pedagogic ontologies, in response to the current state of the student model. We outline the advantages and limitations of this general approach, and describe it’s implementation in SlideTutor–a developing Intelligent Tutoring System in Dermatopathology. PMID:14728159
Study on a pattern classification method of soil quality based on simplified learning sample dataset
Zhang, Jiahua; Liu, S.; Hu, Y.; Tian, Y.
2011-01-01
Based on the massive soil information in current soil quality grade evaluation, this paper constructed an intelligent classification approach of soil quality grade depending on classical sampling techniques and disordered multiclassification Logistic regression model. As a case study to determine the learning sample capacity under certain confidence level and estimation accuracy, and use c-means algorithm to automatically extract the simplified learning sample dataset from the cultivated soil quality grade evaluation database for the study area, Long chuan county in Guangdong province, a disordered Logistic classifier model was then built and the calculation analysis steps of soil quality grade intelligent classification were given. The result indicated that the soil quality grade can be effectively learned and predicted by the extracted simplified dataset through this method, which changed the traditional method for soil quality grade evaluation. ?? 2011 IEEE.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pichara, Karim; Protopapas, Pavlos
We present an automatic classification method for astronomical catalogs with missing data. We use Bayesian networks and a probabilistic graphical model that allows us to perform inference to predict missing values given observed data and dependency relationships between variables. To learn a Bayesian network from incomplete data, we use an iterative algorithm that utilizes sampling methods and expectation maximization to estimate the distributions and probabilistic dependencies of variables from data with missing values. To test our model, we use three catalogs with missing data (SAGE, Two Micron All Sky Survey, and UBVI) and one complete catalog (MACHO). We examine howmore » classification accuracy changes when information from missing data catalogs is included, how our method compares to traditional missing data approaches, and at what computational cost. Integrating these catalogs with missing data, we find that classification of variable objects improves by a few percent and by 15% for quasar detection while keeping the computational cost the same.« less
Convex formulation of multiple instance learning from positive and unlabeled bags.
Bao, Han; Sakai, Tomoya; Sato, Issei; Sugiyama, Masashi
2018-05-24
Multiple instance learning (MIL) is a variation of traditional supervised learning problems where data (referred to as bags) are composed of sub-elements (referred to as instances) and only bag labels are available. MIL has a variety of applications such as content-based image retrieval, text categorization, and medical diagnosis. Most of the previous work for MIL assume that training bags are fully labeled. However, it is often difficult to obtain an enough number of labeled bags in practical situations, while many unlabeled bags are available. A learning framework called PU classification (positive and unlabeled classification) can address this problem. In this paper, we propose a convex PU classification method to solve an MIL problem. We experimentally show that the proposed method achieves better performance with significantly lower computation costs than an existing method for PU-MIL. Copyright © 2018 Elsevier Ltd. All rights reserved.
Clonal Selection Based Artificial Immune System for Generalized Pattern Recognition
NASA Technical Reports Server (NTRS)
Huntsberger, Terry
2011-01-01
The last two decades has seen a rapid increase in the application of AIS (Artificial Immune Systems) modeled after the human immune system to a wide range of areas including network intrusion detection, job shop scheduling, classification, pattern recognition, and robot control. JPL (Jet Propulsion Laboratory) has developed an integrated pattern recognition/classification system called AISLE (Artificial Immune System for Learning and Exploration) based on biologically inspired models of B-cell dynamics in the immune system. When used for unsupervised or supervised classification, the method scales linearly with the number of dimensions, has performance that is relatively independent of the total size of the dataset, and has been shown to perform as well as traditional clustering methods. When used for pattern recognition, the method efficiently isolates the appropriate matches in the data set. The paper presents the underlying structure of AISLE and the results from a number of experimental studies.
NASA Astrophysics Data System (ADS)
Liu, J.; Lan, T.; Qin, H.
2017-10-01
Traditional data cleaning identifies dirty data by classifying original data sequences, which is a class-imbalanced problem since the proportion of incorrect data is much less than the proportion of correct ones for most diagnostic systems in Magnetic Confinement Fusion (MCF) devices. When using machine learning algorithms to classify diagnostic data based on class-imbalanced training set, most classifiers are biased towards the major class and show very poor classification rates on the minor class. By transforming the direct classification problem about original data sequences into a classification problem about the physical similarity between data sequences, the class-balanced effect of Time-Domain Global Similarity (TDGS) method on training set structure is investigated in this paper. Meanwhile, the impact of improved training set structure on data cleaning performance of TDGS method is demonstrated with an application example in EAST POlarimetry-INTerferometry (POINT) system.
Sentiment analysis of feature ranking methods for classification accuracy
NASA Astrophysics Data System (ADS)
Joseph, Shashank; Mugauri, Calvin; Sumathy, S.
2017-11-01
Text pre-processing and feature selection are important and critical steps in text mining. Text pre-processing of large volumes of datasets is a difficult task as unstructured raw data is converted into structured format. Traditional methods of processing and weighing took much time and were less accurate. To overcome this challenge, feature ranking techniques have been devised. A feature set from text preprocessing is fed as input for feature selection. Feature selection helps improve text classification accuracy. Of the three feature selection categories available, the filter category will be the focus. Five feature ranking methods namely: document frequency, standard deviation information gain, CHI-SQUARE, and weighted-log likelihood -ratio is analyzed.
Zu, Chen; Jie, Biao; Liu, Mingxia; Chen, Songcan
2015-01-01
Multimodal classification methods using different modalities of imaging and non-imaging data have recently shown great advantages over traditional single-modality-based ones for diagnosis and prognosis of Alzheimer’s disease (AD), as well as its prodromal stage, i.e., mild cognitive impairment (MCI). However, to the best of our knowledge, most existing methods focus on mining the relationship across multiple modalities of the same subjects, while ignoring the potentially useful relationship across different subjects. Accordingly, in this paper, we propose a novel learning method for multimodal classification of AD/MCI, by fully exploring the relationships across both modalities and subjects. Specifically, our proposed method includes two subsequent components, i.e., label-aligned multi-task feature selection and multimodal classification. In the first step, the feature selection learning from multiple modalities are treated as different learning tasks and a group sparsity regularizer is imposed to jointly select a subset of relevant features. Furthermore, to utilize the discriminative information among labeled subjects, a new label-aligned regularization term is added into the objective function of standard multi-task feature selection, where label-alignment means that all multi-modality subjects with the same class labels should be closer in the new feature-reduced space. In the second step, a multi-kernel support vector machine (SVM) is adopted to fuse the selected features from multi-modality data for final classification. To validate our method, we perform experiments on the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database using baseline MRI and FDG-PET imaging data. The experimental results demonstrate that our proposed method achieves better classification performance compared with several state-of-the-art methods for multimodal classification of AD/MCI. PMID:26572145
Learning classification models with soft-label information.
Nguyen, Quang; Valizadegan, Hamed; Hauskrecht, Milos
2014-01-01
Learning of classification models in medicine often relies on data labeled by a human expert. Since labeling of clinical data may be time-consuming, finding ways of alleviating the labeling costs is critical for our ability to automatically learn such models. In this paper we propose a new machine learning approach that is able to learn improved binary classification models more efficiently by refining the binary class information in the training phase with soft labels that reflect how strongly the human expert feels about the original class labels. Two types of methods that can learn improved binary classification models from soft labels are proposed. The first relies on probabilistic/numeric labels, the other on ordinal categorical labels. We study and demonstrate the benefits of these methods for learning an alerting model for heparin induced thrombocytopenia. The experiments are conducted on the data of 377 patient instances labeled by three different human experts. The methods are compared using the area under the receiver operating characteristic curve (AUC) score. Our AUC results show that the new approach is capable of learning classification models more efficiently compared to traditional learning methods. The improvement in AUC is most remarkable when the number of examples we learn from is small. A new classification learning framework that lets us learn from auxiliary soft-label information provided by a human expert is a promising new direction for learning classification models from expert labels, reducing the time and cost needed to label data.
Apocrine Hidradenocarcinoma of the Scalp: A Classification Conundrum
Cassarino, David S.; Shih, Hubert B.; Abemayor, Elliot; John, Maie St.
2008-01-01
Introduction The classification of malignant sweat gland lesions is complex. Traditionally, cutaneous sweat gland tumors have been classified by either eccrine or apocrine features. Methods A case report of a 33-year-old Hispanic man with a left scalp mass diagnosed as a malignancy of adnexal origin preoperatively is discussed. After presentation at our multidisciplinary tumor board, excision with ipsilateral neck dissection was undertaken. Results Final pathology revealed an apocrine hidradenocarcinoma. The classification and behavior of this entity are discussed in this report. Conclusion Apocrine hidradenocarcinoma can be viewed as an aggressive malignant lesion of cutaneous sweat glands on a spectrum that involves both eccrine and apoeccrine lesions. PMID:20596988
Deep feature extraction and combination for synthetic aperture radar target classification
NASA Astrophysics Data System (ADS)
Amrani, Moussa; Jiang, Feng
2017-10-01
Feature extraction has always been a difficult problem in the classification performance of synthetic aperture radar automatic target recognition (SAR-ATR). It is very important to select discriminative features to train a classifier, which is a prerequisite. Inspired by the great success of convolutional neural network (CNN), we address the problem of SAR target classification by proposing a feature extraction method, which takes advantage of exploiting the extracted deep features from CNNs on SAR images to introduce more powerful discriminative features and robust representation ability for them. First, the pretrained VGG-S net is fine-tuned on moving and stationary target acquisition and recognition (MSTAR) public release database. Second, after a simple preprocessing is performed, the fine-tuned network is used as a fixed feature extractor to extract deep features from the processed SAR images. Third, the extracted deep features are fused by using a traditional concatenation and a discriminant correlation analysis algorithm. Finally, for target classification, K-nearest neighbors algorithm based on LogDet divergence-based metric learning triplet constraints is adopted as a baseline classifier. Experiments on MSTAR are conducted, and the classification accuracy results demonstrate that the proposed method outperforms the state-of-the-art methods.
Local curvature analysis for classifying breast tumors: Preliminary analysis in dedicated breast CT
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Juhun, E-mail: leej15@upmc.edu; Nishikawa, Robert M.; Reiser, Ingrid
2015-09-15
Purpose: The purpose of this study is to measure the effectiveness of local curvature measures as novel image features for classifying breast tumors. Methods: A total of 119 breast lesions from 104 noncontrast dedicated breast computed tomography images of women were used in this study. Volumetric segmentation was done using a seed-based segmentation algorithm and then a triangulated surface was extracted from the resulting segmentation. Total, mean, and Gaussian curvatures were then computed. Normalized curvatures were used as classification features. In addition, traditional image features were also extracted and a forward feature selection scheme was used to select the optimalmore » feature set. Logistic regression was used as a classifier and leave-one-out cross-validation was utilized to evaluate the classification performances of the features. The area under the receiver operating characteristic curve (AUC, area under curve) was used as a figure of merit. Results: Among curvature measures, the normalized total curvature (C{sub T}) showed the best classification performance (AUC of 0.74), while the others showed no classification power individually. Five traditional image features (two shape, two margin, and one texture descriptors) were selected via the feature selection scheme and its resulting classifier achieved an AUC of 0.83. Among those five features, the radial gradient index (RGI), which is a margin descriptor, showed the best classification performance (AUC of 0.73). A classifier combining RGI and C{sub T} yielded an AUC of 0.81, which showed similar performance (i.e., no statistically significant difference) to the classifier with the above five traditional image features. Additional comparisons in AUC values between classifiers using different combinations of traditional image features and C{sub T} were conducted. The results showed that C{sub T} was able to replace the other four image features for the classification task. Conclusions: The normalized curvature measure contains useful information in classifying breast tumors. Using this, one can reduce the number of features in a classifier, which may result in more robust classifiers for different datasets.« less
A Novel Feature Selection Technique for Text Classification Using Naïve Bayes.
Dey Sarkar, Subhajit; Goswami, Saptarsi; Agarwal, Aman; Aktar, Javed
2014-01-01
With the proliferation of unstructured data, text classification or text categorization has found many applications in topic classification, sentiment analysis, authorship identification, spam detection, and so on. There are many classification algorithms available. Naïve Bayes remains one of the oldest and most popular classifiers. On one hand, implementation of naïve Bayes is simple and, on the other hand, this also requires fewer amounts of training data. From the literature review, it is found that naïve Bayes performs poorly compared to other classifiers in text classification. As a result, this makes the naïve Bayes classifier unusable in spite of the simplicity and intuitiveness of the model. In this paper, we propose a two-step feature selection method based on firstly a univariate feature selection and then feature clustering, where we use the univariate feature selection method to reduce the search space and then apply clustering to select relatively independent feature sets. We demonstrate the effectiveness of our method by a thorough evaluation and comparison over 13 datasets. The performance improvement thus achieved makes naïve Bayes comparable or superior to other classifiers. The proposed algorithm is shown to outperform other traditional methods like greedy search based wrapper or CFS.
NASA Astrophysics Data System (ADS)
Wozniak, Breann M.
The purpose of this study was to examine the effect of process-oriented guided-inquiry learning (POGIL) on non-majors college biology students' understanding of biological classification. This study addressed an area of science instruction, POGIL in the non-majors college biology laboratory, which has yet to be qualitatively and quantitatively researched. A concurrent triangulation mixed methods approach was used. Students' understanding of biological classification was measured in two areas: scores on pre and posttests (consisting of 11 multiple choice questions), and conceptions of classification as elicited in pre and post interviews and instructor reflections. Participants were Minnesota State University, Mankato students enrolled in BIOL 100 Summer Session. One section was taught with the traditional curriculum (n = 6) and the other section in the POGIL curriculum (n = 10) developed by the researcher. Three students from each section were selected to take part in pre and post interviews. There were no significant differences within each teaching method (p < .05). There was a tendency of difference in the means. The POGIL group may have scored higher on the posttest (M = 8.830 +/- .477 vs. M = 7.330 +/- .330; z =-1.729, p = .084) and the traditional group may have scored higher on the pretest than the posttest (M = 8.333 +/- .333 vs M = 7.333 +/- .333; z = -1.650 , p = .099). Two themes emerged after the interviews and instructor reflections: 1) After instruction students had a more extensive understanding of classification in three areas: vocabulary terms, physical characteristics, and types of evidence used to classify. Both groups extended their understanding, but only POGIL students could explain how molecular evidence is used in classification. 2) The challenges preventing students from understanding classification were: familiar animal categories and aquatic habitats, unfamiliar organisms, combining and subdividing initial groupings, and the hierarchical nature of classification. The POGIL students were the only group to surpass these challenges after the teaching intervention. This study shows that POGIL is an effective technique at eliciting students' misconceptions, and addressing these misconceptions, leading to an increase in student understanding of biological classification.
Unsupervised Cryo-EM Data Clustering through Adaptively Constrained K-Means Algorithm
Xu, Yaofang; Wu, Jiayi; Yin, Chang-Cheng; Mao, Youdong
2016-01-01
In single-particle cryo-electron microscopy (cryo-EM), K-means clustering algorithm is widely used in unsupervised 2D classification of projection images of biological macromolecules. 3D ab initio reconstruction requires accurate unsupervised classification in order to separate molecular projections of distinct orientations. Due to background noise in single-particle images and uncertainty of molecular orientations, traditional K-means clustering algorithm may classify images into wrong classes and produce classes with a large variation in membership. Overcoming these limitations requires further development on clustering algorithms for cryo-EM data analysis. We propose a novel unsupervised data clustering method building upon the traditional K-means algorithm. By introducing an adaptive constraint term in the objective function, our algorithm not only avoids a large variation in class sizes but also produces more accurate data clustering. Applications of this approach to both simulated and experimental cryo-EM data demonstrate that our algorithm is a significantly improved alterative to the traditional K-means algorithm in single-particle cryo-EM analysis. PMID:27959895
Unsupervised Cryo-EM Data Clustering through Adaptively Constrained K-Means Algorithm.
Xu, Yaofang; Wu, Jiayi; Yin, Chang-Cheng; Mao, Youdong
2016-01-01
In single-particle cryo-electron microscopy (cryo-EM), K-means clustering algorithm is widely used in unsupervised 2D classification of projection images of biological macromolecules. 3D ab initio reconstruction requires accurate unsupervised classification in order to separate molecular projections of distinct orientations. Due to background noise in single-particle images and uncertainty of molecular orientations, traditional K-means clustering algorithm may classify images into wrong classes and produce classes with a large variation in membership. Overcoming these limitations requires further development on clustering algorithms for cryo-EM data analysis. We propose a novel unsupervised data clustering method building upon the traditional K-means algorithm. By introducing an adaptive constraint term in the objective function, our algorithm not only avoids a large variation in class sizes but also produces more accurate data clustering. Applications of this approach to both simulated and experimental cryo-EM data demonstrate that our algorithm is a significantly improved alterative to the traditional K-means algorithm in single-particle cryo-EM analysis.
Soil texture classification algorithm using RGB characteristics of soil images
USDA-ARS?s Scientific Manuscript database
Soil texture has an important influence on agriculture, affecting crop selection, movement of nutrients and water, soil electrical conductivity, and crop growth. Soil texture has traditionally been determined in the laboratory using pipette and hydrometer methods that require a considerable amount o...
Semi-Supervised Projective Non-Negative Matrix Factorization for Cancer Classification.
Zhang, Xiang; Guan, Naiyang; Jia, Zhilong; Qiu, Xiaogang; Luo, Zhigang
2015-01-01
Advances in DNA microarray technologies have made gene expression profiles a significant candidate in identifying different types of cancers. Traditional learning-based cancer identification methods utilize labeled samples to train a classifier, but they are inconvenient for practical application because labels are quite expensive in the clinical cancer research community. This paper proposes a semi-supervised projective non-negative matrix factorization method (Semi-PNMF) to learn an effective classifier from both labeled and unlabeled samples, thus boosting subsequent cancer classification performance. In particular, Semi-PNMF jointly learns a non-negative subspace from concatenated labeled and unlabeled samples and indicates classes by the positions of the maximum entries of their coefficients. Because Semi-PNMF incorporates statistical information from the large volume of unlabeled samples in the learned subspace, it can learn more representative subspaces and boost classification performance. We developed a multiplicative update rule (MUR) to optimize Semi-PNMF and proved its convergence. The experimental results of cancer classification for two multiclass cancer gene expression profile datasets show that Semi-PNMF outperforms the representative methods.
Post-boosting of classification boundary for imbalanced data using geometric mean.
Du, Jie; Vong, Chi-Man; Pun, Chi-Man; Wong, Pak-Kin; Ip, Weng-Fai
2017-12-01
In this paper, a novel imbalance learning method for binary classes is proposed, named as Post-Boosting of classification boundary for Imbalanced data (PBI), which can significantly improve the performance of any trained neural networks (NN) classification boundary. The procedure of PBI simply consists of two steps: an (imbalanced) NN learning method is first applied to produce a classification boundary, which is then adjusted by PBI under the geometric mean (G-mean). For data imbalance, the geometric mean of the accuracies of both minority and majority classes is considered, that is statistically more suitable than the common metric accuracy. PBI also has the following advantages over traditional imbalance methods: (i) PBI can significantly improve the classification accuracy on minority class while improving or keeping that on majority class as well; (ii) PBI is suitable for large data even with high imbalance ratio (up to 0.001). For evaluation of (i), a new metric called Majority loss/Minority advance ratio (MMR) is proposed that evaluates the loss ratio of majority class to minority class. Experiments have been conducted for PBI and several imbalance learning methods over benchmark datasets of different sizes, different imbalance ratios, and different dimensionalities. By analyzing the experimental results, PBI is shown to outperform other imbalance learning methods on almost all datasets. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Zhao, Lili; Yin, Jianping; Yuan, Lihuan; Liu, Qiang; Li, Kuan; Qiu, Minghui
2017-07-01
Automatic detection of abnormal cells from cervical smear images is extremely demanded in annual diagnosis of women's cervical cancer. For this medical cell recognition problem, there are three different feature sections, namely cytology morphology, nuclear chromatin pathology and region intensity. The challenges of this problem come from feature combination s and classification accurately and efficiently. Thus, we propose an efficient abnormal cervical cell detection system based on multi-instance extreme learning machine (MI-ELM) to deal with above two questions in one unified framework. MI-ELM is one of the most promising supervised learning classifiers which can deal with several feature sections and realistic classification problems analytically. Experiment results over Herlev dataset demonstrate that the proposed method outperforms three traditional methods for two-class classification in terms of well accuracy and less time.
Gu, Yao; Ni, Yongnian; Kokot, Serge
2012-09-13
A novel, simple and direct fluorescence method for analysis of complex substances and their potential substitutes has been researched and developed. Measurements involved excitation and emission (EEM) fluorescence spectra of powdered, complex, medicinal herbs, Cortex Phellodendri Chinensis (CPC) and the similar Cortex Phellodendri Amurensis (CPA); these substances were compared and discriminated from each other and the potentially adulterated samples (Caulis mahoniae (CM) and David poplar bark (DPB)). Different chemometrics methods were applied for resolution of the complex spectra, and the excitation spectra were found to be the most informative; only the rank-ordering PROMETHEE method was able to classify the samples with single ingredients (CPA, CPC, CM) or those with binary mixtures (CPA/CPC, CPA/CM, CPC/CM). Interestingly, it was essential to use the geometrical analysis for interactive aid (GAIA) display for a full understanding of the classification results. However, these two methods, like the other chemometrics models, were unable to classify composite spectral matrices consisting of data from samples of single ingredients and binary mixtures; this suggested that the excitation spectra of the different samples were very similar. However, the method is useful for classification of single-ingredient samples and, separately, their binary mixtures; it may also be applied for similar classification work with other complex substances.
De Souza, Daiana A; Wang, Ying; Kaftanoglu, Osman; De Jong, David; Amdam, Gro V; Gonçalves, Lionel S; Francoy, Tiago M
2015-01-01
In vitro rearing is an important and useful tool for honey bee (Apis mellifera L.) studies. However, it often results in intercastes between queens and workers, which are normally are not seen in hive-reared bees, except when larvae older than three days are grafted for queen rearing. Morphological classification (queen versus worker or intercastes) of bees produced by this method can be subjective and generally depends on size differences. Here, we propose an alternative method for caste classification of female honey bees reared in vitro, based on weight at emergence, ovariole number, spermatheca size and size and shape, and features of the head, mandible and basitarsus. Morphological measurements were made with both traditional morphometric and geometric morphometrics techniques. The classifications were performed by principal component analysis, using naturally developed queens and workers as controls. First, the analysis included all the characters. Subsequently, a new analysis was made without the information about ovariole number and spermatheca size. Geometric morphometrics was less dependent on ovariole number and spermatheca information for caste and intercaste identification. This is useful, since acquiring information concerning these reproductive structures requires time-consuming dissection and they are not accessible when abdomens have been removed for molecular assays or in dried specimens. Additionally, geometric morphometrics divided intercastes into more discrete phenotype subsets. We conclude that morphometric geometrics are superior to traditional morphometrics techniques for identification and classification of honey bee castes and intermediates.
Fast Image Texture Classification Using Decision Trees
NASA Technical Reports Server (NTRS)
Thompson, David R.
2011-01-01
Texture analysis would permit improved autonomous, onboard science data interpretation for adaptive navigation, sampling, and downlink decisions. These analyses would assist with terrain analysis and instrument placement in both macroscopic and microscopic image data products. Unfortunately, most state-of-the-art texture analysis demands computationally expensive convolutions of filters involving many floating-point operations. This makes them infeasible for radiation- hardened computers and spaceflight hardware. A new method approximates traditional texture classification of each image pixel with a fast decision-tree classifier. The classifier uses image features derived from simple filtering operations involving integer arithmetic. The texture analysis method is therefore amenable to implementation on FPGA (field-programmable gate array) hardware. Image features based on the "integral image" transform produce descriptive and efficient texture descriptors. Training the decision tree on a set of training data yields a classification scheme that produces reasonable approximations of optimal "texton" analysis at a fraction of the computational cost. A decision-tree learning algorithm employing the traditional k-means criterion of inter-cluster variance is used to learn tree structure from training data. The result is an efficient and accurate summary of surface morphology in images. This work is an evolutionary advance that unites several previous algorithms (k-means clustering, integral images, decision trees) and applies them to a new problem domain (morphology analysis for autonomous science during remote exploration). Advantages include order-of-magnitude improvements in runtime, feasibility for FPGA hardware, and significant improvements in texture classification accuracy.
A. De Souza, Daiana; Wang, Ying; Kaftanoglu, Osman; De Jong, David; V. Amdam, Gro; S. Gonçalves, Lionel; M. Francoy, Tiago
2015-01-01
In vitro rearing is an important and useful tool for honey bee (Apis mellifera L.) studies. However, it often results in intercastes between queens and workers, which are normally are not seen in hive-reared bees, except when larvae older than three days are grafted for queen rearing. Morphological classification (queen versus worker or intercastes) of bees produced by this method can be subjective and generally depends on size differences. Here, we propose an alternative method for caste classification of female honey bees reared in vitro, based on weight at emergence, ovariole number, spermatheca size and size and shape, and features of the head, mandible and basitarsus. Morphological measurements were made with both traditional morphometric and geometric morphometrics techniques. The classifications were performed by principal component analysis, using naturally developed queens and workers as controls. First, the analysis included all the characters. Subsequently, a new analysis was made without the information about ovariole number and spermatheca size. Geometric morphometrics was less dependent on ovariole number and spermatheca information for caste and intercaste identification. This is useful, since acquiring information concerning these reproductive structures requires time-consuming dissection and they are not accessible when abdomens have been removed for molecular assays or in dried specimens. Additionally, geometric morphometrics divided intercastes into more discrete phenotype subsets. We conclude that morphometric geometrics are superior to traditional morphometrics techniques for identification and classification of honey bee castes and intermediates. PMID:25894528
Lin, Chin; Hsu, Chia-Jung; Lou, Yu-Sheng; Yeh, Shih-Jen; Lee, Chia-Cheng; Su, Sui-Lung; Chen, Hsiang-Cheng
2017-11-06
Automated disease code classification using free-text medical information is important for public health surveillance. However, traditional natural language processing (NLP) pipelines are limited, so we propose a method combining word embedding with a convolutional neural network (CNN). Our objective was to compare the performance of traditional pipelines (NLP plus supervised machine learning models) with that of word embedding combined with a CNN in conducting a classification task identifying International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) diagnosis codes in discharge notes. We used 2 classification methods: (1) extracting from discharge notes some features (terms, n-gram phrases, and SNOMED CT categories) that we used to train a set of supervised machine learning models (support vector machine, random forests, and gradient boosting machine), and (2) building a feature matrix, by a pretrained word embedding model, that we used to train a CNN. We used these methods to identify the chapter-level ICD-10-CM diagnosis codes in a set of discharge notes. We conducted the evaluation using 103,390 discharge notes covering patients hospitalized from June 1, 2015 to January 31, 2017 in the Tri-Service General Hospital in Taipei, Taiwan. We used the receiver operating characteristic curve as an evaluation measure, and calculated the area under the curve (AUC) and F-measure as the global measure of effectiveness. In 5-fold cross-validation tests, our method had a higher testing accuracy (mean AUC 0.9696; mean F-measure 0.9086) than traditional NLP-based approaches (mean AUC range 0.8183-0.9571; mean F-measure range 0.5050-0.8739). A real-world simulation that split the training sample and the testing sample by date verified this result (mean AUC 0.9645; mean F-measure 0.9003 using the proposed method). Further analysis showed that the convolutional layers of the CNN effectively identified a large number of keywords and automatically extracted enough concepts to predict the diagnosis codes. Word embedding combined with a CNN showed outstanding performance compared with traditional methods, needing very little data preprocessing. This shows that future studies will not be limited by incomplete dictionaries. A large amount of unstructured information from free-text medical writing will be extracted by automated approaches in the future, and we believe that the health care field is about to enter the age of big data. ©Chin Lin, Chia-Jung Hsu, Yu-Sheng Lou, Shih-Jen Yeh, Chia-Cheng Lee, Sui-Lung Su, Hsiang-Cheng Chen. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 06.11.2017.
Wang, Shijun; McKenna, Matthew T; Nguyen, Tan B; Burns, Joseph E; Petrick, Nicholas; Sahiner, Berkman; Summers, Ronald M
2012-05-01
In this paper, we present development and testing results for a novel colonic polyp classification method for use as part of a computed tomographic colonography (CTC) computer-aided detection (CAD) system. Inspired by the interpretative methodology of radiologists using 3-D fly-through mode in CTC reading, we have developed an algorithm which utilizes sequences of images (referred to here as videos) for classification of CAD marks. For each CAD mark, we created a video composed of a series of intraluminal, volume-rendered images visualizing the detection from multiple viewpoints. We then framed the video classification question as a multiple-instance learning (MIL) problem. Since a positive (negative) bag may contain negative (positive) instances, which in our case depends on the viewing angles and camera distance to the target, we developed a novel MIL paradigm to accommodate this class of problems. We solved the new MIL problem by maximizing a L2-norm soft margin using semidefinite programming, which can optimize relevant parameters automatically. We tested our method by analyzing a CTC data set obtained from 50 patients from three medical centers. Our proposed method showed significantly better performance compared with several traditional MIL methods.
Short text sentiment classification based on feature extension and ensemble classifier
NASA Astrophysics Data System (ADS)
Liu, Yang; Zhu, Xie
2018-05-01
With the rapid development of Internet social media, excavating the emotional tendencies of the short text information from the Internet, the acquisition of useful information has attracted the attention of researchers. At present, the commonly used can be attributed to the rule-based classification and statistical machine learning classification methods. Although micro-blog sentiment analysis has made good progress, there still exist some shortcomings such as not highly accurate enough and strong dependence from sentiment classification effect. Aiming at the characteristics of Chinese short texts, such as less information, sparse features, and diverse expressions, this paper considers expanding the original text by mining related semantic information from the reviews, forwarding and other related information. First, this paper uses Word2vec to compute word similarity to extend the feature words. And then uses an ensemble classifier composed of SVM, KNN and HMM to analyze the emotion of the short text of micro-blog. The experimental results show that the proposed method can make good use of the comment forwarding information to extend the original features. Compared with the traditional method, the accuracy, recall and F1 value obtained by this method have been improved.
Deep Learning Methods for Underwater Target Feature Extraction and Recognition
Peng, Yuan; Qiu, Mengran; Shi, Jianfei; Liu, Liangliang
2018-01-01
The classification and recognition technology of underwater acoustic signal were always an important research content in the field of underwater acoustic signal processing. Currently, wavelet transform, Hilbert-Huang transform, and Mel frequency cepstral coefficients are used as a method of underwater acoustic signal feature extraction. In this paper, a method for feature extraction and identification of underwater noise data based on CNN and ELM is proposed. An automatic feature extraction method of underwater acoustic signals is proposed using depth convolution network. An underwater target recognition classifier is based on extreme learning machine. Although convolution neural networks can execute both feature extraction and classification, their function mainly relies on a full connection layer, which is trained by gradient descent-based; the generalization ability is limited and suboptimal, so an extreme learning machine (ELM) was used in classification stage. Firstly, CNN learns deep and robust features, followed by the removing of the fully connected layers. Then ELM fed with the CNN features is used as the classifier to conduct an excellent classification. Experiments on the actual data set of civil ships obtained 93.04% recognition rate; compared to the traditional Mel frequency cepstral coefficients and Hilbert-Huang feature, recognition rate greatly improved. PMID:29780407
Wang, Shijun; McKenna, Matthew T.; Nguyen, Tan B.; Burns, Joseph E.; Petrick, Nicholas; Sahiner, Berkman
2012-01-01
In this paper we present development and testing results for a novel colonic polyp classification method for use as part of a computed tomographic colonography (CTC) computer-aided detection (CAD) system. Inspired by the interpretative methodology of radiologists using 3D fly-through mode in CTC reading, we have developed an algorithm which utilizes sequences of images (referred to here as videos) for classification of CAD marks. For each CAD mark, we created a video composed of a series of intraluminal, volume-rendered images visualizing the detection from multiple viewpoints. We then framed the video classification question as a multiple-instance learning (MIL) problem. Since a positive (negative) bag may contain negative (positive) instances, which in our case depends on the viewing angles and camera distance to the target, we developed a novel MIL paradigm to accommodate this class of problems. We solved the new MIL problem by maximizing a L2-norm soft margin using semidefinite programming, which can optimize relevant parameters automatically. We tested our method by analyzing a CTC data set obtained from 50 patients from three medical centers. Our proposed method showed significantly better performance compared with several traditional MIL methods. PMID:22552333
Sub-pixel image classification for forest types in East Texas
NASA Astrophysics Data System (ADS)
Westbrook, Joey
Sub-pixel classification is the extraction of information about the proportion of individual materials of interest within a pixel. Landcover classification at the sub-pixel scale provides more discrimination than traditional per-pixel multispectral classifiers for pixels where the material of interest is mixed with other materials. It allows for the un-mixing of pixels to show the proportion of each material of interest. The materials of interest for this study are pine, hardwood, mixed forest and non-forest. The goal of this project was to perform a sub-pixel classification, which allows a pixel to have multiple labels, and compare the result to a traditional supervised classification, which allows a pixel to have only one label. The satellite image used was a Landsat 5 Thematic Mapper (TM) scene of the Stephen F. Austin Experimental Forest in Nacogdoches County, Texas and the four cover type classes are pine, hardwood, mixed forest and non-forest. Once classified, a multi-layer raster datasets was created that comprised four raster layers where each layer showed the percentage of that cover type within the pixel area. Percentage cover type maps were then produced and the accuracy of each was assessed using a fuzzy error matrix for the sub-pixel classifications, and the results were compared to the supervised classification in which a traditional error matrix was used. The overall accuracy of the sub-pixel classification using the aerial photo for both training and reference data had the highest (65% overall) out of the three sub-pixel classifications. This was understandable because the analyst can visually observe the cover types actually on the ground for training data and reference data, whereas using the FIA (Forest Inventory and Analysis) plot data, the analyst must assume that an entire pixel contains the exact percentage of a cover type found in a plot. An increase in accuracy was found after reclassifying each sub-pixel classification from nine classes with 10 percent interval each to five classes with 20 percent interval each. When compared to the supervised classification which has a satisfactory overall accuracy of 90%, none of the sub-pixel classification achieved the same level. However, since traditional per-pixel classifiers assign only one label to pixels throughout the landscape while sub-pixel classifications assign multiple labels to each pixel, the traditional 85% accuracy of acceptance for pixel-based classifications should not apply to sub-pixel classifications. More research is needed in order to define the level of accuracy that is deemed acceptable for sub-pixel classifications.
Group Treatment in Acquired Brain Injury Rehabilitation
ERIC Educational Resources Information Center
Bertisch, Hilary; Rath, Joseph F.; Langenbahn, Donna M.; Sherr, Rose Lynn; Diller, Leonard
2011-01-01
The current article describes critical issues in adapting traditional group-treatment methods for working with individuals with reduced cognitive capacity secondary to acquired brain injury. Using the classification system based on functional ability developed at the NYU Rusk Institute of Rehabilitation Medicine (RIRM), we delineate the cognitive…
K-Nearest Neighbor Algorithm Optimization in Text Categorization
NASA Astrophysics Data System (ADS)
Chen, Shufeng
2018-01-01
K-Nearest Neighbor (KNN) classification algorithm is one of the simplest methods of data mining. It has been widely used in classification, regression and pattern recognition. The traditional KNN method has some shortcomings such as large amount of sample computation and strong dependence on the sample library capacity. In this paper, a method of representative sample optimization based on CURE algorithm is proposed. On the basis of this, presenting a quick algorithm QKNN (Quick k-nearest neighbor) to find the nearest k neighbor samples, which greatly reduces the similarity calculation. The experimental results show that this algorithm can effectively reduce the number of samples and speed up the search for the k nearest neighbor samples to improve the performance of the algorithm.
Using self-organizing maps to develop ambient air quality classifications: a time series example
2014-01-01
Background Development of exposure metrics that capture features of the multipollutant environment are needed to investigate health effects of pollutant mixtures. This is a complex problem that requires development of new methodologies. Objective Present a self-organizing map (SOM) framework for creating ambient air quality classifications that group days with similar multipollutant profiles. Methods Eight years of day-level data from Atlanta, GA, for ten ambient air pollutants collected at a central monitor location were classified using SOM into a set of day types based on their day-level multipollutant profiles. We present strategies for using SOM to develop a multipollutant metric of air quality and compare results with more traditional techniques. Results Our analysis found that 16 types of days reasonably describe the day-level multipollutant combinations that appear most frequently in our data. Multipollutant day types ranged from conditions when all pollutants measured low to days exhibiting relatively high concentrations for either primary or secondary pollutants or both. The temporal nature of class assignments indicated substantial heterogeneity in day type frequency distributions (~1%-14%), relatively short-term durations (<2 day persistence), and long-term and seasonal trends. Meteorological summaries revealed strong day type weather dependencies and pollutant concentration summaries provided interesting scenarios for further investigation. Comparison with traditional methods found SOM produced similar classifications with added insight regarding between-class relationships. Conclusion We find SOM to be an attractive framework for developing ambient air quality classification because the approach eases interpretation of results by allowing users to visualize classifications on an organized map. The presented approach provides an appealing tool for developing multipollutant metrics of air quality that can be used to support multipollutant health studies. PMID:24990361
a Data Field Method for Urban Remotely Sensed Imagery Classification Considering Spatial Correlation
NASA Astrophysics Data System (ADS)
Zhang, Y.; Qin, K.; Zeng, C.; Zhang, E. B.; Yue, M. X.; Tong, X.
2016-06-01
Spatial correlation between pixels is important information for remotely sensed imagery classification. Data field method and spatial autocorrelation statistics have been utilized to describe and model spatial information of local pixels. The original data field method can represent the spatial interactions of neighbourhood pixels effectively. However, its focus on measuring the grey level change between the central pixel and the neighbourhood pixels results in exaggerating the contribution of the central pixel to the whole local window. Besides, Geary's C has also been proven to well characterise and qualify the spatial correlation between each pixel and its neighbourhood pixels. But the extracted object is badly delineated with the distracting salt-and-pepper effect of isolated misclassified pixels. To correct this defect, we introduce the data field method for filtering and noise limitation. Moreover, the original data field method is enhanced by considering each pixel in the window as the central pixel to compute statistical characteristics between it and its neighbourhood pixels. The last step employs a support vector machine (SVM) for the classification of multi-features (e.g. the spectral feature and spatial correlation feature). In order to validate the effectiveness of the developed method, experiments are conducted on different remotely sensed images containing multiple complex object classes inside. The results show that the developed method outperforms the traditional method in terms of classification accuracies.
Sakhteman, Amirhossein; Faridi, Pouya; Daneshamouz, Saeid; Akbarizadeh, Amin Reza; Borhani-Haghighi, Afshin; Mohagheghzadeh, Abdolali
2017-01-01
Herbal oils have been widely used in Iran as medicinal compounds dating back to thousands of years in Iran. Chamomile oil is widely used as an example of traditional oil. We remade chamomile oils and tried to modify it with current knowledge and facilities. Six types of oil (traditional and modified) were prepared. Microbial limit tests and physicochemical tests were performed on them. Also, principal component analysis, hierarchical cluster analysis, and partial least squares discriminant analysis were done on the spectral data of attenuated total reflectance–infrared in order to obtain insight based on classification pattern of the samples. The results show that we can use modified versions of the chamomile oils (modified Clevenger-type apparatus method and microwave method) with the same content of traditional ones and with less microbial contaminations and better physicochemical properties. PMID:28585466
Zargaran, Arman; Sakhteman, Amirhossein; Faridi, Pouya; Daneshamouz, Saeid; Akbarizadeh, Amin Reza; Borhani-Haghighi, Afshin; Mohagheghzadeh, Abdolali
2017-10-01
Herbal oils have been widely used in Iran as medicinal compounds dating back to thousands of years in Iran. Chamomile oil is widely used as an example of traditional oil. We remade chamomile oils and tried to modify it with current knowledge and facilities. Six types of oil (traditional and modified) were prepared. Microbial limit tests and physicochemical tests were performed on them. Also, principal component analysis, hierarchical cluster analysis, and partial least squares discriminant analysis were done on the spectral data of attenuated total reflectance-infrared in order to obtain insight based on classification pattern of the samples. The results show that we can use modified versions of the chamomile oils (modified Clevenger-type apparatus method and microwave method) with the same content of traditional ones and with less microbial contaminations and better physicochemical properties.
Gene selection for tumor classification using neighborhood rough sets and entropy measures.
Chen, Yumin; Zhang, Zunjun; Zheng, Jianzhong; Ma, Ying; Xue, Yu
2017-03-01
With the development of bioinformatics, tumor classification from gene expression data becomes an important useful technology for cancer diagnosis. Since a gene expression data often contains thousands of genes and a small number of samples, gene selection from gene expression data becomes a key step for tumor classification. Attribute reduction of rough sets has been successfully applied to gene selection field, as it has the characters of data driving and requiring no additional information. However, traditional rough set method deals with discrete data only. As for the gene expression data containing real-value or noisy data, they are usually employed by a discrete preprocessing, which may result in poor classification accuracy. In this paper, we propose a novel gene selection method based on the neighborhood rough set model, which has the ability of dealing with real-value data whilst maintaining the original gene classification information. Moreover, this paper addresses an entropy measure under the frame of neighborhood rough sets for tackling the uncertainty and noisy of gene expression data. The utilization of this measure can bring about a discovery of compact gene subsets. Finally, a gene selection algorithm is designed based on neighborhood granules and the entropy measure. Some experiments on two gene expression data show that the proposed gene selection is an effective method for improving the accuracy of tumor classification. Copyright © 2017 Elsevier Inc. All rights reserved.
Chinese Sentence Classification Based on Convolutional Neural Network
NASA Astrophysics Data System (ADS)
Gu, Chengwei; Wu, Ming; Zhang, Chuang
2017-10-01
Sentence classification is one of the significant issues in Natural Language Processing (NLP). Feature extraction is often regarded as the key point for natural language processing. Traditional ways based on machine learning can not take high level features into consideration, such as Naive Bayesian Model. The neural network for sentence classification can make use of contextual information to achieve greater results in sentence classification tasks. In this paper, we focus on classifying Chinese sentences. And the most important is that we post a novel architecture of Convolutional Neural Network (CNN) to apply on Chinese sentence classification. In particular, most of the previous methods often use softmax classifier for prediction, we embed a linear support vector machine to substitute softmax in the deep neural network model, minimizing a margin-based loss to get a better result. And we use tanh as an activation function, instead of ReLU. The CNN model improve the result of Chinese sentence classification tasks. Experimental results on the Chinese news title database validate the effectiveness of our model.
NASA Astrophysics Data System (ADS)
Liu, F.; Chen, T.; He, J.; Wen, Q.; Yu, F.; Gu, X.; Wang, Z.
2018-04-01
In recent years, the quick upgrading and improvement of SAR sensors provide beneficial complements for the traditional optical remote sensing in the aspects of theory, technology and data. In this paper, Sentinel-1A SAR data and GF-1 optical data were selected for image fusion, and more emphases were put on the dryland crop classification under a complex crop planting structure, regarding corn and cotton as the research objects. Considering the differences among various data fusion methods, the principal component analysis (PCA), Gram-Schmidt (GS), Brovey and wavelet transform (WT) methods were compared with each other, and the GS and Brovey methods were proved to be more applicable in the study area. Then, the classification was conducted based on the object-oriented technique process. And for the GS, Brovey fusion images and GF-1 optical image, the nearest neighbour algorithm was adopted to realize the supervised classification with the same training samples. Based on the sample plots in the study area, the accuracy assessment was conducted subsequently. The values of overall accuracy and kappa coefficient of fusion images were all higher than those of GF-1 optical image, and GS method performed better than Brovey method. In particular, the overall accuracy of GS fusion image was 79.8 %, and the Kappa coefficient was 0.644. Thus, the results showed that GS and Brovey fusion images were superior to optical images for dryland crop classification. This study suggests that the fusion of SAR and optical images is reliable for dryland crop classification under a complex crop planting structure.
Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li
2011-01-01
Background Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Methodology/Principal Findings Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. Conclusions/Significance The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice. PMID:21359184
Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li
2011-02-16
Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice.
Hierarchical Gene Selection and Genetic Fuzzy System for Cancer Microarray Data Classification
Nguyen, Thanh; Khosravi, Abbas; Creighton, Douglas; Nahavandi, Saeid
2015-01-01
This paper introduces a novel approach to gene selection based on a substantial modification of analytic hierarchy process (AHP). The modified AHP systematically integrates outcomes of individual filter methods to select the most informative genes for microarray classification. Five individual ranking methods including t-test, entropy, receiver operating characteristic (ROC) curve, Wilcoxon and signal to noise ratio are employed to rank genes. These ranked genes are then considered as inputs for the modified AHP. Additionally, a method that uses fuzzy standard additive model (FSAM) for cancer classification based on genes selected by AHP is also proposed in this paper. Traditional FSAM learning is a hybrid process comprising unsupervised structure learning and supervised parameter tuning. Genetic algorithm (GA) is incorporated in-between unsupervised and supervised training to optimize the number of fuzzy rules. The integration of GA enables FSAM to deal with the high-dimensional-low-sample nature of microarray data and thus enhance the efficiency of the classification. Experiments are carried out on numerous microarray datasets. Results demonstrate the performance dominance of the AHP-based gene selection against the single ranking methods. Furthermore, the combination of AHP-FSAM shows a great accuracy in microarray data classification compared to various competing classifiers. The proposed approach therefore is useful for medical practitioners and clinicians as a decision support system that can be implemented in the real medical practice. PMID:25823003
Hierarchical gene selection and genetic fuzzy system for cancer microarray data classification.
Nguyen, Thanh; Khosravi, Abbas; Creighton, Douglas; Nahavandi, Saeid
2015-01-01
This paper introduces a novel approach to gene selection based on a substantial modification of analytic hierarchy process (AHP). The modified AHP systematically integrates outcomes of individual filter methods to select the most informative genes for microarray classification. Five individual ranking methods including t-test, entropy, receiver operating characteristic (ROC) curve, Wilcoxon and signal to noise ratio are employed to rank genes. These ranked genes are then considered as inputs for the modified AHP. Additionally, a method that uses fuzzy standard additive model (FSAM) for cancer classification based on genes selected by AHP is also proposed in this paper. Traditional FSAM learning is a hybrid process comprising unsupervised structure learning and supervised parameter tuning. Genetic algorithm (GA) is incorporated in-between unsupervised and supervised training to optimize the number of fuzzy rules. The integration of GA enables FSAM to deal with the high-dimensional-low-sample nature of microarray data and thus enhance the efficiency of the classification. Experiments are carried out on numerous microarray datasets. Results demonstrate the performance dominance of the AHP-based gene selection against the single ranking methods. Furthermore, the combination of AHP-FSAM shows a great accuracy in microarray data classification compared to various competing classifiers. The proposed approach therefore is useful for medical practitioners and clinicians as a decision support system that can be implemented in the real medical practice.
Pang, Shuchao; Yu, Zhezhou; Orgun, Mehmet A
2017-03-01
Highly accurate classification of biomedical images is an essential task in the clinical diagnosis of numerous medical diseases identified from those images. Traditional image classification methods combined with hand-crafted image feature descriptors and various classifiers are not able to effectively improve the accuracy rate and meet the high requirements of classification of biomedical images. The same also holds true for artificial neural network models directly trained with limited biomedical images used as training data or directly used as a black box to extract the deep features based on another distant dataset. In this study, we propose a highly reliable and accurate end-to-end classifier for all kinds of biomedical images via deep learning and transfer learning. We first apply domain transferred deep convolutional neural network for building a deep model; and then develop an overall deep learning architecture based on the raw pixels of original biomedical images using supervised training. In our model, we do not need the manual design of the feature space, seek an effective feature vector classifier or segment specific detection object and image patches, which are the main technological difficulties in the adoption of traditional image classification methods. Moreover, we do not need to be concerned with whether there are large training sets of annotated biomedical images, affordable parallel computing resources featuring GPUs or long times to wait for training a perfect deep model, which are the main problems to train deep neural networks for biomedical image classification as observed in recent works. With the utilization of a simple data augmentation method and fast convergence speed, our algorithm can achieve the best accuracy rate and outstanding classification ability for biomedical images. We have evaluated our classifier on several well-known public biomedical datasets and compared it with several state-of-the-art approaches. We propose a robust automated end-to-end classifier for biomedical images based on a domain transferred deep convolutional neural network model that shows a highly reliable and accurate performance which has been confirmed on several public biomedical image datasets. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.
On Utilizing Optimal and Information Theoretic Syntactic Modeling for Peptide Classification
NASA Astrophysics Data System (ADS)
Aygün, Eser; Oommen, B. John; Cataltepe, Zehra
Syntactic methods in pattern recognition have been used extensively in bioinformatics, and in particular, in the analysis of gene and protein expressions, and in the recognition and classification of bio-sequences. These methods are almost universally distance-based. This paper concerns the use of an Optimal and Information Theoretic (OIT) probabilistic model [11] to achieve peptide classification using the information residing in their syntactic representations. The latter has traditionally been achieved using the edit distances required in the respective peptide comparisons. We advocate that one can model the differences between compared strings as a mutation model consisting of random Substitutions, Insertions and Deletions (SID) obeying the OIT model. Thus, in this paper, we show that the probability measure obtained from the OIT model can be perceived as a sequence similarity metric, using which a Support Vector Machine (SVM)-based peptide classifier, referred to as OIT_SVM, can be devised.
Group-Based Active Learning of Classification Models.
Luo, Zhipeng; Hauskrecht, Milos
2017-05-01
Learning of classification models from real-world data often requires additional human expert effort to annotate the data. However, this process can be rather costly and finding ways of reducing the human annotation effort is critical for this task. The objective of this paper is to develop and study new ways of providing human feedback for efficient learning of classification models by labeling groups of examples. Briefly, unlike traditional active learning methods that seek feedback on individual examples, we develop a new group-based active learning framework that solicits label information on groups of multiple examples. In order to describe groups in a user-friendly way, conjunctive patterns are used to compactly represent groups. Our empirical study on 12 UCI data sets demonstrates the advantages and superiority of our approach over both classic instance-based active learning work, as well as existing group-based active-learning methods.
A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification
NASA Astrophysics Data System (ADS)
Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun
2016-12-01
Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value.
A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification.
Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun
2016-12-01
Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value.
A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification
Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun
2016-01-01
Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value. PMID:27905520
Song, Shu-qi; Zhang, Ya-qiang
2009-12-01
The etiology, pathogenesis, and diagnostic criteria of chronic prostatitis were reviewed in this article. Based on clinical practice, the authors systematically discussed the thoughts and methods for the treatment of chronic prostatitis by integrated traditional Chinese and Western medicine. Meanwhile, advice on disputed problems in clinical study of prostatits were given, such as curative effect estimation value of the number of leukocytes in expressed prostatic secretion (EPS) and bacterial culture in EPS, the opportunity and treatment course of antibiotics, National Institutes of Health chronic prostatitis symptom index, classification of syndromes of traditional Chinese medicine (TCM), TCM symptom score, and clinical study period.
Improved semi-supervised online boosting for object tracking
NASA Astrophysics Data System (ADS)
Li, Yicui; Qi, Lin; Tan, Shukun
2016-10-01
The advantage of an online semi-supervised boosting method which takes object tracking problem as a classification problem, is training a binary classifier from labeled and unlabeled examples. Appropriate object features are selected based on real time changes in the object. However, the online semi-supervised boosting method faces one key problem: The traditional self-training using the classification results to update the classifier itself, often leads to drifting or tracking failure, due to the accumulated error during each update of the tracker. To overcome the disadvantages of semi-supervised online boosting based on object tracking methods, the contribution of this paper is an improved online semi-supervised boosting method, in which the learning process is guided by positive (P) and negative (N) constraints, termed P-N constraints, which restrict the labeling of the unlabeled samples. First, we train the classification by an online semi-supervised boosting. Then, this classification is used to process the next frame. Finally, the classification is analyzed by the P-N constraints, which are used to verify if the labels of unlabeled data assigned by the classifier are in line with the assumptions made about positive and negative samples. The proposed algorithm can effectively improve the discriminative ability of the classifier and significantly alleviate the drifting problem in tracking applications. In the experiments, we demonstrate real-time tracking of our tracker on several challenging test sequences where our tracker outperforms other related on-line tracking methods and achieves promising tracking performance.
Tongue Images Classification Based on Constrained High Dispersal Network.
Meng, Dan; Cao, Guitao; Duan, Ye; Zhu, Minghua; Tu, Liping; Xu, Dong; Xu, Jiatuo
2017-01-01
Computer aided tongue diagnosis has a great potential to play important roles in traditional Chinese medicine (TCM). However, the majority of the existing tongue image analyses and classification methods are based on the low-level features, which may not provide a holistic view of the tongue. Inspired by deep convolutional neural network (CNN), we propose a novel feature extraction framework called constrained high dispersal neural networks (CHDNet) to extract unbiased features and reduce human labor for tongue diagnosis in TCM. Previous CNN models have mostly focused on learning convolutional filters and adapting weights between them, but these models have two major issues: redundancy and insufficient capability in handling unbalanced sample distribution. We introduce high dispersal and local response normalization operation to address the issue of redundancy. We also add multiscale feature analysis to avoid the problem of sensitivity to deformation. Our proposed CHDNet learns high-level features and provides more classification information during training time, which may result in higher accuracy when predicting testing samples. We tested the proposed method on a set of 267 gastritis patients and a control group of 48 healthy volunteers. Test results show that CHDNet is a promising method in tongue image classification for the TCM study.
Classification of hyperspectral imagery with neural networks: comparison to conventional tools
NASA Astrophysics Data System (ADS)
Merényi, Erzsébet; Farrand, William H.; Taranik, James V.; Minor, Timothy B.
2014-12-01
Efficient exploitation of hyperspectral imagery is of great importance in remote sensing. Artificial intelligence approaches have been receiving favorable reviews for classification of hyperspectral data because the complexity of such data challenges the limitations of many conventional methods. Artificial neural networks (ANNs) were shown to outperform traditional classifiers in many situations. However, studies that use the full spectral dimensionality of hyperspectral images to classify a large number of surface covers are scarce if non-existent. We advocate the need for methods that can handle the full dimensionality and a large number of classes to retain the discovery potential and the ability to discriminate classes with subtle spectral differences. We demonstrate that such a method exists in the family of ANNs. We compare the maximum likelihood, Mahalonobis distance, minimum distance, spectral angle mapper, and a hybrid ANN classifier for real hyperspectral AVIRIS data, using the full spectral resolution to map 23 cover types and using a small training set. Rigorous evaluation of the classification accuracies shows that the ANN outperforms the other methods and achieves ≈90% accuracy on test data.
Milburn, Trelani F; Lonigan, Christopher J; Allan, Darcey M; Phillips, Beth M
2017-04-01
To investigate approaches for identifying young children who may be at risk for later reading-related learning disabilities, this study compared the use of four contemporary methods of indexing learning disability (LD) with older children (i.e., IQ-achievement discrepancy, low achievement, low growth, and dual-discrepancy) to determine risk status with a large sample of 1,011 preschoolers. These children were classified as at risk or not using each method across three early-literacy skills (i.e., language, phonological awareness, print knowledge) and at three levels of severity (i.e., 5th, 10th, 25th percentiles). Chance-corrected affected-status agreement (CCASA) indicated poor agreement among methods with rates of agreement generally decreasing with greater levels of severity for both single- and two-measure classification, and agreement rates were lower for two-measure classification than for single-measure classification. These low rates of agreement between conventional methods of identifying children at risk for LD represent a significant impediment for identification and intervention for young children considered at-risk.
Milburn, Trelani F.; Lonigan, Christopher J.; Allan, Darcey M.; Phillips, Beth M.
2017-01-01
To investigate approaches for identifying young children who may be at risk for later reading-related learning disabilities, this study compared the use of four contemporary methods of indexing learning disability (LD) with older children (i.e., IQ-achievement discrepancy, low achievement, low growth, and dual-discrepancy) to determine risk status with a large sample of 1,011 preschoolers. These children were classified as at risk or not using each method across three early-literacy skills (i.e., language, phonological awareness, print knowledge) and at three levels of severity (i.e., 5th, 10th, 25th percentiles). Chance-corrected affected-status agreement (CCASA) indicated poor agreement among methods with rates of agreement generally decreasing with greater levels of severity for both single- and two-measure classification, and agreement rates were lower for two-measure classification than for single-measure classification. These low rates of agreement between conventional methods of identifying children at risk for LD represent a significant impediment for identification and intervention for young children considered at-risk. PMID:28670102
NASA Astrophysics Data System (ADS)
Uríčková, Veronika; Sádecká, Jana
2015-09-01
The identification of the geographical origin of beverages is one of the most important issues in food chemistry. Spectroscopic methods provide a relative rapid and low cost alternative to traditional chemical composition or sensory analyses. This paper reviews the current state of development of ultraviolet (UV), visible (Vis), near infrared (NIR) and mid infrared (MIR) spectroscopic techniques combined with pattern recognition methods for determining geographical origin of both wines and distilled drinks. UV, Vis, and NIR spectra contain broad band(s) with weak spectral features limiting their discrimination ability. Despite this expected shortcoming, each of the three spectroscopic ranges (NIR, Vis/NIR and UV/Vis/NIR) provides average correct classification higher than 82%. Although average correct classification is similar for NIR and MIR regions, in some instances MIR data processing improves prediction. Advantage of using MIR is that MIR peaks are better defined and more easily assigned than NIR bands. In general, success in a classification depends on both spectral range and pattern recognition methods. The main problem still remains the construction of databanks needed for all of these methods.
Relevance popularity: A term event model based feature selection scheme for text classification.
Feng, Guozhong; An, Baiguo; Yang, Fengqin; Wang, Han; Zhang, Libiao
2017-01-01
Feature selection is a practical approach for improving the performance of text classification methods by optimizing the feature subsets input to classifiers. In traditional feature selection methods such as information gain and chi-square, the number of documents that contain a particular term (i.e. the document frequency) is often used. However, the frequency of a given term appearing in each document has not been fully investigated, even though it is a promising feature to produce accurate classifications. In this paper, we propose a new feature selection scheme based on a term event Multinomial naive Bayes probabilistic model. According to the model assumptions, the matching score function, which is based on the prediction probability ratio, can be factorized. Finally, we derive a feature selection measurement for each term after replacing inner parameters by their estimators. On a benchmark English text datasets (20 Newsgroups) and a Chinese text dataset (MPH-20), our numerical experiment results obtained from using two widely used text classifiers (naive Bayes and support vector machine) demonstrate that our method outperformed the representative feature selection methods.
An unbalanced spectra classification method based on entropy
NASA Astrophysics Data System (ADS)
Liu, Zhong-bao; Zhao, Wen-juan
2017-05-01
How to solve the problem of distinguishing the minority spectra from the majority of the spectra is quite important in astronomy. In view of this, an unbalanced spectra classification method based on entropy (USCM) is proposed in this paper to deal with the unbalanced spectra classification problem. USCM greatly improves the performances of the traditional classifiers on distinguishing the minority spectra as it takes the data distribution into consideration in the process of classification. However, its time complexity is exponential with the training size, and therefore, it can only deal with the problem of small- and medium-scale classification. How to solve the large-scale classification problem is quite important to USCM. It can be easily obtained by mathematical computation that the dual form of USCM is equivalent to the minimum enclosing ball (MEB), and core vector machine (CVM) is introduced, USCM based on CVM is proposed to deal with the large-scale classification problem. Several comparative experiments on the 4 subclasses of K-type spectra, 3 subclasses of F-type spectra and 3 subclasses of G-type spectra from Sloan Digital Sky Survey (SDSS) verify USCM and USCM based on CVM perform better than kNN (k nearest neighbor) and SVM (support vector machine) in dealing with the problem of rare spectra mining respectively on the small- and medium-scale datasets and the large-scale datasets.
Acoustics based assessment of respiratory diseases using GMM classification.
Mayorga, P; Druzgalski, C; Morelos, R L; Gonzalez, O H; Vidales, J
2010-01-01
The focus of this paper is to present a method utilizing lung sounds for a quantitative assessment of patient health as it relates to respiratory disorders. In order to accomplish this, applicable traditional techniques within the speech processing domain were utilized to evaluate lung sounds obtained with a digital stethoscope. Traditional methods utilized in the evaluation of asthma involve auscultation and spirometry, but utilization of more sensitive electronic stethoscopes, which are currently available, and application of quantitative signal analysis methods offer opportunities of improved diagnosis. In particular we propose an acoustic evaluation methodology based on the Gaussian Mixed Models (GMM) which should assist in broader analysis, identification, and diagnosis of asthma based on the frequency domain analysis of wheezing and crackles.
A review and analysis of neural networks for classification of remotely sensed multispectral imagery
NASA Technical Reports Server (NTRS)
Paola, Justin D.; Schowengerdt, Robert A.
1993-01-01
A literature survey and analysis of the use of neural networks for the classification of remotely sensed multispectral imagery is presented. As part of a brief mathematical review, the backpropagation algorithm, which is the most common method of training multi-layer networks, is discussed with an emphasis on its application to pattern recognition. The analysis is divided into five aspects of neural network classification: (1) input data preprocessing, structure, and encoding; (2) output encoding and extraction of classes; (3) network architecture, (4) training algorithms; and (5) comparisons to conventional classifiers. The advantages of the neural network method over traditional classifiers are its non-parametric nature, arbitrary decision boundary capabilities, easy adaptation to different types of data and input structures, fuzzy output values that can enhance classification, and good generalization for use with multiple images. The disadvantages of the method are slow training time, inconsistent results due to random initial weights, and the requirement of obscure initialization values (e.g., learning rate and hidden layer size). Possible techniques for ameliorating these problems are discussed. It is concluded that, although the neural network method has several unique capabilities, it will become a useful tool in remote sensing only if it is made faster, more predictable, and easier to use.
NASA Astrophysics Data System (ADS)
He, Zhi; Liu, Lin
2016-11-01
Empirical mode decomposition (EMD) and its variants have recently been applied for hyperspectral image (HSI) classification due to their ability to extract useful features from the original HSI. However, it remains a challenging task to effectively exploit the spectral-spatial information by the traditional vector or image-based methods. In this paper, a three-dimensional (3D) extension of EMD (3D-EMD) is proposed to naturally treat the HSI as a cube and decompose the HSI into varying oscillations (i.e. 3D intrinsic mode functions (3D-IMFs)). To achieve fast 3D-EMD implementation, 3D Delaunay triangulation (3D-DT) is utilized to determine the distances of extrema, while separable filters are adopted to generate the envelopes. Taking the extracted 3D-IMFs as features of different tasks, robust multitask learning (RMTL) is further proposed for HSI classification. In RMTL, pairs of low-rank and sparse structures are formulated by trace-norm and l1,2 -norm to capture task relatedness and specificity, respectively. Moreover, the optimization problems of RMTL can be efficiently solved by the inexact augmented Lagrangian method (IALM). Compared with several state-of-the-art feature extraction and classification methods, the experimental results conducted on three benchmark data sets demonstrate the superiority of the proposed methods.
Classification of CT examinations for COPD visual severity analysis
NASA Astrophysics Data System (ADS)
Tan, Jun; Zheng, Bin; Wang, Xingwei; Pu, Jiantao; Gur, David; Sciurba, Frank C.; Leader, J. Ken
2012-03-01
In this study we present a computational method of CT examination classification into visual assessed emphysema severity. The visual severity categories ranged from 0 to 5 and were rated by an experienced radiologist. The six categories were none, trace, mild, moderate, severe and very severe. Lung segmentation was performed for every input image and all image features are extracted from the segmented lung only. We adopted a two-level feature representation method for the classification. Five gray level distribution statistics, six gray level co-occurrence matrix (GLCM), and eleven gray level run-length (GLRL) features were computed for each CT image depicted segment lung. Then we used wavelets decomposition to obtain the low- and high-frequency components of the input image, and again extract from the lung region six GLCM features and eleven GLRL features. Therefore our feature vector length is 56. The CT examinations were classified using the support vector machine (SVM) and k-nearest neighbors (KNN) and the traditional threshold (density mask) approach. The SVM classifier had the highest classification performance of all the methods with an overall sensitivity of 54.4% and a 69.6% sensitivity to discriminate "no" and "trace visually assessed emphysema. We believe this work may lead to an automated, objective method to categorically classify emphysema severity on CT exam.
Investigating Visitor Profiles as a Valuable Addition to Museum Research
ERIC Educational Resources Information Center
Lewalter, Doris; Phelan, Siëlle; Geyer, Claudia; Specht, Inga; Grüninger, Rahel; Schnotz, Wolfgang
2015-01-01
There is a long tradition of museum research assessing visitors' personal background. In this article, we suggest an insightful way to enhance and intensify visitor analyses and adopt a more integrative approach. To this end, we draw attention to Latent Class Analysis (LCA), a classification method that allows us to investigate visitor profiles…
RRCRank: a fusion method using rank strategy for residue-residue contact prediction.
Jing, Xiaoyang; Dong, Qiwen; Lu, Ruqian
2017-09-02
In structural biology area, protein residue-residue contacts play a crucial role in protein structure prediction. Some researchers have found that the predicted residue-residue contacts could effectively constrain the conformational search space, which is significant for de novo protein structure prediction. In the last few decades, related researchers have developed various methods to predict residue-residue contacts, especially, significant performance has been achieved by using fusion methods in recent years. In this work, a novel fusion method based on rank strategy has been proposed to predict contacts. Unlike the traditional regression or classification strategies, the contact prediction task is regarded as a ranking task. First, two kinds of features are extracted from correlated mutations methods and ensemble machine-learning classifiers, and then the proposed method uses the learning-to-rank algorithm to predict contact probability of each residue pair. First, we perform two benchmark tests for the proposed fusion method (RRCRank) on CASP11 dataset and CASP12 dataset respectively. The test results show that the RRCRank method outperforms other well-developed methods, especially for medium and short range contacts. Second, in order to verify the superiority of ranking strategy, we predict contacts by using the traditional regression and classification strategies based on the same features as ranking strategy. Compared with these two traditional strategies, the proposed ranking strategy shows better performance for three contact types, in particular for long range contacts. Third, the proposed RRCRank has been compared with several state-of-the-art methods in CASP11 and CASP12. The results show that the RRCRank could achieve comparable prediction precisions and is better than three methods in most assessment metrics. The learning-to-rank algorithm is introduced to develop a novel rank-based method for the residue-residue contact prediction of proteins, which achieves state-of-the-art performance based on the extensive assessment.
Artificial intelligence in the diagnosis of low back pain.
Mann, N H; Brown, M D
1991-04-01
Computerized methods are used to recognize the characteristics of patient pain drawings. Artificial neural network (ANN) models are compared with expert predictions and traditional statistical classification methods when placing the pain drawings of low back pain patients into one of five clinically significant categories. A discussion is undertaken outlining the differences in these classifiers and the potential benefits of the ANN model as an artificial intelligence technique.
A New Experiment on Bengali Character Recognition
NASA Astrophysics Data System (ADS)
Barman, Sumana; Bhattacharyya, Debnath; Jeon, Seung-Whan; Kim, Tai-Hoon; Kim, Haeng-Kon
This paper presents a method to use View based approach in Bangla Optical Character Recognition (OCR) system providing reduced data set to the ANN classification engine rather than the traditional OCR methods. It describes how Bangla characters are processed, trained and then recognized with the use of a Backpropagation Artificial neural network. This is the first published account of using a segmentation-free optical character recognition system for Bangla using a view based approach. The methodology presented here assumes that the OCR pre-processor has presented the input images to the classification engine described here. The size and the font face used to render the characters are also significant in both training and classification. The images are first converted into greyscale and then to binary images; these images are then scaled to a fit a pre-determined area with a fixed but significant number of pixels. The feature vectors are then formed extracting the characteristics points, which in this case is simply a series of 0s and 1s of fixed length. Finally, an artificial neural network is chosen for the training and classification process.
Exploring Genome-Wide Expression Profiles Using Machine Learning Techniques.
Kebschull, Moritz; Papapanou, Panos N
2017-01-01
Although contemporary high-throughput -omics methods produce high-dimensional data, the resulting wealth of information is difficult to assess using traditional statistical procedures. Machine learning methods facilitate the detection of additional patterns, beyond the mere identification of lists of features that differ between groups.Here, we demonstrate the utility of (1) supervised classification algorithms in class validation, and (2) unsupervised clustering in class discovery. We use data from our previous work that described the transcriptional profiles of gingival tissue samples obtained from subjects suffering from chronic or aggressive periodontitis (1) to test whether the two diagnostic entities were also characterized by differences on the molecular level, and (2) to search for a novel, alternative classification of periodontitis based on the tissue transcriptomes.Using machine learning technology, we provide evidence for diagnostic imprecision in the currently accepted classification of periodontitis, and demonstrate that a novel, alternative classification based on differences in gingival tissue transcriptomes is feasible. The outlined procedures allow for the unbiased interrogation of high-dimensional datasets for characteristic underlying classes, and are applicable to a broad range of -omics data.
NASA Astrophysics Data System (ADS)
Hu, Yan-Yan; Li, Dong-Sheng
2016-01-01
The hyperspectral images(HSI) consist of many closely spaced bands carrying the most object information. While due to its high dimensionality and high volume nature, it is hard to get satisfactory classification performance. In order to reduce HSI data dimensionality preparation for high classification accuracy, it is proposed to combine a band selection method of artificial immune systems (AIS) with a hybrid kernels support vector machine (SVM-HK) algorithm. In fact, after comparing different kernels for hyperspectral analysis, the approach mixed radial basis function kernel (RBF-K) with sigmoid kernel (Sig-K) and applied the optimized hybrid kernels in SVM classifiers. Then the SVM-HK algorithm used to induce the bands selection of an improved version of AIS. The AIS was composed of clonal selection and elite antibody mutation, including evaluation process with optional index factor (OIF). Experimental classification performance was on a San Diego Naval Base acquired by AVIRIS, the HRS dataset shows that the method is able to efficiently achieve bands redundancy removal while outperforming the traditional SVM classifier.
Comparison Between Supervised and Unsupervised Classifications of Neuronal Cell Types: A Case Study
Guerra, Luis; McGarry, Laura M; Robles, Víctor; Bielza, Concha; Larrañaga, Pedro; Yuste, Rafael
2011-01-01
In the study of neural circuits, it becomes essential to discern the different neuronal cell types that build the circuit. Traditionally, neuronal cell types have been classified using qualitative descriptors. More recently, several attempts have been made to classify neurons quantitatively, using unsupervised clustering methods. While useful, these algorithms do not take advantage of previous information known to the investigator, which could improve the classification task. For neocortical GABAergic interneurons, the problem to discern among different cell types is particularly difficult and better methods are needed to perform objective classifications. Here we explore the use of supervised classification algorithms to classify neurons based on their morphological features, using a database of 128 pyramidal cells and 199 interneurons from mouse neocortex. To evaluate the performance of different algorithms we used, as a “benchmark,” the test to automatically distinguish between pyramidal cells and interneurons, defining “ground truth” by the presence or absence of an apical dendrite. We compared hierarchical clustering with a battery of different supervised classification algorithms, finding that supervised classifications outperformed hierarchical clustering. In addition, the selection of subsets of distinguishing features enhanced the classification accuracy for both sets of algorithms. The analysis of selected variables indicates that dendritic features were most useful to distinguish pyramidal cells from interneurons when compared with somatic and axonal morphological variables. We conclude that supervised classification algorithms are better matched to the general problem of distinguishing neuronal cell types when some information on these cell groups, in our case being pyramidal or interneuron, is known a priori. As a spin-off of this methodological study, we provide several methods to automatically distinguish neocortical pyramidal cells from interneurons, based on their morphologies. © 2010 Wiley Periodicals, Inc. Develop Neurobiol 71: 71–82, 2011 PMID:21154911
Zhang, Jianhua; Li, Sunan; Wang, Rubin
2017-01-01
In this paper, we deal with the Mental Workload (MWL) classification problem based on the measured physiological data. First we discussed the optimal depth (i.e., the number of hidden layers) and parameter optimization algorithms for the Convolutional Neural Networks (CNN). The base CNNs designed were tested according to five classification performance indices, namely Accuracy, Precision, F-measure, G-mean, and required training time. Then we developed an Ensemble Convolutional Neural Network (ECNN) to enhance the accuracy and robustness of the individual CNN model. For the ECNN design, three model aggregation approaches (weighted averaging, majority voting and stacking) were examined and a resampling strategy was used to enhance the diversity of individual CNN models. The results of MWL classification performance comparison indicated that the proposed ECNN framework can effectively improve MWL classification performance and is featured by entirely automatic feature extraction and MWL classification, when compared with traditional machine learning methods.
Joint deconvolution and classification with applications to passive acoustic underwater multipath.
Anderson, Hyrum S; Gupta, Maya R
2008-11-01
This paper addresses the problem of classifying signals that have been corrupted by noise and unknown linear time-invariant (LTI) filtering such as multipath, given labeled uncorrupted training signals. A maximum a posteriori approach to the deconvolution and classification is considered, which produces estimates of the desired signal, the unknown channel, and the class label. For cases in which only a class label is needed, the classification accuracy can be improved by not committing to an estimate of the channel or signal. A variant of the quadratic discriminant analysis (QDA) classifier is proposed that probabilistically accounts for the unknown LTI filtering, and which avoids deconvolution. The proposed QDA classifier can work either directly on the signal or on features whose transformation by LTI filtering can be analyzed; as an example a classifier for subband-power features is derived. Results on simulated data and real Bowhead whale vocalizations show that jointly considering deconvolution with classification can dramatically improve classification performance over traditional methods over a range of signal-to-noise ratios.
An ordinal classification approach for CTG categorization.
Georgoulas, George; Karvelis, Petros; Gavrilis, Dimitris; Stylios, Chrysostomos D; Nikolakopoulos, George
2017-07-01
Evaluation of cardiotocogram (CTG) is a standard approach employed during pregnancy and delivery. But, its interpretation requires high level expertise to decide whether the recording is Normal, Suspicious or Pathological. Therefore, a number of attempts have been carried out over the past three decades for development automated sophisticated systems. These systems are usually (multiclass) classification systems that assign a category to the respective CTG. However most of these systems usually do not take into consideration the natural ordering of the categories associated with CTG recordings. In this work, an algorithm that explicitly takes into consideration the ordering of CTG categories, based on binary decomposition method, is investigated. Achieved results, using as a base classifier the C4.5 decision tree classifier, prove that the ordinal classification approach is marginally better than the traditional multiclass classification approach, which utilizes the standard C4.5 algorithm for several performance criteria.
Zhao, Junning; Ye, Zuguang
2012-08-01
Toxic classification of traditional Chinese medicine, as a contribution of traditional Chinese medicine (TCM) to the recognition of medicinal toxicity and rational use of medicinal materials by Chinese people, is now a great issue related to safe medication, sustainable development and internationalization of Chinese medicine. In this article, the origination and development of toxic classification theory was summarized and analyzed. Because toxic classification is an urgent issue related to TCM industrialization, modernization and internationalization, this article made a systematic analysis on the nature and connotation of toxic classification as well as risk control for TCM industry due to the medicinal toxicity. Based on the toxic studies, this article made some recommendations on toxic classification of Chinese medicinal materials for the revision of China Pharmacopeia (volume 1). From the aspect of scientific research, a new technical guideline for research on toxic classification of Chinese medicine should be formulated based on new biological toxicity test technology such as Microtox and ADME/Tox, because the present classification of acute toxicity of mice/rats can not met the modern development of Chinese medicine any more. The evaluation system and technical SOP of TCM toxic classification should also be established, and they should well balance TCM features, superiority and international requirements. From the aspect of medicine management, list of toxic medicines and their risk classification should be further improved by competent government according to scientific research. In China Pharmacopeia (volume I), such descriptions of strong toxicity, toxicity or mild toxicity should be abandoned when describing medicine nature and flavor. This revision might help promote TCM sustainable development and internationalization, and enhance the competitive capacity of Chinese medicine in both domestic and international market. However, description of strong toxicity, toxicity or mild toxicity might be used when making cautions for the medicine, stating that the description is based on Chinese classic works. In this way, TCM traditional theory might be inherited and features of Chinese medicine maintained and reflected. Besides, modern findings should be added to the cautions, including dose-response relationship, toxic mechanism, and toxic elements. The traditional toxic descriptions and modern findings, as a whole, can make the caution clear and scientific, and then promote safe medication and TCM modernization and internationalization.
C-learning: A new classification framework to estimate optimal dynamic treatment regimes.
Zhang, Baqun; Zhang, Min
2017-12-11
A dynamic treatment regime is a sequence of decision rules, each corresponding to a decision point, that determine that next treatment based on each individual's own available characteristics and treatment history up to that point. We show that identifying the optimal dynamic treatment regime can be recast as a sequential optimization problem and propose a direct sequential optimization method to estimate the optimal treatment regimes. In particular, at each decision point, the optimization is equivalent to sequentially minimizing a weighted expected misclassification error. Based on this classification perspective, we propose a powerful and flexible C-learning algorithm to learn the optimal dynamic treatment regimes backward sequentially from the last stage until the first stage. C-learning is a direct optimization method that directly targets optimizing decision rules by exploiting powerful optimization/classification techniques and it allows incorporation of patient's characteristics and treatment history to improve performance, hence enjoying advantages of both the traditional outcome regression-based methods (Q- and A-learning) and the more recent direct optimization methods. The superior performance and flexibility of the proposed methods are illustrated through extensive simulation studies. © 2017, The International Biometric Society.
Rank-Optimized Logistic Matrix Regression toward Improved Matrix Data Classification.
Zhang, Jianguang; Jiang, Jianmin
2018-02-01
While existing logistic regression suffers from overfitting and often fails in considering structural information, we propose a novel matrix-based logistic regression to overcome the weakness. In the proposed method, 2D matrices are directly used to learn two groups of parameter vectors along each dimension without vectorization, which allows the proposed method to fully exploit the underlying structural information embedded inside the 2D matrices. Further, we add a joint [Formula: see text]-norm on two parameter matrices, which are organized by aligning each group of parameter vectors in columns. This added co-regularization term has two roles-enhancing the effect of regularization and optimizing the rank during the learning process. With our proposed fast iterative solution, we carried out extensive experiments. The results show that in comparison to both the traditional tensor-based methods and the vector-based regression methods, our proposed solution achieves better performance for matrix data classifications.
A New Method of Facial Expression Recognition Based on SPE Plus SVM
NASA Astrophysics Data System (ADS)
Ying, Zilu; Huang, Mingwei; Wang, Zhen; Wang, Zhewei
A novel method of facial expression recognition (FER) is presented, which uses stochastic proximity embedding (SPE) for data dimension reduction, and support vector machine (SVM) for expression classification. The proposed algorithm is applied to Japanese Female Facial Expression (JAFFE) database for FER, better performance is obtained compared with some traditional algorithms, such as PCA and LDA etc.. The result have further proved the effectiveness of the proposed algorithm.
USDA-ARS?s Scientific Manuscript database
Successful development of approaches to quantify impacts of diverse landuse and associated agricultural management practices on ecosystem services is frequently limited by lack of historical and contemporary landuse data. We hypothesized that recent ground truth data could be used to extrapolate pre...
A Model Assessment and Classification System for Men and Women in Correctional Institutions.
ERIC Educational Resources Information Center
Hellervik, Lowell W.; And Others
The report describes a manpower assessment and classification system for criminal offenders directed towards making practical training and job classification decisions. The model is not concerned with custody classifications except as they affect occupational/training possibilities. The model combines traditional procedures of vocational…
Shamim, Mohammad Tabrez Anwar; Anwaruddin, Mohammad; Nagarajaram, H A
2007-12-15
Fold recognition is a key step in the protein structure discovery process, especially when traditional sequence comparison methods fail to yield convincing structural homologies. Although many methods have been developed for protein fold recognition, their accuracies remain low. This can be attributed to insufficient exploitation of fold discriminatory features. We have developed a new method for protein fold recognition using structural information of amino acid residues and amino acid residue pairs. Since protein fold recognition can be treated as a protein fold classification problem, we have developed a Support Vector Machine (SVM) based classifier approach that uses secondary structural state and solvent accessibility state frequencies of amino acids and amino acid pairs as feature vectors. Among the individual properties examined secondary structural state frequencies of amino acids gave an overall accuracy of 65.2% for fold discrimination, which is better than the accuracy by any method reported so far in the literature. Combination of secondary structural state frequencies with solvent accessibility state frequencies of amino acids and amino acid pairs further improved the fold discrimination accuracy to more than 70%, which is approximately 8% higher than the best available method. In this study we have also tested, for the first time, an all-together multi-class method known as Crammer and Singer method for protein fold classification. Our studies reveal that the three multi-class classification methods, namely one versus all, one versus one and Crammer and Singer method, yield similar predictions. Dataset and stand-alone program are available upon request.
Bevilacqua, M; Ciarapica, F E; Giacchetta, G
2008-07-01
This work is an attempt to apply classification tree methods to data regarding accidents in a medium-sized refinery, so as to identify the important relationships between the variables, which can be considered as decision-making rules when adopting any measures for improvement. The results obtained using the CART (Classification And Regression Trees) method proved to be the most precise and, in general, they are encouraging concerning the use of tree diagrams as preliminary explorative techniques for the assessment of the ergonomic, management and operational parameters which influence high accident risk situations. The Occupational Injury analysis carried out in this paper was planned as a dynamic process and can be repeated systematically. The CART technique, which considers a very wide set of objective and predictive variables, shows new cause-effect correlations in occupational safety which had never been previously described, highlighting possible injury risk groups and supporting decision-making in these areas. The use of classification trees must not, however, be seen as an attempt to supplant other techniques, but as a complementary method which can be integrated into traditional types of analysis.
DeepPap: Deep Convolutional Networks for Cervical Cell Classification.
Zhang, Ling; Le Lu; Nogues, Isabella; Summers, Ronald M; Liu, Shaoxiong; Yao, Jianhua
2017-11-01
Automation-assisted cervical screening via Pap smear or liquid-based cytology (LBC) is a highly effective cell imaging based cancer detection tool, where cells are partitioned into "abnormal" and "normal" categories. However, the success of most traditional classification methods relies on the presence of accurate cell segmentations. Despite sixty years of research in this field, accurate segmentation remains a challenge in the presence of cell clusters and pathologies. Moreover, previous classification methods are only built upon the extraction of hand-crafted features, such as morphology and texture. This paper addresses these limitations by proposing a method to directly classify cervical cells-without prior segmentation-based on deep features, using convolutional neural networks (ConvNets). First, the ConvNet is pretrained on a natural image dataset. It is subsequently fine-tuned on a cervical cell dataset consisting of adaptively resampled image patches coarsely centered on the nuclei. In the testing phase, aggregation is used to average the prediction scores of a similar set of image patches. The proposed method is evaluated on both Pap smear and LBC datasets. Results show that our method outperforms previous algorithms in classification accuracy (98.3%), area under the curve (0.99) values, and especially specificity (98.3%), when applied to the Herlev benchmark Pap smear dataset and evaluated using five-fold cross validation. Similar superior performances are also achieved on the HEMLBC (H&E stained manual LBC) dataset. Our method is promising for the development of automation-assisted reading systems in primary cervical screening.
Change classification in SAR time series: a functional approach
NASA Astrophysics Data System (ADS)
Boldt, Markus; Thiele, Antje; Schulz, Karsten; Hinz, Stefan
2017-10-01
Change detection represents a broad field of research in SAR remote sensing, consisting of many different approaches. Besides the simple recognition of change areas, the analysis of type, category or class of the change areas is at least as important for creating a comprehensive result. Conventional strategies for change classification are based on supervised or unsupervised landuse / landcover classifications. The main drawback of such approaches is that the quality of the classification result directly depends on the selection of training and reference data. Additionally, supervised processing methods require an experienced operator who capably selects the training samples. This training step is not necessary when using unsupervised strategies, but nevertheless meaningful reference data must be available for identifying the resulting classes. Consequently, an experienced operator is indispensable. In this study, an innovative concept for the classification of changes in SAR time series data is proposed. Regarding the drawbacks of traditional strategies given above, it copes without using any training data. Moreover, the method can be applied by an operator, who does not have detailed knowledge about the available scenery yet. This knowledge is provided by the algorithm. The final step of the procedure, which main aspect is given by the iterative optimization of an initial class scheme with respect to the categorized change objects, is represented by the classification of these objects to the finally resulting classes. This assignment step is subject of this paper.
Reservoir Identification: Parameter Characterization or Feature Classification
NASA Astrophysics Data System (ADS)
Cao, J.
2017-12-01
The ultimate goal of oil and gas exploration is to find the oil or gas reservoirs with industrial mining value. Therefore, the core task of modern oil and gas exploration is to identify oil or gas reservoirs on the seismic profiles. Traditionally, the reservoir is identify by seismic inversion of a series of physical parameters such as porosity, saturation, permeability, formation pressure, and so on. Due to the heterogeneity of the geological medium, the approximation of the inversion model and the incompleteness and noisy of the data, the inversion results are highly uncertain and must be calibrated or corrected with well data. In areas where there are few wells or no well, reservoir identification based on seismic inversion is high-risk. Reservoir identification is essentially a classification issue. In the identification process, the underground rocks are divided into reservoirs with industrial mining value and host rocks with non-industrial mining value. In addition to the traditional physical parameters classification, the classification may be achieved using one or a few comprehensive features. By introducing the concept of seismic-print, we have developed a new reservoir identification method based on seismic-print analysis. Furthermore, we explore the possibility to use deep leaning to discover the seismic-print characteristics of oil and gas reservoirs. Preliminary experiments have shown that the deep learning of seismic data could distinguish gas reservoirs from host rocks. The combination of both seismic-print analysis and seismic deep learning is expected to be a more robust reservoir identification method. The work was supported by NSFC under grant No. 41430323 and No. U1562219, and the National Key Research and Development Program under Grant No. 2016YFC0601
Liu, Zhenli; Liu, Yuanyan; Liu, Chunsheng; Song, Zhiqian; Li, Qing; Zha, Qinglin; Lu, Cheng; Wang, Chun; Ning, Zhangchi; Zhang, Yuxin; Tian, Cheng; Lu, Aiping
2013-07-12
Rhodiola plants are used as a natural remedy in the western world and as a traditional herbal medicine in China, and are valued for their ability to enhance human resistance to stress or fatigue and to promote longevity. Due to the morphological similarities among different species, the identification of the genus remains somewhat controversial, which may affect their safety and effectiveness in clinical use. In this paper, 47 Rhodiola samples of seven species were collected from thirteen local provinces of China. They were identified by their morphological characteristics and genetic and phytochemical taxonomies. Eight bioactive chemotaxonomic markers from four chemical classes (phenylpropanoids, phenylethanol derivatives, flavonoids and phenolic acids) were determined to evaluate and distinguish the chemotaxonomy of Rhodiola samples using an HPLC-DAD/UV method. Hierarchical cluster analysis (HCA) and principal component analysis (PCA) were applied to compare the two classification methods between genetic and phytochemical taxonomy. The established chemotaxonomic classification could be effectively used for Rhodiola species identification.
Electroencephalography epilepsy classifications using hybrid cuckoo search and neural network
NASA Astrophysics Data System (ADS)
Pratiwi, A. B.; Damayanti, A.; Miswanto
2017-07-01
Epilepsy is a condition that affects the brain and causes repeated seizures. This seizure is episodes that can vary and nearly undetectable to long periods of vigorous shaking or brain contractions. Epilepsy often can be confirmed with an electrocephalography (EEG). Neural Networks has been used in biomedic signal analysis, it has successfully classified the biomedic signal, such as EEG signal. In this paper, a hybrid cuckoo search and neural network are used to recognize EEG signal for epilepsy classifications. The weight of the multilayer perceptron is optimized by the cuckoo search algorithm based on its error. The aim of this methods is making the network faster to obtained the local or global optimal then the process of classification become more accurate. Based on the comparison results with the traditional multilayer perceptron, the hybrid cuckoo search and multilayer perceptron provides better performance in term of error convergence and accuracy. The purpose methods give MSE 0.001 and accuracy 90.0 %.
2013-01-01
Background Rhodiola plants are used as a natural remedy in the western world and as a traditional herbal medicine in China, and are valued for their ability to enhance human resistance to stress or fatigue and to promote longevity. Due to the morphological similarities among different species, the identification of the genus remains somewhat controversial, which may affect their safety and effectiveness in clinical use. Results In this paper, 47 Rhodiola samples of seven species were collected from thirteen local provinces of China. They were identified by their morphological characteristics and genetic and phytochemical taxonomies. Eight bioactive chemotaxonomic markers from four chemical classes (phenylpropanoids, phenylethanol derivatives, flavonoids and phenolic acids) were determined to evaluate and distinguish the chemotaxonomy of Rhodiola samples using an HPLC-DAD/UV method. Hierarchical cluster analysis (HCA) and principal component analysis (PCA) were applied to compare the two classification methods between genetic and phytochemical taxonomy. Conclusions The established chemotaxonomic classification could be effectively used for Rhodiola species identification. PMID:23844866
Jaiswara, Ranjana; Nandi, Diptarup; Balakrishnan, Rohini
2013-01-01
Traditional taxonomy based on morphology has often failed in accurate species identification owing to the occurrence of cryptic species, which are reproductively isolated but morphologically identical. Molecular data have thus been used to complement morphology in species identification. The sexual advertisement calls in several groups of acoustically communicating animals are species-specific and can thus complement molecular data as non-invasive tools for identification. Several statistical tools and automated identifier algorithms have been used to investigate the efficiency of acoustic signals in species identification. Despite a plethora of such methods, there is a general lack of knowledge regarding the appropriate usage of these methods in specific taxa. In this study, we investigated the performance of two commonly used statistical methods, discriminant function analysis (DFA) and cluster analysis, in identification and classification based on acoustic signals of field cricket species belonging to the subfamily Gryllinae. Using a comparative approach we evaluated the optimal number of species and calling song characteristics for both the methods that lead to most accurate classification and identification. The accuracy of classification using DFA was high and was not affected by the number of taxa used. However, a constraint in using discriminant function analysis is the need for a priori classification of songs. Accuracy of classification using cluster analysis, which does not require a priori knowledge, was maximum for 6-7 taxa and decreased significantly when more than ten taxa were analysed together. We also investigated the efficacy of two novel derived acoustic features in improving the accuracy of identification. Our results show that DFA is a reliable statistical tool for species identification using acoustic signals. Our results also show that cluster analysis of acoustic signals in crickets works effectively for species classification and identification.
Integrative Chemical-Biological Read-Across Approach for Chemical Hazard Classification
Low, Yen; Sedykh, Alexander; Fourches, Denis; Golbraikh, Alexander; Whelan, Maurice; Rusyn, Ivan; Tropsha, Alexander
2013-01-01
Traditional read-across approaches typically rely on the chemical similarity principle to predict chemical toxicity; however, the accuracy of such predictions is often inadequate due to the underlying complex mechanisms of toxicity. Here we report on the development of a hazard classification and visualization method that draws upon both chemical structural similarity and comparisons of biological responses to chemicals measured in multiple short-term assays (”biological” similarity). The Chemical-Biological Read-Across (CBRA) approach infers each compound's toxicity from those of both chemical and biological analogs whose similarities are determined by the Tanimoto coefficient. Classification accuracy of CBRA was compared to that of classical RA and other methods using chemical descriptors alone, or in combination with biological data. Different types of adverse effects (hepatotoxicity, hepatocarcinogenicity, mutagenicity, and acute lethality) were classified using several biological data types (gene expression profiling and cytotoxicity screening). CBRA-based hazard classification exhibited consistently high external classification accuracy and applicability to diverse chemicals. Transparency of the CBRA approach is aided by the use of radial plots that show the relative contribution of analogous chemical and biological neighbors. Identification of both chemical and biological features that give rise to the high accuracy of CBRA-based toxicity prediction facilitates mechanistic interpretation of the models. PMID:23848138
Self-similarity Clustering Event Detection Based on Triggers Guidance
NASA Astrophysics Data System (ADS)
Zhang, Xianfei; Li, Bicheng; Tian, Yuxuan
Traditional method of Event Detection and Characterization (EDC) regards event detection task as classification problem. It makes words as samples to train classifier, which can lead to positive and negative samples of classifier imbalance. Meanwhile, there is data sparseness problem of this method when the corpus is small. This paper doesn't classify event using word as samples, but cluster event in judging event types. It adopts self-similarity to convergence the value of K in K-means algorithm by the guidance of event triggers, and optimizes clustering algorithm. Then, combining with named entity and its comparative position information, the new method further make sure the pinpoint type of event. The new method avoids depending on template of event in tradition methods, and its result of event detection can well be used in automatic text summarization, text retrieval, and topic detection and tracking.
MATRIX DISCRIMINANT ANALYSIS WITH APPLICATION TO COLORIMETRIC SENSOR ARRAY DATA
Suslick, Kenneth S.
2014-01-01
With the rapid development of nano-technology, a “colorimetric sensor array” (CSA) which is referred to as an optical electronic nose has been developed for the identification of toxicants. Unlike traditional sensors which rely on a single chemical interaction, CSA can measure multiple chemical interactions by using chemo-responsive dyes. The color changes of the chemo-responsive dyes are recorded before and after exposure to toxicants and serve as a template for classification. The color changes are digitalized in the form of a matrix with rows representing dye effects and columns representing the spectrum of colors. Thus, matrix-classification methods are highly desirable. In this article, we develop a novel classification method, matrix discriminant analysis (MDA), which is a generalization of linear discriminant analysis (LDA) for the data in matrix form. By incorporating the intrinsic matrix-structure of the data in discriminant analysis, the proposed method can improve CSA’s sensitivity and more importantly, specificity. A penalized MDA method, PMDA, is also introduced to further incorporate sparsity structure in discriminant function. Numerical studies suggest that the proposed MDA and PMDA methods outperform LDA and other competing discriminant methods for matrix predictors. The asymptotic consistency of MDA is also established. R code and data are available online as supplementary material. PMID:26783371
[A New Distance Metric between Different Stellar Spectra: the Residual Distribution Distance].
Liu, Jie; Pan, Jing-chang; Luo, A-li; Wei, Peng; Liu, Meng
2015-12-01
Distance metric is an important issue for the spectroscopic survey data processing, which defines a calculation method of the distance between two different spectra. Based on this, the classification, clustering, parameter measurement and outlier data mining of spectral data can be carried out. Therefore, the distance measurement method has some effect on the performance of the classification, clustering, parameter measurement and outlier data mining. With the development of large-scale stellar spectral sky surveys, how to define more efficient distance metric on stellar spectra has become a very important issue in the spectral data processing. Based on this problem and fully considering of the characteristics and data features of the stellar spectra, a new distance measurement method of stellar spectra named Residual Distribution Distance is proposed. While using this method to measure the distance, the two spectra are firstly scaled and then the standard deviation of the residual is used the distance. Different from the traditional distance metric calculation methods of stellar spectra, when used to calculate the distance between stellar spectra, this method normalize the two spectra to the same scale, and then calculate the residual corresponding to the same wavelength, and the standard error of the residual spectrum is used as the distance measure. The distance measurement method can be used for stellar classification, clustering and stellar atmospheric physical parameters measurement and so on. This paper takes stellar subcategory classification as an example to test the distance measure method. The results show that the distance defined by the proposed method is more effective to describe the gap between different types of spectra in the classification than other methods, which can be well applied in other related applications. At the same time, this paper also studies the effect of the signal to noise ratio (SNR) on the performance of the proposed method. The result show that the distance is affected by the SNR. The smaller the signal-to-noise ratio is, the greater impact is on the distance; While SNR is larger than 10, the signal-to-noise ratio has little effect on the performance for the classification.
Fuzzy Set Classification of Old-Growth Southern Pine
Don C. Bragg
2002-01-01
I propose the development of a fuzzy set ordination (FSO) approach to old-growth classification of southern pines. A fuzzy systems approach differs from traditional old-growth classification in that it does not require a "crisp" classification where a stand is either "old-growth" or "not old-growth", but allows for fractional membership...
Gender classification from video under challenging operating conditions
NASA Astrophysics Data System (ADS)
Mendoza-Schrock, Olga; Dong, Guozhu
2014-06-01
The literature is abundant with papers on gender classification research. However the majority of such research is based on the assumption that there is enough resolution so that the subject's face can be resolved. Hence the majority of the research is actually in the face recognition and facial feature area. A gap exists for gender classification under challenging operating conditions—different seasonal conditions, different clothing, etc.—and when the subject's face cannot be resolved due to lack of resolution. The Seasonal Weather and Gender (SWAG) Database is a novel database that contains subjects walking through a scene under operating conditions that span a calendar year. This paper exploits a subset of that database—the SWAG One dataset—using data mining techniques, traditional classifiers (ex. Naïve Bayes, Support Vector Machine, etc.) and traditional (canny edge detection, etc.) and non-traditional (height/width ratios, etc.) feature extractors to achieve high correct gender classification rates (greater than 85%). Another novelty includes exploiting frame differentials.
Cognition research and constitutional classification in Chinese medicine.
Wang, Ji; Li, Yingshuai; Ni, Cheng; Zhang, Huimin; Li, Lingru; Wang, Qi
2011-01-01
In the Western medicine system, scholars have explained individual differences in terms of behaviour and thinking, leading to the emergence of various classification theories on individual differences. Traditional Chinese medicine has long observed human constitutions. Modern Chinese medicine studies have also involved study of human constitutions; however, differences exist in the ways traditional and modern Chinese medicine explore individual constitutions. In the late 1970s, the constitutional theory of Chinese medicine was proposed. This theory takes a global and dynamic view of human differences (e.g., the shape of the human body, function, psychology, and other characteristics) based on arguments from traditional Chinese medicine. The establishment of a standard for classifying constitutions into nine modules was critical for clinical application of this theory. In this review, we describe the history and recent research progress of this theory, and compare it with related studies in the western medicine system. Several research methods, including philology, informatics, epidemiology, and molecular biology, in classifying constitutions used in the constitutional theory of Chinese medicine were discussed. In summary, this constitutional theory of Chinese medicine can be used in clinical practice and would contribute to health control of patients.
Fuzzy-Rough Nearest Neighbour Classification
NASA Astrophysics Data System (ADS)
Jensen, Richard; Cornelis, Chris
A new fuzzy-rough nearest neighbour (FRNN) classification algorithm is presented in this paper, as an alternative to Sarkar's fuzzy-rough ownership function (FRNN-O) approach. By contrast to the latter, our method uses the nearest neighbours to construct lower and upper approximations of decision classes, and classifies test instances based on their membership to these approximations. In the experimental analysis, we evaluate our approach with both classical fuzzy-rough approximations (based on an implicator and a t-norm), as well as with the recently introduced vaguely quantified rough sets. Preliminary results are very good, and in general FRNN outperforms FRNN-O, as well as the traditional fuzzy nearest neighbour (FNN) algorithm.
Self-Harmful Behaviors in a Population-Based Sample of Young Adults
ERIC Educational Resources Information Center
Nada-Raja, Shyamala; Skegg, Keren; Langley, John; Morrison, Dianne; Sowerby, Paula
2004-01-01
A birth cohort of 472 women and 494 men aged 26 years was interviewed about a range of self-harmful behaviors first and then asked about suicidal intent.- Lifetime prevalence of self-harm using traditional methods of suicide (ICD [International Classification of Diseases] self-harm) was 13%, with 9% of the sample describing at least one such…
Monitoring urban tree cover using object-based image analysis and public domain remotely sensed data
L. Monika Moskal; Diane M. Styers; Meghan Halabisky
2011-01-01
Urban forest ecosystems provide a range of social and ecological services, but due to the heterogeneity of these canopies their spatial extent is difficult to quantify and monitor. Traditional per-pixel classification methods have been used to map urban canopies, however, such techniques are not generally appropriate for assessing these highly variable landscapes....
Support vector machine and principal component analysis for microarray data classification
NASA Astrophysics Data System (ADS)
Astuti, Widi; Adiwijaya
2018-03-01
Cancer is a leading cause of death worldwide although a significant proportion of it can be cured if it is detected early. In recent decades, technology called microarray takes an important role in the diagnosis of cancer. By using data mining technique, microarray data classification can be performed to improve the accuracy of cancer diagnosis compared to traditional techniques. The characteristic of microarray data is small sample but it has huge dimension. Since that, there is a challenge for researcher to provide solutions for microarray data classification with high performance in both accuracy and running time. This research proposed the usage of Principal Component Analysis (PCA) as a dimension reduction method along with Support Vector Method (SVM) optimized by kernel functions as a classifier for microarray data classification. The proposed scheme was applied on seven data sets using 5-fold cross validation and then evaluation and analysis conducted on term of both accuracy and running time. The result showed that the scheme can obtained 100% accuracy for Ovarian and Lung Cancer data when Linear and Cubic kernel functions are used. In term of running time, PCA greatly reduced the running time for every data sets.
Protein classification using probabilistic chain graphs and the Gene Ontology structure.
Carroll, Steven; Pavlovic, Vladimir
2006-08-01
Probabilistic graphical models have been developed in the past for the task of protein classification. In many cases, classifications obtained from the Gene Ontology have been used to validate these models. In this work we directly incorporate the structure of the Gene Ontology into the graphical representation for protein classification. We present a method in which each protein is represented by a replicate of the Gene Ontology structure, effectively modeling each protein in its own 'annotation space'. Proteins are also connected to one another according to different measures of functional similarity, after which belief propagation is run to make predictions at all ontology terms. The proposed method was evaluated on a set of 4879 proteins from the Saccharomyces Genome Database whose interactions were also recorded in the GRID project. Results indicate that direct utilization of the Gene Ontology improves predictive ability, outperforming traditional models that do not take advantage of dependencies among functional terms. Average increase in accuracy (precision) of positive and negative term predictions of 27.8% (2.0%) over three different similarity measures and three subontologies was observed. C/C++/Perl implementation is available from authors upon request.
OpenMP Parallelization and Optimization of Graph-Based Machine Learning Algorithms
Meng, Zhaoyi; Koniges, Alice; He, Yun Helen; ...
2016-09-21
In this paper, we investigate the OpenMP parallelization and optimization of two novel data classification algorithms. The new algorithms are based on graph and PDE solution techniques and provide significant accuracy and performance advantages over traditional data classification algorithms in serial mode. The methods leverage the Nystrom extension to calculate eigenvalue/eigenvectors of the graph Laplacian and this is a self-contained module that can be used in conjunction with other graph-Laplacian based methods such as spectral clustering. We use performance tools to collect the hotspots and memory access of the serial codes and use OpenMP as the parallelization language to parallelizemore » the most time-consuming parts. Where possible, we also use library routines. We then optimize the OpenMP implementations and detail the performance on traditional supercomputer nodes (in our case a Cray XC30), and test the optimization steps on emerging testbed systems based on Intel’s Knights Corner and Landing processors. We show both performance improvement and strong scaling behavior. Finally, a large number of optimization techniques and analyses are necessary before the algorithm reaches almost ideal scaling.« less
Weighted Discriminative Dictionary Learning based on Low-rank Representation
NASA Astrophysics Data System (ADS)
Chang, Heyou; Zheng, Hao
2017-01-01
Low-rank representation has been widely used in the field of pattern classification, especially when both training and testing images are corrupted with large noise. Dictionary plays an important role in low-rank representation. With respect to the semantic dictionary, the optimal representation matrix should be block-diagonal. However, traditional low-rank representation based dictionary learning methods cannot effectively exploit the discriminative information between data and dictionary. To address this problem, this paper proposed weighted discriminative dictionary learning based on low-rank representation, where a weighted representation regularization term is constructed. The regularization associates label information of both training samples and dictionary atoms, and encourages to generate a discriminative representation with class-wise block-diagonal structure, which can further improve the classification performance where both training and testing images are corrupted with large noise. Experimental results demonstrate advantages of the proposed method over the state-of-the-art methods.
Li, Zhao-Liang
2018-01-01
Few studies have examined hyperspectral remote-sensing image classification with type-II fuzzy sets. This paper addresses image classification based on a hyperspectral remote-sensing technique using an improved interval type-II fuzzy c-means (IT2FCM*) approach. In this study, in contrast to other traditional fuzzy c-means-based approaches, the IT2FCM* algorithm considers the ranking of interval numbers and the spectral uncertainty. The classification results based on a hyperspectral dataset using the FCM, IT2FCM, and the proposed improved IT2FCM* algorithms show that the IT2FCM* method plays the best performance according to the clustering accuracy. In this paper, in order to validate and demonstrate the separability of the IT2FCM*, four type-I fuzzy validity indexes are employed, and a comparative analysis of these fuzzy validity indexes also applied in FCM and IT2FCM methods are made. These four indexes are also applied into different spatial and spectral resolution datasets to analyze the effects of spectral and spatial scaling factors on the separability of FCM, IT2FCM, and IT2FCM* methods. The results of these validity indexes from the hyperspectral datasets show that the improved IT2FCM* algorithm have the best values among these three algorithms in general. The results demonstrate that the IT2FCM* exhibits good performance in hyperspectral remote-sensing image classification because of its ability to handle hyperspectral uncertainty. PMID:29373548
Back to the future: therapies for idiopathic nephrotic syndrome.
Gibson, Keisha L; Glenn, Dorey; Ferris, Maria E
2015-01-01
Roughly 20-40% of individuals with idiopathic nephrotic syndrome will fail to respond to standard therapies and have a high risk of progression to end stage kidney disease (ESKD). In the last 50 years, no new therapies have been approved specifically for the treatment of these individuals with recalcitrant disease. Recent in vitro, translational, and clinical studies have identified novel targets and pathways that not only expand our understanding of the complex pathophysiology of proteinuric disease but also provide an opportunity to challenge the tradition of relying on histologic classification of nephrotic diseases to make treatment decisions. The traditional methods of directing the care of individuals with nephrotic syndrome by histological classification or deciding second line therapies on the basis of steroid-responsiveness may soon yield customizing therapies based on our expanding understanding of molecular targets. Important non-immunologic mechanisms of widely used immunosuppressive therapies may be just as important in palliating proteinuric disease as proposed immunologic functions. © 2015 S. Karger AG, Basel.
An automated approach to mapping corn from Landsat imagery
Maxwell, S.K.; Nuckols, J.R.; Ward, M.H.; Hoffer, R.M.
2004-01-01
Most land cover maps generated from Landsat imagery involve classification of a wide variety of land cover types, whereas some studies may only need spatial information on a single cover type. For example, we required a map of corn in order to estimate exposure to agricultural chemicals for an environmental epidemiology study. Traditional classification techniques, which require the collection and processing of costly ground reference data, were not feasible for our application because of the large number of images to be analyzed. We present a new method that has the potential to automate the classification of corn from Landsat satellite imagery, resulting in a more timely product for applications covering large geographical regions. Our approach uses readily available agricultural areal estimates to enable automation of the classification process resulting in a map identifying land cover as ‘highly likely corn,’ ‘likely corn’ or ‘unlikely corn.’ To demonstrate the feasibility of this approach, we produced a map consisting of the three corn likelihood classes using a Landsat image in south central Nebraska. Overall classification accuracy of the map was 92.2% when compared to ground reference data.
A new approach to enhance the performance of decision tree for classifying gene expression data.
Hassan, Md; Kotagiri, Ramamohanarao
2013-12-20
Gene expression data classification is a challenging task due to the large dimensionality and very small number of samples. Decision tree is one of the popular machine learning approaches to address such classification problems. However, the existing decision tree algorithms use a single gene feature at each node to split the data into its child nodes and hence might suffer from poor performance specially when classifying gene expression dataset. By using a new decision tree algorithm where, each node of the tree consists of more than one gene, we enhance the classification performance of traditional decision tree classifiers. Our method selects suitable genes that are combined using a linear function to form a derived composite feature. To determine the structure of the tree we use the area under the Receiver Operating Characteristics curve (AUC). Experimental analysis demonstrates higher classification accuracy using the new decision tree compared to the other existing decision trees in literature. We experimentally compare the effect of our scheme against other well known decision tree techniques. Experiments show that our algorithm can substantially boost the classification performance of the decision tree.
A model for anomaly classification in intrusion detection systems
NASA Astrophysics Data System (ADS)
Ferreira, V. O.; Galhardi, V. V.; Gonçalves, L. B. L.; Silva, R. C.; Cansian, A. M.
2015-09-01
Intrusion Detection Systems (IDS) are traditionally divided into two types according to the detection methods they employ, namely (i) misuse detection and (ii) anomaly detection. Anomaly detection has been widely used and its main advantage is the ability to detect new attacks. However, the analysis of anomalies generated can become expensive, since they often have no clear information about the malicious events they represent. In this context, this paper presents a model for automated classification of alerts generated by an anomaly based IDS. The main goal is either the classification of the detected anomalies in well-defined taxonomies of attacks or to identify whether it is a false positive misclassified by the IDS. Some common attacks to computer networks were considered and we achieved important results that can equip security analysts with best resources for their analyses.
Park, Sang-Sub
2014-01-01
The purpose of this study is to grasp difference in quality of chest compression accuracy between the modified chest compression method with the use of smartphone application and the standardized traditional chest compression method. Participants were progressed 64 people except 6 absentees among 70 people who agreed to participation with completing the CPR curriculum. In the classification of group in participants, the modified chest compression method was called as smartphone group (33 people). The standardized chest compression method was called as traditional group (31 people). The common equipments in both groups were used Manikin for practice and Manikin for evaluation. In the meantime, the smartphone group for application was utilized Android and iOS Operating System (OS) of 2 smartphone products (G, i). The measurement period was conducted from September 25th to 26th, 2012. Data analysis was used SPSS WIN 12.0 program. As a result of research, the proper compression depth (mm) was shown the proper compression depth (p< 0.01) in traditional group (53.77 mm) compared to smartphone group (48.35 mm). Even the proper chest compression (%) was formed suitably (p< 0.05) in traditional group (73.96%) more than smartphone group (60.51%). As for the awareness of chest compression accuracy, the traditional group (3.83 points) had the higher awareness of chest compression accuracy (p< 0.001) than the smartphone group (2.32 points). In the questionnaire that was additionally carried out 1 question only in smartphone group, the modified chest compression method with the use of smartphone had the high negative reason in rescuer for occurrence of hand back pain (48.5%) and unstable posture (21.2%).
Analysis of signals under compositional noise with applications to SONAR data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tucker, J. Derek; Wu, Wei; Srivastava, Anuj
2013-07-09
In this paper, we consider the problem of denoising and classification of SONAR signals observed under compositional noise, i.e., they have been warped randomly along the x-axis. The traditional techniques do not account for such noise and, consequently, cannot provide a robust classification of signals. We apply a recent framework that: 1) uses a distance-based objective function for data alignment and noise reduction; and 2) leads to warping-invariant distances between signals for robust clustering and classification. We use this framework to introduce two distances that can be used for signal classification: a) a y-distance, which is the distance between themore » aligned signals; and b) an x-distance that measures the amount of warping needed to align the signals. We focus on the task of clustering and classifying objects, using acoustic spectrum (acoustic color), which is complicated by the uncertainties in aspect angles at data collections. Small changes in the aspect angles corrupt signals in a way that amounts to compositional noise. As a result, we demonstrate the use of the developed metrics in classification of acoustic color data and highlight improvements in signal classification over current methods.« less
Metric learning with spectral graph convolutions on brain connectivity networks.
Ktena, Sofia Ira; Parisot, Sarah; Ferrante, Enzo; Rajchl, Martin; Lee, Matthew; Glocker, Ben; Rueckert, Daniel
2018-04-01
Graph representations are often used to model structured data at an individual or population level and have numerous applications in pattern recognition problems. In the field of neuroscience, where such representations are commonly used to model structural or functional connectivity between a set of brain regions, graphs have proven to be of great importance. This is mainly due to the capability of revealing patterns related to brain development and disease, which were previously unknown. Evaluating similarity between these brain connectivity networks in a manner that accounts for the graph structure and is tailored for a particular application is, however, non-trivial. Most existing methods fail to accommodate the graph structure, discarding information that could be beneficial for further classification or regression analyses based on these similarities. We propose to learn a graph similarity metric using a siamese graph convolutional neural network (s-GCN) in a supervised setting. The proposed framework takes into consideration the graph structure for the evaluation of similarity between a pair of graphs, by employing spectral graph convolutions that allow the generalisation of traditional convolutions to irregular graphs and operates in the graph spectral domain. We apply the proposed model on two datasets: the challenging ABIDE database, which comprises functional MRI data of 403 patients with autism spectrum disorder (ASD) and 468 healthy controls aggregated from multiple acquisition sites, and a set of 2500 subjects from UK Biobank. We demonstrate the performance of the method for the tasks of classification between matching and non-matching graphs, as well as individual subject classification and manifold learning, showing that it leads to significantly improved results compared to traditional methods. Copyright © 2017 Elsevier Inc. All rights reserved.
Study on the Classification of GAOFEN-3 Polarimetric SAR Images Using Deep Neural Network
NASA Astrophysics Data System (ADS)
Zhang, J.; Zhang, J.; Zhao, Z.
2018-04-01
Polarimetric Synthetic Aperture Radar (POLSAR) imaging principle determines that the image quality will be affected by speckle noise. So the recognition accuracy of traditional image classification methods will be reduced by the effect of this interference. Since the date of submission, Deep Convolutional Neural Network impacts on the traditional image processing methods and brings the field of computer vision to a new stage with the advantages of a strong ability to learn deep features and excellent ability to fit large datasets. Based on the basic characteristics of polarimetric SAR images, the paper studied the types of the surface cover by using the method of Deep Learning. We used the fully polarimetric SAR features of different scales to fuse RGB images to the GoogLeNet model based on convolution neural network Iterative training, and then use the trained model to test the classification of data validation.First of all, referring to the optical image, we mark the surface coverage type of GF-3 POLSAR image with 8m resolution, and then collect the samples according to different categories. To meet the GoogLeNet model requirements of 256 × 256 pixel image input and taking into account the lack of full-resolution SAR resolution, the original image should be pre-processed in the process of resampling. In this paper, POLSAR image slice samples of different scales with sampling intervals of 2 m and 1 m to be trained separately and validated by the verification dataset. Among them, the training accuracy of GoogLeNet model trained with resampled 2-m polarimetric SAR image is 94.89 %, and that of the trained SAR image with resampled 1 m is 92.65 %.
Carnahan, Brian; Meyer, Gérard; Kuntz, Lois-Ann
2003-01-01
Multivariate classification models play an increasingly important role in human factors research. In the past, these models have been based primarily on discriminant analysis and logistic regression. Models developed from machine learning research offer the human factors professional a viable alternative to these traditional statistical classification methods. To illustrate this point, two machine learning approaches--genetic programming and decision tree induction--were used to construct classification models designed to predict whether or not a student truck driver would pass his or her commercial driver license (CDL) examination. The models were developed and validated using the curriculum scores and CDL exam performances of 37 student truck drivers who had completed a 320-hr driver training course. Results indicated that the machine learning classification models were superior to discriminant analysis and logistic regression in terms of predictive accuracy. Actual or potential applications of this research include the creation of models that more accurately predict human performance outcomes.
Qualitative and Quantitative Analysis for Facial Complexion in Traditional Chinese Medicine
Zhao, Changbo; Li, Guo-zheng; Li, Fufeng; Wang, Zhi; Liu, Chang
2014-01-01
Facial diagnosis is an important and very intuitive diagnostic method in Traditional Chinese Medicine (TCM). However, due to its qualitative and experience-based subjective property, traditional facial diagnosis has a certain limitation in clinical medicine. The computerized inspection method provides classification models to recognize facial complexion (including color and gloss). However, the previous works only study the classification problems of facial complexion, which is considered as qualitative analysis in our perspective. For quantitative analysis expectation, the severity or degree of facial complexion has not been reported yet. This paper aims to make both qualitative and quantitative analysis for facial complexion. We propose a novel feature representation of facial complexion from the whole face of patients. The features are established with four chromaticity bases splitting up by luminance distribution on CIELAB color space. Chromaticity bases are constructed from facial dominant color using two-level clustering; the optimal luminance distribution is simply implemented with experimental comparisons. The features are proved to be more distinctive than the previous facial complexion feature representation. Complexion recognition proceeds by training an SVM classifier with the optimal model parameters. In addition, further improved features are more developed by the weighted fusion of five local regions. Extensive experimental results show that the proposed features achieve highest facial color recognition performance with a total accuracy of 86.89%. And, furthermore, the proposed recognition framework could analyze both color and gloss degrees of facial complexion by learning a ranking function. PMID:24967342
NASA Astrophysics Data System (ADS)
Liu, Wanjun; Liang, Xuejian; Qu, Haicheng
2017-11-01
Hyperspectral image (HSI) classification is one of the most popular topics in remote sensing community. Traditional and deep learning-based classification methods were proposed constantly in recent years. In order to improve the classification accuracy and robustness, a dimensionality-varied convolutional neural network (DVCNN) was proposed in this paper. DVCNN was a novel deep architecture based on convolutional neural network (CNN). The input of DVCNN was a set of 3D patches selected from HSI which contained spectral-spatial joint information. In the following feature extraction process, each patch was transformed into some different 1D vectors by 3D convolution kernels, which were able to extract features from spectral-spatial data. The rest of DVCNN was about the same as general CNN and processed 2D matrix which was constituted by by all 1D data. So that the DVCNN could not only extract more accurate and rich features than CNN, but also fused spectral-spatial information to improve classification accuracy. Moreover, the robustness of network on water-absorption bands was enhanced in the process of spectral-spatial fusion by 3D convolution, and the calculation was simplified by dimensionality varied convolution. Experiments were performed on both Indian Pines and Pavia University scene datasets, and the results showed that the classification accuracy of DVCNN improved by 32.87% on Indian Pines and 19.63% on Pavia University scene than spectral-only CNN. The maximum accuracy improvement of DVCNN achievement was 13.72% compared with other state-of-the-art HSI classification methods, and the robustness of DVCNN on water-absorption bands noise was demonstrated.
Chen, Zhao; Cao, Yanfeng; He, Shuaibing; Qiao, Yanjiang
2018-01-01
Action (" gongxiao " in Chinese) of traditional Chinese medicine (TCM) is the high recapitulation for therapeutic and health-preserving effects under the guidance of TCM theory. TCM-defined herbal properties (" yaoxing " in Chinese) had been used in this research. TCM herbal property (TCM-HP) is the high generalization and summary for actions, both of which come from long-term effective clinical practice in two thousands of years in China. However, the specific relationship between TCM-HP and action of TCM is complex and unclear from a scientific perspective. The research about this is conducive to expound the connotation of TCM-HP theory and is of important significance for the development of the TCM-HP theory. One hundred and thirty-three herbs including 88 heat-clearing herbs (HCHs) and 45 blood-activating stasis-resolving herbs (BAHRHs) were collected from reputable TCM literatures, and their corresponding TCM-HPs/actions information were collected from Chinese pharmacopoeia (2015 edition). The Kennard-Stone (K-S) algorithm was used to split 133 herbs into 100 calibration samples and 33 validation samples. Then, machine learning methods including supported vector machine (SVM), k-nearest neighbor (kNN) and deep learning methods including deep belief network (DBN), convolutional neutral network (CNN) were adopted to develop action classification models based on TCM-HP theory, respectively. In order to ensure robustness, these four classification methods were evaluated by using the method of tenfold cross validation and 20 external validation samples for prediction. As results, 72.7-100% of 33 validation samples including 17 HCHs and 16 BASRHs were correctly predicted by these four types of methods. Both of the DBN and CNN methods gave out the best results and their sensitivity, specificity, precision, accuracy were all 100.00%. Especially, the predicted results of external validation set showed that the performance of deep learning methods (DBN, CNN) were better than traditional machine learning methods (kNN, SVM) in terms of their sensitivity, specificity, precision, accuracy. Moreover, the distribution patterns of TCM-HPs of HCHs and BASRHs were also analyzed to detect the featured TCM-HPs of these two types of herbs. The result showed that the featured TCM-HPs of HCHs were cold, bitter, liver and stomach meridians entered, while those of BASRHs were warm, bitter and pungent, liver meridian entered. The performance on validation set and external validation set of deep learning methods (DBN, CNN) were better than machine learning models (kNN, SVM) in sensitivity, specificity, precision, accuracy when predicting the actions of heat-clearing and blood-activating stasis-resolving based on TCM-HP theory. The deep learning classification methods owned better generalization ability and accuracy when predicting the actions of heat-clearing and blood-activating stasis-resolving based on TCM-HP theory. Besides, the methods of deep learning would help us to improve our understanding about the relationship between herbal property and action, as well as to enrich and develop the theory of TCM-HP scientifically.
Data-driven automated acoustic analysis of human infant vocalizations using neural network tools.
Warlaumont, Anne S; Oller, D Kimbrough; Buder, Eugene H; Dale, Rick; Kozma, Robert
2010-04-01
Acoustic analysis of infant vocalizations has typically employed traditional acoustic measures drawn from adult speech acoustics, such as f(0), duration, formant frequencies, amplitude, and pitch perturbation. Here an alternative and complementary method is proposed in which data-derived spectrographic features are central. 1-s-long spectrograms of vocalizations produced by six infants recorded longitudinally between ages 3 and 11 months are analyzed using a neural network consisting of a self-organizing map and a single-layer perceptron. The self-organizing map acquires a set of holistic, data-derived spectrographic receptive fields. The single-layer perceptron receives self-organizing map activations as input and is trained to classify utterances into prelinguistic phonatory categories (squeal, vocant, or growl), identify the ages at which they were produced, and identify the individuals who produced them. Classification performance was significantly better than chance for all three classification tasks. Performance is compared to another popular architecture, the fully supervised multilayer perceptron. In addition, the network's weights and patterns of activation are explored from several angles, for example, through traditional acoustic measurements of the network's receptive fields. Results support the use of this and related tools for deriving holistic acoustic features directly from infant vocalization data and for the automatic classification of infant vocalizations.
A Deep Learning Approach for Fault Diagnosis of Induction Motors in Manufacturing
NASA Astrophysics Data System (ADS)
Shao, Si-Yu; Sun, Wen-Jun; Yan, Ru-Qiang; Wang, Peng; Gao, Robert X.
2017-11-01
Extracting features from original signals is a key procedure for traditional fault diagnosis of induction motors, as it directly influences the performance of fault recognition. However, high quality features need expert knowledge and human intervention. In this paper, a deep learning approach based on deep belief networks (DBN) is developed to learn features from frequency distribution of vibration signals with the purpose of characterizing working status of induction motors. It combines feature extraction procedure with classification task together to achieve automated and intelligent fault diagnosis. The DBN model is built by stacking multiple-units of restricted Boltzmann machine (RBM), and is trained using layer-by-layer pre-training algorithm. Compared with traditional diagnostic approaches where feature extraction is needed, the presented approach has the ability of learning hierarchical representations, which are suitable for fault classification, directly from frequency distribution of the measurement data. The structure of the DBN model is investigated as the scale and depth of the DBN architecture directly affect its classification performance. Experimental study conducted on a machine fault simulator verifies the effectiveness of the deep learning approach for fault diagnosis of induction motors. This research proposes an intelligent diagnosis method for induction motor which utilizes deep learning model to automatically learn features from sensor data and realize working status recognition.
Novel high/low solubility classification methods for new molecular entities.
Dave, Rutwij A; Morris, Marilyn E
2016-09-10
This research describes a rapid solubility classification approach that could be used in the discovery and development of new molecular entities. Compounds (N=635) were divided into two groups based on information available in the literature: high solubility (BDDCS/BCS 1/3) and low solubility (BDDCS/BCS 2/4). We established decision rules for determining solubility classes using measured log solubility in molar units (MLogSM) or measured solubility (MSol) in mg/ml units. ROC curve analysis was applied to determine statistically significant threshold values of MSol and MLogSM. Results indicated that NMEs with MLogSM>-3.05 or MSol>0.30mg/mL will have ≥85% probability of being highly soluble and new molecular entities with MLogSM≤-3.05 or MSol≤0.30mg/mL will have ≥85% probability of being poorly soluble. When comparing solubility classification using the threshold values of MLogSM or MSol with BDDCS, we were able to correctly classify 85% of compounds. We also evaluated solubility classification of an independent set of 108 orally administered drugs using MSol (0.3mg/mL) and our method correctly classified 81% and 95% of compounds into high and low solubility classes, respectively. The high/low solubility classification using MLogSM or MSol is novel and independent of traditionally used dose number criteria. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Liu, Tao; Im, Jungho; Quackenbush, Lindi J.
2015-12-01
This study provides a novel approach to individual tree crown delineation (ITCD) using airborne Light Detection and Ranging (LiDAR) data in dense natural forests using two main steps: crown boundary refinement based on a proposed Fishing Net Dragging (FiND) method, and segment merging based on boundary classification. FiND starts with approximate tree crown boundaries derived using a traditional watershed method with Gaussian filtering and refines these boundaries using an algorithm that mimics how a fisherman drags a fishing net. Random forest machine learning is then used to classify boundary segments into two classes: boundaries between trees and boundaries between branches that belong to a single tree. Three groups of LiDAR-derived features-two from the pseudo waveform generated along with crown boundaries and one from a canopy height model (CHM)-were used in the classification. The proposed ITCD approach was tested using LiDAR data collected over a mountainous region in the Adirondack Park, NY, USA. Overall accuracy of boundary classification was 82.4%. Features derived from the CHM were generally more important in the classification than the features extracted from the pseudo waveform. A comprehensive accuracy assessment scheme for ITCD was also introduced by considering both area of crown overlap and crown centroids. Accuracy assessment using this new scheme shows the proposed ITCD achieved 74% and 78% as overall accuracy, respectively, for deciduous and mixed forest.
Fu, Chen; Zhang, Nevin Lianwen; Chen, Bao-Xin; Chen, Zhou Rong; Jin, Xiang Lan; Guo, Rong-Juan; Chen, Zhi-Gang; Zhang, Yun-Ling
2017-05-01
To treat patients with vascular mild cognitive impairment (VMCI) using traditional Chinese medicine (TCM), it is necessary to classify the patients into TCM syndrome types and to apply different treatments to different types. In this paper, we investigate how to properly carry out the classification for patients with VMCI aged 50 or above using a novel data-driven method known as latent tree analysis (LTA). A cross-sectional survey on VMCI was carried out in several regions in Northern China between February 2008 and February 2012 which resulted in a data set that involves 803 patients and 93 symptoms. LTA was performed on the data to reveal symptom co-occurrence patterns, and the patients were partitioned into clusters in multiple ways based on the patterns. The patient clusters were matched up with syndrome types, and population statistics of the clusters are used to quantify the syndrome types and to establish classification rules. Eight syndrome types are identified: Qi deficiency, Qi stagnation, Blood deficiency, Blood stasis, Phlegm-dampness, Fire-heat, Yang deficiency, and Yin deficiency. The prevalence and symptom occurrence characteristics of each syndrome type are determined. Quantitative classification rules are established for determining whether a patient belongs to each of the syndrome types. A solution for the TCM syndrome classification problem for patients with VMCI and aged 50 or above is established based on the LTA of unlabeled symptom survey data. The results can be used as a reference in clinic practice to improve the quality of syndrome differentiation and to reduce diagnosis variances across physicians. They can also be used for patient selection in research projects aimed at finding biomarkers for the syndrome types and in randomized control trials aimed at determining the efficacy of TCM treatments of VMCI.
Boan, Andrea D; Voeks, Jenifer H; Feng, Wuwei Wayne; Bachman, David L; Jauch, Edward C; Adams, Robert J; Ovbiagele, Bruce; Lackland, Daniel T
2014-01-01
The use of International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9) diagnostic codes can identify racial disparities in ischemic stroke hospitalizations; however, inclusion of revascularization procedure codes as acute stroke events may affect the magnitude of the risk difference. This study assesses the impact of excluding revascularization procedure codes in the ICD-9 definition of ischemic stroke, compared with the traditional inclusive definition, on racial disparity estimates for stroke incidence and recurrence. Patients discharged with a diagnosis of ischemic stroke (ICD-9 codes 433.00-434.91 and 436) were identified from a statewide inpatient discharge database from 2010 to 2012. Race-age specific disparity estimates of stroke incidence and recurrence and 1-year cumulative recurrent stroke rates were compared between the routinely used traditional classification and a modified classification of stroke that excluded primary ICD-9 cerebral revascularization procedures codes (38.12, 00.61, and 00.63). The traditional classification identified 7878 stroke hospitalizations, whereas the modified classification resulted in 18% fewer hospitalizations (n = 6444). The age-specific black to white rate ratios were significantly higher in the modified than in the traditional classification for stroke incidence (rate ratio, 1.50; 95% confidence interval [CI], 1.43-1.58 vs. rate ratio, 1.24; 95% CI, 1.18-1.30, respectively). In whites, the 1-year cumulative recurrence rate was significantly reduced by 46% (45-64 years) and 49% (≥ 65 years) in the modified classification, largely explained by a higher rate of cerebral revascularization procedures among whites. There were nonsignificant reductions of 14% (45-64 years) and 19% (≥ 65 years) among blacks. Including cerebral revascularization procedure codes overestimates hospitalization rates for ischemic stroke and significantly underestimates the racial disparity estimates in stroke incidence and recurrence. Copyright © 2014 National Stroke Association. Published by Elsevier Inc. All rights reserved.
Arrogance analysis of several typical pattern recognition classifiers
NASA Astrophysics Data System (ADS)
Jing, Chen; Xia, Shengping; Hu, Weidong
2007-04-01
Various kinds of classification methods have been developed. However, most of these classical methods, such as Back-Propagation (BP), Bayesian method, Support Vector Machine(SVM), Self-Organizing Map (SOM) are arrogant. A so-called arrogance, for a human, means that his decision, which even is a mistake, overstates his actual experience. Accordingly, we say that he is a arrogant if he frequently makes arrogant decisions. Likewise, some classical pattern classifiers represent the similar characteristic of arrogance. Given an input feature vector, we say a classifier is arrogant in its classification if its veracity is high yet its experience is low. Typically, for a new sample which is distinguishable from original training samples, traditional classifiers recognize it as one of the known targets. Clearly, arrogance in classification is an undesirable attribute. Conversely, a classifier is non-arrogant in its classification if there is a reasonable balance between its veracity and its experience. Inquisitiveness is, in many ways, the opposite of arrogance. In nature, inquisitiveness is an eagerness for knowledge characterized by the drive to question, to seek a deeper understanding. The human capacity to doubt present beliefs allows us to acquire new experiences and to learn from our mistakes. Within the discrete world of computers, inquisitive pattern recognition is the constructive investigation and exploitation of conflict in information. Thus, we quantify this balance and discuss new techniques that will detect arrogance in a classifier.
Sandino, Juan; Wooler, Adam; Gonzalez, Felipe
2017-09-24
The increased technological developments in Unmanned Aerial Vehicles (UAVs) combined with artificial intelligence and Machine Learning (ML) approaches have opened the possibility of remote sensing of extensive areas of arid lands. In this paper, a novel approach towards the detection of termite mounds with the use of a UAV, hyperspectral imagery, ML and digital image processing is intended. A new pipeline process is proposed to detect termite mounds automatically and to reduce, consequently, detection times. For the classification stage, several ML classification algorithms' outcomes were studied, selecting support vector machines as the best approach for their role in image classification of pre-existing termite mounds. Various test conditions were applied to the proposed algorithm, obtaining an overall accuracy of 68%. Images with satisfactory mound detection proved that the method is "resolution-dependent". These mounds were detected regardless of their rotation and position in the aerial image. However, image distortion reduced the number of detected mounds due to the inclusion of a shape analysis method in the object detection phase, and image resolution is still determinant to obtain accurate results. Hyperspectral imagery demonstrated better capabilities to classify a huge set of materials than implementing traditional segmentation methods on RGB images only.
Wang, Shunfang; Nie, Bing; Yue, Kun; Fei, Yu; Li, Wenjia; Xu, Dongshu
2017-12-15
Kernel discriminant analysis (KDA) is a dimension reduction and classification algorithm based on nonlinear kernel trick, which can be novelly used to treat high-dimensional and complex biological data before undergoing classification processes such as protein subcellular localization. Kernel parameters make a great impact on the performance of the KDA model. Specifically, for KDA with the popular Gaussian kernel, to select the scale parameter is still a challenging problem. Thus, this paper introduces the KDA method and proposes a new method for Gaussian kernel parameter selection depending on the fact that the differences between reconstruction errors of edge normal samples and those of interior normal samples should be maximized for certain suitable kernel parameters. Experiments with various standard data sets of protein subcellular localization show that the overall accuracy of protein classification prediction with KDA is much higher than that without KDA. Meanwhile, the kernel parameter of KDA has a great impact on the efficiency, and the proposed method can produce an optimum parameter, which makes the new algorithm not only perform as effectively as the traditional ones, but also reduce the computational time and thus improve efficiency.
A robust probabilistic collaborative representation based classification for multimodal biometrics
NASA Astrophysics Data System (ADS)
Zhang, Jing; Liu, Huanxi; Ding, Derui; Xiao, Jianli
2018-04-01
Most of the traditional biometric recognition systems perform recognition with a single biometric indicator. These systems have suffered noisy data, interclass variations, unacceptable error rates, forged identity, and so on. Due to these inherent problems, it is not valid that many researchers attempt to enhance the performance of unimodal biometric systems with single features. Thus, multimodal biometrics is investigated to reduce some of these defects. This paper proposes a new multimodal biometric recognition approach by fused faces and fingerprints. For more recognizable features, the proposed method extracts block local binary pattern features for all modalities, and then combines them into a single framework. For better classification, it employs the robust probabilistic collaborative representation based classifier to recognize individuals. Experimental results indicate that the proposed method has improved the recognition accuracy compared to the unimodal biometrics.
2011-01-01
Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed overall classification accuracy above a median value of 0.63, but for most sensitivity was around or even lower than a median value of 0.5. Conclusions When taking into account sensitivity, specificity and overall classification accuracy Random Forests and Linear Discriminant analysis rank first among all the classifiers tested in prediction of dementia using several neuropsychological tests. These methods may be used to improve accuracy, sensitivity and specificity of Dementia predictions from neuropsychological testing. PMID:21849043
DIF Trees: Using Classification Trees to Detect Differential Item Functioning
ERIC Educational Resources Information Center
Vaughn, Brandon K.; Wang, Qiu
2010-01-01
A nonparametric tree classification procedure is used to detect differential item functioning for items that are dichotomously scored. Classification trees are shown to be an alternative procedure to detect differential item functioning other than the use of traditional Mantel-Haenszel and logistic regression analysis. A nonparametric…
Xia, Wenjun; Mita, Yoshio; Shibata, Tadashi
2016-05-01
Aiming at efficient data condensation and improving accuracy, this paper presents a hardware-friendly template reduction (TR) method for the nearest neighbor (NN) classifiers by introducing the concept of critical boundary vectors. A hardware system is also implemented to demonstrate the feasibility of using an field-programmable gate array (FPGA) to accelerate the proposed method. Initially, k -means centers are used as substitutes for the entire template set. Then, to enhance the classification performance, critical boundary vectors are selected by a novel learning algorithm, which is completed within a single iteration. Moreover, to remove noisy boundary vectors that can mislead the classification in a generalized manner, a global categorization scheme has been explored and applied to the algorithm. The global characterization automatically categorizes each classification problem and rapidly selects the boundary vectors according to the nature of the problem. Finally, only critical boundary vectors and k -means centers are used as the new template set for classification. Experimental results for 24 data sets show that the proposed algorithm can effectively reduce the number of template vectors for classification with a high learning speed. At the same time, it improves the accuracy by an average of 2.17% compared with the traditional NN classifiers and also shows greater accuracy than seven other TR methods. We have shown the feasibility of using a proof-of-concept FPGA system of 256 64-D vectors to accelerate the proposed method on hardware. At a 50-MHz clock frequency, the proposed system achieves a 3.86 times higher learning speed than on a 3.4-GHz PC, while consuming only 1% of the power of that used by the PC.
NASA Astrophysics Data System (ADS)
Gong, Lina; Xu, Tao; Zhang, Wei; Li, Xuhong; Wang, Xia; Pan, Wenwen
2017-03-01
The traditional microblog recommendation algorithm has the problems of low efficiency and modest effect in the era of big data. In the aim of solving these issues, this paper proposed a mixed recommendation algorithm with user clustering. This paper first introduced the situation of microblog marketing industry. Then, this paper elaborates the user interest modeling process and detailed advertisement recommendation methods. Finally, this paper compared the mixed recommendation algorithm with the traditional classification algorithm and mixed recommendation algorithm without user clustering. The results show that the mixed recommendation algorithm with user clustering has good accuracy and recall rate in the microblog advertisements promotion.
Hierarchical semantic cognition for urban functional zones with VHR satellite images and POI data
NASA Astrophysics Data System (ADS)
Zhang, Xiuyuan; Du, Shihong; Wang, Qiao
2017-10-01
As the basic units of urban areas, functional zones are essential for city planning and management, but functional-zone maps are hardly available in most cities, as traditional urban investigations focus mainly on land-cover objects instead of functional zones. As a result, an automatic/semi-automatic method for mapping urban functional zones is highly required. Hierarchical semantic cognition (HSC) is presented in this study, and serves as a general cognition structure for recognizing urban functional zones. Unlike traditional classification methods, the HSC relies on geographic cognition and considers four semantic layers, i.e., visual features, object categories, spatial object patterns, and zone functions, as well as their hierarchical relations. Here, we used HSC to classify functional zones in Beijing with a very-high-resolution (VHR) satellite image and point-of-interest (POI) data. Experimental results indicate that this method can produce more accurate results than Support Vector Machine (SVM) and Latent Dirichlet Allocation (LDA) with a larger overall accuracy of 90.8%. Additionally, the contributions of diverse semantic layers are quantified: the object-category layer is the most important and makes 54% contribution to functional-zone classification; while, other semantic layers are less important but their contributions cannot be ignored. Consequently, the presented HSC is effective in classifying urban functional zones, and can further support urban planning and management.
Wu, Jianning; Wu, Bin
2015-01-01
The accurate identification of gait asymmetry is very beneficial to the assessment of at-risk gait in the clinical applications. This paper investigated the application of classification method based on statistical learning algorithm to quantify gait symmetry based on the assumption that the degree of intrinsic change in dynamical system of gait is associated with the different statistical distributions between gait variables from left-right side of lower limbs; that is, the discrimination of small difference of similarity between lower limbs is considered the reorganization of their different probability distribution. The kinetic gait data of 60 participants were recorded using a strain gauge force platform during normal walking. The classification method is designed based on advanced statistical learning algorithm such as support vector machine algorithm for binary classification and is adopted to quantitatively evaluate gait symmetry. The experiment results showed that the proposed method could capture more intrinsic dynamic information hidden in gait variables and recognize the right-left gait patterns with superior generalization performance. Moreover, our proposed techniques could identify the small significant difference between lower limbs when compared to the traditional symmetry index method for gait. The proposed algorithm would become an effective tool for early identification of the elderly gait asymmetry in the clinical diagnosis. PMID:25705672
Otitis Media Diagnosis for Developing Countries Using Tympanic Membrane Image-Analysis
Myburgh, Hermanus C.; van Zijl, Willemien H.; Swanepoel, DeWet; Hellström, Sten; Laurent, Claude
2016-01-01
Background Otitis media is one of the most common childhood diseases worldwide, but because of lack of doctors and health personnel in developing countries it is often misdiagnosed or not diagnosed at all. This may lead to serious, and life-threatening complications. There is, thus a need for an automated computer based image-analyzing system that could assist in making accurate otitis media diagnoses anywhere. Methods A method for automated diagnosis of otitis media is proposed. The method uses image-processing techniques to classify otitis media. The system is trained using high quality pre-assessed images of tympanic membranes, captured by digital video-otoscopes, and classifies undiagnosed images into five otitis media categories based on predefined signs. Several verification tests analyzed the classification capability of the method. Findings An accuracy of 80.6% was achieved for images taken with commercial video-otoscopes, while an accuracy of 78.7% was achieved for images captured on-site with a low cost custom-made video-otoscope. Interpretation The high accuracy of the proposed otitis media classification system compares well with the classification accuracy of general practitioners and pediatricians (~ 64% to 80%) using traditional otoscopes, and therefore holds promise for the future in making automated diagnosis of otitis media in medically underserved populations. PMID:27077122
Wu, Jianning; Wu, Bin
2015-01-01
The accurate identification of gait asymmetry is very beneficial to the assessment of at-risk gait in the clinical applications. This paper investigated the application of classification method based on statistical learning algorithm to quantify gait symmetry based on the assumption that the degree of intrinsic change in dynamical system of gait is associated with the different statistical distributions between gait variables from left-right side of lower limbs; that is, the discrimination of small difference of similarity between lower limbs is considered the reorganization of their different probability distribution. The kinetic gait data of 60 participants were recorded using a strain gauge force platform during normal walking. The classification method is designed based on advanced statistical learning algorithm such as support vector machine algorithm for binary classification and is adopted to quantitatively evaluate gait symmetry. The experiment results showed that the proposed method could capture more intrinsic dynamic information hidden in gait variables and recognize the right-left gait patterns with superior generalization performance. Moreover, our proposed techniques could identify the small significant difference between lower limbs when compared to the traditional symmetry index method for gait. The proposed algorithm would become an effective tool for early identification of the elderly gait asymmetry in the clinical diagnosis.
NASA Astrophysics Data System (ADS)
Jiang, Li; Shi, Tielin; Xuan, Jianping
2012-05-01
Generally, the vibration signals of fault bearings are non-stationary and highly nonlinear under complicated operating conditions. Thus, it's a big challenge to extract optimal features for improving classification and simultaneously decreasing feature dimension. Kernel Marginal Fisher analysis (KMFA) is a novel supervised manifold learning algorithm for feature extraction and dimensionality reduction. In order to avoid the small sample size problem in KMFA, we propose regularized KMFA (RKMFA). A simple and efficient intelligent fault diagnosis method based on RKMFA is put forward and applied to fault recognition of rolling bearings. So as to directly excavate nonlinear features from the original high-dimensional vibration signals, RKMFA constructs two graphs describing the intra-class compactness and the inter-class separability, by combining traditional manifold learning algorithm with fisher criteria. Therefore, the optimal low-dimensional features are obtained for better classification and finally fed into the simplest K-nearest neighbor (KNN) classifier to recognize different fault categories of bearings. The experimental results demonstrate that the proposed approach improves the fault classification performance and outperforms the other conventional approaches.
Shedding subspecies: The influence of genetics on reptile subspecies taxonomy.
Torstrom, Shannon M; Pangle, Kevin L; Swanson, Bradley J
2014-07-01
The subspecies concept influences multiple aspects of biology and management. The 'molecular revolution' altered traditional methods (morphological traits) of subspecies classification by applying genetic analyses resulting in alternative or contradictory classifications. We evaluated recent reptile literature for bias in the recommendations regarding subspecies status when genetic data were included. Reviewing characteristics of the study, genetic variables, genetic distance values and noting the species concepts, we found that subspecies were more likely elevated to species when using genetic analysis. However, there was no predictive relationship between variables used and taxonomic recommendation. There was a significant difference between the median genetic distance values when researchers elevated or collapsed a subspecies. Our review found nine different concepts of species used when recommending taxonomic change, and studies incorporating multiple species concepts were more likely to recommend a taxonomic change. Since using genetic techniques significantly alter reptile taxonomy there is a need to establish a standard method to determine the species-subspecies boundary in order to effectively use the subspecies classification for research and conservation purposes. Copyright © 2014 Elsevier Inc. All rights reserved.
Olives, Casey; Pagano, Marcello; Deitchler, Megan; Hedt, Bethany L; Egge, Kari; Valadez, Joseph J
2009-04-01
Traditional lot quality assurance sampling (LQAS) methods require simple random sampling to guarantee valid results. However, cluster sampling has been proposed to reduce the number of random starting points. This study uses simulations to examine the classification error of two such designs, a 67x3 (67 clusters of three observations) and a 33x6 (33 clusters of six observations) sampling scheme to assess the prevalence of global acute malnutrition (GAM). Further, we explore the use of a 67x3 sequential sampling scheme for LQAS classification of GAM prevalence. Results indicate that, for independent clusters with moderate intracluster correlation for the GAM outcome, the three sampling designs maintain approximate validity for LQAS analysis. Sequential sampling can substantially reduce the average sample size that is required for data collection. The presence of intercluster correlation can impact dramatically the classification error that is associated with LQAS analysis.
a Novel Deep Convolutional Neural Network for Spectral-Spatial Classification of Hyperspectral Data
NASA Astrophysics Data System (ADS)
Li, N.; Wang, C.; Zhao, H.; Gong, X.; Wang, D.
2018-04-01
Spatial and spectral information are obtained simultaneously by hyperspectral remote sensing. Joint extraction of these information of hyperspectral image is one of most import methods for hyperspectral image classification. In this paper, a novel deep convolutional neural network (CNN) is proposed, which extracts spectral-spatial information of hyperspectral images correctly. The proposed model not only learns sufficient knowledge from the limited number of samples, but also has powerful generalization ability. The proposed framework based on three-dimensional convolution can extract spectral-spatial features of labeled samples effectively. Though CNN has shown its robustness to distortion, it cannot extract features of different scales through the traditional pooling layer that only have one size of pooling window. Hence, spatial pyramid pooling (SPP) is introduced into three-dimensional local convolutional filters for hyperspectral classification. Experimental results with a widely used hyperspectral remote sensing dataset show that the proposed model provides competitive performance.
Cognitive context detection in UAS operators using eye-gaze patterns on computer screens
NASA Astrophysics Data System (ADS)
Mannaru, Pujitha; Balasingam, Balakumar; Pattipati, Krishna; Sibley, Ciara; Coyne, Joseph
2016-05-01
In this paper, we demonstrate the use of eye-gaze metrics of unmanned aerial systems (UAS) operators as effective indices of their cognitive workload. Our analyses are based on an experiment where twenty participants performed pre-scripted UAS missions of three different difficulty levels by interacting with two custom designed graphical user interfaces (GUIs) that are displayed side by side. First, we compute several eye-gaze metrics, traditional eye movement metrics as well as newly proposed ones, and analyze their effectiveness as cognitive classifiers. Most of the eye-gaze metrics are computed by dividing the computer screen into "cells". Then, we perform several analyses in order to select metrics for effective cognitive context classification related to our specific application; the objective of these analyses are to (i) identify appropriate ways to divide the screen into cells; (ii) select appropriate metrics for training and classification of cognitive features; and (iii) identify a suitable classification method.
High-order distance-based multiview stochastic learning in image classification.
Yu, Jun; Rui, Yong; Tang, Yuan Yan; Tao, Dacheng
2014-12-01
How do we find all images in a larger set of images which have a specific content? Or estimate the position of a specific object relative to the camera? Image classification methods, like support vector machine (supervised) and transductive support vector machine (semi-supervised), are invaluable tools for the applications of content-based image retrieval, pose estimation, and optical character recognition. However, these methods only can handle the images represented by single feature. In many cases, different features (or multiview data) can be obtained, and how to efficiently utilize them is a challenge. It is inappropriate for the traditionally concatenating schema to link features of different views into a long vector. The reason is each view has its specific statistical property and physical interpretation. In this paper, we propose a high-order distance-based multiview stochastic learning (HD-MSL) method for image classification. HD-MSL effectively combines varied features into a unified representation and integrates the labeling information based on a probabilistic framework. In comparison with the existing strategies, our approach adopts the high-order distance obtained from the hypergraph to replace pairwise distance in estimating the probability matrix of data distribution. In addition, the proposed approach can automatically learn a combination coefficient for each view, which plays an important role in utilizing the complementary information of multiview data. An alternative optimization is designed to solve the objective functions of HD-MSL and obtain different views on coefficients and classification scores simultaneously. Experiments on two real world datasets demonstrate the effectiveness of HD-MSL in image classification.
A Complex Network Approach to Stylometry
Amancio, Diego Raphael
2015-01-01
Statistical methods have been widely employed to study the fundamental properties of language. In recent years, methods from complex and dynamical systems proved useful to create several language models. Despite the large amount of studies devoted to represent texts with physical models, only a limited number of studies have shown how the properties of the underlying physical systems can be employed to improve the performance of natural language processing tasks. In this paper, I address this problem by devising complex networks methods that are able to improve the performance of current statistical methods. Using a fuzzy classification strategy, I show that the topological properties extracted from texts complement the traditional textual description. In several cases, the performance obtained with hybrid approaches outperformed the results obtained when only traditional or networked methods were used. Because the proposed model is generic, the framework devised here could be straightforwardly used to study similar textual applications where the topology plays a pivotal role in the description of the interacting agents. PMID:26313921
Bushon, R.N.; Brady, A.M.; Likirdopulos, C.A.; Cireddu, J.V.
2009-01-01
Aims: The aim of this study was to examine a rapid method for detecting Escherichia coli and enterococci in recreational water. Methods and Results: Water samples were assayed for E. coli and enterococci by traditional and immunomagnetic separation/adenosine triphosphate (IMS/ATP) methods. Three sample treatments were evaluated for the IMS/ATP method: double filtration, single filtration, and direct analysis. Pearson's correlation analysis showed strong, significant, linear relations between IMS/ATP and traditional methods for all sample treatments; strongest linear correlations were with the direct analysis (r = 0.62 and 0.77 for E. coli and enterococci, respectively). Additionally, simple linear regression was used to estimate bacteria concentrations as a function of IMS/ATP results. The correct classification of water-quality criteria was 67% for E. coli and 80% for enterococci. Conclusions: The IMS/ATP method is a viable alternative to traditional methods for faecal-indicator bacteria. Significance and Impact of the Study: The IMS/ATP method addresses critical public health needs for the rapid detection of faecal-indicator contamination and has potential for satisfying US legislative mandates requiring methods to detect bathing water contamination in 2 h or less. Moreover, IMS/ATP equipment is considerably less costly and more portable than that for molecular methods, making the method suitable for field applications. ?? 2009 The Authors.
NASA Astrophysics Data System (ADS)
Aufaristama, Muhammad; Hölbling, Daniel; Höskuldsson, Ármann; Jónsdóttir, Ingibjörg
2017-04-01
The Krafla volcanic system is part of the Icelandic North Volcanic Zone (NVZ). During Holocene, two eruptive events occurred in Krafla, 1724-1729 and 1975-1984. The last eruptive episode (1975-1984), known as the "Krafla Fires", resulted in nine volcanic eruption episodes. The total area covered by the lavas from this eruptive episode is 36 km2 and the volume is about 0.25-0.3 km3. Lava morphology is related to the characteristics of the surface morphology of a lava flow after solidification. The typical morphology of lava can be used as primary basis for the classification of lava flows when rheological properties cannot be directly observed during emplacement, and also for better understanding the behavior of lava flow models. Although mapping of lava flows in the field is relatively accurate such traditional methods are time consuming, especially when the lava covers large areas such as it is the case in Krafla. Semi-automatic mapping methods that make use of satellite remote sensing data allow for an efficient and fast mapping of lava morphology. In this study, two semi-automatic methods for lava morphology classification are presented and compared using Landsat 8 (30 m spatial resolution) and SPOT-5 (10 m spatial resolution) satellite images. For assessing the classification accuracy, the results from semi-automatic mapping were compared to the respective results from visual interpretation. On the one hand, the Spectral Angle Mapper (SAM) classification method was used. With this method an image is classified according to the spectral similarity between the image reflectance spectrums and the reference reflectance spectra. SAM successfully produced detailed lava surface morphology maps. However, the pixel-based approach partly leads to a salt-and-pepper effect. On the other hand, we applied the Random Forest (RF) classification method within an object-based image analysis (OBIA) framework. This statistical classifier uses a randomly selected subset of training samples to produce multiple decision trees. For final classification of pixels or - in the present case - image objects, the average of the class assignments probability predicted by the different decision trees is used. While the resulting OBIA classification of lava morphology types shows a high coincidence with the reference data, the approach is sensitive to the segmentation-derived image objects that constitute the base units for classification. Both semi-automatic methods produce reasonable results in the Krafla lava field, even if the identification of different pahoehoe and aa types of lava appeared to be difficult. The use of satellite remote sensing data shows a high potential for fast and efficient classification of lava morphology, particularly over large and inaccessible areas.
Learning From Short Text Streams With Topic Drifts.
Li, Peipei; He, Lu; Wang, Haiyan; Hu, Xuegang; Zhang, Yuhong; Li, Lei; Wu, Xindong
2017-09-18
Short text streams such as search snippets and micro blogs have been popular on the Web with the emergence of social media. Unlike traditional normal text streams, these data present the characteristics of short length, weak signal, high volume, high velocity, topic drift, etc. Short text stream classification is hence a very challenging and significant task. However, this challenge has received little attention from the research community. Therefore, a new feature extension approach is proposed for short text stream classification with the help of a large-scale semantic network obtained from a Web corpus. It is built on an incremental ensemble classification model for efficiency. First, more semantic contexts based on the senses of terms in short texts are introduced to make up of the data sparsity using the open semantic network, in which all terms are disambiguated by their semantics to reduce the noise impact. Second, a concept cluster-based topic drifting detection method is proposed to effectively track hidden topic drifts. Finally, extensive studies demonstrate that as compared to several well-known concept drifting detection methods in data stream, our approach can detect topic drifts effectively, and it enables handling short text streams effectively while maintaining the efficiency as compared to several state-of-the-art short text classification approaches.
Segmentation and object-oriented processing of single-season and multi-season Landsat-7 ETM+ data was utilized for the classification of wetlands in a 1560 km2 study area of north central Florida. This segmentation and object-oriented classification outperformed the traditional ...
Jaiswara, Ranjana; Nandi, Diptarup; Balakrishnan, Rohini
2013-01-01
Traditional taxonomy based on morphology has often failed in accurate species identification owing to the occurrence of cryptic species, which are reproductively isolated but morphologically identical. Molecular data have thus been used to complement morphology in species identification. The sexual advertisement calls in several groups of acoustically communicating animals are species-specific and can thus complement molecular data as non-invasive tools for identification. Several statistical tools and automated identifier algorithms have been used to investigate the efficiency of acoustic signals in species identification. Despite a plethora of such methods, there is a general lack of knowledge regarding the appropriate usage of these methods in specific taxa. In this study, we investigated the performance of two commonly used statistical methods, discriminant function analysis (DFA) and cluster analysis, in identification and classification based on acoustic signals of field cricket species belonging to the subfamily Gryllinae. Using a comparative approach we evaluated the optimal number of species and calling song characteristics for both the methods that lead to most accurate classification and identification. The accuracy of classification using DFA was high and was not affected by the number of taxa used. However, a constraint in using discriminant function analysis is the need for a priori classification of songs. Accuracy of classification using cluster analysis, which does not require a priori knowledge, was maximum for 6–7 taxa and decreased significantly when more than ten taxa were analysed together. We also investigated the efficacy of two novel derived acoustic features in improving the accuracy of identification. Our results show that DFA is a reliable statistical tool for species identification using acoustic signals. Our results also show that cluster analysis of acoustic signals in crickets works effectively for species classification and identification. PMID:24086666
Ali, Safdar; Majid, Abdul; Khan, Asifullah
2014-04-01
Development of an accurate and reliable intelligent decision-making method for the construction of cancer diagnosis system is one of the fast growing research areas of health sciences. Such decision-making system can provide adequate information for cancer diagnosis and drug discovery. Descriptors derived from physicochemical properties of protein sequences are very useful for classifying cancerous proteins. Recently, several interesting research studies have been reported on breast cancer classification. To this end, we propose the exploitation of the physicochemical properties of amino acids in protein primary sequences such as hydrophobicity (Hd) and hydrophilicity (Hb) for breast cancer classification. Hd and Hb properties of amino acids, in recent literature, are reported to be quite effective in characterizing the constituent amino acids and are used to study protein foldings, interactions, structures, and sequence-order effects. Especially, using these physicochemical properties, we observed that proline, serine, tyrosine, cysteine, arginine, and asparagine amino acids offer high discrimination between cancerous and healthy proteins. In addition, unlike traditional ensemble classification approaches, the proposed 'IDM-PhyChm-Ens' method was developed by combining the decision spaces of a specific classifier trained on different feature spaces. The different feature spaces used were amino acid composition, split amino acid composition, and pseudo amino acid composition. Consequently, we have exploited different feature spaces using Hd and Hb properties of amino acids to develop an accurate method for classification of cancerous protein sequences. We developed ensemble classifiers using diverse learning algorithms such as random forest (RF), support vector machines (SVM), and K-nearest neighbor (KNN) trained on different feature spaces. We observed that ensemble-RF, in case of cancer classification, performed better than ensemble-SVM and ensemble-KNN. Our analysis demonstrates that ensemble-RF, ensemble-SVM and ensemble-KNN are more effective than their individual counterparts. The proposed 'IDM-PhyChm-Ens' method has shown improved performance compared to existing techniques.
Autonomous target recognition using remotely sensed surface vibration measurements
NASA Astrophysics Data System (ADS)
Geurts, James; Ruck, Dennis W.; Rogers, Steven K.; Oxley, Mark E.; Barr, Dallas N.
1993-09-01
The remotely measured surface vibration signatures of tactical military ground vehicles are investigated for use in target classification and identification friend or foe (IFF) systems. The use of remote surface vibration sensing by a laser radar reduces the effects of partial occlusion, concealment, and camouflage experienced by automatic target recognition systems using traditional imagery in a tactical battlefield environment. Linear Predictive Coding (LPC) efficiently represents the vibration signatures and nearest neighbor classifiers exploit the LPC feature set using a variety of distortion metrics. Nearest neighbor classifiers achieve an 88 percent classification rate in an eight class problem, representing a classification performance increase of thirty percent from previous efforts. A novel confidence figure of merit is implemented to attain a 100 percent classification rate with less than 60 percent rejection. The high classification rates are achieved on a target set which would pose significant problems to traditional image-based recognition systems. The targets are presented to the sensor in a variety of aspects and engine speeds at a range of 1 kilometer. The classification rates achieved demonstrate the benefits of using remote vibration measurement in a ground IFF system. The signature modeling and classification system can also be used to identify rotary and fixed-wing targets.
Meta-analysis of teaching methods: a 50k+ student study
NASA Astrophysics Data System (ADS)
Sayre, Eleanor; Archibeque, Benjamin; Gomez, K. Alison; Heckendorf, Tyrel; Madsen, Adrian M.; McKagan, Sarah B.; Schenk, Edward W.; Shepard, Chase; Sorell, Lane; von Korff, Joshua
2015-04-01
The Force Concept Inventory (FCI) and the Force and Motion Conceptual Evaluation (FMCE) are the two most widely-used conceptual tests in introductory mechanics. Because they are so popular, they provide an excellent avenue to compare different teaching methods at different kinds of institutions with varying student populations. We conducted a secondary analysis of all peer-reviewed papers which publish data from US and Canadian colleges and universities. Our data include over fifty thousand students drawn from approximately 100 papers; papers were drawn from Scopus, ERIC, ComPADRE, and journal websites. We augment published data about teaching methods with institutional data such as Carnegie Classification and average SAT scores. We statistically determine the effectiveness of different teaching methods as measured by FCI and FMCE gains and mediated by institutional and course factors. As in the landmark 1998 Hake study, we find that classes using interactive engagement (IE) have significantly larger learning gains than classes using traditional instruction. However, we find a broader distribution of normalized gains occurs in each of traditional and IE classes, and the differences between IE and traditional instruction have changed over time and are more context dependent.
NASA Astrophysics Data System (ADS)
Sosa, Germán. D.; Cruz-Roa, Angel; González, Fabio A.
2015-01-01
This work addresses the problem of lung sound classification, in particular, the problem of distinguishing between wheeze and normal sounds. Wheezing sound detection is an important step to associate lung sounds with an abnormal state of the respiratory system, usually associated with tuberculosis or another chronic obstructive pulmonary diseases (COPD). The paper presents an approach for automatic lung sound classification, which uses different state-of-the-art sound features in combination with a C-weighted support vector machine (SVM) classifier that works better for unbalanced data. Feature extraction methods used here are commonly applied in speech recognition and related problems thanks to the fact that they capture the most informative spectral content from the original signals. The evaluated methods were: Fourier transform (FT), wavelet decomposition using Wavelet Packet Transform bank of filters (WPT) and Mel Frequency Cepstral Coefficients (MFCC). For comparison, we evaluated and contrasted the proposed approach against previous works using different combination of features and/or classifiers. The different methods were evaluated on a set of lung sounds including normal and wheezing sounds. A leave-two-out per-case cross-validation approach was used, which, in each fold, chooses as validation set a couple of cases, one including normal sounds and the other including wheezing sounds. Experimental results were reported in terms of traditional classification performance measures: sensitivity, specificity and balanced accuracy. Our best results using the suggested approach, C-weighted SVM and MFCC, achieve a 82.1% of balanced accuracy obtaining the best result for this problem until now. These results suggest that supervised classifiers based on kernel methods are able to learn better models for this challenging classification problem even using the same feature extraction methods.
A Practical and Automated Approach to Large Area Forest Disturbance Mapping with Remote Sensing
Ozdogan, Mutlu
2014-01-01
In this paper, I describe a set of procedures that automate forest disturbance mapping using a pair of Landsat images. The approach is built on the traditional pair-wise change detection method, but is designed to extract training data without user interaction and uses a robust classification algorithm capable of handling incorrectly labeled training data. The steps in this procedure include: i) creating masks for water, non-forested areas, clouds, and cloud shadows; ii) identifying training pixels whose value is above or below a threshold defined by the number of standard deviations from the mean value of the histograms generated from local windows in the short-wave infrared (SWIR) difference image; iii) filtering the original training data through a number of classification algorithms using an n-fold cross validation to eliminate mislabeled training samples; and finally, iv) mapping forest disturbance using a supervised classification algorithm. When applied to 17 Landsat footprints across the U.S. at five-year intervals between 1985 and 2010, the proposed approach produced forest disturbance maps with 80 to 95% overall accuracy, comparable to those obtained from traditional approaches to forest change detection. The primary sources of mis-classification errors included inaccurate identification of forests (errors of commission), issues related to the land/water mask, and clouds and cloud shadows missed during image screening. The approach requires images from the peak growing season, at least for the deciduous forest sites, and cannot readily distinguish forest harvest from natural disturbances or other types of land cover change. The accuracy of detecting forest disturbance diminishes with the number of years between the images that make up the image pair. Nevertheless, the relatively high accuracies, little or no user input needed for processing, speed of map production, and simplicity of the approach make the new method especially practical for forest cover change analysis over very large regions. PMID:24717283
A practical and automated approach to large area forest disturbance mapping with remote sensing.
Ozdogan, Mutlu
2014-01-01
In this paper, I describe a set of procedures that automate forest disturbance mapping using a pair of Landsat images. The approach is built on the traditional pair-wise change detection method, but is designed to extract training data without user interaction and uses a robust classification algorithm capable of handling incorrectly labeled training data. The steps in this procedure include: i) creating masks for water, non-forested areas, clouds, and cloud shadows; ii) identifying training pixels whose value is above or below a threshold defined by the number of standard deviations from the mean value of the histograms generated from local windows in the short-wave infrared (SWIR) difference image; iii) filtering the original training data through a number of classification algorithms using an n-fold cross validation to eliminate mislabeled training samples; and finally, iv) mapping forest disturbance using a supervised classification algorithm. When applied to 17 Landsat footprints across the U.S. at five-year intervals between 1985 and 2010, the proposed approach produced forest disturbance maps with 80 to 95% overall accuracy, comparable to those obtained from traditional approaches to forest change detection. The primary sources of mis-classification errors included inaccurate identification of forests (errors of commission), issues related to the land/water mask, and clouds and cloud shadows missed during image screening. The approach requires images from the peak growing season, at least for the deciduous forest sites, and cannot readily distinguish forest harvest from natural disturbances or other types of land cover change. The accuracy of detecting forest disturbance diminishes with the number of years between the images that make up the image pair. Nevertheless, the relatively high accuracies, little or no user input needed for processing, speed of map production, and simplicity of the approach make the new method especially practical for forest cover change analysis over very large regions.
NASA Astrophysics Data System (ADS)
Wan, Xiaoqing; Zhao, Chunhui; Gao, Bing
2017-11-01
The integration of an edge-preserving filtering technique in the classification of a hyperspectral image (HSI) has been proven effective in enhancing classification performance. This paper proposes an ensemble strategy for HSI classification using an edge-preserving filter along with a deep learning model and edge detection. First, an adaptive guided filter is applied to the original HSI to reduce the noise in degraded images and to extract powerful spectral-spatial features. Second, the extracted features are fed as input to a stacked sparse autoencoder to adaptively exploit more invariant and deep feature representations; then, a random forest classifier is applied to fine-tune the entire pretrained network and determine the classification output. Third, a Prewitt compass operator is further performed on the HSI to extract the edges of the first principal component after dimension reduction. Moreover, the regional growth rule is applied to the resulting edge logical image to determine the local region for each unlabeled pixel. Finally, the categories of the corresponding neighborhood samples are determined in the original classification map; then, the major voting mechanism is implemented to generate the final output. Extensive experiments proved that the proposed method achieves competitive performance compared with several traditional approaches.
Xiang, Kun; Li, Yinglei; Ford, William; Land, Walker; Schaffer, J David; Congdon, Robert; Zhang, Jing; Sadik, Omowunmi
2016-02-21
We hereby report the design and implementation of an Autonomous Microbial Cell Culture and Classification (AMC(3)) system for rapid detection of food pathogens. Traditional food testing methods require multistep procedures and long incubation period, and are thus prone to human error. AMC(3) introduces a "one click approach" to the detection and classification of pathogenic bacteria. Once the cultured materials are prepared, all operations are automatic. AMC(3) is an integrated sensor array platform in a microbial fuel cell system composed of a multi-potentiostat, an automated data collection system (Python program, Yocto Maxi-coupler electromechanical relay module) and a powerful classification program. The classification scheme consists of Probabilistic Neural Network (PNN), Support Vector Machines (SVM) and General Regression Neural Network (GRNN) oracle-based system. Differential Pulse Voltammetry (DPV) is performed on standard samples or unknown samples. Then, using preset feature extractions and quality control, accepted data are analyzed by the intelligent classification system. In a typical use, thirty-two extracted features were analyzed to correctly classify the following pathogens: Escherichia coli ATCC#25922, Escherichia coli ATCC#11775, and Staphylococcus epidermidis ATCC#12228. 85.4% accuracy range was recorded for unknown samples, and within a shorter time period than the industry standard of 24 hours.
Image search engine with selective filtering and feature-element-based classification
NASA Astrophysics Data System (ADS)
Li, Qing; Zhang, Yujin; Dai, Shengyang
2001-12-01
With the growth of Internet and storage capability in recent years, image has become a widespread information format in World Wide Web. However, it has become increasingly harder to search for images of interest, and effective image search engine for the WWW needs to be developed. We propose in this paper a selective filtering process and a novel approach for image classification based on feature element in the image search engine we developed for the WWW. First a selective filtering process is embedded in a general web crawler to filter out the meaningless images with GIF format. Two parameters that can be obtained easily are used in the filtering process. Our classification approach first extract feature elements from images instead of feature vectors. Compared with feature vectors, feature elements can better capture visual meanings of the image according to subjective perception of human beings. Different from traditional image classification method, our classification approach based on feature element doesn't calculate the distance between two vectors in the feature space, while trying to find associations between feature element and class attribute of the image. Experiments are presented to show the efficiency of the proposed approach.
SENTINEL-1 and SENTINEL-2 Data Fusion for Wetlands Mapping: Balikdami, Turkey
NASA Astrophysics Data System (ADS)
Kaplan, G.; Avdan, U.
2018-04-01
Wetlands provide a number of environmental and socio-economic benefits such as their ability to store floodwaters and improve water quality, providing habitats for wildlife and supporting biodiversity, as well as aesthetic values. Remote sensing technology has proven to be a useful and frequent application in monitoring and mapping wetlands. Combining optical and microwave satellite data can help with mapping and monitoring the biophysical characteristics of wetlands and wetlands` vegetation. Also, fusing radar and optical remote sensing data can increase the wetland classification accuracy. In this paper, data from the fine spatial resolution optical satellite, Sentinel-2 and the Synthetic Aperture Radar Satellite, Sentinel-1, were fused for mapping wetlands. Both Sentinel-1 and Sentinel-2 images were pre-processed. After the pre-processing, vegetation indices were calculated using the Sentinel-2 bands and the results were included in the fusion data set. For the classification of the fused data, three different classification approaches were used and compared. The results showed significant improvement in the wetland classification using both multispectral and microwave data. Also, the presence of the red edge bands and the vegetation indices used in the data set showed significant improvement in the discrimination between wetlands and other vegetated areas. The statistical results of the fusion of the optical and radar data showed high wetland mapping accuracy, showing an overall classification accuracy of approximately 90 % in the object-based classification method. For future research, we recommend multi-temporal image use, terrain data collection, as well as a comparison of the used method with the traditional image fusion techniques.
Ferreira, Emmanoela N; da S Mourão, José; Rocha, Pollyana D; Nascimento, Douglas M; da S Q Bezerra, Dandara Monalisa Mariz
2009-01-01
Background Folk taxonomy is a sub-area of ethnobiology that study the way of how traditional communities classify, identify and name their natural resources. The work present was undertaken in two traditional communities (Barra de Mamanguape and Tramataia). The objective of this study was investigate the ethnobiological classification of the local crabs and swimming crabs used by the crustaceous gatherers of the Mamanguape River Estuary (MRE), Paraíba State, Brazil. Methods The methodology used here involved a combination of qualitative methods (open interviews, semi-structured interviews, direct observations, guided tours, surveys, and interviews in synchronic and diachronic situations that crossed-checked and repeated identifications) and quantitative methods (Venn diagram). A total of 32 men and women were interviewed in the two communities. Specimens of the local crustaceans were collected and identified by the harvesters themselves, subsequently fixed in formalin, conserved in 70% ethyl alcohol, identified using appropriate specialized literature, and then deposited in the laboratory of the Zoology Department of the University State of Paraiba. Results The crustaceous gatherers we studied were observed to group crustaceans according to their similarities and differences, producing a hierarchical classification system containing four levels of decreasing taxonomic order: unique beginner, life-form, generic, and specific. A sequential and/or semantic system classification system that is used to classify the ontogeny of the female swimming crab was also identified. Of the nine folk generics identified, 44.5% were monotypic. 55.5% were polytypic and were subdivided into 15 folk specifics. An identification key was elaborated with the data obtained about the folk polytypics generics. Conclusion The detailed knowledge concerning the crabs and swimming crabs revealed by the MRE crustaceous gatherers demonstrates that these people detain a vast knowledge concerning these marine resources. This local knowledge provides a rich but little-known source of information that will aid future ecological and/or zoological studies in the region that will be indispensable for producing management plans to help guarantee the sustainability of these local natural resources. PMID:19671153
Compressed learning and its applications to subcellular localization.
Zheng, Zhong-Long; Guo, Li; Jia, Jiong; Xie, Chen-Mao; Zeng, Wen-Cai; Yang, Jie
2011-09-01
One of the main challenges faced by biological applications is to predict protein subcellular localization in automatic fashion accurately. To achieve this in these applications, a wide variety of machine learning methods have been proposed in recent years. Most of them focus on finding the optimal classification scheme and less of them take the simplifying the complexity of biological systems into account. Traditionally, such bio-data are analyzed by first performing a feature selection before classification. Motivated by CS (Compressed Sensing) theory, we propose the methodology which performs compressed learning with a sparseness criterion such that feature selection and dimension reduction are merged into one analysis. The proposed methodology decreases the complexity of biological system, while increases protein subcellular localization accuracy. Experimental results are quite encouraging, indicating that the aforementioned sparse methods are quite promising in dealing with complicated biological problems, such as predicting the subcellular localization of Gram-negative bacterial proteins.
NASA Astrophysics Data System (ADS)
Lee, Min Jin; Hong, Helen; Shim, Kyu Won; Kim, Yong Oock
2017-03-01
This paper proposes morphological descriptors representing the degree of skull deformity for craniosynostosis in head CT images and a hierarchical classifier model distinguishing among normal and different types of craniosynostosis. First, to compare deformity surface model with mean normal surface model, mean normal surface models are generated for each age range and the mean normal surface model is deformed to the deformity surface model via multi-level threestage registration. Second, four shape features including local distance and area ratio indices are extracted in each five cranial bone. Finally, hierarchical SVM classifier is proposed to distinguish between the normal and deformity. As a result, the proposed method showed improved classification results compared to traditional cranial index. Our method can be used for the early diagnosis, surgical planning and postsurgical assessment of craniosynostosis as well as quantitative analysis of skull deformity.
Tuarob, Suppawong; Tucker, Conrad S; Salathe, Marcel; Ram, Nilam
2014-06-01
The role of social media as a source of timely and massive information has become more apparent since the era of Web 2.0.Multiple studies illustrated the use of information in social media to discover biomedical and health-related knowledge.Most methods proposed in the literature employ traditional document classification techniques that represent a document as a bag of words.These techniques work well when documents are rich in text and conform to standard English; however, they are not optimal for social media data where sparsity and noise are norms.This paper aims to address the limitations posed by the traditional bag-of-word based methods and propose to use heterogeneous features in combination with ensemble machine learning techniques to discover health-related information, which could prove to be useful to multiple biomedical applications, especially those needing to discover health-related knowledge in large scale social media data.Furthermore, the proposed methodology could be generalized to discover different types of information in various kinds of textual data. Social media data is characterized by an abundance of short social-oriented messages that do not conform to standard languages, both grammatically and syntactically.The problem of discovering health-related knowledge in social media data streams is then transformed into a text classification problem, where a text is identified as positive if it is health-related and negative otherwise.We first identify the limitations of the traditional methods which train machines with N-gram word features, then propose to overcome such limitations by utilizing the collaboration of machine learning based classifiers, each of which is trained to learn a semantically different aspect of the data.The parameter analysis for tuning each classifier is also reported. Three data sets are used in this research.The first data set comprises of approximately 5000 hand-labeled tweets, and is used for cross validation of the classification models in the small scale experiment, and for training the classifiers in the real-world large scale experiment.The second data set is a random sample of real-world Twitter data in the US.The third data set is a random sample of real-world Facebook Timeline posts. Two sets of evaluations are conducted to investigate the proposed model's ability to discover health-related information in the social media domain: small scale and large scale evaluations.The small scale evaluation employs 10-fold cross validation on the labeled data, and aims to tune parameters of the proposed models, and to compare with the stage-of-the-art method.The large scale evaluation tests the trained classification models on the native, real-world data sets, and is needed to verify the ability of the proposed model to handle the massive heterogeneity in real-world social media. The small scale experiment reveals that the proposed method is able to mitigate the limitations in the well established techniques existing in the literature, resulting in performance improvement of 18.61% (F-measure).The large scale experiment further reveals that the baseline fails to perform well on larger data with higher degrees of heterogeneity, while the proposed method is able to yield reasonably good performance and outperform the baseline by 46.62% (F-Measure) on average. Copyright © 2014 Elsevier Inc. All rights reserved.
Towards a phylogenetic classification of Leptothecata (Cnidaria, Hydrozoa)
Maronna, Maximiliano M.; Miranda, Thaís P.; Peña Cantero, Álvaro L.; Barbeitos, Marcos S.; Marques, Antonio C.
2016-01-01
Leptothecata are hydrozoans whose hydranths are covered by perisarc and gonophores and whose medusae bear gonads on their radial canals. They develop complex polypoid colonies and exhibit considerable morphological variation among species with respect to growth, defensive structures and mode of development. For instance, several lineages within this order have lost the medusa stage. Depending on the author, traditional taxonomy in hydrozoans may be either polyp- or medusa-oriented. Therefore, the absence of the latter stage in some lineages may lead to very different classification schemes. Molecular data have proved useful in elucidating this taxonomic challenge. We analyzed a super matrix of new and published rRNA gene sequences (16S, 18S and 28S), employing newly proposed methods to measure branch support and improve phylogenetic signal. Our analysis recovered new clades not recognized by traditional taxonomy and corroborated some recently proposed taxa. We offer a thorough taxonomic revision of the Leptothecata, erecting new orders, suborders, infraorders and families. We also discuss the origination and diversification dynamics of the group from a macroevolutionary perspective. PMID:26821567
Network-based high level data classification.
Silva, Thiago Christiano; Zhao, Liang
2012-06-01
Traditional supervised data classification considers only physical features (e.g., distance or similarity) of the input data. Here, this type of learning is called low level classification. On the other hand, the human (animal) brain performs both low and high orders of learning and it has facility in identifying patterns according to the semantic meaning of the input data. Data classification that considers not only physical attributes but also the pattern formation is, here, referred to as high level classification. In this paper, we propose a hybrid classification technique that combines both types of learning. The low level term can be implemented by any classification technique, while the high level term is realized by the extraction of features of the underlying network constructed from the input data. Thus, the former classifies the test instances by their physical features or class topologies, while the latter measures the compliance of the test instances to the pattern formation of the data. Our study shows that the proposed technique not only can realize classification according to the pattern formation, but also is able to improve the performance of traditional classification techniques. Furthermore, as the class configuration's complexity increases, such as the mixture among different classes, a larger portion of the high level term is required to get correct classification. This feature confirms that the high level classification has a special importance in complex situations of classification. Finally, we show how the proposed technique can be employed in a real-world application, where it is capable of identifying variations and distortions of handwritten digit images. As a result, it supplies an improvement in the overall pattern recognition rate.
Elsawy, Amr S; Eldawlatly, Seif; Taher, Mohamed; Aly, Gamal M
2014-01-01
The current trend to use Brain-Computer Interfaces (BCIs) with mobile devices mandates the development of efficient EEG data processing methods. In this paper, we demonstrate the performance of a Principal Component Analysis (PCA) ensemble classifier for P300-based spellers. We recorded EEG data from multiple subjects using the Emotiv neuroheadset in the context of a classical oddball P300 speller paradigm. We compare the performance of the proposed ensemble classifier to the performance of traditional feature extraction and classifier methods. Our results demonstrate the capability of the PCA ensemble classifier to classify P300 data recorded using the Emotiv neuroheadset with an average accuracy of 86.29% on cross-validation data. In addition, offline testing of the recorded data reveals an average classification accuracy of 73.3% that is significantly higher than that achieved using traditional methods. Finally, we demonstrate the effect of the parameters of the P300 speller paradigm on the performance of the method.
Jiang, Biao; Jia, Yan; He, Congfen
2018-05-11
Traditional skincare involves the subjective classification of skin into 4 categories (oily, dry, mixed, and neutral) prior to skin treatment. Following the development of noninvasive methods in skin and skin imaging technology, scientists have developed efficacy-based skincare products based on the physiological characteristics of skin under different conditions. Currently, the emergence of skinomics and systems biology has facilitated the development of precision skincare. In this article, the evolution of skincare based on the physiological states of the skin (from traditional skincare and efficacy-based skincare to precision skincare) is described. In doing so, we highlight skinomics and systems biology, with particular emphasis on the importance of skin lipidomics and microbiomes in precision skincare. The emerging trends of precision skincare are anticipated. © 2018 Wiley Periodicals, Inc.
EEG-based driver fatigue detection using hybrid deep generic model.
Phyo Phyo San; Sai Ho Ling; Rifai Chai; Tran, Yvonne; Craig, Ashley; Hung Nguyen
2016-08-01
Classification of electroencephalography (EEG)-based application is one of the important process for biomedical engineering. Driver fatigue is a major case of traffic accidents worldwide and considered as a significant problem in recent decades. In this paper, a hybrid deep generic model (DGM)-based support vector machine is proposed for accurate detection of driver fatigue. Traditionally, a probabilistic DGM with deep architecture is quite good at learning invariant features, but it is not always optimal for classification due to its trainable parameters are in the middle layer. Alternatively, Support Vector Machine (SVM) itself is unable to learn complicated invariance, but produces good decision surface when applied to well-behaved features. Consolidating unsupervised high-level feature extraction techniques, DGM and SVM classification makes the integrated framework stronger and enhance mutually in feature extraction and classification. The experimental results showed that the proposed DBN-based driver fatigue monitoring system achieves better testing accuracy of 73.29 % with 91.10 % sensitivity and 55.48 % specificity. In short, the proposed hybrid DGM-based SVM is an effective method for the detection of driver fatigue in EEG.
Two-dimensional shape classification using generalized Fourier representation and neural networks
NASA Astrophysics Data System (ADS)
Chodorowski, Artur; Gustavsson, Tomas; Mattsson, Ulf
2000-04-01
A shape-based classification method is developed based upon the Generalized Fourier Representation (GFR). GFR can be regarded as an extension of traditional polar Fourier descriptors, suitable for description of closed objects, both convex and concave, with or without holes. Explicit relations of GFR coefficients to regular moments, moment invariants and affine moment invariants are given in the paper. The dual linear relation between GFR coefficients and regular moments was used to compare shape features derive from GFR descriptors and Hu's moment invariants. the GFR was then applied to a clinical problem within oral medicine and used to represent the contours of the lesions in the oral cavity. The lesions studied were leukoplakia and different forms of lichenoid reactions. Shape features were extracted from GFR coefficients in order to classify potentially cancerous oral lesions. Alternative classifiers were investigated based on a multilayer perceptron with different architectures and extensions. The overall classification accuracy for recognition of potentially cancerous oral lesions when using neural network classifier was 85%, while the classification between leukoplakia and reticular lichenoid reactions gave 96% (5-fold cross-validated) recognition rate.
Diniz, Paulo Henrique Gonçalves Dias; Barbosa, Mayara Ferreira; de Melo Milanez, Karla Danielle Tavares; Pistonesi, Marcelo Fabián; de Araújo, Mário César Ugulino
2016-02-01
In this work we proposed a method to verify the differentiating characteristics of simple tea infusions prepared in boiling water alone (simulating a home-made tea cup), which represents the final product as ingested by the consumers. For this purpose we used UV-Vis spectroscopy and variable selection through the Successive Projections Algorithm associated with Linear Discriminant Analysis (SPA-LDA) for simultaneous classification of the teas according to their variety and geographic origin. For comparison, KNN, CART, SIMCA, PLS-DA and PCA-LDA were also used. SPA-LDA and PCA-LDA provided significantly better results for tea classification of the five studied classes (Argentinean green tea; Brazilian green tea; Argentinean black tea; Brazilian black tea; and Sri Lankan black tea). The proposed methodology provides a simpler, faster and more affordable classification of simple tea infusions, and can be used as an alternative approach to traditional tea quality evaluation as made by skilful tasters, which is evidently partial and cannot assess geographic origins. Copyright © 2015 Elsevier Ltd. All rights reserved.
Chiranjeevi, Pojala; Gopalakrishnan, Viswanath; Moogi, Pratibha
2015-09-01
Facial expression recognition is one of the open problems in computer vision. Robust neutral face recognition in real time is a major challenge for various supervised learning-based facial expression recognition methods. This is due to the fact that supervised methods cannot accommodate all appearance variability across the faces with respect to race, pose, lighting, facial biases, and so on, in the limited amount of training data. Moreover, processing each and every frame to classify emotions is not required, as user stays neutral for majority of the time in usual applications like video chat or photo album/web browsing. Detecting neutral state at an early stage, thereby bypassing those frames from emotion classification would save the computational power. In this paper, we propose a light-weight neutral versus emotion classification engine, which acts as a pre-processer to the traditional supervised emotion classification approaches. It dynamically learns neutral appearance at key emotion (KE) points using a statistical texture model, constructed by a set of reference neutral frames for each user. The proposed method is made robust to various types of user head motions by accounting for affine distortions based on a statistical texture model. Robustness to dynamic shift of KE points is achieved by evaluating the similarities on a subset of neighborhood patches around each KE point using the prior information regarding the directionality of specific facial action units acting on the respective KE point. The proposed method, as a result, improves emotion recognition (ER) accuracy and simultaneously reduces computational complexity of the ER system, as validated on multiple databases.
Spotting East African mammals in open savannah from space.
Yang, Zheng; Wang, Tiejun; Skidmore, Andrew K; de Leeuw, Jan; Said, Mohammed Y; Freer, Jim
2014-01-01
Knowledge of population dynamics is essential for managing and conserving wildlife. Traditional methods of counting wild animals such as aerial survey or ground counts not only disturb animals, but also can be labour intensive and costly. New, commercially available very high-resolution satellite images offer great potential for accurate estimates of animal abundance over large open areas. However, little research has been conducted in the area of satellite-aided wildlife census, although computer processing speeds and image analysis algorithms have vastly improved. This paper explores the possibility of detecting large animals in the open savannah of Maasai Mara National Reserve, Kenya from very high-resolution GeoEye-1 satellite images. A hybrid image classification method was employed for this specific purpose by incorporating the advantages of both pixel-based and object-based image classification approaches. This was performed in two steps: firstly, a pixel-based image classification method, i.e., artificial neural network was applied to classify potential targets with similar spectral reflectance at pixel level; and then an object-based image classification method was used to further differentiate animal targets from the surrounding landscapes through the applications of expert knowledge. As a result, the large animals in two pilot study areas were successfully detected with an average count error of 8.2%, omission error of 6.6% and commission error of 13.7%. The results of the study show for the first time that it is feasible to perform automated detection and counting of large wild animals in open savannahs from space, and therefore provide a complementary and alternative approach to the conventional wildlife survey techniques.
Yoon, Jong H.; Tamir, Diana; Minzenberg, Michael J.; Ragland, J. Daniel; Ursu, Stefan; Carter, Cameron S.
2009-01-01
Background Multivariate pattern analysis is an alternative method of analyzing fMRI data, which is capable of decoding distributed neural representations. We applied this method to test the hypothesis of the impairment in distributed representations in schizophrenia. We also compared the results of this method with traditional GLM-based univariate analysis. Methods 19 schizophrenia and 15 control subjects viewed two runs of stimuli--exemplars of faces, scenes, objects, and scrambled images. To verify engagement with stimuli, subjects completed a 1-back matching task. A multi-voxel pattern classifier was trained to identify category-specific activity patterns on one run of fMRI data. Classification testing was conducted on the remaining run. Correlation of voxel-wise activity across runs evaluated variance over time in activity patterns. Results Patients performed the task less accurately. This group difference was reflected in the pattern analysis results with diminished classification accuracy in patients compared to controls, 59% and 72% respectively. In contrast, there was no group difference in GLM-based univariate measures. In both groups, classification accuracy was significantly correlated with behavioral measures. Both groups showed highly significant correlation between inter-run correlations and classification accuracy. Conclusions Distributed representations of visual objects are impaired in schizophrenia. This impairment is correlated with diminished task performance, suggesting that decreased integrity of cortical activity patterns is reflected in impaired behavior. Comparisons with univariate results suggest greater sensitivity of pattern analysis in detecting group differences in neural activity and reduced likelihood of non-specific factors driving these results. PMID:18822407
Superiority of artificial neural networks for a genetic classification procedure.
Sant'Anna, I C; Tomaz, R S; Silva, G N; Nascimento, M; Bhering, L L; Cruz, C D
2015-08-19
The correct classification of individuals is extremely important for the preservation of genetic variability and for maximization of yield in breeding programs using phenotypic traits and genetic markers. The Fisher and Anderson discriminant functions are commonly used multivariate statistical techniques for these situations, which allow for the allocation of an initially unknown individual to predefined groups. However, for higher levels of similarity, such as those found in backcrossed populations, these methods have proven to be inefficient. Recently, much research has been devoted to developing a new paradigm of computing known as artificial neural networks (ANNs), which can be used to solve many statistical problems, including classification problems. The aim of this study was to evaluate the feasibility of ANNs as an evaluation technique of genetic diversity by comparing their performance with that of traditional methods. The discriminant functions were equally ineffective in discriminating the populations, with error rates of 23-82%, thereby preventing the correct discrimination of individuals between populations. The ANN was effective in classifying populations with low and high differentiation, such as those derived from a genetic design established from backcrosses, even in cases of low differentiation of the data sets. The ANN appears to be a promising technique to solve classification problems, since the number of individuals classified incorrectly by the ANN was always lower than that of the discriminant functions. We envisage the potential relevant application of this improved procedure in the genomic classification of markers to distinguish between breeds and accessions.
Chatterjee, Sankhadeep; Dey, Nilanjan; Shi, Fuqian; Ashour, Amira S; Fong, Simon James; Sen, Soumya
2018-04-01
Dengue fever detection and classification have a vital role due to the recent outbreaks of different kinds of dengue fever. Recently, the advancement in the microarray technology can be employed for such classification process. Several studies have established that the gene selection phase takes a significant role in the classifier performance. Subsequently, the current study focused on detecting two different variations, namely, dengue fever (DF) and dengue hemorrhagic fever (DHF). A modified bag-of-features method has been proposed to select the most promising genes in the classification process. Afterward, a modified cuckoo search optimization algorithm has been engaged to support the artificial neural (ANN-MCS) to classify the unknown subjects into three different classes namely, DF, DHF, and another class containing convalescent and normal cases. The proposed method has been compared with other three well-known classifiers, namely, multilayer perceptron feed-forward network (MLP-FFN), artificial neural network (ANN) trained with cuckoo search (ANN-CS), and ANN trained with PSO (ANN-PSO). Experiments have been carried out with different number of clusters for the initial bag-of-features-based feature selection phase. After obtaining the reduced dataset, the hybrid ANN-MCS model has been employed for the classification process. The results have been compared in terms of the confusion matrix-based performance measuring metrics. The experimental results indicated a highly statistically significant improvement with the proposed classifier over the traditional ANN-CS model.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mackenzie, Cristóbal; Pichara, Karim; Protopapas, Pavlos
The success of automatic classification of variable stars depends strongly on the lightcurve representation. Usually, lightcurves are represented as a vector of many descriptors designed by astronomers called features. These descriptors are expensive in terms of computing, require substantial research effort to develop, and do not guarantee a good classification. Today, lightcurve representation is not entirely automatic; algorithms must be designed and manually tuned up for every survey. The amounts of data that will be generated in the future mean astronomers must develop scalable and automated analysis pipelines. In this work we present a feature learning algorithm designed for variablemore » objects. Our method works by extracting a large number of lightcurve subsequences from a given set, which are then clustered to find common local patterns in the time series. Representatives of these common patterns are then used to transform lightcurves of a labeled set into a new representation that can be used to train a classifier. The proposed algorithm learns the features from both labeled and unlabeled lightcurves, overcoming the bias using only labeled data. We test our method on data sets from the Massive Compact Halo Object survey and the Optical Gravitational Lensing Experiment; the results show that our classification performance is as good as and in some cases better than the performance achieved using traditional statistical features, while the computational cost is significantly lower. With these promising results, we believe that our method constitutes a significant step toward the automation of the lightcurve classification pipeline.« less
Clustering-based Feature Learning on Variable Stars
NASA Astrophysics Data System (ADS)
Mackenzie, Cristóbal; Pichara, Karim; Protopapas, Pavlos
2016-04-01
The success of automatic classification of variable stars depends strongly on the lightcurve representation. Usually, lightcurves are represented as a vector of many descriptors designed by astronomers called features. These descriptors are expensive in terms of computing, require substantial research effort to develop, and do not guarantee a good classification. Today, lightcurve representation is not entirely automatic; algorithms must be designed and manually tuned up for every survey. The amounts of data that will be generated in the future mean astronomers must develop scalable and automated analysis pipelines. In this work we present a feature learning algorithm designed for variable objects. Our method works by extracting a large number of lightcurve subsequences from a given set, which are then clustered to find common local patterns in the time series. Representatives of these common patterns are then used to transform lightcurves of a labeled set into a new representation that can be used to train a classifier. The proposed algorithm learns the features from both labeled and unlabeled lightcurves, overcoming the bias using only labeled data. We test our method on data sets from the Massive Compact Halo Object survey and the Optical Gravitational Lensing Experiment; the results show that our classification performance is as good as and in some cases better than the performance achieved using traditional statistical features, while the computational cost is significantly lower. With these promising results, we believe that our method constitutes a significant step toward the automation of the lightcurve classification pipeline.
Neural net applied to anthropological material: a methodical study on the human nasal skeleton.
Prescher, Andreas; Meyers, Anne; Gerf von Keyserlingk, Diedrich
2005-07-01
A new information processing method, an artificial neural net, was applied to characterise the variability of anthropological features of the human nasal skeleton. The aim was to find different types of nasal skeletons. A neural net with 15*15 nodes was trained by 17 standard anthropological parameters taken from 184 skulls of the Aachen collection. The trained neural net delivers its classification in a two-dimensional map. Different types of noses were locally separated within the map. Rare and frequent types may be distinguished after one passage of the complete collection through the net. Statistical descriptive analysis, hierarchical cluster analysis, and discriminant analysis were applied to the same data set. These parallel applications allowed comparison of the new approach to the more traditional ones. In general the classification by the neural net is in correspondence with cluster analysis and discriminant analysis. However, it goes beyond these classifications because of the possibility of differentiating the types in multi-dimensional dependencies. Furthermore, places in the map are kept blank for intermediate forms, which may be theoretically expected, but were not included in the training set. In conclusion, the application of a neural network is a suitable method for investigating large collections of biological material. The gained classification may be helpful in anatomy and anthropology as well as in forensic medicine. It may be used to characterise the peculiarity of a whole set as well as to find particular cases within the set.
A classification model of Hyperion image base on SAM combined decision tree
NASA Astrophysics Data System (ADS)
Wang, Zhenghai; Hu, Guangdao; Zhou, YongZhang; Liu, Xin
2009-10-01
Monitoring the Earth using imaging spectrometers has necessitated more accurate analyses and new applications to remote sensing. A very high dimensional input space requires an exponentially large amount of data to adequately and reliably represent the classes in that space. On the other hand, with increase in the input dimensionality the hypothesis space grows exponentially, which makes the classification performance highly unreliable. Traditional classification algorithms Classification of hyperspectral images is challenging. New algorithms have to be developed for hyperspectral data classification. The Spectral Angle Mapper (SAM) is a physically-based spectral classification that uses an ndimensional angle to match pixels to reference spectra. The algorithm determines the spectral similarity between two spectra by calculating the angle between the spectra, treating them as vectors in a space with dimensionality equal to the number of bands. The key and difficulty is that we should artificial defining the threshold of SAM. The classification precision depends on the rationality of the threshold of SAM. In order to resolve this problem, this paper proposes a new automatic classification model of remote sensing image using SAM combined with decision tree. It can automatic choose the appropriate threshold of SAM and improve the classify precision of SAM base on the analyze of field spectrum. The test area located in Heqing Yunnan was imaged by EO_1 Hyperion imaging spectrometer using 224 bands in visual and near infrared. The area included limestone areas, rock fields, soil and forests. The area was classified into four different vegetation and soil types. The results show that this method choose the appropriate threshold of SAM and eliminates the disturbance and influence of unwanted objects effectively, so as to improve the classification precision. Compared with the likelihood classification by field survey data, the classification precision of this model heightens 9.9%.
2016-02-01
the goals of the present study in mind. Fiedler (2011), in a review describing some common pitfalls in traditional psychological research, points...stimulus similarity and uncertainty in auditory change deafness. Frontiers in Psychology . 2014:5:1125. Dickerson K, Gaston JR, Perelman BS, Mermagen...Perspectives on Psychology . 2011;6(2):163–171. Fletcher NH, Rossing T. The physics of musical instruments. New York (NY): Springer; 1998. Gaver
Zhang, Qian; Jun, Se -Ran; Leuze, Michael; ...
2017-01-19
The development of rapid, economical genome sequencing has shed new light on the classification of viruses. As of October 2016, the National Center for Biotechnology Information (NCBI) database contained >2 million viral genome sequences and a reference set of ~4000 viral genome sequences that cover a wide range of known viral families. Whole-genome sequences can be used to improve viral classification and provide insight into the viral tree of life . However, due to the lack of evolutionary conservation amongst diverse viruses, it is not feasible to build a viral tree of life using traditional phylogenetic methods based on conservedmore » proteins. In this study, we used an alignment-free method that uses k-mers as genomic features for a large-scale comparison of complete viral genomes available in RefSeq. To determine the optimal feature length, k (an essential step in constructing a meaningful dendrogram), we designed a comprehensive strategy that combines three approaches: (1) cumulative relative entropy, (2) average number of common features among genomes, and (3) the Shannon diversity index. This strategy was used to determine k for all 3,905 complete viral genomes in RefSeq. Lastly, the resulting dendrogram shows consistency with the viral taxonomy of the ICTV and the Baltimore classification of viruses.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Qian; Jun, Se -Ran; Leuze, Michael
The development of rapid, economical genome sequencing has shed new light on the classification of viruses. As of October 2016, the National Center for Biotechnology Information (NCBI) database contained >2 million viral genome sequences and a reference set of ~4000 viral genome sequences that cover a wide range of known viral families. Whole-genome sequences can be used to improve viral classification and provide insight into the viral tree of life . However, due to the lack of evolutionary conservation amongst diverse viruses, it is not feasible to build a viral tree of life using traditional phylogenetic methods based on conservedmore » proteins. In this study, we used an alignment-free method that uses k-mers as genomic features for a large-scale comparison of complete viral genomes available in RefSeq. To determine the optimal feature length, k (an essential step in constructing a meaningful dendrogram), we designed a comprehensive strategy that combines three approaches: (1) cumulative relative entropy, (2) average number of common features among genomes, and (3) the Shannon diversity index. This strategy was used to determine k for all 3,905 complete viral genomes in RefSeq. Lastly, the resulting dendrogram shows consistency with the viral taxonomy of the ICTV and the Baltimore classification of viruses.« less
Otitis Media Diagnosis for Developing Countries Using Tympanic Membrane Image-Analysis.
Myburgh, Hermanus C; van Zijl, Willemien H; Swanepoel, DeWet; Hellström, Sten; Laurent, Claude
2016-03-01
Otitis media is one of the most common childhood diseases worldwide, but because of lack of doctors and health personnel in developing countries it is often misdiagnosed or not diagnosed at all. This may lead to serious, and life-threatening complications. There is, thus a need for an automated computer based image-analyzing system that could assist in making accurate otitis media diagnoses anywhere. A method for automated diagnosis of otitis media is proposed. The method uses image-processing techniques to classify otitis media. The system is trained using high quality pre-assessed images of tympanic membranes, captured by digital video-otoscopes, and classifies undiagnosed images into five otitis media categories based on predefined signs. Several verification tests analyzed the classification capability of the method. An accuracy of 80.6% was achieved for images taken with commercial video-otoscopes, while an accuracy of 78.7% was achieved for images captured on-site with a low cost custom-made video-otoscope. The high accuracy of the proposed otitis media classification system compares well with the classification accuracy of general practitioners and pediatricians (~64% to 80%) using traditional otoscopes, and therefore holds promise for the future in making automated diagnosis of otitis media in medically underserved populations.
Zhang, Qian; Jun, Se-Ran; Leuze, Michael; Ussery, David; Nookaew, Intawat
2017-01-01
The development of rapid, economical genome sequencing has shed new light on the classification of viruses. As of October 2016, the National Center for Biotechnology Information (NCBI) database contained >2 million viral genome sequences and a reference set of ~4000 viral genome sequences that cover a wide range of known viral families. Whole-genome sequences can be used to improve viral classification and provide insight into the viral “tree of life”. However, due to the lack of evolutionary conservation amongst diverse viruses, it is not feasible to build a viral tree of life using traditional phylogenetic methods based on conserved proteins. In this study, we used an alignment-free method that uses k-mers as genomic features for a large-scale comparison of complete viral genomes available in RefSeq. To determine the optimal feature length, k (an essential step in constructing a meaningful dendrogram), we designed a comprehensive strategy that combines three approaches: (1) cumulative relative entropy, (2) average number of common features among genomes, and (3) the Shannon diversity index. This strategy was used to determine k for all 3,905 complete viral genomes in RefSeq. The resulting dendrogram shows consistency with the viral taxonomy of the ICTV and the Baltimore classification of viruses. PMID:28102365
Facilitating Women's Involvement in Non-Traditional Occupations.
ERIC Educational Resources Information Center
Wolansky, William D.
This paper examines three topics related to women's involvement in non-traditional occupations: (1) the historical origin of occupational classification; (2) the influence of World War II on women's expanded participation in the workforce; and (3) women's entry into non-traditional occupations. The industrial revolution in Europe and later in the…
ERIC Educational Resources Information Center
Wolken, Lawrence C.
1984-01-01
Defines the predominate classifications of economic systems: traditional, command, market, capitalism, socialism, and communism. Considers property rights, role of government, economic freedom, incentives, market structure, economic goals and means of achieving those goals for each classification. Identifies 26 print and audio-visual sources for…
Rapid river classification using GIS-delineated functional process zones
Traditional classification of rivers does not take into consideration how rivers function within the ecosystem. Using factors such as hydrology and geomorphology that directly affect ecosystem structure and function, provides a means of classifying river systems into hydrogeomorp...
Local Knowledge and Conservation of Seagrasses in the Tamil Nadu State of India
2011-01-01
Local knowledge systems are not considered in the conservation of fragile seagrass marine ecosystems. In fact, little is known about the utility of seagrasses in local coastal communities. This is intriguing given that some local communities rely on seagrasses to sustain their livelihoods and have relocated their villages to areas with a rich diversity and abundance of seagrasses. The purpose of this study is to assist in conservation efforts regarding seagrasses through identifying Traditional Ecological Knowledge (TEK) from local knowledge systems of seagrasses from 40 coastal communities along the eastern coast of India. We explore the assemblage of scientific and local traditional knowledge concerning the 1. classification of seagrasses (comparing scientific and traditional classification systems), 2. utility of seagrasses, 3. Traditional Ecological Knowledge (TEK) of seagrasses, and 4. current conservation efforts for seagrass ecosystems. Our results indicate that local knowledge systems consist of a complex classification of seagrass diversity that considers the role of seagrasses in the marine ecosystem. This fine-scaled ethno-classification gives rise to five times the number of taxa (10 species = 50 local ethnotaxa), each with a unique role in the ecosystem and utility within coastal communities, including the use of seagrasses for medicine (e.g., treatment of heart conditions, seasickness, etc.), food (nutritious seeds), fertilizer (nutrient rich biomass) and livestock feed (goats and sheep). Local communities are concerned about the loss of seagrass diversity and have considerable local knowledge that is valuable for conservation and restoration plans. This study serves as a case study example of the depth and breadth of local knowledge systems for a particular ecosystem that is in peril. Key words: local health and nutrition, traditional ecological knowledge (TEK), conservation and natural resources management, consensus, ethnomedicine, ethnotaxa, cultural heritage PMID:22112297
Integrating human and machine intelligence in galaxy morphology classification tasks
NASA Astrophysics Data System (ADS)
Beck, Melanie R.; Scarlata, Claudia; Fortson, Lucy F.; Lintott, Chris J.; Simmons, B. D.; Galloway, Melanie A.; Willett, Kyle W.; Dickinson, Hugh; Masters, Karen L.; Marshall, Philip J.; Wright, Darryl
2018-06-01
Quantifying galaxy morphology is a challenging yet scientifically rewarding task. As the scale of data continues to increase with upcoming surveys, traditional classification methods will struggle to handle the load. We present a solution through an integration of visual and automated classifications, preserving the best features of both human and machine. We demonstrate the effectiveness of such a system through a re-analysis of visual galaxy morphology classifications collected during the Galaxy Zoo 2 (GZ2) project. We reprocess the top-level question of the GZ2 decision tree with a Bayesian classification aggregation algorithm dubbed SWAP, originally developed for the Space Warps gravitational lens project. Through a simple binary classification scheme, we increase the classification rate nearly 5-fold classifying 226 124 galaxies in 92 d of GZ2 project time while reproducing labels derived from GZ2 classification data with 95.7 per cent accuracy. We next combine this with a Random Forest machine learning algorithm that learns on a suite of non-parametric morphology indicators widely used for automated morphologies. We develop a decision engine that delegates tasks between human and machine and demonstrate that the combined system provides at least a factor of 8 increase in the classification rate, classifying 210 803 galaxies in just 32 d of GZ2 project time with 93.1 per cent accuracy. As the Random Forest algorithm requires a minimal amount of computational cost, this result has important implications for galaxy morphology identification tasks in the era of Euclid and other large-scale surveys.
An Investigation of Automatic Change Detection for Topographic Map Updating
NASA Astrophysics Data System (ADS)
Duncan, P.; Smit, J.
2012-08-01
Changes to the landscape are constantly occurring and it is essential for geospatial and mapping organisations that these changes are regularly detected and captured, so that map databases can be updated to reflect the current status of the landscape. The Chief Directorate of National Geospatial Information (CD: NGI), South Africa's national mapping agency, currently relies on manual methods of detecting changes and capturing these changes. These manual methods are time consuming and labour intensive, and rely on the skills and interpretation of the operator. It is therefore necessary to move towards more automated methods in the production process at CD: NGI. The aim of this research is to do an investigation into a methodology for automatic or semi-automatic change detection for the purpose of updating topographic databases. The method investigated for detecting changes is through image classification as well as spatial analysis and is focussed on urban landscapes. The major data input into this study is high resolution aerial imagery and existing topographic vector data. Initial results indicate the traditional pixel-based image classification approaches are unsatisfactory for large scale land-use mapping and that object-orientated approaches hold more promise. Even in the instance of object-oriented image classification generalization of techniques on a broad-scale has provided inconsistent results. A solution may lie with a hybrid approach of pixel and object-oriented techniques.
Suzanne M. Joy; R. M. Reich; Richard T. Reynolds
2003-01-01
Traditional land classification techniques for large areas that use Landsat Thematic Mapper (TM) imagery are typically limited to the fixed spatial resolution of the sensors (30m). However, the study of some ecological processes requires land cover classifications at finer spatial resolutions. We model forest vegetation types on the Kaibab National Forest (KNF) in...
Protecting traditional knowledge of Chinese medicine: concepts and proposals.
Liu, Changhua; Gu, Man
2011-06-01
With the development of the knowledge economy, knowledge has become one of the most important resources for social progress and economic development. Some countries have proposed measures for the protection of their own traditional knowledge. Traditional Chinese medicine belongs to the category of intangible cultural heritage because it is an important part of Chinese cultural heritage. Today the value of traditional knowledge of Chinese medicine has been widely recognized by the domestic and international public. This paper discusses the definition of traditional knowledge of Chinese medicine and its protection, and evaluates research on its classification. We review the present status of the protection of traditional knowledge of Chinese medicine and tentatively put forward some possible ideas and methods for the protection of traditional knowledge of Chinese medicine. Our goal is to find a way to strengthen the vitality of traditional Chinese medicine and consolidate its foundation. We believe that if we could establish a suitable sui generis(sui generis is a Latin term meaning "of its own kind" and is often used in discussions about protecting the rights of indigenous peoples. Here we use it to emphasize the fact that protection of traditional knowledge of Chinese medicine cannot be achieved through existing legal means of protection alone due to its unique characteristics) system for traditional knowledge, a more favorable environment for the preservation and development of traditional Chinese medicine will ultimately be created.
NASA Astrophysics Data System (ADS)
McClanahan, James Patrick
Eddy Current Testing (ECT) is a Non-Destructive Examination (NDE) technique that is widely used in power generating plants (both nuclear and fossil) to test the integrity of heat exchanger (HX) and steam generator (SG) tubing. Specifically for this research, laboratory-generated, flawed tubing data were examined. The purpose of this dissertation is to develop and implement an automated method for the classification and an advanced characterization of defects in HX and SG tubing. These two improvements enhanced the robustness of characterization as compared to traditional bobbin-coil ECT data analysis methods. A more robust classification and characterization of the tube flaw in-situ (while the SG is on-line but not when the plant is operating), should provide valuable information to the power industry. The following are the conclusions reached from this research. A feature extraction program acquiring relevant information from both the mixed, absolute and differential data was successfully implemented. The CWT was utilized to extract more information from the mixed, complex differential data. Image Processing techniques used to extract the information contained in the generated CWT, classified the data with a high success rate. The data were accurately classified, utilizing the compressed feature vector and using a Bayes classification system. An estimation of the upper bound for the probability of error, using the Bhattacharyya distance, was successfully applied to the Bayesian classification. The classified data were separated according to flaw-type (classification) to enhance characterization. The characterization routine used dedicated, flaw-type specific ANNs that made the characterization of the tube flaw more robust. The inclusion of outliers may help complete the feature space so that classification accuracy is increased. Given that the eddy current test signals appear very similar, there may not be sufficient information to make an extremely accurate (>95%) classification or an advanced characterization using this system. It is necessary to have a larger database fore more accurate system learning.
Sea Ice Detection Based on an Improved Similarity Measurement Method Using Hyperspectral Data.
Han, Yanling; Li, Jue; Zhang, Yun; Hong, Zhonghua; Wang, Jing
2017-05-15
Hyperspectral remote sensing technology can acquire nearly continuous spectrum information and rich sea ice image information, thus providing an important means of sea ice detection. However, the correlation and redundancy among hyperspectral bands reduce the accuracy of traditional sea ice detection methods. Based on the spectral characteristics of sea ice, this study presents an improved similarity measurement method based on linear prediction (ISMLP) to detect sea ice. First, the first original band with a large amount of information is determined based on mutual information theory. Subsequently, a second original band with the least similarity is chosen by the spectral correlation measuring method. Finally, subsequent bands are selected through the linear prediction method, and a support vector machine classifier model is applied to classify sea ice. In experiments performed on images of Baffin Bay and Bohai Bay, comparative analyses were conducted to compare the proposed method and traditional sea ice detection methods. Our proposed ISMLP method achieved the highest classification accuracies (91.18% and 94.22%) in both experiments. From these results the ISMLP method exhibits better performance overall than other methods and can be effectively applied to hyperspectral sea ice detection.
Sea Ice Detection Based on an Improved Similarity Measurement Method Using Hyperspectral Data
Han, Yanling; Li, Jue; Zhang, Yun; Hong, Zhonghua; Wang, Jing
2017-01-01
Hyperspectral remote sensing technology can acquire nearly continuous spectrum information and rich sea ice image information, thus providing an important means of sea ice detection. However, the correlation and redundancy among hyperspectral bands reduce the accuracy of traditional sea ice detection methods. Based on the spectral characteristics of sea ice, this study presents an improved similarity measurement method based on linear prediction (ISMLP) to detect sea ice. First, the first original band with a large amount of information is determined based on mutual information theory. Subsequently, a second original band with the least similarity is chosen by the spectral correlation measuring method. Finally, subsequent bands are selected through the linear prediction method, and a support vector machine classifier model is applied to classify sea ice. In experiments performed on images of Baffin Bay and Bohai Bay, comparative analyses were conducted to compare the proposed method and traditional sea ice detection methods. Our proposed ISMLP method achieved the highest classification accuracies (91.18% and 94.22%) in both experiments. From these results the ISMLP method exhibits better performance overall than other methods and can be effectively applied to hyperspectral sea ice detection. PMID:28505135
NASA Astrophysics Data System (ADS)
Andreon, S.; Gargiulo, G.; Longo, G.; Tagliaferri, R.; Capuano, N.
2000-12-01
Astronomical wide-field imaging performed with new large-format CCD detectors poses data reduction problems of unprecedented scale, which are difficult to deal with using traditional interactive tools. We present here NExt (Neural Extractor), a new neural network (NN) based package capable of detecting objects and performing both deblending and star/galaxy classification in an automatic way. Traditionally, in astronomical images, objects are first distinguished from the noisy background by searching for sets of connected pixels having brightnesses above a given threshold; they are then classified as stars or as galaxies through diagnostic diagrams having variables chosen according to the astronomer's taste and experience. In the extraction step, assuming that images are well sampled, NExt requires only the simplest a priori definition of `what an object is' (i.e. it keeps all structures composed of more than one pixel) and performs the detection via an unsupervised NN, approaching detection as a clustering problem that has been thoroughly studied in the artificial intelligence literature. The first part of the NExt procedure consists of an optimal compression of the redundant information contained in the pixels via a mapping from pixel intensities to a subspace individualized through principal component analysis. At magnitudes fainter than the completeness limit, stars are usually almost indistinguishable from galaxies, and therefore the parameters characterizing the two classes do not lie in disconnected subspaces, thus preventing the use of unsupervised methods. We therefore adopted a supervised NN (i.e. a NN that first finds the rules to classify objects from examples and then applies them to the whole data set). In practice, each object is classified depending on its membership of the regions mapping the input feature space in the training set. In order to obtain an objective and reliable classification, instead of using an arbitrarily defined set of features we use a NN to select the most significant features among the large number of measured ones, and then we use these selected features to perform the classification task. In order to optimize the performance of the system, we implemented and tested several different models of NN. The comparison of the NExt performance with that of the best detection and classification package known to the authors (SExtractor) shows that NExt is at least as effective as the best traditional packages.
Patient-Specific Deep Architectural Model for ECG Classification
Luo, Kan; Cuschieri, Alfred
2017-01-01
Heartbeat classification is a crucial step for arrhythmia diagnosis during electrocardiographic (ECG) analysis. The new scenario of wireless body sensor network- (WBSN-) enabled ECG monitoring puts forward a higher-level demand for this traditional ECG analysis task. Previously reported methods mainly addressed this requirement with the applications of a shallow structured classifier and expert-designed features. In this study, modified frequency slice wavelet transform (MFSWT) was firstly employed to produce the time-frequency image for heartbeat signal. Then the deep learning (DL) method was performed for the heartbeat classification. Here, we proposed a novel model incorporating automatic feature abstraction and a deep neural network (DNN) classifier. Features were automatically abstracted by the stacked denoising auto-encoder (SDA) from the transferred time-frequency image. DNN classifier was constructed by an encoder layer of SDA and a softmax layer. In addition, a deterministic patient-specific heartbeat classifier was achieved by fine-tuning on heartbeat samples, which included a small subset of individual samples. The performance of the proposed model was evaluated on the MIT-BIH arrhythmia database. Results showed that an overall accuracy of 97.5% was achieved using the proposed model, confirming that the proposed DNN model is a powerful tool for heartbeat pattern recognition. PMID:29065597
[Hyperspectral remote sensing image classification based on SVM optimized by clonal selection].
Liu, Qing-Jie; Jing, Lin-Hai; Wang, Meng-Fei; Lin, Qi-Zhong
2013-03-01
Model selection for support vector machine (SVM) involving kernel and the margin parameter values selection is usually time-consuming, impacts training efficiency of SVM model and final classification accuracies of SVM hyperspectral remote sensing image classifier greatly. Firstly, based on combinatorial optimization theory and cross-validation method, artificial immune clonal selection algorithm is introduced to the optimal selection of SVM (CSSVM) kernel parameter a and margin parameter C to improve the training efficiency of SVM model. Then an experiment of classifying AVIRIS in India Pine site of USA was performed for testing the novel CSSVM, as well as a traditional SVM classifier with general Grid Searching cross-validation method (GSSVM) for comparison. And then, evaluation indexes including SVM model training time, classification overall accuracy (OA) and Kappa index of both CSSVM and GSSVM were all analyzed quantitatively. It is demonstrated that OA of CSSVM on test samples and whole image are 85.1% and 81.58, the differences from that of GSSVM are both within 0.08% respectively; And Kappa indexes reach 0.8213 and 0.7728, the differences from that of GSSVM are both within 0.001; While the ratio of model training time of CSSVM and GSSVM is between 1/6 and 1/10. Therefore, CSSVM is fast and accurate algorithm for hyperspectral image classification and is superior to GSSVM.
Postcraniometric sex and ancestry estimation in South Africa: a validation study.
Liebenberg, Leandi; Krüger, Gabriele C; L'Abbé, Ericka N; Stull, Kyra E
2018-05-24
With the acceptance of the Daubert criteria as the standards for best practice in forensic anthropological research, more emphasis is being placed on the validation of published methods. Methods, both traditional and novel, need to be validated, adjusted, and refined for optimal performance within forensic anthropological analyses. Recently, a custom postcranial database of modern South Africans was created for use in Fordisc 3.1. Classification accuracies of up to 85% for ancestry estimation and 98% for sex estimation were achieved using a multivariate approach. To measure the external validity and report more realistic performance statistics, an independent sample was tested. The postcrania from 180 black, white, and colored South Africans were measured and classified using the custom postcranial database. A decrease in accuracy was observed for both ancestry estimation (79%) and sex estimation (95%) of the validation sample. When incorporating both sex and ancestry simultaneously, the method achieved 70% accuracy, and 79% accuracy when sex-specific ancestry analyses were run. Classification matrices revealed that postcrania were more likely to misclassify as a result of ancestry rather than sex. While both sex and ancestry influence the size of an individual, sex differences are more marked in the postcranial skeleton and are therefore easier to identify. The external validity of the postcranial database was verified and therefore shown to be a useful tool for forensic casework in South Africa. While the classification rates were slightly lower than the original method, this is expected when a method is generalized.
Luna-José, Azucena de Lourdes; Aguilar, Beatriz Rendón
2012-07-12
Traditional classification systems represent cognitive processes of human cultures in the world. It synthesizes specific conceptions of nature, as well as cumulative learning, beliefs and customs that are part of a particular human community or society. Traditional knowledge has been analyzed from different viewpoints, one of which corresponds to the analysis of ethnoclassifications. In this work, a brief analysis of the botanical traditional knowledge among Zapotecs of the municipality of San Agustin Loxicha, Oaxaca was conducted. The purposes of this study were: a) to analyze the traditional ecological knowledge of local plant resources through the folk classification of both landscapes and plants and b) to determine the role that this knowledge has played in plant resource management and conservation. The study was developed in five communities of San Agustín Loxicha. From field trips, plant specimens were collected and showed to local people in order to get the Spanish or Zapotec names; through interviews with local people, we obtained names and identified classification categories of plants, vegetation units, and soil types. We found a logic structure in Zapotec plant names, based on linguistic terms, as well as morphological and ecological caracteristics. We followed the classification principles proposed by Berlin [6] in order to build a hierarchical structure of life forms, names and other characteristics mentioned by people. We recorded 757 plant names. Most of them (67%) have an equivalent Zapotec name and the remaining 33% had mixed names with Zapotec and Spanish terms. Plants were categorized as native plants, plants introduced in pre-Hispanic times, or plants introduced later. All of them are grouped in a hierarchical classification, which include life form, generic, specific, and varietal categories. Monotypic and polytypic names are used to further classify plants. This holistic classification system plays an important role for local people in many aspects: it helps to organize and make sense of the diversity, to understand the interrelation among plants-soil-vegetation and to classify their physical space since they relate plants with a particular vegetation unit and a kind of soil. The locals also make a rational use of these elements, because they know which crops can grow in any vegetation unit, or which places are indicated to recollect plants. These aspects are interconnected and could be fundamental for a rational use and management of plant resources.
Classification, disease, and diagnosis.
Jutel, Annemarie
2011-01-01
Classification shapes medicine and guides its practice. Understanding classification must be part of the quest to better understand the social context and implications of diagnosis. Classifications are part of the human work that provides a foundation for the recognition and study of illness: deciding how the vast expanse of nature can be partitioned into meaningful chunks, stabilizing and structuring what is otherwise disordered. This article explores the aims of classification, their embodiment in medical diagnosis, and the historical traditions of medical classification. It provides a brief overview of the aims and principles of classification and their relevance to contemporary medicine. It also demonstrates how classifications operate as social framing devices that enable and disable communication, assert and refute authority, and are important items for sociological study.
29 CFR 510.25 - Traditional functions of government.
Code of Federal Regulations, 2012 CFR
2012-07-01
... whose primary function falls within one or more of the activities listed in paragraph (a) or (b) of this... 29 Labor 3 2012-07-01 2012-07-01 false Traditional functions of government. 510.25 Section 510.25... RICO Classification of Industries § 510.25 Traditional functions of government. (a) Section 6(c)(4) of...
29 CFR 510.25 - Traditional functions of government.
Code of Federal Regulations, 2013 CFR
2013-07-01
... whose primary function falls within one or more of the activities listed in paragraph (a) or (b) of this... 29 Labor 3 2013-07-01 2013-07-01 false Traditional functions of government. 510.25 Section 510.25... RICO Classification of Industries § 510.25 Traditional functions of government. (a) Section 6(c)(4) of...
29 CFR 510.25 - Traditional functions of government.
Code of Federal Regulations, 2014 CFR
2014-07-01
... whose primary function falls within one or more of the activities listed in paragraph (a) or (b) of this... 29 Labor 3 2014-07-01 2014-07-01 false Traditional functions of government. 510.25 Section 510.25... RICO Classification of Industries § 510.25 Traditional functions of government. (a) Section 6(c)(4) of...
29 CFR 510.25 - Traditional functions of government.
Code of Federal Regulations, 2011 CFR
2011-07-01
... 29 Labor 3 2011-07-01 2011-07-01 false Traditional functions of government. 510.25 Section 510.25... RICO Classification of Industries § 510.25 Traditional functions of government. (a) Section 6(c)(4) of... and 775.4. The latter subsection listed those functions of State or local government which were...
Abreu-Y Abreu, A T; González Sánchez, C B; Villanueva Sáenz, E; Valdovinos Díaz, M A
2010-01-01
With the introduction of high resolution manometry (HRM) and esophageal topography a novel classification (Chicago Classification) has been proposed for the diagnosis of esophageal motor disorders (EMD). Clinical differences with the traditional classification are currently under evaluation. To investigate differences between the Chicago (CC) and traditional (TC) classifications in the diagnosis of EMD. Consecutive patients with indication for esophageal manometry were studied. HRM was performed with a 36 sensors solid-state catheter and Manoview software (V2.0).Conventional manometric tracings were analyzed by an investigator blinded to the results of HRM. Diagnosis by CC and CT were compared. Two hundred patients were studied, 106 (53%) of them women (53%) with a mean patient age of 43.4 (range 16 - 84) years. Preoperative evaluation for GERD 152 (76%) was the most frequent indication. Achalasia (8), scleroderma (2) and peristaltic dysfunction (60 vs. 59) were similarly diagnosed by CC and CT. Spastic disorders were more frequently identified by CC: nutcracker esophagus (NC) in 3, spastic NC in3 and segmental NC in 11 patients versus TC: NC 5. Three patients had spasm with CC and 1 with TC. Non specific motor disorder was diagnosed by TC and 2 patients had functional obstruction with CC. Hypotensive lower esophageal sphincter was identified in 63 patients with CC vs.57 with TC. Spastic disorders and functional obstruction were the EMD better identified by HRM and CC.
Median Robust Extended Local Binary Pattern for Texture Classification.
Liu, Li; Lao, Songyang; Fieguth, Paul W; Guo, Yulan; Wang, Xiaogang; Pietikäinen, Matti
2016-03-01
Local binary patterns (LBP) are considered among the most computationally efficient high-performance texture features. However, the LBP method is very sensitive to image noise and is unable to capture macrostructure information. To best address these disadvantages, in this paper, we introduce a novel descriptor for texture classification, the median robust extended LBP (MRELBP). Different from the traditional LBP and many LBP variants, MRELBP compares regional image medians rather than raw image intensities. A multiscale LBP type descriptor is computed by efficiently comparing image medians over a novel sampling scheme, which can capture both microstructure and macrostructure texture information. A comprehensive evaluation on benchmark data sets reveals MRELBP's high performance-robust to gray scale variations, rotation changes and noise-but at a low computational cost. MRELBP produces the best classification scores of 99.82%, 99.38%, and 99.77% on three popular Outex test suites. More importantly, MRELBP is shown to be highly robust to image noise, including Gaussian noise, Gaussian blur, salt-and-pepper noise, and random pixel corruption.
Arana-Daniel, Nancy; Gallegos, Alberto A; López-Franco, Carlos; Alanís, Alma Y; Morales, Jacob; López-Franco, Adriana
2016-01-01
With the increasing power of computers, the amount of data that can be processed in small periods of time has grown exponentially, as has the importance of classifying large-scale data efficiently. Support vector machines have shown good results classifying large amounts of high-dimensional data, such as data generated by protein structure prediction, spam recognition, medical diagnosis, optical character recognition and text classification, etc. Most state of the art approaches for large-scale learning use traditional optimization methods, such as quadratic programming or gradient descent, which makes the use of evolutionary algorithms for training support vector machines an area to be explored. The present paper proposes an approach that is simple to implement based on evolutionary algorithms and Kernel-Adatron for solving large-scale classification problems, focusing on protein structure prediction. The functional properties of proteins depend upon their three-dimensional structures. Knowing the structures of proteins is crucial for biology and can lead to improvements in areas such as medicine, agriculture and biofuels.
Olives, Casey; Pagano, Marcello; Deitchler, Megan; Hedt, Bethany L; Egge, Kari; Valadez, Joseph J
2009-01-01
Traditional lot quality assurance sampling (LQAS) methods require simple random sampling to guarantee valid results. However, cluster sampling has been proposed to reduce the number of random starting points. This study uses simulations to examine the classification error of two such designs, a 67×3 (67 clusters of three observations) and a 33×6 (33 clusters of six observations) sampling scheme to assess the prevalence of global acute malnutrition (GAM). Further, we explore the use of a 67×3 sequential sampling scheme for LQAS classification of GAM prevalence. Results indicate that, for independent clusters with moderate intracluster correlation for the GAM outcome, the three sampling designs maintain approximate validity for LQAS analysis. Sequential sampling can substantially reduce the average sample size that is required for data collection. The presence of intercluster correlation can impact dramatically the classification error that is associated with LQAS analysis. PMID:20011037
Shultz, Jeffrey W
2018-01-09
A new species of leiobunine harvestman from the Chiricahua Mountains of Arizona is described. The species lacks pro- and retrolateral submarginal rows of coxal denticles, a feature often considered diagnostic for the polyphyletic Nelima, and has greatly reduced ventral dentition on the palpal claw, as in the monotypic Leuronychus. In most other respects, the species is uniquely similar to members of a clade from central and western Mexico currently in the poly- and/or paraphyletic Leiobunum. These traits include a supracheliceral lamina with a wide transverse plate and a canaliculate ocularium, with an anterior surface that slopes dorsoposteriorly and a posterior surface that bulges rearward and is constricted at its base. There is thus a conflict between classification using traditional diagnostic characters and classification using unique similarity of non-traditional characters. The problem is exacerbated by the problematic status of each candidate genus. Here the species is placed in Leiobunum as L. silum sp. nov., a decision that gives weight to probable phylogenetic affinity with species currently placed in that genus. Leiobunum silum provides an excellent example of the limits of traditional typological classification and the need for a broad-scale morphological and molecular revision of sclerosomatid harvestmen.
Feature Inference Learning and Eyetracking
ERIC Educational Resources Information Center
Rehder, Bob; Colner, Robert M.; Hoffman, Aaron B.
2009-01-01
Besides traditional supervised classification learning, people can learn categories by inferring the missing features of category members. It has been proposed that feature inference learning promotes learning a category's internal structure (e.g., its typical features and interfeature correlations) whereas classification promotes the learning of…
Wong, Raymond
2013-01-01
Voice biometrics is one kind of physiological characteristics whose voice is different for each individual person. Due to this uniqueness, voice classification has found useful applications in classifying speakers' gender, mother tongue or ethnicity (accent), emotion states, identity verification, verbal command control, and so forth. In this paper, we adopt a new preprocessing method named Statistical Feature Extraction (SFX) for extracting important features in training a classification model, based on piecewise transformation treating an audio waveform as a time-series. Using SFX we can faithfully remodel statistical characteristics of the time-series; together with spectral analysis, a substantial amount of features are extracted in combination. An ensemble is utilized in selecting only the influential features to be used in classification model induction. We focus on the comparison of effects of various popular data mining algorithms on multiple datasets. Our experiment consists of classification tests over four typical categories of human voice data, namely, Female and Male, Emotional Speech, Speaker Identification, and Language Recognition. The experiments yield encouraging results supporting the fact that heuristically choosing significant features from both time and frequency domains indeed produces better performance in voice classification than traditional signal processing techniques alone, like wavelets and LPC-to-CC. PMID:24288684
Linder, Roland; Orth, Isabelle; Hagen, E Christian; van der Woude, Fokko J; Schmitt, Wilhelm H
2011-06-01
To investigate the operating characteristics of the American College of Rheumatology (ACR) traditional format criteria for Wegener's granulomatosis (WG), the Sørensen criteria for WG and microscopic polyangiitis (MPA), and the Chapel Hill nomenclature for WG and MPA. Further, to develop and validate improved criteria for distinguishing WG from MPA by an artificial neural network (ANN) and by traditional approaches [classification tree (CT), logistic regression (LR)]. All criteria were applied to 240 patients with WG and 78 patients with MPA recruited by a multicenter study. To generate new classification criteria (ANN, CT, LR), 23 clinical measurements were assessed. Validation was performed by applying the same approaches to an independent monocenter cohort of 46 patients with WG and 21 patients with MPA. A total of 70.8% of the patients with WG and 7.7% of the patients with MPA from the multicenter cohort fulfilled the ACR criteria for WG (accuracy 76.1%). The accuracy of the Chapel Hill criteria for WG and MPA was only 35.0% and 55.3% (Sørensen criteria: 67.2% and 92.4%). In contrast, the ANN and CT achieved an accuracy of 94.3%, based on 4 measurements (involvement of nose, sinus, ear, and pulmonary nodules), all associated with WG. LR led to an accuracy of 92.8%. Inclusion of antineutrophil cytoplasmic antibodies did not improve the allocation. Validation of methods resulted in accuracy of 91.0% (ANN and CT) and 88.1% (LR). The ACR, Sørensen, and Chapel Hill criteria did not reliably separate WG from MPA. In contrast, an appropriately trained ANN and a CT differentiated between these disorders and performed better than LR.
Wang, Anran; Wang, Jian; Lin, Hongfei; Zhang, Jianhai; Yang, Zhihao; Xu, Kan
2017-12-20
Biomedical event extraction is one of the most frontier domains in biomedical research. The two main subtasks of biomedical event extraction are trigger identification and arguments detection which can both be considered as classification problems. However, traditional state-of-the-art methods are based on support vector machine (SVM) with massive manually designed one-hot represented features, which require enormous work but lack semantic relation among words. In this paper, we propose a multiple distributed representation method for biomedical event extraction. The method combines context consisting of dependency-based word embedding, and task-based features represented in a distributed way as the input of deep learning models to train deep learning models. Finally, we used softmax classifier to label the example candidates. The experimental results on Multi-Level Event Extraction (MLEE) corpus show higher F-scores of 77.97% in trigger identification and 58.31% in overall compared to the state-of-the-art SVM method. Our distributed representation method for biomedical event extraction avoids the problems of semantic gap and dimension disaster from traditional one-hot representation methods. The promising results demonstrate that our proposed method is effective for biomedical event extraction.
A machine-learned computational functional genomics-based approach to drug classification.
Lötsch, Jörn; Ultsch, Alfred
2016-12-01
The public accessibility of "big data" about the molecular targets of drugs and the biological functions of genes allows novel data science-based approaches to pharmacology that link drugs directly with their effects on pathophysiologic processes. This provides a phenotypic path to drug discovery and repurposing. This paper compares the performance of a functional genomics-based criterion to the traditional drug target-based classification. Knowledge discovery in the DrugBank and Gene Ontology databases allowed the construction of a "drug target versus biological process" matrix as a combination of "drug versus genes" and "genes versus biological processes" matrices. As a canonical example, such matrices were constructed for classical analgesic drugs. These matrices were projected onto a toroid grid of 50 × 82 artificial neurons using a self-organizing map (SOM). The distance, respectively, cluster structure of the high-dimensional feature space of the matrices was visualized on top of this SOM using a U-matrix. The cluster structure emerging on the U-matrix provided a correct classification of the analgesics into two main classes of opioid and non-opioid analgesics. The classification was flawless with both the functional genomics and the traditional target-based criterion. The functional genomics approach inherently included the drugs' modulatory effects on biological processes. The main pharmacological actions known from pharmacological science were captures, e.g., actions on lipid signaling for non-opioid analgesics that comprised many NSAIDs and actions on neuronal signal transmission for opioid analgesics. Using machine-learned techniques for computational drug classification in a comparative assessment, a functional genomics-based criterion was found to be similarly suitable for drug classification as the traditional target-based criterion. This supports a utility of functional genomics-based approaches to computational system pharmacology for drug discovery and repurposing.
Classification Influence of Features on Given Emotions and Its Application in Feature Selection
NASA Astrophysics Data System (ADS)
Xing, Yin; Chen, Chuang; Liu, Li-Long
2018-04-01
In order to solve the problem that there is a large amount of redundant data in high-dimensional speech emotion features, we analyze deeply the extracted speech emotion features and select better features. Firstly, a given emotion is classified by each feature. Secondly, the recognition rate is ranked in descending order. Then, the optimal threshold of features is determined by rate criterion. Finally, the better features are obtained. When applied in Berlin and Chinese emotional data set, the experimental results show that the feature selection method outperforms the other traditional methods.
Disregarding population specificity: its influence on the sex assessment methods from the tibia.
Kotěrová, Anežka; Velemínská, Jana; Dupej, Ján; Brzobohatá, Hana; Pilný, Aleš; Brůžek, Jaroslav
2017-01-01
Forensic anthropology has developed classification techniques for sex estimation of unknown skeletal remains, for example population-specific discriminant function analyses. These methods were designed for populations that lived mostly in the late nineteenth and twentieth centuries. Their level of reliability or misclassification is important for practical use in today's forensic practice; it is, however, unknown. We addressed the question of what the likelihood of errors would be if population specificity of discriminant functions of the tibia were disregarded. Moreover, five classification functions in a Czech sample were proposed (accuracies 82.1-87.5 %, sex bias ranged from -1.3 to -5.4 %). We measured ten variables traditionally used for sex assessment of the tibia on a sample of 30 male and 26 female models from recent Czech population. To estimate the classification accuracy and error (misclassification) rates ignoring population specificity, we selected published classification functions of tibia for the Portuguese, south European, and the North American populations. These functions were applied on the dimensions of the Czech population. Comparing the classification success of the reference and the tested Czech sample showed that females from Czech population were significantly overestimated and mostly misclassified as males. Overall accuracy of sex assessment significantly decreased (53.6-69.7 %), sex bias -29.4-100 %, which is most probably caused by secular trend and the generally high variability of body size. Results indicate that the discriminant functions, developed for skeletal series representing geographically and chronologically diverse populations, are not applicable in current forensic investigations. Finally, implications and recommendations for future research are discussed.
NASA Astrophysics Data System (ADS)
Sukuta, Sydney; Bruch, Reinhard F.
2002-05-01
The goal of this study is to test the feasibility of using noise factor/eigenvector bands as general clinical analytical tools for diagnoses. We developed a new technique, Noise Band Factor Cluster Analysis (NBFCA), to diagnose benign tumors via their Fourier transform IR fiber optic evanescent wave spectral data for the first time. The middle IR region of human normal skin tissue and benign and melanoma tumors, were analyzed using this new diagnostic technique. Our results are not in full-agreement with pathological classifications hence there is a possibility that our approaches could complement or improve these traditional classification schemes. Moreover, the use of NBFCA make it much easier to delineate class boundaries hence this method provides results with much higher certainty.
Texture Feature Extraction and Classification for Iris Diagnosis
NASA Astrophysics Data System (ADS)
Ma, Lin; Li, Naimin
Appling computer aided techniques in iris image processing, and combining occidental iridology with the traditional Chinese medicine is a challenging research area in digital image processing and artificial intelligence. This paper proposes an iridology model that consists the iris image pre-processing, texture feature analysis and disease classification. To the pre-processing, a 2-step iris localization approach is proposed; a 2-D Gabor filter based texture analysis and a texture fractal dimension estimation method are proposed for pathological feature extraction; and at last support vector machines are constructed to recognize 2 typical diseases such as the alimentary canal disease and the nerve system disease. Experimental results show that the proposed iridology diagnosis model is quite effective and promising for medical diagnosis and health surveillance for both hospital and public use.
Jaccard distance based weighted sparse representation for coarse-to-fine plant species recognition.
Zhang, Shanwen; Wu, Xiaowei; You, Zhuhong
2017-01-01
Leaf based plant species recognition plays an important role in ecological protection, however its application to large and modern leaf databases has been a long-standing obstacle due to the computational cost and feasibility. Recognizing such limitations, we propose a Jaccard distance based sparse representation (JDSR) method which adopts a two-stage, coarse to fine strategy for plant species recognition. In the first stage, we use the Jaccard distance between the test sample and each training sample to coarsely determine the candidate classes of the test sample. The second stage includes a Jaccard distance based weighted sparse representation based classification(WSRC), which aims to approximately represent the test sample in the training space, and classify it by the approximation residuals. Since the training model of our JDSR method involves much fewer but more informative representatives, this method is expected to overcome the limitation of high computational and memory costs in traditional sparse representation based classification. Comparative experimental results on a public leaf image database demonstrate that the proposed method outperforms other existing feature extraction and SRC based plant recognition methods in terms of both accuracy and computational speed.
NASA Astrophysics Data System (ADS)
Malekmohammadi, Bahram; Ramezani Mehrian, Majid; Jafari, Hamid Reza
2012-11-01
One of the most important water-resources management strategies for arid lands is managed aquifer recharge (MAR). In establishing a MAR scheme, site selection is the prime prerequisite that can be assisted by geographic information system (GIS) tools. One of the most important uncertainties in the site-selection process using GIS is finite ranges or intervals resulting from data classification. In order to reduce these uncertainties, a novel method has been developed involving the integration of multi-criteria decision making (MCDM), GIS, and a fuzzy inference system (FIS). The Shemil-Ashkara plain in the Hormozgan Province of Iran was selected as the case study; slope, geology, groundwater depth, potential for runoff, land use, and groundwater electrical conductivity have been considered as site-selection factors. By defining fuzzy membership functions for the input layers and the output layer, and by constructing fuzzy rules, a FIS has been developed. Comparison of the results produced by the proposed method and the traditional simple additive weighted (SAW) method shows that the proposed method yields more precise results. In conclusion, fuzzy-set theory can be an effective method to overcome associated uncertainties in classification of geographic information data.
Fusion of shallow and deep features for classification of high-resolution remote sensing images
NASA Astrophysics Data System (ADS)
Gao, Lang; Tian, Tian; Sun, Xiao; Li, Hang
2018-02-01
Effective spectral and spatial pixel description plays a significant role for the classification of high resolution remote sensing images. Current approaches of pixel-based feature extraction are of two main kinds: one includes the widelyused principal component analysis (PCA) and gray level co-occurrence matrix (GLCM) as the representative of the shallow spectral and shape features, and the other refers to the deep learning-based methods which employ deep neural networks and have made great promotion on classification accuracy. However, the former traditional features are insufficient to depict complex distribution of high resolution images, while the deep features demand plenty of samples to train the network otherwise over fitting easily occurs if only limited samples are involved in the training. In view of the above, we propose a GLCM-based convolution neural network (CNN) approach to extract features and implement classification for high resolution remote sensing images. The employment of GLCM is able to represent the original images and eliminate redundant information and undesired noises. Meanwhile, taking shallow features as the input of deep network will contribute to a better guidance and interpretability. In consideration of the amount of samples, some strategies such as L2 regularization and dropout methods are used to prevent over-fitting. The fine-tuning strategy is also used in our study to reduce training time and further enhance the generalization performance of the network. Experiments with popular data sets such as PaviaU data validate that our proposed method leads to a performance improvement compared to individual involved approaches.
Guo, Yanrong; Gao, Yaozong; Shao, Yeqin; Price, True; Oto, Aytekin; Shen, Dinggang
2014-01-01
Purpose: Automatic prostate segmentation from MR images is an important task in various clinical applications such as prostate cancer staging and MR-guided radiotherapy planning. However, the large appearance and shape variations of the prostate in MR images make the segmentation problem difficult to solve. Traditional Active Shape/Appearance Model (ASM/AAM) has limited accuracy on this problem, since its basic assumption, i.e., both shape and appearance of the targeted organ follow Gaussian distributions, is invalid in prostate MR images. To this end, the authors propose a sparse dictionary learning method to model the image appearance in a nonparametric fashion and further integrate the appearance model into a deformable segmentation framework for prostate MR segmentation. Methods: To drive the deformable model for prostate segmentation, the authors propose nonparametric appearance and shape models. The nonparametric appearance model is based on a novel dictionary learning method, namely distributed discriminative dictionary (DDD) learning, which is able to capture fine distinctions in image appearance. To increase the differential power of traditional dictionary-based classification methods, the authors' DDD learning approach takes three strategies. First, two dictionaries for prostate and nonprostate tissues are built, respectively, using the discriminative features obtained from minimum redundancy maximum relevance feature selection. Second, linear discriminant analysis is employed as a linear classifier to boost the optimal separation between prostate and nonprostate tissues, based on the representation residuals from sparse representation. Third, to enhance the robustness of the authors' classification method, multiple local dictionaries are learned for local regions along the prostate boundary (each with small appearance variations), instead of learning one global classifier for the entire prostate. These discriminative dictionaries are located on different patches of the prostate surface and trained to adaptively capture the appearance in different prostate zones, thus achieving better local tissue differentiation. For each local region, multiple classifiers are trained based on the randomly selected samples and finally assembled by a specific fusion method. In addition to this nonparametric appearance model, a prostate shape model is learned from the shape statistics using a novel approach, sparse shape composition, which can model nonGaussian distributions of shape variation and regularize the 3D mesh deformation by constraining it within the observed shape subspace. Results: The proposed method has been evaluated on two datasets consisting of T2-weighted MR prostate images. For the first (internal) dataset, the classification effectiveness of the authors' improved dictionary learning has been validated by comparing it with three other variants of traditional dictionary learning methods. The experimental results show that the authors' method yields a Dice Ratio of 89.1% compared to the manual segmentation, which is more accurate than the three state-of-the-art MR prostate segmentation methods under comparison. For the second dataset, the MICCAI 2012 challenge dataset, the authors' proposed method yields a Dice Ratio of 87.4%, which also achieves better segmentation accuracy than other methods under comparison. Conclusions: A new magnetic resonance image prostate segmentation method is proposed based on the combination of deformable model and dictionary learning methods, which achieves more accurate segmentation performance on prostate T2 MR images. PMID:24989402
Waugh, Mark H.; Hopwood, Christopher J.; Krueger, Robert F.; Morey, Leslie C.; Pincus, Aaron L.; Wright, Aidan G. C.
2016-01-01
The Diagnostic and Statistical Manual of Mental Disorders Fifth Edition (DSM-5) Section III Alternative Model for Personality Disorders (AMPD; APA, 2013) represents an innovative system for simultaneous psychiatric classification and psychological assessment of personality disorders (PD). The AMPD combines major paradigms of personality assessment and provides an original, heuristic, flexible, and practical framework that enriches clinical thinking and practice. Origins, emerging research, and clinical application of the AMPD for diagnosis and psychological assessment are reviewed. The AMPD integrates assessment and research traditions, facilitates case conceptualization, is easy to learn and use, and assists in providing patient feedback. New as well as existing tests and psychometric methods may be used to operationalize the AMPD for clinical assessments. PMID:28450760
Waugh, Mark H; Hopwood, Christopher J; Krueger, Robert F; Morey, Leslie C; Pincus, Aaron L; Wright, Aidan G C
2017-04-01
The Diagnostic and Statistical Manual of Mental Disorders Fifth Edition (DSM-5) Section III Alternative Model for Personality Disorders (AMPD; APA, 2013) represents an innovative system for simultaneous psychiatric classification and psychological assessment of personality disorders (PD). The AMPD combines major paradigms of personality assessment and provides an original, heuristic, flexible, and practical framework that enriches clinical thinking and practice. Origins, emerging research, and clinical application of the AMPD for diagnosis and psychological assessment are reviewed. The AMPD integrates assessment and research traditions, facilitates case conceptualization, is easy to learn and use, and assists in providing patient feedback. New as well as existing tests and psychometric methods may be used to operationalize the AMPD for clinical assessments.
Yang, X; Le, D; Zhang, Y L; Liang, L Z; Yang, G; Hu, W J
2016-10-18
To explore a crown form classification method for upper central incisor which is more objective and scientific than traditional classification method based on the standardized photography technique. To analyze the relationship between crown form of upper central incisors and papilla filling in periodontally healthy Chinese Han-nationality youth. In the study, 180 periodontally healthy Chinese youth ( 75 males, and 105 females ) aged 20-30 (24.3±4.5) years were included. With the standardized upper central incisor photography technique, pictures of 360 upper central incisors were obtained. Each tooth was classified as triangular, ovoid or square by 13 experienced specialist majors in prothodontics independently and the final classification result was decided by most evaluators in order to ensure objectivity. The standardized digital photo was also used to evaluate the gingival papilla filling situation. The papilla filling result was recorded as present or absent according to naked eye observation. The papilla filling rates of different crown forms were analyzed. Statistical analyses were performed with SPSS 19.0. The proportions of triangle, ovoid and square forms of upper central incisor in Chinese Han-nationality youth were 31.4% (113/360), 37.2% (134/360) and 31.4% (113/360 ), respectively, and no statistical difference was found between the males and females. Average κ value between each two evaluators was 0.381. Average κ value was raised up to 0.563 when compared with the final classification result. In the study, 24 upper central incisors without contact were excluded, and the papilla filling rates of triangle, ovoid and square crown were 56.4% (62/110), 69.6% (87/125), 76.2% (77/101) separately. The papilla filling rate of square form was higher (P=0.007). The proportion of clinical crown form of upper central incisor in Chinese Han-nationality youth is obtained. Compared with triangle form, square form is found to favor a gingival papilla that fills the interproximal embrasure space. The consistency of the present classification method for upper central incisor is not satisfying, which indicates that a new classification method, more scientific and objective than the present one, is to be found.
ERIC Educational Resources Information Center
Berman, Sanford
1980-01-01
Criticizes the 19th edition of "Dewey Decimal Classification" for violating traditional classification goals for library materials and ignoring the desires of libraries and other users. A total reform is proposed to eliminate Phoenix schedules and to accept only those relocations approved by an editorial board of users. (RAA)
Shi, Ting-Ting; Zhang, Xiao-Bo; Guo, Lan-Ping; Huang, Lu-Qi
2017-11-01
The herbs used as the material for traditional Chinese medicine are always planted in the mountainous area where the natural environment is suitable. As the mountain terrain is complex and the distribution of planting plots is scattered, the traditional survey method is difficult to obtain accurate planting area. It is of great significance to provide decision support for the conservation and utilization of traditional Chinese medicine resources by studying the method of extraction of Chinese herbal medicine planting area based on remote sensing and realizing the dynamic monitoring and reserve estimation of Chinese herbal medicines. In this paper, taking the Panax notoginseng plots in Wenshan prefecture of Yunnan province as an example, the China-made GF-1multispectral remote sensing images with a 16 m×16 m resolution were obtained. Then, the time series that can reflect the difference of spectrum of P. notoginseng shed and the background objects were selected to the maximum extent, and the decision tree model of extraction the of P. notoginseng plots was constructed according to the spectral characteristics of the surface features. The results showed that the remote sensing classification method based on the decision tree model could extract P. notoginseng plots in the study area effectively. The method can provide technical support for extraction of P. notoginseng plots at county level. Copyright© by the Chinese Pharmaceutical Association.
Guo, Yanrong; Gao, Yaozong; Shao, Yeqin; Price, True; Oto, Aytekin; Shen, Dinggang
2014-07-01
Automatic prostate segmentation from MR images is an important task in various clinical applications such as prostate cancer staging and MR-guided radiotherapy planning. However, the large appearance and shape variations of the prostate in MR images make the segmentation problem difficult to solve. Traditional Active Shape/Appearance Model (ASM/AAM) has limited accuracy on this problem, since its basic assumption, i.e., both shape and appearance of the targeted organ follow Gaussian distributions, is invalid in prostate MR images. To this end, the authors propose a sparse dictionary learning method to model the image appearance in a nonparametric fashion and further integrate the appearance model into a deformable segmentation framework for prostate MR segmentation. To drive the deformable model for prostate segmentation, the authors propose nonparametric appearance and shape models. The nonparametric appearance model is based on a novel dictionary learning method, namely distributed discriminative dictionary (DDD) learning, which is able to capture fine distinctions in image appearance. To increase the differential power of traditional dictionary-based classification methods, the authors' DDD learning approach takes three strategies. First, two dictionaries for prostate and nonprostate tissues are built, respectively, using the discriminative features obtained from minimum redundancy maximum relevance feature selection. Second, linear discriminant analysis is employed as a linear classifier to boost the optimal separation between prostate and nonprostate tissues, based on the representation residuals from sparse representation. Third, to enhance the robustness of the authors' classification method, multiple local dictionaries are learned for local regions along the prostate boundary (each with small appearance variations), instead of learning one global classifier for the entire prostate. These discriminative dictionaries are located on different patches of the prostate surface and trained to adaptively capture the appearance in different prostate zones, thus achieving better local tissue differentiation. For each local region, multiple classifiers are trained based on the randomly selected samples and finally assembled by a specific fusion method. In addition to this nonparametric appearance model, a prostate shape model is learned from the shape statistics using a novel approach, sparse shape composition, which can model nonGaussian distributions of shape variation and regularize the 3D mesh deformation by constraining it within the observed shape subspace. The proposed method has been evaluated on two datasets consisting of T2-weighted MR prostate images. For the first (internal) dataset, the classification effectiveness of the authors' improved dictionary learning has been validated by comparing it with three other variants of traditional dictionary learning methods. The experimental results show that the authors' method yields a Dice Ratio of 89.1% compared to the manual segmentation, which is more accurate than the three state-of-the-art MR prostate segmentation methods under comparison. For the second dataset, the MICCAI 2012 challenge dataset, the authors' proposed method yields a Dice Ratio of 87.4%, which also achieves better segmentation accuracy than other methods under comparison. A new magnetic resonance image prostate segmentation method is proposed based on the combination of deformable model and dictionary learning methods, which achieves more accurate segmentation performance on prostate T2 MR images.
The functional therapeutic chemical classification system.
Croset, Samuel; Overington, John P; Rebholz-Schuhmann, Dietrich
2014-03-15
Drug repositioning is the discovery of new indications for compounds that have already been approved and used in a clinical setting. Recently, some computational approaches have been suggested to unveil new opportunities in a systematic fashion, by taking into consideration gene expression signatures or chemical features for instance. We present here a novel method based on knowledge integration using semantic technologies, to capture the functional role of approved chemical compounds. In order to computationally generate repositioning hypotheses, we used the Web Ontology Language to formally define the semantics of over 20 000 terms with axioms to correctly denote various modes of action (MoA). Based on an integration of public data, we have automatically assigned over a thousand of approved drugs into these MoA categories. The resulting new resource is called the Functional Therapeutic Chemical Classification System and was further evaluated against the content of the traditional Anatomical Therapeutic Chemical Classification System. We illustrate how the new classification can be used to generate drug repurposing hypotheses, using Alzheimers disease as a use-case. https://www.ebi.ac.uk/chembl/ftc; https://github.com/loopasam/ftc. croset@ebi.ac.uk Supplementary data are available at Bioinformatics online.
Feature generation and representations for protein-protein interaction classification.
Lan, Man; Tan, Chew Lim; Su, Jian
2009-10-01
Automatic detecting protein-protein interaction (PPI) relevant articles is a crucial step for large-scale biological database curation. The previous work adopted POS tagging, shallow parsing and sentence splitting techniques, but they achieved worse performance than the simple bag-of-words representation. In this paper, we generated and investigated multiple types of feature representations in order to further improve the performance of PPI text classification task. Besides the traditional domain-independent bag-of-words approach and the term weighting methods, we also explored other domain-dependent features, i.e. protein-protein interaction trigger keywords, protein named entities and the advanced ways of incorporating Natural Language Processing (NLP) output. The integration of these multiple features has been evaluated on the BioCreAtIvE II corpus. The experimental results showed that both the advanced way of using NLP output and the integration of bag-of-words and NLP output improved the performance of text classification. Specifically, in comparison with the best performance achieved in the BioCreAtIvE II IAS, the feature-level and classifier-level integration of multiple features improved the performance of classification 2.71% and 3.95%, respectively.
NASA Astrophysics Data System (ADS)
Wan, Xiaoqing; Zhao, Chunhui; Wang, Yanchun; Liu, Wu
2017-11-01
This paper proposes a novel classification paradigm for hyperspectral image (HSI) using feature-level fusion and deep learning-based methodologies. Operation is carried out in three main steps. First, during a pre-processing stage, wave atoms are introduced into bilateral filter to smooth HSI, and this strategy can effectively attenuate noise and restore texture information. Meanwhile, high quality spectral-spatial features can be extracted from HSI by taking geometric closeness and photometric similarity among pixels into consideration simultaneously. Second, higher order statistics techniques are firstly introduced into hyperspectral data classification to characterize the phase correlations of spectral curves. Third, multifractal spectrum features are extracted to characterize the singularities and self-similarities of spectra shapes. To this end, a feature-level fusion is applied to the extracted spectral-spatial features along with higher order statistics and multifractal spectrum features. Finally, stacked sparse autoencoder is utilized to learn more abstract and invariant high-level features from the multiple feature sets, and then random forest classifier is employed to perform supervised fine-tuning and classification. Experimental results on two real hyperspectral data sets demonstrate that the proposed method outperforms some traditional alternatives.
Yu, Hualong; Hong, Shufang; Yang, Xibei; Ni, Jun; Dan, Yuanyuan; Qin, Bin
2013-01-01
DNA microarray technology can measure the activities of tens of thousands of genes simultaneously, which provides an efficient way to diagnose cancer at the molecular level. Although this strategy has attracted significant research attention, most studies neglect an important problem, namely, that most DNA microarray datasets are skewed, which causes traditional learning algorithms to produce inaccurate results. Some studies have considered this problem, yet they merely focus on binary-class problem. In this paper, we dealt with multiclass imbalanced classification problem, as encountered in cancer DNA microarray, by using ensemble learning. We utilized one-against-all coding strategy to transform multiclass to multiple binary classes, each of them carrying out feature subspace, which is an evolving version of random subspace that generates multiple diverse training subsets. Next, we introduced one of two different correction technologies, namely, decision threshold adjustment or random undersampling, into each training subset to alleviate the damage of class imbalance. Specifically, support vector machine was used as base classifier, and a novel voting rule called counter voting was presented for making a final decision. Experimental results on eight skewed multiclass cancer microarray datasets indicate that unlike many traditional classification approaches, our methods are insensitive to class imbalance.
Wang, Yiqin; Yan, Hanxia; Yan, Jianjun; Yuan, Fengyin; Xu, Zhaoxia; Liu, Guoping; Xu, Wenjie
2015-01-01
Objective. This research provides objective and quantitative parameters of the traditional Chinese medicine (TCM) pulse conditions for distinguishing between patients with the coronary heart disease (CHD) and normal people by using the proposed classification approach based on Hilbert-Huang transform (HHT) and random forest. Methods. The energy and the sample entropy features were extracted by applying the HHT to TCM pulse by treating these pulse signals as time series. By using the random forest classifier, the extracted two types of features and their combination were, respectively, used as input data to establish classification model. Results. Statistical results showed that there were significant differences in the pulse energy and sample entropy between the CHD group and the normal group. Moreover, the energy features, sample entropy features, and their combination were inputted as pulse feature vectors; the corresponding average recognition rates were 84%, 76.35%, and 90.21%, respectively. Conclusion. The proposed approach could be appropriately used to analyze pulses of patients with CHD, which can lay a foundation for research on objective and quantitative criteria on disease diagnosis or Zheng differentiation. PMID:26180536
Guo, Rui; Wang, Yiqin; Yan, Hanxia; Yan, Jianjun; Yuan, Fengyin; Xu, Zhaoxia; Liu, Guoping; Xu, Wenjie
2015-01-01
Objective. This research provides objective and quantitative parameters of the traditional Chinese medicine (TCM) pulse conditions for distinguishing between patients with the coronary heart disease (CHD) and normal people by using the proposed classification approach based on Hilbert-Huang transform (HHT) and random forest. Methods. The energy and the sample entropy features were extracted by applying the HHT to TCM pulse by treating these pulse signals as time series. By using the random forest classifier, the extracted two types of features and their combination were, respectively, used as input data to establish classification model. Results. Statistical results showed that there were significant differences in the pulse energy and sample entropy between the CHD group and the normal group. Moreover, the energy features, sample entropy features, and their combination were inputted as pulse feature vectors; the corresponding average recognition rates were 84%, 76.35%, and 90.21%, respectively. Conclusion. The proposed approach could be appropriately used to analyze pulses of patients with CHD, which can lay a foundation for research on objective and quantitative criteria on disease diagnosis or Zheng differentiation.
Sentiments Analysis of Reviews Based on ARCNN Model
NASA Astrophysics Data System (ADS)
Xu, Xiaoyu; Xu, Ming; Xu, Jian; Zheng, Ning; Yang, Tao
2017-10-01
The sentiments analysis of product reviews is designed to help customers understand the status of the product. The traditional method of sentiments analysis relies on the input of a fixed feature vector which is performance bottleneck of the basic codec architecture. In this paper, we propose an attention mechanism with BRNN-CNN model, referring to as ARCNN model. In order to have a good analysis of the semantic relations between words and solves the problem of dimension disaster, we use the GloVe algorithm to train the vector representations for words. Then, ARCNN model is proposed to deal with the problem of deep features training. Specifically, BRNN model is proposed to investigate non-fixed-length vectors and keep time series information perfectly and CNN can study more connection of deep semantic links. Moreover, the attention mechanism can automatically learn from the data and optimize the allocation of weights. Finally, a softmax classifier is designed to complete the sentiment classification of reviews. Experiments show that the proposed method can improve the accuracy of sentiment classification compared with benchmark methods.
Iliyasu, Abdullah M; Fatichah, Chastine
2017-12-19
A quantum hybrid (QH) intelligent approach that blends the adaptive search capability of the quantum-behaved particle swarm optimisation (QPSO) method with the intuitionistic rationality of traditional fuzzy k -nearest neighbours (Fuzzy k -NN) algorithm (known simply as the Q-Fuzzy approach) is proposed for efficient feature selection and classification of cells in cervical smeared (CS) images. From an initial multitude of 17 features describing the geometry, colour, and texture of the CS images, the QPSO stage of our proposed technique is used to select the best subset features (i.e., global best particles) that represent a pruned down collection of seven features. Using a dataset of almost 1000 images, performance evaluation of our proposed Q-Fuzzy approach assesses the impact of our feature selection on classification accuracy by way of three experimental scenarios that are compared alongside two other approaches: the All-features (i.e., classification without prior feature selection) and another hybrid technique combining the standard PSO algorithm with the Fuzzy k -NN technique (P-Fuzzy approach). In the first and second scenarios, we further divided the assessment criteria in terms of classification accuracy based on the choice of best features and those in terms of the different categories of the cervical cells. In the third scenario, we introduced new QH hybrid techniques, i.e., QPSO combined with other supervised learning methods, and compared the classification accuracy alongside our proposed Q-Fuzzy approach. Furthermore, we employed statistical approaches to establish qualitative agreement with regards to the feature selection in the experimental scenarios 1 and 3. The synergy between the QPSO and Fuzzy k -NN in the proposed Q-Fuzzy approach improves classification accuracy as manifest in the reduction in number cell features, which is crucial for effective cervical cancer detection and diagnosis.
New Course Design: Classification Schemes and Information Architecture.
ERIC Educational Resources Information Center
Weinberg, Bella Hass
2002-01-01
Describes a course developed at St. John's University (New York) in the Division of Library and Information Science that relates traditional classification schemes to information architecture and Web sites. Highlights include functional aspects of information architecture, that is, the way content is structured; assignments; student reactions; and…
Classification of high dimensional multispectral image data
NASA Technical Reports Server (NTRS)
Hoffbeck, Joseph P.; Landgrebe, David A.
1993-01-01
A method for classifying high dimensional remote sensing data is described. The technique uses a radiometric adjustment to allow a human operator to identify and label training pixels by visually comparing the remotely sensed spectra to laboratory reflectance spectra. Training pixels for material without obvious spectral features are identified by traditional means. Features which are effective for discriminating between the classes are then derived from the original radiance data and used to classify the scene. This technique is applied to Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) data taken over Cuprite, Nevada in 1992, and the results are compared to an existing geologic map. This technique performed well even with noisy data and the fact that some of the materials in the scene lack absorption features. No adjustment for the atmosphere or other scene variables was made to the data classified. While the experimental results compare favorably with an existing geologic map, the primary purpose of this research was to demonstrate the classification method, as compared to the geology of the Cuprite scene.
Fully Convolutional Network Based Shadow Extraction from GF-2 Imagery
NASA Astrophysics Data System (ADS)
Li, Z.; Cai, G.; Ren, H.
2018-04-01
There are many shadows on the high spatial resolution satellite images, especially in the urban areas. Although shadows on imagery severely affect the information extraction of land cover or land use, they provide auxiliary information for building extraction which is hard to achieve a satisfactory accuracy through image classification itself. This paper focused on the method of building shadow extraction by designing a fully convolutional network and training samples collected from GF-2 satellite imagery in the urban region of Changchun city. By means of spatial filtering and calculation of adjacent relationship along the sunlight direction, the small patches from vegetation or bridges have been eliminated from the preliminary extracted shadows. Finally, the building shadows were separated. The extracted building shadow information from the proposed method in this paper was compared with the results from the traditional object-oriented supervised classification algorihtms. It showed that the deep learning network approach can improve the accuracy to a large extent.
Automatic Modulation Classification Based on Deep Learning for Unmanned Aerial Vehicles.
Zhang, Duona; Ding, Wenrui; Zhang, Baochang; Xie, Chunyu; Li, Hongguang; Liu, Chunhui; Han, Jungong
2018-03-20
Deep learning has recently attracted much attention due to its excellent performance in processing audio, image, and video data. However, few studies are devoted to the field of automatic modulation classification (AMC). It is one of the most well-known research topics in communication signal recognition and remains challenging for traditional methods due to complex disturbance from other sources. This paper proposes a heterogeneous deep model fusion (HDMF) method to solve the problem in a unified framework. The contributions include the following: (1) a convolutional neural network (CNN) and long short-term memory (LSTM) are combined by two different ways without prior knowledge involved; (2) a large database, including eleven types of single-carrier modulation signals with various noises as well as a fading channel, is collected with various signal-to-noise ratios (SNRs) based on a real geographical environment; and (3) experimental results demonstrate that HDMF is very capable of coping with the AMC problem, and achieves much better performance when compared with the independent network.
Automatic Modulation Classification Based on Deep Learning for Unmanned Aerial Vehicles
Ding, Wenrui; Zhang, Baochang; Xie, Chunyu; Li, Hongguang; Liu, Chunhui; Han, Jungong
2018-01-01
Deep learning has recently attracted much attention due to its excellent performance in processing audio, image, and video data. However, few studies are devoted to the field of automatic modulation classification (AMC). It is one of the most well-known research topics in communication signal recognition and remains challenging for traditional methods due to complex disturbance from other sources. This paper proposes a heterogeneous deep model fusion (HDMF) method to solve the problem in a unified framework. The contributions include the following: (1) a convolutional neural network (CNN) and long short-term memory (LSTM) are combined by two different ways without prior knowledge involved; (2) a large database, including eleven types of single-carrier modulation signals with various noises as well as a fading channel, is collected with various signal-to-noise ratios (SNRs) based on a real geographical environment; and (3) experimental results demonstrate that HDMF is very capable of coping with the AMC problem, and achieves much better performance when compared with the independent network. PMID:29558434
[DNA barcoding and its utility in commonly-used medicinal snakes].
Huang, Yong; Zhang, Yue-yun; Zhao, Cheng-jian; Xu, Yong-li; Gu, Ying-le; Huang, Wen-qi; Lin, Kui; Li, Li
2015-03-01
Identification accuracy of traditional Chinese medicine is crucial for the traditional Chinese medicine research, production and application. DNA barcoding based on the mitochondrial gene coding for cytochrome c oxidase subunit I (COI), are more and more used for identification of traditional Chinese medicine. Using universal barcoding primers to sequence, we discussed the feasibility of DNA barcoding method for identification commonly-used medicinal snakes (a total of 109 samples belonging to 19 species 15 genera 6 families). The phylogenetic trees using Neighbor-joining were constructed. The results indicated that the mean content of G + C(46.5%) was lower than that of A + T (53.5%). As calculated by Kimera-2-parameter model, the mean intraspecies genetic distance of Trimeresurus albolabris, Ptyas dhumnades and Lycodon rufozonatus was greater than 2%. Further phylogenetic relationship results suggested that identification of one sample of T. albolabris was erroneous. The identification of some samples of P. dhumnades was also not correct, namely originally P. korros was identified as P. dhumnades. Factors influence on intraspecific genetic distance difference of L. rufozonatus need to be studied further. Therefore, DNA barcoding for identification of medicinal snakes is feasible, and greatly complements the morphological classification method. It is necessary to further study in identification of traditional Chinese medicine.
A user-friendly SSVEP-based brain-computer interface using a time-domain classifier.
Luo, An; Sullivan, Thomas J
2010-04-01
We introduce a user-friendly steady-state visual evoked potential (SSVEP)-based brain-computer interface (BCI) system. Single-channel EEG is recorded using a low-noise dry electrode. Compared to traditional gel-based multi-sensor EEG systems, a dry sensor proves to be more convenient, comfortable and cost effective. A hardware system was built that displays four LED light panels flashing at different frequencies and synchronizes with EEG acquisition. The visual stimuli have been carefully designed such that potential risk to photosensitive people is minimized. We describe a novel stimulus-locked inter-trace correlation (SLIC) method for SSVEP classification using EEG time-locked to stimulus onsets. We studied how the performance of the algorithm is affected by different selection of parameters. Using the SLIC method, the average light detection rate is 75.8% with very low error rates (an 8.4% false positive rate and a 1.3% misclassification rate). Compared to a traditional frequency-domain-based method, the SLIC method is more robust (resulting in less annoyance to the users) and is also suitable for irregular stimulus patterns.
Real-time ultrasonic weld evaluation system
NASA Astrophysics Data System (ADS)
Katragadda, Gopichand; Nair, Satish; Liu, Harry; Brown, Lawrence M.
1996-11-01
Ultrasonic testing techniques are currently used as an alternative to radiography for detecting, classifying,and sizing weld defects, and for evaluating weld quality. Typically, ultrasonic weld inspections are performed manually, which require significant operator expertise and time. Thus, in recent years, the emphasis is to develop automated methods to aid or replace operators in critical weld inspections where inspection time, reliability, and operator safety are major issues. During this period, significant advances wee made in the areas of weld defect classification and sizing. Very few of these methods, however have found their way into the market, largely due to the lack of an integrated approach enabling real-time implementation. Also, not much research effort was directed in improving weld acceptance criteria. This paper presents an integrated system utilizing state-of-the-art techniques for a complete automation of the weld inspection procedure. The modules discussed include transducer tracking, classification, sizing, and weld acceptance criteria. Transducer tracking was studied by experimentally evaluating sonic and optical position tracking techniques. Details for this evaluation are presented. Classification is obtained using a multi-layer perceptron. Results from different feature extraction schemes, including a new method based on a combination of time and frequency-domain signal representations are given. Algorithms developed to automate defect registration and sizing are discussed. A fuzzy-logic acceptance criteria for weld acceptance is presented describing how this scheme provides improved robustness compared to the traditional flow-diagram standards.
Creating a Canonical Scientific and Technical Information Classification System for NCSTRL+
NASA Technical Reports Server (NTRS)
Tiffany, Melissa E.; Nelson, Michael L.
1998-01-01
The purpose of this paper is to describe the new subject classification system for the NCSTRL+ project. NCSTRL+ is a canonical digital library (DL) based on the Networked Computer Science Technical Report Library (NCSTRL). The current NCSTRL+ classification system uses the NASA Scientific and Technical (STI) subject classifications, which has a bias towards the aerospace, aeronautics, and engineering disciplines. Examination of other scientific and technical information classification systems showed similar discipline-centric weaknesses. Traditional, library-oriented classification systems represented all disciplines, but were too generalized to serve the needs of a scientific and technically oriented digital library. Lack of a suitable existing classification system led to the creation of a lightweight, balanced, general classification system that allows the mapping of more specialized classification schemes into the new framework. We have developed the following classification system to give equal weight to all STI disciplines, while being compact and lightweight.
From local to global – Contributions of Indian psychiatry to international psychiatry
Murthy, R. Srinivasa
2010-01-01
Indian psychiatrists have actively engaged with world psychiatry by contributing to understanding and care of persons with mental disorders based on the religious, cultural and social aspects of Indian life. The contributions are significant in the areas of outlining the scope of mental health, classification of mental disorders, understanding the course of mental disorders, psychotherapy, traditional methods of care, role of family in mental health care and care of the mentally ill in the community settings. PMID:21836699
Galpert, Deborah; Fernández, Alberto; Herrera, Francisco; Antunes, Agostinho; Molina-Ruiz, Reinaldo; Agüero-Chapin, Guillermin
2018-05-03
The development of new ortholog detection algorithms and the improvement of existing ones are of major importance in functional genomics. We have previously introduced a successful supervised pairwise ortholog classification approach implemented in a big data platform that considered several pairwise protein features and the low ortholog pair ratios found between two annotated proteomes (Galpert, D et al., BioMed Research International, 2015). The supervised models were built and tested using a Saccharomycete yeast benchmark dataset proposed by Salichos and Rokas (2011). Despite several pairwise protein features being combined in a supervised big data approach; they all, to some extent were alignment-based features and the proposed algorithms were evaluated on a unique test set. Here, we aim to evaluate the impact of alignment-free features on the performance of supervised models implemented in the Spark big data platform for pairwise ortholog detection in several related yeast proteomes. The Spark Random Forest and Decision Trees with oversampling and undersampling techniques, and built with only alignment-based similarity measures or combined with several alignment-free pairwise protein features showed the highest classification performance for ortholog detection in three yeast proteome pairs. Although such supervised approaches outperformed traditional methods, there were no significant differences between the exclusive use of alignment-based similarity measures and their combination with alignment-free features, even within the twilight zone of the studied proteomes. Just when alignment-based and alignment-free features were combined in Spark Decision Trees with imbalance management, a higher success rate (98.71%) within the twilight zone could be achieved for a yeast proteome pair that underwent a whole genome duplication. The feature selection study showed that alignment-based features were top-ranked for the best classifiers while the runners-up were alignment-free features related to amino acid composition. The incorporation of alignment-free features in supervised big data models did not significantly improve ortholog detection in yeast proteomes regarding the classification qualities achieved with just alignment-based similarity measures. However, the similarity of their classification performance to that of traditional ortholog detection methods encourages the evaluation of other alignment-free protein pair descriptors in future research.
An unsupervised classification technique for multispectral remote sensing data.
NASA Technical Reports Server (NTRS)
Su, M. Y.; Cummings, R. E.
1973-01-01
Description of a two-part clustering technique consisting of (a) a sequential statistical clustering, which is essentially a sequential variance analysis, and (b) a generalized K-means clustering. In this composite clustering technique, the output of (a) is a set of initial clusters which are input to (b) for further improvement by an iterative scheme. This unsupervised composite technique was employed for automatic classification of two sets of remote multispectral earth resource observations. The classification accuracy by the unsupervised technique is found to be comparable to that by traditional supervised maximum-likelihood classification techniques.
Initial study of Schroedinger eigenmaps for spectral target detection
NASA Astrophysics Data System (ADS)
Dorado-Munoz, Leidy P.; Messinger, David W.
2016-08-01
Spectral target detection refers to the process of searching for a specific material with a known spectrum over a large area containing materials with different spectral signatures. Traditional target detection methods in hyperspectral imagery (HSI) require assuming the data fit some statistical or geometric models and based on the model, to estimate parameters for defining a hypothesis test, where one class (i.e., target class) is chosen over the other classes (i.e., background class). Nonlinear manifold learning methods such as Laplacian eigenmaps (LE) have extensively shown their potential use in HSI processing, specifically in classification or segmentation. Recently, Schroedinger eigenmaps (SE), which is built upon LE, has been introduced as a semisupervised classification method. In SE, the former Laplacian operator is replaced by the Schroedinger operator. The Schroedinger operator includes by definition, a potential term V that steers the transformation in certain directions improving the separability between classes. In this regard, we propose a methodology for target detection that is not based on the traditional schemes and that does not need the estimation of statistical or geometric parameters. This method is based on SE, where the potential term V is taken into consideration to include the prior knowledge about the target class and use it to steer the transformation in directions where the target location in the new space is known and the separability between target and background is augmented. An initial study of how SE can be used in a target detection scheme for HSI is shown here. In-scene pixel and spectral signature detection approaches are presented. The HSI data used comprise various target panels for testing simultaneous detection of multiple objects with different complexities.
NASA Astrophysics Data System (ADS)
Xue, Zhaohui; Du, Peijun; Li, Jun; Su, Hongjun
2017-02-01
The generally limited availability of training data relative to the usually high data dimension pose a great challenge to accurate classification of hyperspectral imagery, especially for identifying crops characterized with highly correlated spectra. However, traditional parametric classification models are problematic due to the need of non-singular class-specific covariance matrices. In this research, a novel sparse graph regularization (SGR) method is presented, aiming at robust crop mapping using hyperspectral imagery with very few in situ data. The core of SGR lies in propagating labels from known data to unknown, which is triggered by: (1) the fraction matrix generated for the large unknown data by using an effective sparse representation algorithm with respect to the few training data serving as the dictionary; (2) the prediction function estimated for the few training data by formulating a regularization model based on sparse graph. Then, the labels of large unknown data can be obtained by maximizing the posterior probability distribution based on the two ingredients. SGR is more discriminative, data-adaptive, robust to noise, and efficient, which is unique with regard to previously proposed approaches and has high potentials in discriminating crops, especially when facing insufficient training data and high-dimensional spectral space. The study area is located at Zhangye basin in the middle reaches of Heihe watershed, Gansu, China, where eight crop types were mapped with Compact Airborne Spectrographic Imager (CASI) and Shortwave Infrared Airborne Spectrogrpahic Imager (SASI) hyperspectral data. Experimental results demonstrate that the proposed method significantly outperforms other traditional and state-of-the-art methods.
A statistical approach for validating eSOTER and digital soil maps in front of traditional soil maps
NASA Astrophysics Data System (ADS)
Bock, Michael; Baritz, Rainer; Köthe, Rüdiger; Melms, Stephan; Günther, Susann
2015-04-01
During the European research project eSOTER, three different Digital Soil Maps (DSM) were developed for the pilot area Chemnitz 1:250,000 (FP7 eSOTER project, grant agreement nr. 211578). The core task of the project was to revise the SOTER method for the interpretation of soil and terrain data. It was one of the working hypothesis that eSOTER does not only provide terrain data with typical soil profiles, but that the new products actually perform like a conceptual soil map. The three eSOTER maps for the pilot area considerably differed in spatial representation and content of soil classes. In this study we compare the three eSOTER maps against existing reconnaissance soil maps keeping in mind that traditional soil maps have many subjective issues and intended bias regarding the overestimation and emphasize of certain features. Hence, a true validation of the proper representation of modeled soil maps is hardly possible; rather a statistical comparison between modeled and empirical approaches is possible. If eSOTER data represent conceptual soil maps, then different eSOTER, DSM and conventional maps from various sources and different regions could be harmonized towards consistent new data sets for large areas including the whole European continent. One of the eSOTER maps has been developed closely to the traditional SOTER method: terrain classification data (derived from SRTM DEM) were combined with lithology data (re-interpreted geological map); the corresponding terrain units were then extended with soil information: a very dense regional soil profile data set was used to define soil mapping units based on a statistical grouping of terrain units. The second map is a pure DSM map using continuous terrain parameters instead of terrain classification; radiospectrometric data were used to supplement parent material information from geology maps. The classification method Random Forest was used. The third approach predicts soil diagnostic properties based on covariates similar to DSM practices; in addition, multi-temporal MODIS data were used; the resulting soil map is the product of these diagnostic layers producing a map of soil reference groups (classified according to WRB). Because the third approach was applied to a larger test area in central Europe, and compared to the first two approaches, has worked with coarser input data, comparability is only partly fulfilled. To evaluate the usability of the three eSOTER maps, and to make a comparison among them, traditional soil maps 1:200,000 and 1:50,000 were used as reference data sets. Three statistical methods were applied: (i) in a moving window the distribution of the soil classes of each DSM product was compared to that of the soil maps by calculating the corrected coefficient of contingency, (ii) the value of predictive power for each of the eSOTER maps was determined, and (iii) the degree of consistency was derived. The latter is based on a weighting of the match of occurring class combinations via expert knowledge and recalculating the proportions of map appearance with these weights. To re-check the validation results a field study by local soil experts was conducted. The results show clearly that the first eSOTER approach based on the terrain classification / reinterpreted parent material information has the greatest similarity with traditional soil maps. The spatial differentiation offered by such an approach is well suitable to serve as a conceptual soil map. Therefore, eSOTER can be a tool for soil mappers to generate conceptual soil maps in a faster and more consistent way. This conclusion is at least valid for overview scales such as 1.250,000.
Sobre prestamos y clasificaciones linguisticas (Regarding Borrowing and Linguistic Classification).
ERIC Educational Resources Information Center
Key, Mary Ritchie
1988-01-01
This article explores the traditionally accepted etymologies of several lexical borrowings in the indigenous languages of the Americas within the framework of comparative linguistics and linguistic classification. The first section presents a general discussion of the problem of tracing lexical borrowings in this context. The section features a…
Wang, Xue-Yong; Liao, Cai-Li; Liu, Si-Qi; Liu, Chun-Sheng; Shao, Ai-Juan; Huang, Lu-Qi
2013-05-01
This paper put forward a more accurate identification method for identification of Chinese materia medica (CMM), the systematic identification of Chinese materia medica (SICMM) , which might solve difficulties in CMM identification used the ordinary traditional ways. Concepts, mechanisms and methods of SICMM were systematically introduced and possibility was proved by experiments. The establishment of SICMM will solve problems in identification of Chinese materia medica not only in phenotypic characters like the mnorphous, microstructure, chemical constituents, but also further discovery evolution and classification of species, subspecies and population in medical plants. The establishment of SICMM will improve the development of identification of CMM and create a more extensive study space.
Real time system design of motor imagery brain-computer interface based on multi band CSP and SVM
NASA Astrophysics Data System (ADS)
Zhao, Li; Li, Xiaoqin; Bian, Yan
2018-04-01
Motion imagery (MT) is an effective method to promote the recovery of limbs in patients after stroke. Though an online MT brain computer interface (BCT) system, which apply MT, can enhance the patient's participation and accelerate their recovery process. The traditional method deals with the electroencephalogram (EEG) induced by MT by common spatial pattern (CSP), which is used to extract information from a frequency band. Tn order to further improve the classification accuracy of the system, information of two characteristic frequency bands is extracted. The effectiveness of the proposed feature extraction method is verified by off-line analysis of competition data and the analysis of online system.
Machine learning in heart failure: ready for prime time.
Awan, Saqib Ejaz; Sohel, Ferdous; Sanfilippo, Frank Mario; Bennamoun, Mohammed; Dwivedi, Girish
2018-03-01
The aim of this review is to present an up-to-date overview of the application of machine learning methods in heart failure including diagnosis, classification, readmissions and medication adherence. Recent studies have shown that the application of machine learning techniques may have the potential to improve heart failure outcomes and management, including cost savings by improving existing diagnostic and treatment support systems. Recently developed deep learning methods are expected to yield even better performance than traditional machine learning techniques in performing complex tasks by learning the intricate patterns hidden in big medical data. The review summarizes the recent developments in the application of machine and deep learning methods in heart failure management.
Li, Jing; Hong, Wenxue
2014-12-01
The feature extraction and feature selection are the important issues in pattern recognition. Based on the geometric algebra representation of vector, a new feature extraction method using blade coefficient of geometric algebra was proposed in this study. At the same time, an improved differential evolution (DE) feature selection method was proposed to solve the elevated high dimension issue. The simple linear discriminant analysis was used as the classifier. The result of the 10-fold cross-validation (10 CV) classification of public breast cancer biomedical dataset was more than 96% and proved superior to that of the original features and traditional feature extraction method.
NASA Astrophysics Data System (ADS)
Wang, Le
2003-10-01
Modern forest management poses an increasing need for detailed knowledge of forest information at different spatial scales. At the forest level, the information for tree species assemblage is desired whereas at or below the stand level, individual tree related information is preferred. Remote Sensing provides an effective tool to extract the above information at multiple spatial scales in the continuous time domain. To date, the increasing volume and readily availability of high-spatial-resolution data have lead to a much wider application of remotely sensed products. Nevertheless, to make effective use of the improving spatial resolution, conventional pixel-based classification methods are far from satisfactory. Correspondingly, developing object-based methods becomes a central challenge for researchers in the field of Remote Sensing. This thesis focuses on the development of methods for accurate individual tree identification and tree species classification. We develop a method in which individual tree crown boundaries and treetop locations are derived under a unified framework. We apply a two-stage approach with edge detection followed by marker-controlled watershed segmentation. Treetops are modeled from radiometry and geometry aspects. Specifically, treetops are assumed to be represented by local radiation maxima and to be located near the center of the tree-crown. As a result, a marker image was created from the derived treetop to guide a watershed segmentation to further differentiate overlapping trees and to produce a segmented image comprised of individual tree crowns. The image segmentation method developed achieves a promising result for a 256 x 256 CASI image. Then further effort is made to extend our methods to the multiscales which are constructed from a wavelet decomposition. A scale consistency and geometric consistency are designed to examine the gradients along the scale-space for the purpose of separating true crown boundary from unwanted textures occurring due to branches and twigs. As a result from the inverse wavelet transform, the tree crown boundary is enhanced while the unwanted textures are suppressed. Based on the enhanced image, an improvement is achieved when applying the two-stage methods to a high resolution aerial photograph. To improve tree species classification, we develop a new method to choose the optimal scale parameter with the aid of Bhattacharya Distance (BD), a well-known index of class separability in traditional pixel-based classification. The optimal scale parameter is then fed in the process of a region-growing-based segmentation as a break-off value. Our object classification achieves a better accuracy in separating tree species when compared to the conventional Maximum Likelihood Classification (MLC). In summary, we develop two object-based methods for identifying individual trees and classifying tree species from high-spatial resolution imagery. Both methods achieve promising results and will promote integration of Remote Sensing and GIS in forest applications.
Predicting complications of percutaneous coronary intervention using a novel support vector method
Lee, Gyemin; Gurm, Hitinder S; Syed, Zeeshan
2013-01-01
Objective To explore the feasibility of a novel approach using an augmented one-class learning algorithm to model in-laboratory complications of percutaneous coronary intervention (PCI). Materials and methods Data from the Blue Cross Blue Shield of Michigan Cardiovascular Consortium (BMC2) multicenter registry for the years 2007 and 2008 (n=41 016) were used to train models to predict 13 different in-laboratory PCI complications using a novel one-plus-class support vector machine (OP-SVM) algorithm. The performance of these models in terms of discrimination and calibration was compared to the performance of models trained using the following classification algorithms on BMC2 data from 2009 (n=20 289): logistic regression (LR), one-class support vector machine classification (OC-SVM), and two-class support vector machine classification (TC-SVM). For the OP-SVM and TC-SVM approaches, variants of the algorithms with cost-sensitive weighting were also considered. Results The OP-SVM algorithm and its cost-sensitive variant achieved the highest area under the receiver operating characteristic curve for the majority of the PCI complications studied (eight cases). Similar improvements were observed for the Hosmer–Lemeshow χ2 value (seven cases) and the mean cross-entropy error (eight cases). Conclusions The OP-SVM algorithm based on an augmented one-class learning problem improved discrimination and calibration across different PCI complications relative to LR and traditional support vector machine classification. Such an approach may have value in a broader range of clinical domains. PMID:23599229
Li, Hui-Fang; Zhang, Dong; Qu, Wen-Jun; Wang, Hai-Lin; Liu, Yang; Borjigdai, Almaz; Cui, Jian; Dong, Zheng-Qi
2016-04-01
The solubility and permeability on four kinds of flavonoids (kaempferol, hesperidin, apigenin, genistein) were test according to the theory of biopharmaceutics classification system (BCS), and their absorption mechanism. The solubility was investigated by the method in determination of solubility of "Chinese Pharmacopoeia 2010". To detect appearance permeability of compounds mentioned above, the appropriate concentrations were selected by the MTT method in cell transfer experiments in Caco-2 cell model, which established by in vitro cell culture method. Therefore, these compounds were classified with BCS according to solubility and permeability. In addition, to explore absorption mechanisms, the experiments in three different concentrations of compounds in high, medium and low in bidirectional transformation methods in Caco-2 cell model contacted. The study indicated that all of kaempferol, hesperidin, apigenin, genistein have the characteristics in low solubility and high permeability, which belong to BCSⅡ, and the absorption mechanism of kaempferol was active transportation. Whereas, hesperidin, apigenin, genistein were passive transportation. In this study, it carried out initial explorations on establishment of determination for solubility and permeability in flavonoids, and provided theoretical reference for further research on BCS in traditional Chinese medicine. Copyright© by the Chinese Pharmaceutical Association.
Butail, Sachit; Salerno, Philip; Bollt, Erik M; Porfiri, Maurizio
2015-12-01
Traditional approaches for the analysis of collective behavior entail digitizing the position of each individual, followed by evaluation of pertinent group observables, such as cohesion and polarization. Machine learning may enable considerable advancements in this area by affording the classification of these observables directly from images. While such methods have been successfully implemented in the classification of individual behavior, their potential in the study collective behavior is largely untested. In this paper, we compare three methods for the analysis of collective behavior: simple tracking (ST) without resolving occlusions, machine learning with real data (MLR), and machine learning with synthetic data (MLS). These methods are evaluated on videos recorded from an experiment studying the effect of ambient light on the shoaling tendency of Giant danios. In particular, we compute average nearest-neighbor distance (ANND) and polarization using the three methods and compare the values with manually-verified ground-truth data. To further assess possible dependence on sampling rate for computing ANND, the comparison is also performed at a low frame rate. Results show that while ST is the most accurate at higher frame rate for both ANND and polarization, at low frame rate for ANND there is no significant difference in accuracy between the three methods. In terms of computational speed, MLR and MLS take significantly less time to process an image, with MLS better addressing constraints related to generation of training data. Finally, all methods are able to successfully detect a significant difference in ANND as the ambient light intensity is varied irrespective of the direction of intensity change.
2012-01-01
Traditional classification systems represent cognitive processes of human cultures in the world. It synthesizes specific conceptions of nature, as well as cumulative learning, beliefs and customs that are part of a particular human community or society. Traditional knowledge has been analyzed from different viewpoints, one of which corresponds to the analysis of ethnoclassifications. In this work, a brief analysis of the botanical traditional knowledge among Zapotecs of the municipality of San Agustin Loxicha, Oaxaca was conducted. The purposes of this study were: a) to analyze the traditional ecological knowledge of local plant resources through the folk classification of both landscapes and plants and b) to determine the role that this knowledge has played in plant resource management and conservation. The study was developed in five communities of San Agustín Loxicha. From field trips, plant specimens were collected and showed to local people in order to get the Spanish or Zapotec names; through interviews with local people, we obtained names and identified classification categories of plants, vegetation units, and soil types. We found a logic structure in Zapotec plant names, based on linguistic terms, as well as morphological and ecological caracteristics. We followed the classification principles proposed by Berlin [6] in order to build a hierarchical structure of life forms, names and other characteristics mentioned by people. We recorded 757 plant names. Most of them (67%) have an equivalent Zapotec name and the remaining 33% had mixed names with Zapotec and Spanish terms. Plants were categorized as native plants, plants introduced in pre-Hispanic times, or plants introduced later. All of them are grouped in a hierarchical classification, which include life form, generic, specific, and varietal categories. Monotypic and polytypic names are used to further classify plants. This holistic classification system plays an important role for local people in many aspects: it helps to organize and make sense of the diversity, to understand the interrelation among plants–soil–vegetation and to classify their physical space since they relate plants with a particular vegetation unit and a kind of soil. The locals also make a rational use of these elements, because they know which crops can grow in any vegetation unit, or which places are indicated to recollect plants. These aspects are interconnected and could be fundamental for a rational use and management of plant resources. PMID:22789155
Chen, Xiao Yu; Ma, Li Zhuang; Chu, Na; Zhou, Min; Hu, Yiyang
2013-01-01
Chronic hepatitis B (CHB) is a serious public health problem, and Traditional Chinese Medicine (TCM) plays an important role in the control and treatment for CHB. In the treatment of TCM, zheng discrimination is the most important step. In this paper, an approach based on CFS-GA (Correlation based Feature Selection and Genetic Algorithm) and C5.0 boost decision tree is used for zheng classification and progression in the TCM treatment of CHB. The CFS-GA performs better than the typical method of CFS. By CFS-GA, the acquired attribute subset is classified by C5.0 boost decision tree for TCM zheng classification of CHB, and C5.0 decision tree outperforms two typical decision trees of NBTree and REPTree on CFS-GA, CFS, and nonselection in comparison. Based on the critical indicators from C5.0 decision tree, important lab indicators in zheng progression are obtained by the method of stepwise discriminant analysis for expressing TCM zhengs in CHB, and alterations of the important indicators are also analyzed in zheng progression. In conclusion, all the three decision trees perform better on CFS-GA than on CFS and nonselection, and C5.0 decision tree outperforms the two typical decision trees both on attribute selection and nonselection.
Spectral Data Reduction via Wavelet Decomposition
NASA Technical Reports Server (NTRS)
Kaewpijit, S.; LeMoigne, J.; El-Ghazawi, T.; Rood, Richard (Technical Monitor)
2002-01-01
The greatest advantage gained from hyperspectral imagery is that narrow spectral features can be used to give more information about materials than was previously possible with broad-band multispectral imagery. For many applications, the new larger data volumes from such hyperspectral sensors, however, present a challenge for traditional processing techniques. For example, the actual identification of each ground surface pixel by its corresponding reflecting spectral signature is still one of the most difficult challenges in the exploitation of this advanced technology, because of the immense volume of data collected. Therefore, conventional classification methods require a preprocessing step of dimension reduction to conquer the so-called "curse of dimensionality." Spectral data reduction using wavelet decomposition could be useful, as it does not only reduce the data volume, but also preserves the distinctions between spectral signatures. This characteristic is related to the intrinsic property of wavelet transforms that preserves high- and low-frequency features during the signal decomposition, therefore preserving peaks and valleys found in typical spectra. When comparing to the most widespread dimension reduction technique, the Principal Component Analysis (PCA), and looking at the same level of compression rate, we show that Wavelet Reduction yields better classification accuracy, for hyperspectral data processed with a conventional supervised classification such as a maximum likelihood method.
NASA Astrophysics Data System (ADS)
Rajwa, Bartek; Bayraktar, Bulent; Banada, Padmapriya P.; Huff, Karleigh; Bae, Euiwon; Hirleman, E. Daniel; Bhunia, Arun K.; Robinson, J. Paul
2006-10-01
Bacterial contamination by Listeria monocytogenes puts the public at risk and is also costly for the food-processing industry. Traditional methods for pathogen identification require complicated sample preparation for reliable results. Previously, we have reported development of a noninvasive optical forward-scattering system for rapid identification of Listeria colonies grown on solid surfaces. The presented system included application of computer-vision and patternrecognition techniques to classify scatter pattern formed by bacterial colonies irradiated with laser light. This report shows an extension of the proposed method. A new scatterometer equipped with a high-resolution CCD chip and application of two additional sets of image features for classification allow for higher accuracy and lower error rates. Features based on Zernike moments are supplemented by Tchebichef moments, and Haralick texture descriptors in the new version of the algorithm. Fisher's criterion has been used for feature selection to decrease the training time of machine learning systems. An algorithm based on support vector machines was used for classification of patterns. Low error rates determined by cross-validation, reproducibility of the measurements, and robustness of the system prove that the proposed technology can be implemented in automated devices for detection and classification of pathogenic bacteria.
Bredesen, Ida Marie; Bjøro, Karen; Gunningberg, Lena; Hofoss, Dag
2016-05-01
Pressure ulcers (PUs) are a problem in health care. Staff competency is paramount to PU prevention. Education is essential to increase skills in pressure ulcer classification and risk assessment. Currently, no pressure ulcer learning programs are available in Norwegian. Develop and test an e-learning program for assessment of pressure ulcer risk and pressure ulcer classification. Forty-four nurses working in acute care hospital wards or nursing homes participated and were assigned randomly into two groups: an e-learning program group (intervention) and a traditional classroom lecture group (control). Data was collected immediately before and after training, and again after three months. The study was conducted at one nursing home and two hospitals between May and December 2012. Accuracy of risk assessment (five patient cases) and pressure ulcer classification (40 photos [normal skin, pressure ulcer categories I-IV] split in two sets) were measured by comparing nurse evaluations in each of the two groups to a pre-established standard based on ratings by experts in pressure ulcer classification and risk assessment. Inter-rater reliability was measured by exact percent agreement and multi-rater Fleiss kappa. A Mann-Whitney U test was used for continuous sum score variables. An e-learning program did not improve Braden subscale scoring. For pressure ulcer classification, however, the intervention group scored significantly higher than the control group on several of the categories in post-test immediately after training. However, after three months there were no significant differences in classification skills between the groups. An e-learning program appears to have a greater effect on the accuracy of pressure ulcer classification than classroom teaching in the short term. For proficiency in Braden scoring, no significant effect of educational methods on learning results was detected. Copyright © 2016 Elsevier Ltd. All rights reserved.
Xie, Duoli; Shi, Tieliu; Wen, Chengping
2017-01-01
Traditional Chinese Medicine (TCM) has been widely used as a complementary medicine in Acute Myeloid Leukemia (AML) treatment. In this study, we proposed a new classification of Chinese Medicines (CMs) by integrating the latest discoveries in disease molecular mechanisms and traditional medicine theory. We screened out a set of chemical compounds on basis of AML differential expression genes and chemical-protein interactions and then mapped them to Traditional Chinese Medicine Integrated Database. 415 CMs contain those compounds and they were categorized into 8 groups according to the Traditional Chinese Pharmacology. Pathway analysis and synthetic lethality gene pairs were applied to analyze the dissimilarity, generality and intergroup relations of different groups. We defined hub CM pairs and alternative CM groups based on the analysis result and finally proposed a formula to form an effective anti-AML prescription which combined the hub CM pairs with alternative CMs according to patients’ molecular features. Our method of formulating CMs based on patients’ stratification provides novel insights into the new usage of conventional CMs and will promote TCM modernization. PMID:28454110
Huang, Lin; Li, Haichang; Xie, Duoli; Shi, Tieliu; Wen, Chengping
2017-06-27
Traditional Chinese Medicine (TCM) has been widely used as a complementary medicine in Acute Myeloid Leukemia (AML) treatment. In this study, we proposed a new classification of Chinese Medicines (CMs) by integrating the latest discoveries in disease molecular mechanisms and traditional medicine theory. We screened out a set of chemical compounds on basis of AML differential expression genes and chemical-protein interactions and then mapped them to Traditional Chinese Medicine Integrated Database. 415 CMs contain those compounds and they were categorized into 8 groups according to the Traditional Chinese Pharmacology. Pathway analysis and synthetic lethality gene pairs were applied to analyze the dissimilarity, generality and intergroup relations of different groups. We defined hub CM pairs and alternative CM groups based on the analysis result and finally proposed a formula to form an effective anti-AML prescription which combined the hub CM pairs with alternative CMs according to patients' molecular features. Our method of formulating CMs based on patients' stratification provides novel insights into the new usage of conventional CMs and will promote TCM modernization.
A Quantitative Analysis of Pulsed Signals Emitted by Wild Bottlenose Dolphins.
Luís, Ana Rita; Couchinho, Miguel N; Dos Santos, Manuel E
2016-01-01
Common bottlenose dolphins (Tursiops truncatus), produce a wide variety of vocal emissions for communication and echolocation, of which the pulsed repertoire has been the most difficult to categorize. Packets of high repetition, broadband pulses are still largely reported under a general designation of burst-pulses, and traditional attempts to classify these emissions rely mainly in their aural characteristics and in graphical aspects of spectrograms. Here, we present a quantitative analysis of pulsed signals emitted by wild bottlenose dolphins, in the Sado estuary, Portugal (2011-2014), and test the reliability of a traditional classification approach. Acoustic parameters (minimum frequency, maximum frequency, peak frequency, duration, repetition rate and inter-click-interval) were extracted from 930 pulsed signals, previously categorized using a traditional approach. Discriminant function analysis revealed a high reliability of the traditional classification approach (93.5% of pulsed signals were consistently assigned to their aurally based categories). According to the discriminant function analysis (Wilk's Λ = 0.11, F3, 2.41 = 282.75, P < 0.001), repetition rate is the feature that best enables the discrimination of different pulsed signals (structure coefficient = 0.98). Classification using hierarchical cluster analysis led to a similar categorization pattern: two main signal types with distinct magnitudes of repetition rate were clustered into five groups. The pulsed signals, here described, present significant differences in their time-frequency features, especially repetition rate (P < 0.001), inter-click-interval (P < 0.001) and duration (P < 0.001). We document the occurrence of a distinct signal type-short burst-pulses, and highlight the existence of a diverse repertoire of pulsed vocalizations emitted in graded sequences. The use of quantitative analysis of pulsed signals is essential to improve classifications and to better assess the contexts of emission, geographic variation and the functional significance of pulsed signals.
Spectral Band Selection for Urban Material Classification Using Hyperspectral Libraries
NASA Astrophysics Data System (ADS)
Le Bris, A.; Chehata, N.; Briottet, X.; Paparoditis, N.
2016-06-01
In urban areas, information concerning very high resolution land cover and especially material maps are necessary for several city modelling or monitoring applications. That is to say, knowledge concerning the roofing materials or the different kinds of ground areas is required. Airborne remote sensing techniques appear to be convenient for providing such information at a large scale. However, results obtained using most traditional processing methods based on usual red-green-blue-near infrared multispectral images remain limited for such applications. A possible way to improve classification results is to enhance the imagery spectral resolution using superspectral or hyperspectral sensors. In this study, it is intended to design a superspectral sensor dedicated to urban materials classification and this work particularly focused on the selection of the optimal spectral band subsets for such sensor. First, reflectance spectral signatures of urban materials were collected from 7 spectral libraires. Then, spectral optimization was performed using this data set. The band selection workflow included two steps, optimising first the number of spectral bands using an incremental method and then examining several possible optimised band subsets using a stochastic algorithm. The same wrapper relevance criterion relying on a confidence measure of Random Forests classifier was used at both steps. To cope with the limited number of available spectra for several classes, additional synthetic spectra were generated from the collection of reference spectra: intra-class variability was simulated by multiplying reference spectra by a random coefficient. At the end, selected band subsets were evaluated considering the classification quality reached using a rbf svm classifier. It was confirmed that a limited band subset was sufficient to classify common urban materials. The important contribution of bands from the Short Wave Infra-Red (SWIR) spectral domain (1000-2400 nm) to material classification was also shown.
Fei, Baowei; Yang, Xiaofeng; Nye, Jonathon A.; Aarsvold, John N.; Raghunath, Nivedita; Cervo, Morgan; Stark, Rebecca; Meltzer, Carolyn C.; Votaw, John R.
2012-01-01
Purpose: Combined MR/PET is a relatively new, hybrid imaging modality. A human MR/PET prototype system consisting of a Siemens 3T Trio MR and brain PET insert was installed and tested at our institution. Its present design does not offer measured attenuation correction (AC) using traditional transmission imaging. This study is the development of quantification tools including MR-based AC for quantification in combined MR/PET for brain imaging. Methods: The developed quantification tools include image registration, segmentation, classification, and MR-based AC. These components were integrated into a single scheme for processing MR/PET data. The segmentation method is multiscale and based on the Radon transform of brain MR images. It was developed to segment the skull on T1-weighted MR images. A modified fuzzy C-means classification scheme was developed to classify brain tissue into gray matter, white matter, and cerebrospinal fluid. Classified tissue is assigned an attenuation coefficient so that AC factors can be generated. PET emission data are then reconstructed using a three-dimensional ordered sets expectation maximization method with the MR-based AC map. Ten subjects had separate MR and PET scans. The PET with [11C]PIB was acquired using a high-resolution research tomography (HRRT) PET. MR-based AC was compared with transmission (TX)-based AC on the HRRT. Seventeen volumes of interest were drawn manually on each subject image to compare the PET activities between the MR-based and TX-based AC methods. Results: For skull segmentation, the overlap ratio between our segmented results and the ground truth is 85.2 ± 2.6%. Attenuation correction results from the ten subjects show that the difference between the MR and TX-based methods was <6.5%. Conclusions: MR-based AC compared favorably with conventional transmission-based AC. Quantitative tools including registration, segmentation, classification, and MR-based AC have been developed for use in combined MR/PET. PMID:23039679
Identification of Malicious Web Pages by Inductive Learning
NASA Astrophysics Data System (ADS)
Liu, Peishun; Wang, Xuefang
Malicious web pages are an increasing threat to current computer systems in recent years. Traditional anti-virus techniques focus typically on detection of the static signatures of Malware and are ineffective against these new threats because they cannot deal with zero-day attacks. In this paper, a novel classification method for detecting malicious web pages is presented. This method is generalization and specialization of attack pattern based on inductive learning, which can be used for updating and expanding knowledge database. The attack pattern is established from an example and generalized by inductive learning, which can be used to detect unknown attacks whose behavior is similar to the example.
Polar cloud and surface classification using AVHRR imagery - An intercomparison of methods
NASA Technical Reports Server (NTRS)
Welch, R. M.; Sengupta, S. K.; Goroch, A. K.; Rabindra, P.; Rangaraj, N.; Navar, M. S.
1992-01-01
Six Advanced Very High-Resolution Radiometer local area coverage (AVHRR LAC) arctic scenes are classified into ten classes. Three different classifiers are examined: (1) the traditional stepwise discriminant analysis (SDA) method; (2) the feed-forward back-propagation (FFBP) neural network; and (3) the probabilistic neural network (PNN). More than 200 spectral and textural measures are computed. These are reduced to 20 features using sequential forward selection. Theoretical accuracy of the classifiers is determined using the bootstrap approach. Overall accuracy is 85.6 percent, 87.6 percent, and 87.0 percent for the SDA, FFBP, and PNN classifiers, respectively, with standard deviations of approximately 1 percent.
Power quality analysis based on spatial correlation
NASA Astrophysics Data System (ADS)
Li, Jiangtao; Zhao, Gang; Liu, Haibo; Li, Fenghou; Liu, Xiaoli
2018-03-01
With the industrialization and urbanization, the status of electricity in the production and life is getting higher and higher. So the prediction of power quality is the more potential significance. Traditional power quality analysis methods include: power quality data compression, disturbance event pattern classification, disturbance parameter calculation. Under certain conditions, these methods can predict power quality. This paper analyses the temporal variation of power quality of one provincial power grid in China from time angle. The distribution of power quality was analyzed based on spatial autocorrelation. This paper tries to prove that the research idea of geography is effective for mining the potential information of power quality.
Lake bed classification using acoustic data
Yin, Karen K.; Li, Xing; Bonde, John; Richards, Carl; Cholwek, Gary
1998-01-01
As part of our effort to identify the lake bed surficial substrates using remote sensing data, this work designs pattern classifiers by multivariate statistical methods. Probability distribution of the preprocessed acoustic signal is analyzed first. A confidence region approach is then adopted to improve the design of the existing classifier. A technique for further isolation is proposed which minimizes the expected loss from misclassification. The devices constructed are applicable for real-time lake bed categorization. A mimimax approach is suggested to treat more general cases where the a priori probability distribution of the substrate types is unknown. Comparison of the suggested methods with the traditional likelihood ratio tests is discussed.
Liu, Bao-Cheng; Ji, Guang
2017-07-01
Incorporating "-omics" studies with environmental interactions could help elucidate the biological mechanisms responsible for Traditional Chinese Medicine (TCM) patterns. Based on the authors' own experiences, this review outlines a model of an ideal combination of "-omics" biomarkers, environmental factors, and TCM pattern classifications; provides a narrative review of the relevant genetic and TCM studies; and lists several successful integrative examples. Two integration tools are briefly introduced. The first is the integration of modern devices into objective diagnostic methods of TCM patterning, which would improve current clinical decision-making and practice. The second is the use of biobanks and data platforms, which could broadly support biological and medical research. Such efforts will transform current medical management and accelerate the progression of precision medicine.
Tansaz, Mojgan; Nazemiyeh, Hossein; Fazljou, Seyed Mohammad Bagher
2018-01-01
Introduction Menstrual bleeding cessation is one of the most frequent gynecologic disorders among women in reproductive age. The treatment is based on hormone therapy. Due to the increasing request for alternative medicine remedies in the field of women's diseases, in present study, it was tried to overview medicinal plants used to treat oligomenorrhea and amenorrhea according to the pharmaceutical textbooks of traditional Persian medicine (TPM) and review the evidence in the conventional medicine. Methods This systematic review was designed and performed in 2017 in order to gather information regarding herbal medications of oligomenorrhea and amenorrhea in TPM and conventional medicine. This study had several steps as searching Iranian traditional medicine literature and extracting the emmenagogue plants, classifying the plants, searching the electronic databases, and finding evidences. To search traditional Persian medicine references, Noor digital library was used, which includes several ancient traditional medical references. The classification of plants was done based on the repetition and potency of the plants in the ancient literatures. The required data was gathered using databases such as PubMed, Scopus, Google Scholar, Cochrane Library, Science Direct, and web of knowledge. Results In present study of all 198 emmenagogue medicinal plants found in TPM, 87 cases were specified to be more effective in treating oligomenorrhea and amenorrhea. In second part of present study, where a search of conventional medicine was performed, 12 studies were found, which had 8 plants investigated: Vitex agnus-castus, Trigonella foenum-graecum, Foeniculum vulgare, Cinnamomum verum, Paeonia lactiflora, Sesamum indicum, Mentha longifolia, and Urtica dioica. Conclusion. Traditional Persian medicine has proposed many different medicinal plants for treatment of oligomenorrhea and amenorrhea. Although just few plants have been proven to be effective for treatment of menstrual irregularities, the results and the classification in present study can be used as an outline for future studies and treatment. PMID:29744355
2011-01-01
Background The avian family Cettiidae, including the genera Cettia, Urosphena, Tesia, Abroscopus and Tickellia and Orthotomus cucullatus, has recently been proposed based on analysis of a small number of loci and species. The close relationship of most of these taxa was unexpected, and called for a comprehensive study based on multiple loci and dense taxon sampling. In the present study, we infer the relationships of all except one of the species in this family using one mitochondrial and three nuclear loci. We use traditional gene tree methods (Bayesian inference, maximum likelihood bootstrapping, parsimony bootstrapping), as well as a recently developed Bayesian species tree approach (*BEAST) that accounts for lineage sorting processes that might produce discordance between gene trees. We also analyse mitochondrial DNA for a larger sample, comprising multiple individuals and a large number of subspecies of polytypic species. Results There are many topological incongruences among the single-locus trees, although none of these is strongly supported. The multi-locus tree inferred using concatenated sequences and the species tree agree well with each other, and are overall well resolved and well supported by the data. The main discrepancy between these trees concerns the most basal split. Both methods infer the genus Cettia to be highly non-monophyletic, as it is scattered across the entire family tree. Deep intraspecific divergences are revealed, and one or two species and one subspecies are inferred to be non-monophyletic (differences between methods). Conclusions The molecular phylogeny presented here is strongly inconsistent with the traditional, morphology-based classification. The remarkably high degree of non-monophyly in the genus Cettia is likely to be one of the most extraordinary examples of misconceived relationships in an avian genus. The phylogeny suggests instances of parallel evolution, as well as highly unequal rates of morphological divergence in different lineages. This complex morphological evolution apparently misled earlier taxonomists. These results underscore the well-known but still often neglected problem of basing classifications on overall morphological similarity. Based on the molecular data, a revised taxonomy is proposed. Although the traditional and species tree methods inferred much the same tree in the present study, the assumption by species tree methods that all species are monophyletic is a limitation in these methods, as some currently recognized species might have more complex histories. PMID:22142197
ERIC Educational Resources Information Center
Golick, Douglas A.; Heng-Moss, Tiffany M.; Steckelberg, Allen L.; Brooks, David. W.; Higley, Leon G.; Fowler, David
2013-01-01
The purpose of the study was to determine whether undergraduate students receiving web-based instruction based on traditional, key character, or classification instruction differed in their performance of insect identification tasks. All groups showed a significant improvement in insect identifications on pre- and post-two-dimensional picture…
GRB 060614: a Fake Short Gamma-Ray Burst
NASA Astrophysics Data System (ADS)
Caito, L.; Bernardini, M. G.; Bianco, C. L.; Dainotti, M. G.; Guida, R.; Ruffini, R.
2008-05-01
The explosion of GRB 060614 produced a deep break in the GRB scenario and opened new horizons of investigation because it can't be traced back to any traditional scheme of classification. In fact, it has features both of long bursts and of short bursts and, above all, it is the first case of long duration near GRB without any bright Ib/c associated Supernova. We will show that, in our canonical GRB scenario [1], this ``anomalous'' situation finds a natural interpretation and allows us to discuss a possible variation to the traditional classification scheme, introducing the distinction between ``genuine'' and ``fake'' short bursts.
NASA Astrophysics Data System (ADS)
Caito, L.; Bernardini, M. G.; Bianco, C. L.; Dainotti, M. G.; Guida, R.; Ruffini, R.
2008-01-01
The explosion of GRB 060614, detected by the Swift satellite, produced a deep break in the GRB scenario opening new horizons of investigation, because it can't be traced back to any traditional scheme of classification. In fact, it manifests peculiarities both of long bursts and of short bursts. Above all, it is the first case of long duration near GRB without any bright Ib/c associated Supernova. We will show that, in our canonical GRB scenario ([l]), this ``anomalous'' situation finds a natural interpretation and allows us to discuss a possible variation to the traditional classification scheme, introducing the distinction between ``genuine'' and ``fake'' short bursts.
Lu, Yao; Harrington, Peter B
2010-08-01
Direct methylation and solid-phase microextraction (SPME) were used as a sample preparation technique for classification of bacteria based on fatty acid methyl ester (FAME) profiles. Methanolic tetramethylammonium hydroxide was applied as a dual-function reagent to saponify and derivatize whole-cell bacterial fatty acids into FAMEs in one step, and SPME was used to extract the bacterial FAMEs from the headspace. Compared with traditional alkaline saponification and sample preparation using liquid-liquid extraction, the method presented in this work avoids using comparatively large amounts of inorganic and organic solvents and greatly decreases the sample preparation time as well. Characteristic gas chromatography/mass spectrometry (GC/MS) of FAME profiles was achieved for six bacterial species. The difference between Gram-positive and Gram-negative bacteria was clearly visualized with the application of principal component analysis of the GC/MS data of bacterial FAMEs. A cross-validation study using ten bootstrap Latin partitions and the fuzzy rule building expert system demonstrated 87 +/- 3% correct classification efficiency.
Classification without labels: learning from mixed samples in high energy physics
NASA Astrophysics Data System (ADS)
Metodiev, Eric M.; Nachman, Benjamin; Thaler, Jesse
2017-10-01
Modern machine learning techniques can be used to construct powerful models for difficult collider physics problems. In many applications, however, these models are trained on imperfect simulations due to a lack of truth-level information in the data, which risks the model learning artifacts of the simulation. In this paper, we introduce the paradigm of classification without labels (CWoLa) in which a classifier is trained to distinguish statistical mixtures of classes, which are common in collider physics. Crucially, neither individual labels nor class proportions are required, yet we prove that the optimal classifier in the CWoLa paradigm is also the optimal classifier in the traditional fully-supervised case where all label information is available. After demonstrating the power of this method in an analytical toy example, we consider a realistic benchmark for collider physics: distinguishing quark- versus gluon-initiated jets using mixed quark/gluon training samples. More generally, CWoLa can be applied to any classification problem where labels or class proportions are unknown or simulations are unreliable, but statistical mixtures of the classes are available.
Multicategory nets of single-layer perceptrons: complexity and sample-size issues.
Raudys, Sarunas; Kybartas, Rimantas; Zavadskas, Edmundas Kazimieras
2010-05-01
The standard cost function of multicategory single-layer perceptrons (SLPs) does not minimize the classification error rate. In order to reduce classification error, it is necessary to: 1) refuse the traditional cost function, 2) obtain near to optimal pairwise linear classifiers by specially organized SLP training and optimal stopping, and 3) fuse their decisions properly. To obtain better classification in unbalanced training set situations, we introduce the unbalance correcting term. It was found that fusion based on the Kulback-Leibler (K-L) distance and the Wu-Lin-Weng (WLW) method result in approximately the same performance in situations where sample sizes are relatively small. The explanation for this observation is by theoretically known verity that an excessive minimization of inexact criteria becomes harmful at times. Comprehensive comparative investigations of six real-world pattern recognition (PR) problems demonstrated that employment of SLP-based pairwise classifiers is comparable and as often as not outperforming the linear support vector (SV) classifiers in moderate dimensional situations. The colored noise injection used to design pseudovalidation sets proves to be a powerful tool for facilitating finite sample problems in moderate-dimensional PR tasks.
Classification without labels: learning from mixed samples in high energy physics
Metodiev, Eric M.; Nachman, Benjamin; Thaler, Jesse
2017-10-25
Modern machine learning techniques can be used to construct powerful models for difficult collider physics problems. In many applications, however, these models are trained on imperfect simulations due to a lack of truth-level information in the data, which risks the model learning artifacts of the simulation. In this paper, we introduce the paradigm of classification without labels (CWoLa) in which a classifier is trained to distinguish statistical mixtures of classes, which are common in collider physics. Crucially, neither individual labels nor class proportions are required, yet we prove that the optimal classifier in the CWoLa paradigm is also the optimalmore » classifier in the traditional fully-supervised case where all label information is available. After demonstrating the power of this method in an analytical toy example, we consider a realistic benchmark for collider physics: distinguishing quark- versus gluon-initiated jets using mixed quark/gluon training samples. More generally, CWoLa can be applied to any classification problem where labels or class proportions are unknown or simulations are unreliable, but statistical mixtures of the classes are available.« less
Classification without labels: learning from mixed samples in high energy physics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Metodiev, Eric M.; Nachman, Benjamin; Thaler, Jesse
Modern machine learning techniques can be used to construct powerful models for difficult collider physics problems. In many applications, however, these models are trained on imperfect simulations due to a lack of truth-level information in the data, which risks the model learning artifacts of the simulation. In this paper, we introduce the paradigm of classification without labels (CWoLa) in which a classifier is trained to distinguish statistical mixtures of classes, which are common in collider physics. Crucially, neither individual labels nor class proportions are required, yet we prove that the optimal classifier in the CWoLa paradigm is also the optimalmore » classifier in the traditional fully-supervised case where all label information is available. After demonstrating the power of this method in an analytical toy example, we consider a realistic benchmark for collider physics: distinguishing quark- versus gluon-initiated jets using mixed quark/gluon training samples. More generally, CWoLa can be applied to any classification problem where labels or class proportions are unknown or simulations are unreliable, but statistical mixtures of the classes are available.« less
Wan, Shixiang; Duan, Yucong; Zou, Quan
2017-09-01
Predicting the subcellular localization of proteins is an important and challenging problem. Traditional experimental approaches are often expensive and time-consuming. Consequently, a growing number of research efforts employ a series of machine learning approaches to predict the subcellular location of proteins. There are two main challenges among the state-of-the-art prediction methods. First, most of the existing techniques are designed to deal with multi-class rather than multi-label classification, which ignores connections between multiple labels. In reality, multiple locations of particular proteins imply that there are vital and unique biological significances that deserve special focus and cannot be ignored. Second, techniques for handling imbalanced data in multi-label classification problems are necessary, but never employed. For solving these two issues, we have developed an ensemble multi-label classifier called HPSLPred, which can be applied for multi-label classification with an imbalanced protein source. For convenience, a user-friendly webserver has been established at http://server.malab.cn/HPSLPred. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Guo, Yanrong; Shao, Yeqin; Gao, Yaozong
Purpose: Automatic prostate segmentation from MR images is an important task in various clinical applications such as prostate cancer staging and MR-guided radiotherapy planning. However, the large appearance and shape variations of the prostate in MR images make the segmentation problem difficult to solve. Traditional Active Shape/Appearance Model (ASM/AAM) has limited accuracy on this problem, since its basic assumption, i.e., both shape and appearance of the targeted organ follow Gaussian distributions, is invalid in prostate MR images. To this end, the authors propose a sparse dictionary learning method to model the image appearance in a nonparametric fashion and further integratemore » the appearance model into a deformable segmentation framework for prostate MR segmentation. Methods: To drive the deformable model for prostate segmentation, the authors propose nonparametric appearance and shape models. The nonparametric appearance model is based on a novel dictionary learning method, namely distributed discriminative dictionary (DDD) learning, which is able to capture fine distinctions in image appearance. To increase the differential power of traditional dictionary-based classification methods, the authors' DDD learning approach takes three strategies. First, two dictionaries for prostate and nonprostate tissues are built, respectively, using the discriminative features obtained from minimum redundancy maximum relevance feature selection. Second, linear discriminant analysis is employed as a linear classifier to boost the optimal separation between prostate and nonprostate tissues, based on the representation residuals from sparse representation. Third, to enhance the robustness of the authors' classification method, multiple local dictionaries are learned for local regions along the prostate boundary (each with small appearance variations), instead of learning one global classifier for the entire prostate. These discriminative dictionaries are located on different patches of the prostate surface and trained to adaptively capture the appearance in different prostate zones, thus achieving better local tissue differentiation. For each local region, multiple classifiers are trained based on the randomly selected samples and finally assembled by a specific fusion method. In addition to this nonparametric appearance model, a prostate shape model is learned from the shape statistics using a novel approach, sparse shape composition, which can model nonGaussian distributions of shape variation and regularize the 3D mesh deformation by constraining it within the observed shape subspace. Results: The proposed method has been evaluated on two datasets consisting of T2-weighted MR prostate images. For the first (internal) dataset, the classification effectiveness of the authors' improved dictionary learning has been validated by comparing it with three other variants of traditional dictionary learning methods. The experimental results show that the authors' method yields a Dice Ratio of 89.1% compared to the manual segmentation, which is more accurate than the three state-of-the-art MR prostate segmentation methods under comparison. For the second dataset, the MICCAI 2012 challenge dataset, the authors' proposed method yields a Dice Ratio of 87.4%, which also achieves better segmentation accuracy than other methods under comparison. Conclusions: A new magnetic resonance image prostate segmentation method is proposed based on the combination of deformable model and dictionary learning methods, which achieves more accurate segmentation performance on prostate T2 MR images.« less
Zhang, Xiaodong; Jing, Shasha; Gao, Peiyi; Xue, Jing; Su, Lu; Li, Weiping; Ren, Lijie; Hu, Qingmao
2016-01-01
Segmentation of infarcts at hyperacute stage is challenging as they exhibit substantial variability which may even be hard for experts to delineate manually. In this paper, a sparse representation based classification method is explored. For each patient, four volumetric data items including three volumes of diffusion weighted imaging and a computed asymmetry map are employed to extract patch features which are then fed to dictionary learning and classification based on sparse representation. Elastic net is adopted to replace the traditional L 0 -norm/ L 1 -norm constraints on sparse representation to stabilize sparse code. To decrease computation cost and to reduce false positives, regions-of-interest are determined to confine candidate infarct voxels. The proposed method has been validated on 98 consecutive patients recruited within 6 hours from onset. It is shown that the proposed method could handle well infarcts with intensity variability and ill-defined edges to yield significantly higher Dice coefficient (0.755 ± 0.118) than the other two methods and their enhanced versions by confining their segmentations within the regions-of-interest (average Dice coefficient less than 0.610). The proposed method could provide a potential tool to quantify infarcts from diffusion weighted imaging at hyperacute stage with accuracy and speed to assist the decision making especially for thrombolytic therapy.
Machine learning in APOGEE. Unsupervised spectral classification with K-means
NASA Astrophysics Data System (ADS)
Garcia-Dias, Rafael; Allende Prieto, Carlos; Sánchez Almeida, Jorge; Ordovás-Pascual, Ignacio
2018-05-01
Context. The volume of data generated by astronomical surveys is growing rapidly. Traditional analysis techniques in spectroscopy either demand intensive human interaction or are computationally expensive. In this scenario, machine learning, and unsupervised clustering algorithms in particular, offer interesting alternatives. The Apache Point Observatory Galactic Evolution Experiment (APOGEE) offers a vast data set of near-infrared stellar spectra, which is perfect for testing such alternatives. Aims: Our research applies an unsupervised classification scheme based on K-means to the massive APOGEE data set. We explore whether the data are amenable to classification into discrete classes. Methods: We apply the K-means algorithm to 153 847 high resolution spectra (R ≈ 22 500). We discuss the main virtues and weaknesses of the algorithm, as well as our choice of parameters. Results: We show that a classification based on normalised spectra captures the variations in stellar atmospheric parameters, chemical abundances, and rotational velocity, among other factors. The algorithm is able to separate the bulge and halo populations, and distinguish dwarfs, sub-giants, RC, and RGB stars. However, a discrete classification in flux space does not result in a neat organisation in the parameters' space. Furthermore, the lack of obvious groups in flux space causes the results to be fairly sensitive to the initialisation, and disrupts the efficiency of commonly-used methods to select the optimal number of clusters. Our classification is publicly available, including extensive online material associated with the APOGEE Data Release 12 (DR12). Conclusions: Our description of the APOGEE database can help greatly with the identification of specific types of targets for various applications. We find a lack of obvious groups in flux space, and identify limitations of the K-means algorithm in dealing with this kind of data. Full Tables B.1-B.4 are only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/612/A98
Epileptic seizure detection in EEG signal with GModPCA and support vector machine.
Jaiswal, Abeg Kumar; Banka, Haider
2017-01-01
Epilepsy is one of the most common neurological disorders caused by recurrent seizures. Electroencephalograms (EEGs) record neural activity and can detect epilepsy. Visual inspection of an EEG signal for epileptic seizure detection is a time-consuming process and may lead to human error; therefore, recently, a number of automated seizure detection frameworks were proposed to replace these traditional methods. Feature extraction and classification are two important steps in these procedures. Feature extraction focuses on finding the informative features that could be used for classification and correct decision-making. Therefore, proposing effective feature extraction techniques for seizure detection is of great significance. Principal Component Analysis (PCA) is a dimensionality reduction technique used in different fields of pattern recognition including EEG signal classification. Global modular PCA (GModPCA) is a variation of PCA. In this paper, an effective framework with GModPCA and Support Vector Machine (SVM) is presented for epileptic seizure detection in EEG signals. The feature extraction is performed with GModPCA, whereas SVM trained with radial basis function kernel performed the classification between seizure and nonseizure EEG signals. Seven different experimental cases were conducted on the benchmark epilepsy EEG dataset. The system performance was evaluated using 10-fold cross-validation. In addition, we prove analytically that GModPCA has less time and space complexities as compared to PCA. The experimental results show that EEG signals have strong inter-sub-pattern correlations. GModPCA and SVM have been able to achieve 100% accuracy for the classification between normal and epileptic signals. Along with this, seven different experimental cases were tested. The classification results of the proposed approach were better than were compared the results of some of the existing methods proposed in literature. It is also found that the time and space complexities of GModPCA are less as compared to PCA. This study suggests that GModPCA and SVM could be used for automated epileptic seizure detection in EEG signal.
Assessing Hurricane Katrina Damage to the Mississippi Gulf Coast Using IKONOS Imagery
NASA Technical Reports Server (NTRS)
Spruce, Joseph; McKellip, Rodney
2006-01-01
Hurricane Katrina hit southeastern Louisiana and the Mississippi Gulf Coast as a Category 3 hurricane with storm surges as high as 9 m. Katrina devastated several coastal towns by destroying or severely damaging hundreds of homes. Several Federal agencies are assessing storm impacts and assisting recovery using high-spatial-resolution remotely sensed data from satellite and airborne platforms. High-quality IKONOS satellite imagery was collected on September 2, 2005, over southwestern Mississippi. Pan-sharpened IKONOS multispectral data and ERDAS IMAGINE software were used to classify post-storm land cover for coastal Hancock and Harrison Counties. This classification included a storm debris category of interest to FEMA for disaster mitigation. The classification resulted from combining traditional unsupervised and supervised classification techniques. Higher spatial resolution aerial and handheld photography were used as reference data. Results suggest that traditional classification techniques and IKONOS data can map wood-dominated storm debris in open areas if relevant training areas are used to develop the unsupervised classification signatures. IKONOS data also enabled other hurricane damage assessment, such as flood-deposited mud on lawns and vegetation foliage loss from the storm. IKONOS data has also aided regional Katrina vegetation damage surveys from multidate Land Remote Sensing Satellite and Moderate Resolution Imaging Spectroradiometer data.
Audio stream classification for multimedia database search
NASA Astrophysics Data System (ADS)
Artese, M.; Bianco, S.; Gagliardi, I.; Gasparini, F.
2013-03-01
Search and retrieval of huge archives of Multimedia data is a challenging task. A classification step is often used to reduce the number of entries on which to perform the subsequent search. In particular, when new entries of the database are continuously added, a fast classification based on simple threshold evaluation is desirable. In this work we present a CART-based (Classification And Regression Tree [1]) classification framework for audio streams belonging to multimedia databases. The database considered is the Archive of Ethnography and Social History (AESS) [2], which is mainly composed of popular songs and other audio records describing the popular traditions handed down generation by generation, such as traditional fairs, and customs. The peculiarities of this database are that it is continuously updated; the audio recordings are acquired in unconstrained environment; and for the non-expert human user is difficult to create the ground truth labels. In our experiments, half of all the available audio files have been randomly extracted and used as training set. The remaining ones have been used as test set. The classifier has been trained to distinguish among three different classes: speech, music, and song. All the audio files in the dataset have been previously manually labeled into the three classes above defined by domain experts.
Training Classifiers with Shadow Features for Sensor-Based Human Activity Recognition.
Fong, Simon; Song, Wei; Cho, Kyungeun; Wong, Raymond; Wong, Kelvin K L
2017-02-27
In this paper, a novel training/testing process for building/using a classification model based on human activity recognition (HAR) is proposed. Traditionally, HAR has been accomplished by a classifier that learns the activities of a person by training with skeletal data obtained from a motion sensor, such as Microsoft Kinect. These skeletal data are the spatial coordinates (x, y, z) of different parts of the human body. The numeric information forms time series, temporal records of movement sequences that can be used for training a classifier. In addition to the spatial features that describe current positions in the skeletal data, new features called 'shadow features' are used to improve the supervised learning efficacy of the classifier. Shadow features are inferred from the dynamics of body movements, and thereby modelling the underlying momentum of the performed activities. They provide extra dimensions of information for characterising activities in the classification process, and thereby significantly improve the classification accuracy. Two cases of HAR are tested using a classification model trained with shadow features: one is by using wearable sensor and the other is by a Kinect-based remote sensor. Our experiments can demonstrate the advantages of the new method, which will have an impact on human activity detection research.
Training Classifiers with Shadow Features for Sensor-Based Human Activity Recognition
Fong, Simon; Song, Wei; Cho, Kyungeun; Wong, Raymond; Wong, Kelvin K. L.
2017-01-01
In this paper, a novel training/testing process for building/using a classification model based on human activity recognition (HAR) is proposed. Traditionally, HAR has been accomplished by a classifier that learns the activities of a person by training with skeletal data obtained from a motion sensor, such as Microsoft Kinect. These skeletal data are the spatial coordinates (x, y, z) of different parts of the human body. The numeric information forms time series, temporal records of movement sequences that can be used for training a classifier. In addition to the spatial features that describe current positions in the skeletal data, new features called ‘shadow features’ are used to improve the supervised learning efficacy of the classifier. Shadow features are inferred from the dynamics of body movements, and thereby modelling the underlying momentum of the performed activities. They provide extra dimensions of information for characterising activities in the classification process, and thereby significantly improve the classification accuracy. Two cases of HAR are tested using a classification model trained with shadow features: one is by using wearable sensor and the other is by a Kinect-based remote sensor. Our experiments can demonstrate the advantages of the new method, which will have an impact on human activity detection research. PMID:28264470
Mander, Luke; Li, Mao; Mio, Washington; Fowlkes, Charless C; Punyasena, Surangi W
2013-11-07
Taxonomic identification of pollen and spores uses inherently qualitative descriptions of morphology. Consequently, identifications are restricted to categories that can be reliably classified by multiple analysts, resulting in the coarse taxonomic resolution of the pollen and spore record. Grass pollen represents an archetypal example; it is not routinely identified below family level. To address this issue, we developed quantitative morphometric methods to characterize surface ornamentation and classify grass pollen grains. This produces a means of quantifying morphological features that are traditionally described qualitatively. We used scanning electron microscopy to image 240 specimens of pollen from 12 species within the grass family (Poaceae). We classified these species by developing algorithmic features that quantify the size and density of sculptural elements on the pollen surface, and measure the complexity of the ornamentation they form. These features yielded a classification accuracy of 77.5%. In comparison, a texture descriptor based on modelling the statistical distribution of brightness values in image patches yielded a classification accuracy of 85.8%, and seven human subjects achieved accuracies between 68.33 and 81.67%. The algorithmic features we developed directly relate to biologically meaningful features of grass pollen morphology, and could facilitate direct interpretation of unsupervised classification results from fossil material.
On the nature of global classification
NASA Technical Reports Server (NTRS)
Wheelis, M. L.; Kandler, O.; Woese, C. R.
1992-01-01
Molecular sequencing technology has brought biology into the era of global (universal) classification. Methodologically and philosophically, global classification differs significantly from traditional, local classification. The need for uniformity requires that higher level taxa be defined on the molecular level in terms of universally homologous functions. A global classification should reflect both principal dimensions of the evolutionary process: genealogical relationship and quality and extent of divergence within a group. The ultimate purpose of a global classification is not simply information storage and retrieval; such a system should also function as an heuristic representation of the evolutionary paradigm that exerts a directing influence on the course of biology. The global system envisioned allows paraphyletic taxa. To retain maximal phylogenetic information in these cases, minor notational amendments in existing taxonomic conventions should be adopted.
Divorcing Strain Classification from Species Names.
Baltrus, David A
2016-06-01
Confusion about strain classification and nomenclature permeates modern microbiology. Although taxonomists have traditionally acted as gatekeepers of order, the numbers of, and speed at which, new strains are identified has outpaced the opportunity for professional classification for many lineages. Furthermore, the growth of bioinformatics and database-fueled investigations have placed metadata curation in the hands of researchers with little taxonomic experience. Here I describe practical challenges facing modern microbial taxonomy, provide an overview of complexities of classification for environmentally ubiquitous taxa like Pseudomonas syringae, and emphasize that classification can be independent of nomenclature. A move toward implementation of relational classification schemes based on inherent properties of whole genomes could provide sorely needed continuity in how strains are referenced across manuscripts and data sets. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Xu, Z.; Guan, K.; Peng, B.; Casler, N. P.; Wang, S. W.
2017-12-01
Landscape has complex three-dimensional features. These 3D features are difficult to extract using conventional methods. Small-footprint LiDAR provides an ideal way for capturing these features. Existing approaches, however, have been relegated to raster or metric-based (two-dimensional) feature extraction from the upper or bottom layer, and thus are not suitable for resolving morphological and intensity features that could be important to fine-scale land cover mapping. Therefore, this research combines airborne LiDAR and multi-temporal Landsat imagery to classify land cover types of Williamson County, Illinois that has diverse and mixed landscape features. Specifically, we applied a 3D convolutional neural network (CNN) method to extract features from LiDAR point clouds by (1) creating occupancy grid, intensity grid at 1-meter resolution, and then (2) normalizing and incorporating data into a 3D CNN feature extractor for many epochs of learning. The learned features (e.g., morphological features, intensity features, etc) were combined with multi-temporal spectral data to enhance the performance of land cover classification based on a Support Vector Machine classifier. We used photo interpretation for training and testing data generation. The classification results show that our approach outperforms traditional methods using LiDAR derived feature maps, and promises to serve as an effective methodology for creating high-quality land cover maps through fusion of complementary types of remote sensing data.
Henrard, S; Speybroeck, N; Hermans, C
2015-11-01
Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.
Improving the performance of univariate control charts for abnormal detection and classification
NASA Astrophysics Data System (ADS)
Yiakopoulos, Christos; Koutsoudaki, Maria; Gryllias, Konstantinos; Antoniadis, Ioannis
2017-03-01
Bearing failures in rotating machinery can cause machine breakdown and economical loss, if no effective actions are taken on time. Therefore, it is of prime importance to detect accurately the presence of faults, especially at their early stage, to prevent sequent damage and reduce costly downtime. The machinery fault diagnosis follows a roadmap of data acquisition, feature extraction and diagnostic decision making, in which mechanical vibration fault feature extraction is the foundation and the key to obtain an accurate diagnostic result. A challenge in this area is the selection of the most sensitive features for various types of fault, especially when the characteristics of failures are difficult to be extracted. Thus, a plethora of complex data-driven fault diagnosis methods are fed by prominent features, which are extracted and reduced through traditional or modern algorithms. Since most of the available datasets are captured during normal operating conditions, the last decade a number of novelty detection methods, able to work when only normal data are available, have been developed. In this study, a hybrid method combining univariate control charts and a feature extraction scheme is introduced focusing towards an abnormal change detection and classification, under the assumption that measurements under normal operating conditions of the machinery are available. The feature extraction method integrates the morphological operators and the Morlet wavelets. The effectiveness of the proposed methodology is validated on two different experimental cases with bearing faults, demonstrating that the proposed approach can improve the fault detection and classification performance of conventional control charts.
Groenendyk, Derek G.; Ferré, Ty P.A.; Thorp, Kelly R.; Rice, Amy K.
2015-01-01
Soils lie at the interface between the atmosphere and the subsurface and are a key component that control ecosystem services, food production, and many other processes at the Earth’s surface. There is a long-established convention for identifying and mapping soils by texture. These readily available, georeferenced soil maps and databases are used widely in environmental sciences. Here, we show that these traditional soil classifications can be inappropriate, contributing to bias and uncertainty in applications from slope stability to water resource management. We suggest a new approach to soil classification, with a detailed example from the science of hydrology. Hydrologic simulations based on common meteorological conditions were performed using HYDRUS-1D, spanning textures identified by the United States Department of Agriculture soil texture triangle. We consider these common conditions to be: drainage from saturation, infiltration onto a drained soil, and combined infiltration and drainage events. Using a k-means clustering algorithm, we created soil classifications based on the modeled hydrologic responses of these soils. The hydrologic-process-based classifications were compared to those based on soil texture and a single hydraulic property, Ks. Differences in classifications based on hydrologic response versus soil texture demonstrate that traditional soil texture classification is a poor predictor of hydrologic response. We then developed a QGIS plugin to construct soil maps combining a classification with georeferenced soil data from the Natural Resource Conservation Service. The spatial patterns of hydrologic response were more immediately informative, much simpler, and less ambiguous, for use in applications ranging from trafficability to irrigation management to flood control. The ease with which hydrologic-process-based classifications can be made, along with the improved quantitative predictions of soil responses and visualization of landscape function, suggest that hydrologic-process-based classifications should be incorporated into environmental process models and can be used to define application-specific maps of hydrologic function. PMID:26121466
Groenendyk, Derek G; Ferré, Ty P A; Thorp, Kelly R; Rice, Amy K
2015-01-01
Soils lie at the interface between the atmosphere and the subsurface and are a key component that control ecosystem services, food production, and many other processes at the Earth's surface. There is a long-established convention for identifying and mapping soils by texture. These readily available, georeferenced soil maps and databases are used widely in environmental sciences. Here, we show that these traditional soil classifications can be inappropriate, contributing to bias and uncertainty in applications from slope stability to water resource management. We suggest a new approach to soil classification, with a detailed example from the science of hydrology. Hydrologic simulations based on common meteorological conditions were performed using HYDRUS-1D, spanning textures identified by the United States Department of Agriculture soil texture triangle. We consider these common conditions to be: drainage from saturation, infiltration onto a drained soil, and combined infiltration and drainage events. Using a k-means clustering algorithm, we created soil classifications based on the modeled hydrologic responses of these soils. The hydrologic-process-based classifications were compared to those based on soil texture and a single hydraulic property, Ks. Differences in classifications based on hydrologic response versus soil texture demonstrate that traditional soil texture classification is a poor predictor of hydrologic response. We then developed a QGIS plugin to construct soil maps combining a classification with georeferenced soil data from the Natural Resource Conservation Service. The spatial patterns of hydrologic response were more immediately informative, much simpler, and less ambiguous, for use in applications ranging from trafficability to irrigation management to flood control. The ease with which hydrologic-process-based classifications can be made, along with the improved quantitative predictions of soil responses and visualization of landscape function, suggest that hydrologic-process-based classifications should be incorporated into environmental process models and can be used to define application-specific maps of hydrologic function.
Lovette, I.J.; Perez-Eman, J. L.; Sullivan, J.P.; Banks, R.C.; Fiorentino, I.; Cordoba-Cordoba, S.; Echeverry-Galvis, M.; Barker, F.K.; Burns, K.J.; Klicka, J.; Lanyon, Scott M.; Bermingham, E.
2010-01-01
The birds in the family Parulidae-commonly termed the New World warblers or wood-warblers-are a classic model radiation for studies of ecological and behavioral differentiation. Although the monophyly of a 'core' wood-warbler clade is well established, no phylogenetic hypothesis for this group has included a full sampling of wood-warbler species diversity. We used parsimony, maximum likelihood, and Bayesian methods to reconstruct relationships among all genera and nearly all wood-warbler species, based on a matrix of mitochondrial DNA (5840 nucleotides) and nuclear DNA (6 loci, 4602 nucleotides) characters. The resulting phylogenetic hypotheses provide a highly congruent picture of wood-warbler relationships, and indicate that the traditional generic classification of these birds recognizes many non-monophyletic groups. We recommend a revised taxonomy in which each of 14 genera (Seiurus, Helmitheros, Mniotilta, Limnothlypis, Protonotaria, Parkesia, Vermivora, Oreothlypis, Geothlypis, Setophaga, Myioborus, Cardellina, Basileuterus, Myiothlypis) corresponds to a well-supported clade; these nomenclatural changes also involve subsuming a number of well-known, traditional wood-warbler genera (Catharopeza, Dendroica, Ergaticus, Euthlypis, Leucopeza, Oporornis, Parula, Phaeothlypis, Wilsonia). We provide a summary phylogenetic hypothesis that will be broadly applicable to investigations of the historical biogeography, processes of diversification, and evolution of trait variation in this well studied avian group. ?? 2010 Elsevier Inc.
Vd'ačný, Peter; Bourland, William A; Orsi, William; Epstein, Slava S; Foissner, Wilhelm
2011-05-01
The class Litostomatea is a highly diverse ciliate taxon comprising hundreds of species ranging from aerobic, free-living predators to anaerobic endocommensals. This is traditionally reflected by classifying the Litostomatea into the subclasses Haptoria and Trichostomatia. The morphological classifications of the Haptoria conflict with the molecular phylogenies, which indicate polyphyly and numerous homoplasies. Thus, we analyzed the genealogy of 53 in-group species with morphological and molecular methods, including 12 new sequences from free-living taxa. The phylogenetic analyses and some strong morphological traits show: (i) body polarization and simplification of the oral apparatus as main evolutionary trends in the Litostomatea and (ii) three distinct lineages (subclasses): the Rhynchostomatia comprising Tracheliida and Dileptida; the Haptoria comprising Lacrymariida, Haptorida, Didiniida, Pleurostomatida and Spathidiida; and the Trichostomatia. The curious Homalozoon cannot be assigned to any of the haptorian orders, but is basal to a clade containing the Didiniida and Pleurostomatida. The internal relationships of the Spathidiida remain obscure because many of them and some "traditional" haptorids form separate branches within the basal polytomy of the order, indicating one or several radiations and convergent evolution. Due to the high divergence in the 18S rRNA gene, the chaeneids and cyclotrichiids are classified incertae sedis. Copyright © 2011 Elsevier Inc. All rights reserved.
Palm-Vein Classification Based on Principal Orientation Features
Zhou, Yujia; Liu, Yaqin; Feng, Qianjin; Yang, Feng; Huang, Jing; Nie, Yixiao
2014-01-01
Personal recognition using palm–vein patterns has emerged as a promising alternative for human recognition because of its uniqueness, stability, live body identification, flexibility, and difficulty to cheat. With the expanding application of palm–vein pattern recognition, the corresponding growth of the database has resulted in a long response time. To shorten the response time of identification, this paper proposes a simple and useful classification for palm–vein identification based on principal direction features. In the registration process, the Gaussian-Radon transform is adopted to extract the orientation matrix and then compute the principal direction of a palm–vein image based on the orientation matrix. The database can be classified into six bins based on the value of the principal direction. In the identification process, the principal direction of the test sample is first extracted to ascertain the corresponding bin. One-by-one matching with the training samples is then performed in the bin. To improve recognition efficiency while maintaining better recognition accuracy, two neighborhood bins of the corresponding bin are continuously searched to identify the input palm–vein image. Evaluation experiments are conducted on three different databases, namely, PolyU, CASIA, and the database of this study. Experimental results show that the searching range of one test sample in PolyU, CASIA and our database by the proposed method for palm–vein identification can be reduced to 14.29%, 14.50%, and 14.28%, with retrieval accuracy of 96.67%, 96.00%, and 97.71%, respectively. With 10,000 training samples in the database, the execution time of the identification process by the traditional method is 18.56 s, while that by the proposed approach is 3.16 s. The experimental results confirm that the proposed approach is more efficient than the traditional method, especially for a large database. PMID:25383715
NASA Astrophysics Data System (ADS)
Burton, Dallas Jonathan
The field of laser-based diagnostics has been a topic of research in various fields, more specifically for applications in environmental studies, military defense technologies, and medicine, among many others. In this dissertation, a novel laser-based optical diagnostic method, differential laser-induced perturbation spectroscopy (DLIPS), has been implemented in a spectroscopy mode and expanded into an imaging mode in combination with fluorescence techniques. The DLIPS method takes advantage of deep ultraviolet (UV) laser perturbation at sub-ablative energy fluences to photochemically cleave bonds and alter fluorescence signal response before and after perturbation. The resulting difference spectrum or differential image adds more information about the target specimen, and can be used in combination with traditional fluorescence techniques for detection of certain materials, characterization of many materials and biological specimen, and diagnosis of various human skin conditions. The differential aspect allows for mitigation of patient or sample variation, and has the potential to develop into a powerful, noninvasive optical sensing tool. The studies in this dissertation encompass efforts to continue the fundamental research on DLIPS including expansion of the method to an imaging mode. Five primary studies have been carried out and presented. These include the use of DLIPS in a spectroscopy mode for analysis of nitrogen-based explosives on various substrates, classification of Caribbean fruit flies versus Caribbean fruit flies that have been irradiated with gamma rays, and diagnosis of human skin cancer lesions. The nitrogen-based explosives and Caribbean fruit flies have been analyzed with the DLIPS scheme using the imaging modality, providing complementary information to the spectroscopic scheme. In each study, a comparison between absolute fluorescence signals and DLIPS responses showed that DLIPS statistically outperformed traditional fluorescence techniques with regards to regression error and classification.
Should Classification of the UK Honours Degree Have a Future?
ERIC Educational Resources Information Center
Elton, Lewis
2004-01-01
The classified honours degree has so much prestige and so venerable a tradition that only very serious and systemic changes could justify the question as to whether classification has a future. However, while this paper argues that such changes have indeed taken place in the past 30 years, the main arguments for change are pedagogical. The…
The research of "blind" spot in the LVQ network
NASA Astrophysics Data System (ADS)
Guo, Zhanjie; Nan, Shupo; Wang, Xiaoli
2017-04-01
Nowadays competitive neural network has been widely used in the pattern recognition, classification and other aspects, and show the great advantages compared with the traditional clustering methods. But the competitive neural networks still has inadequate in many aspects, and it needs to be further improved. Based on the learning Vector Quantization Network proposed by Learning Kohonen [1], this paper resolve the issue of the large training error, when there are "blind" spots in a network through the introduction of threshold value learning rules and finally programs the realization with Matlab.
Uberon, an integrative multi-species anatomy ontology
2012-01-01
We present Uberon, an integrated cross-species ontology consisting of over 6,500 classes representing a variety of anatomical entities, organized according to traditional anatomical classification criteria. The ontology represents structures in a species-neutral way and includes extensive associations to existing species-centric anatomical ontologies, allowing integration of model organism and human data. Uberon provides a necessary bridge between anatomical structures in different taxa for cross-species inference. It uses novel methods for representing taxonomic variation, and has proved to be essential for translational phenotype analyses. Uberon is available at http://uberon.org PMID:22293552
NASA Astrophysics Data System (ADS)
Qi, Yong; Lei, Kai; Zhang, Lizeqing; Xing, Ximing; Gou, Wenyue
2018-06-01
This paper introduced the development of a self-serving medical data assisted diagnosis software of cervical cancer on the basis of artificial neural network (SVN, FNN, KNN). The system is developed based on the idea of self-service platform, supported by the application and innovation of neural network algorithm in medical data identification. Furthermore, it combined the advanced methods in various fields to effectively solve the complicated and inaccurate problem of cervical canceration data in the traditional manual treatment.
International experience on the use of artificial neural networks in gastroenterology.
Grossi, E; Mancini, A; Buscema, M
2007-03-01
In this paper, we reconsider the scientific background for the use of artificial intelligence tools in medicine. A review of some recent significant papers shows that artificial neural networks, the more advanced and effective artificial intelligence technique, can improve the classification accuracy and survival prediction of a number of gastrointestinal diseases. We discuss the 'added value' the use of artificial neural networks-based tools can bring in the field of gastroenterology, both at research and clinical application level, when compared with traditional statistical or clinical-pathological methods.
Gait recognition based on integral outline
NASA Astrophysics Data System (ADS)
Ming, Guan; Fang, Lv
2017-02-01
Biometric identification technology replaces traditional security technology, which has become a trend, and gait recognition also has become a hot spot of research because its feature is difficult to imitate and theft. This paper presents a gait recognition system based on integral outline of human body. The system has three important aspects: the preprocessing of gait image, feature extraction and classification. Finally, using a method of polling to evaluate the performance of the system, and summarizing the problems existing in the gait recognition and the direction of development in the future.
Benchmarking protein classification algorithms via supervised cross-validation.
Kertész-Farkas, Attila; Dhir, Somdutta; Sonego, Paolo; Pacurar, Mircea; Netoteia, Sergiu; Nijveen, Harm; Kuzniar, Arnold; Leunissen, Jack A M; Kocsor, András; Pongor, Sándor
2008-04-24
Development and testing of protein classification algorithms are hampered by the fact that the protein universe is characterized by groups vastly different in the number of members, in average protein size, similarity within group, etc. Datasets based on traditional cross-validation (k-fold, leave-one-out, etc.) may not give reliable estimates on how an algorithm will generalize to novel, distantly related subtypes of the known protein classes. Supervised cross-validation, i.e., selection of test and train sets according to the known subtypes within a database has been successfully used earlier in conjunction with the SCOP database. Our goal was to extend this principle to other databases and to design standardized benchmark datasets for protein classification. Hierarchical classification trees of protein categories provide a simple and general framework for designing supervised cross-validation strategies for protein classification. Benchmark datasets can be designed at various levels of the concept hierarchy using a simple graph-theoretic distance. A combination of supervised and random sampling was selected to construct reduced size model datasets, suitable for algorithm comparison. Over 3000 new classification tasks were added to our recently established protein classification benchmark collection that currently includes protein sequence (including protein domains and entire proteins), protein structure and reading frame DNA sequence data. We carried out an extensive evaluation based on various machine-learning algorithms such as nearest neighbor, support vector machines, artificial neural networks, random forests and logistic regression, used in conjunction with comparison algorithms, BLAST, Smith-Waterman, Needleman-Wunsch, as well as 3D comparison methods DALI and PRIDE. The resulting datasets provide lower, and in our opinion more realistic estimates of the classifier performance than do random cross-validation schemes. A combination of supervised and random sampling was used to construct model datasets, suitable for algorithm comparison.
Corcoran, Jennifer M.; Knight, Joseph F.; Gallant, Alisa L.
2013-01-01
Wetland mapping at the landscape scale using remotely sensed data requires both affordable data and an efficient accurate classification method. Random forest classification offers several advantages over traditional land cover classification techniques, including a bootstrapping technique to generate robust estimations of outliers in the training data, as well as the capability of measuring classification confidence. Though the random forest classifier can generate complex decision trees with a multitude of input data and still not run a high risk of over fitting, there is a great need to reduce computational and operational costs by including only key input data sets without sacrificing a significant level of accuracy. Our main questions for this study site in Northern Minnesota were: (1) how does classification accuracy and confidence of mapping wetlands compare using different remote sensing platforms and sets of input data; (2) what are the key input variables for accurate differentiation of upland, water, and wetlands, including wetland type; and (3) which datasets and seasonal imagery yield the best accuracy for wetland classification. Our results show the key input variables include terrain (elevation and curvature) and soils descriptors (hydric), along with an assortment of remotely sensed data collected in the spring (satellite visible, near infrared, and thermal bands; satellite normalized vegetation index and Tasseled Cap greenness and wetness; and horizontal-horizontal (HH) and horizontal-vertical (HV) polarization using L-band satellite radar). We undertook this exploratory analysis to inform decisions by natural resource managers charged with monitoring wetland ecosystems and to aid in designing a system for consistent operational mapping of wetlands across landscapes similar to those found in Northern Minnesota.
A Visual mining based framework for classification accuracy estimation
NASA Astrophysics Data System (ADS)
Arun, Pattathal Vijayakumar
2013-12-01
Classification techniques have been widely used in different remote sensing applications and correct classification of mixed pixels is a tedious task. Traditional approaches adopt various statistical parameters, however does not facilitate effective visualisation. Data mining tools are proving very helpful in the classification process. We propose a visual mining based frame work for accuracy assessment of classification techniques using open source tools such as WEKA and PREFUSE. These tools in integration can provide an efficient approach for getting information about improvements in the classification accuracy and helps in refining training data set. We have illustrated framework for investigating the effects of various resampling methods on classification accuracy and found that bilinear (BL) is best suited for preserving radiometric characteristics. We have also investigated the optimal number of folds required for effective analysis of LISS-IV images. Techniki klasyfikacji są szeroko wykorzystywane w różnych aplikacjach teledetekcyjnych, w których poprawna klasyfikacja pikseli stanowi poważne wyzwanie. Podejście tradycyjne wykorzystujące różnego rodzaju parametry statystyczne nie zapewnia efektywnej wizualizacji. Wielce obiecujące wydaje się zastosowanie do klasyfikacji narzędzi do eksploracji danych. W artykule zaproponowano podejście bazujące na wizualnej analizie eksploracyjnej, wykorzystujące takie narzędzia typu open source jak WEKA i PREFUSE. Wymienione narzędzia ułatwiają korektę pół treningowych i efektywnie wspomagają poprawę dokładności klasyfikacji. Działanie metody sprawdzono wykorzystując wpływ różnych metod resampling na zachowanie dokładności radiometrycznej i uzyskując najlepsze wyniki dla metody bilinearnej (BL).
NASA Astrophysics Data System (ADS)
Maksimyuk, V. A.; Storozhuk, E. A.; Chernyshenko, I. S.
2012-11-01
Variational finite-difference methods of solving linear and nonlinear problems for thin and nonthin shells (plates) made of homogeneous isotropic (metallic) and orthotropic (composite) materials are analyzed and their classification principles and structure are discussed. Scalar and vector variational finite-difference methods that implement the Kirchhoff-Love hypotheses analytically or algorithmically using Lagrange multipliers are outlined. The Timoshenko hypotheses are implemented in a traditional way, i.e., analytically. The stress-strain state of metallic and composite shells of complex geometry is analyzed numerically. The numerical results are presented in the form of graphs and tables and used to assess the efficiency of using the variational finite-difference methods to solve linear and nonlinear problems of the statics of shells (plates)
Efficient method of image edge detection based on FSVM
NASA Astrophysics Data System (ADS)
Cai, Aiping; Xiong, Xiaomei
2013-07-01
For efficient object cover edge detection in digital images, this paper studied traditional methods and algorithm based on SVM. It analyzed Canny edge detection algorithm existed some pseudo-edge and poor anti-noise capability. In order to provide a reliable edge extraction method, propose a new detection algorithm based on FSVM. Which contains several steps: first, trains classify sample and gives the different membership function to different samples. Then, a new training sample is formed by increase the punishment some wrong sub-sample, and use the new FSVM classification model for train and test them. Finally the edges are extracted of the object image by using the model. Experimental result shows that good edge detection image will be obtained and adding noise experiments results show that this method has good anti-noise.
A classification of user-generated content into consumer decision journey stages.
Vázquez, Silvia; Muñoz-García, Óscar; Campanella, Inés; Poch, Marc; Fisas, Beatriz; Bel, Nuria; Andreu, Gloria
2014-10-01
In the last decades, the availability of digital user-generated documents from social media has dramatically increased. This massive growth of user-generated content has also affected traditional shopping behaviour. Customers have embraced new communication channels such as microblogs and social networks that enable them not only just to talk with friends and acquaintances about their shopping experience, but also to search for opinions expressed by complete strangers as part of their decision making processes. Uncovering how customers feel about specific products or brands and detecting purchase habits and preferences has traditionally been a costly and highly time-consuming task which involved the use of methods such as focus groups and surveys. However, the new scenario calls for a deep assessment of current market research techniques in order to better interpret and profit from this ever-growing stream of attitudinal data. With this purpose, we present a novel analysis and classification of user-generated content in terms of it belonging to one of the four stages of the Consumer Decision Journey Court et al. (2009) (i.e. the purchase process from the moment when a customer is aware of the existence of the product to the moment when he or she buys, experiences and talks about it). Using a corpus of short texts written in English and Spanish and extracted from different social media, we identify a set of linguistic patterns for each purchase stage that will be then used in a rule-based classifier. Additionally, we use machine learning algorithms to automatically identify business indicators such as the Marketing Mix elements McCarthy and Brogowicz (1981). The classification of the purchase stages achieves an average precision of 74%. The proposed classification of texts depending on the Marketing Mix elements expressed achieved an average precision of 75% for all the elements analysed. Copyright © 2014 Elsevier Ltd. All rights reserved.
A Vision-Based Counting and Recognition System for Flying Insects in Intelligent Agriculture.
Zhong, Yuanhong; Gao, Junyuan; Lei, Qilun; Zhou, Yao
2018-05-09
Rapid and accurate counting and recognition of flying insects are of great importance, especially for pest control. Traditional manual identification and counting of flying insects is labor intensive and inefficient. In this study, a vision-based counting and classification system for flying insects is designed and implemented. The system is constructed as follows: firstly, a yellow sticky trap is installed in the surveillance area to trap flying insects and a camera is set up to collect real-time images. Then the detection and coarse counting method based on You Only Look Once (YOLO) object detection, the classification method and fine counting based on Support Vector Machines (SVM) using global features are designed. Finally, the insect counting and recognition system is implemented on Raspberry PI. Six species of flying insects including bee, fly, mosquito, moth, chafer and fruit fly are selected to assess the effectiveness of the system. Compared with the conventional methods, the test results show promising performance. The average counting accuracy is 92.50% and average classifying accuracy is 90.18% on Raspberry PI. The proposed system is easy-to-use and provides efficient and accurate recognition data, therefore, it can be used for intelligent agriculture applications.
A Vision-Based Counting and Recognition System for Flying Insects in Intelligent Agriculture
Zhong, Yuanhong; Gao, Junyuan; Lei, Qilun; Zhou, Yao
2018-01-01
Rapid and accurate counting and recognition of flying insects are of great importance, especially for pest control. Traditional manual identification and counting of flying insects is labor intensive and inefficient. In this study, a vision-based counting and classification system for flying insects is designed and implemented. The system is constructed as follows: firstly, a yellow sticky trap is installed in the surveillance area to trap flying insects and a camera is set up to collect real-time images. Then the detection and coarse counting method based on You Only Look Once (YOLO) object detection, the classification method and fine counting based on Support Vector Machines (SVM) using global features are designed. Finally, the insect counting and recognition system is implemented on Raspberry PI. Six species of flying insects including bee, fly, mosquito, moth, chafer and fruit fly are selected to assess the effectiveness of the system. Compared with the conventional methods, the test results show promising performance. The average counting accuracy is 92.50% and average classifying accuracy is 90.18% on Raspberry PI. The proposed system is easy-to-use and provides efficient and accurate recognition data, therefore, it can be used for intelligent agriculture applications. PMID:29747429
Macaluso, P J
2011-02-01
Digital photogrammetric methods were used to collect diameter, area, and perimeter data of the acetabulum for a twentieth-century skeletal sample from France (Georges Olivier Collection, Musée de l'Homme, Paris) consisting of 46 males and 36 females. The measurements were then subjected to both discriminant function and logistic regression analyses in order to develop osteometric standards for sex assessment. Univariate discriminant functions and logistic regression equations yielded overall correct classification accuracy rates for both the left and the right acetabula ranging from 84.1% to 89.6%. The multivariate models developed in this study did not provide increased accuracy over those using only a single variable. Classification sex bias ratios ranged between 1.1% and 7.3% for the majority of models. The results of this study, therefore, demonstrate that metric analysis of acetabular size provides a highly accurate, and easily replicable, method of discriminating sex in this documented skeletal collection. The results further suggest that the addition of area and perimeter data derived from digital images may provide a more effective method of sex assessment than that offered by traditional linear measurements alone. Copyright © 2010 Elsevier GmbH. All rights reserved.
Arvanitoyannis, Ioannis S; Vlachos, Antonios
2007-01-01
The authenticity of products labeled as olive oils, and in particular as virgin olive oils, stands for a very important issue both in terms of its health and commercial aspects. In view of the continuously increasing interest in virgin olive oil therapeutic properties, the traditional methods of characterization and physical and sensory analysis were further enriched with more advanced and sophisticated methods such as HPLC-MS, HPLC-GC/C/IRMS, RPLC-GC, DEPT, and CSIA among others. The results of both traditional and "novel" methods were treated both by means of classical multivariate analysis (cluster, principal component, correspondence, canonical, and discriminant) and artificial intelligence methods showing that nowadays the adulteration of virgin olive oil with seed oil is detectable at very low percentages, sometimes even at less than 1%. Furthermore, the detection of geographical origin of olive oil is equally feasible and much more accurate in countries like Italy and Spain where databases of physical/chemical properties exist. However, this geographical origin classification can also be accomplished in the absence of such databases provided that an adequate number of oil samples are used and the parameters studied have "discriminating power."
Fuzzy ontologies for semantic interpretation of remotely sensed images
NASA Astrophysics Data System (ADS)
Djerriri, Khelifa; Malki, Mimoun
2015-10-01
Object-based image classification consists in the assignment of object that share similar attributes to object categories. To perform such a task the remote sensing expert uses its personal knowledge, which is rarely formalized. Ontologies have been proposed as solution to represent domain knowledge agreed by domain experts in a formal and machine readable language. Classical ontology languages are not appropriate to deal with imprecision or vagueness in knowledge. Fortunately, Description Logics for the semantic web has been enhanced by various approaches to handle such knowledge. This paper presents the extension of the traditional ontology-based interpretation with fuzzy ontology of main land-cover classes in Landsat8-OLI scenes (vegetation, built-up areas, water bodies, shadow, clouds, forests) objects. A good classification of image objects was obtained and the results highlight the potential of the method to be replicated over time and space in the perspective of transferability of the procedure.
Hamit, Murat; Yun, Weikang; Yan, Chuanbo; Kutluk, Abdugheni; Fang, Yang; Alip, Elzat
2015-06-01
Image feature extraction is an important part of image processing and it is an important field of research and application of image processing technology. Uygur medicine is one of Chinese traditional medicine and researchers pay more attention to it. But large amounts of Uygur medicine data have not been fully utilized. In this study, we extracted the image color histogram feature of herbal and zooid medicine of Xinjiang Uygur. First, we did preprocessing, including image color enhancement, size normalizition and color space transformation. Then we extracted color histogram feature and analyzed them with statistical method. And finally, we evaluated the classification ability of features by Bayes discriminant analysis. Experimental results showed that high accuracy for Uygur medicine image classification was obtained by using color histogram feature. This study would have a certain help for the content-based medical image retrieval for Xinjiang Uygur medicine.
NASA Technical Reports Server (NTRS)
Sabol, Donald E., Jr.; Roberts, Dar A.; Adams, John B.; Smith, Milton O.
1993-01-01
An important application of remote sensing is to map and monitor changes over large areas of the land surface. This is particularly significant with the current interest in monitoring vegetation communities. Most of traditional methods for mapping different types of plant communities are based upon statistical classification techniques (i.e., parallel piped, nearest-neighbor, etc.) applied to uncalibrated multispectral data. Classes from these techniques are typically difficult to interpret (particularly to a field ecologist/botanist). Also, classes derived for one image can be very different from those derived from another image of the same area, making interpretation of observed temporal changes nearly impossible. More recently, neural networks have been applied to classification. Neural network classification, based upon spectral matching, is weak in dealing with spectral mixtures (a condition prevalent in images of natural surfaces). Another approach to mapping vegetation communities is based on spectral mixture analysis, which can provide a consistent framework for image interpretation. Roberts et al. (1990) mapped vegetation using the band residuals from a simple mixing model (the same spectral endmembers applied to all image pixels). Sabol et al. (1992b) and Roberts et al. (1992) used different methods to apply the most appropriate spectral endmembers to each image pixel, thereby allowing mapping of vegetation based upon the the different endmember spectra. In this paper, we describe a new approach to classification of vegetation communities based upon the spectra fractions derived from spectral mixture analysis. This approach was applied to three 1992 AVIRIS images of Jasper Ridge, California to observe seasonal changes in surface composition.
Fatigue crack sizing in rail steel using crack closure-induced acoustic emission waves
NASA Astrophysics Data System (ADS)
Li, Dan; Kuang, Kevin Sze Chiang; Ghee Koh, Chan
2017-06-01
The acoustic emission (AE) technique is a promising approach for detecting and locating fatigue cracks in metallic structures such as rail tracks. However, it is still a challenge to quantify the crack size accurately using this technique. AE waves can be generated by either crack propagation (CP) or crack closure (CC) processes and classification of these two types of AE waves is necessary to obtain more reliable crack sizing results. As the pre-processing step, an index based on wavelet power (WP) of AE signal is initially established in this paper in order to distinguish between the CC-induced AE waves and their CP-induced counterparts. Here, information embedded within the AE signal was used to perform the AE wave classification, which is preferred to the use of real-time load information, typically adopted in other studies. With the proposed approach, it renders the AE technique more amenable to practical implementation. Following the AE wave classification, a novel method to quantify the fatigue crack length was developed by taking advantage of the CC-induced AE waves, the count rate of which was observed to be positively correlated with the crack length. The crack length was subsequently determined using an empirical model derived from the AE data acquired during the fatigue tests of the rail steel specimens. The performance of the proposed method was validated by experimental data and compared with that of the traditional crack sizing method, which is based on CP-induced AE waves. As a significant advantage over other AE crack sizing methods, the proposed novel method is able to estimate the crack length without prior knowledge of the initial crack length, integration of AE data or real-time load amplitude. It is thus applicable to the health monitoring of both new and existing structures.
General tensor discriminant analysis and gabor features for gait recognition.
Tao, Dacheng; Li, Xuelong; Wu, Xindong; Maybank, Stephen J
2007-10-01
The traditional image representations are not suited to conventional classification methods, such as the linear discriminant analysis (LDA), because of the under sample problem (USP): the dimensionality of the feature space is much higher than the number of training samples. Motivated by the successes of the two dimensional LDA (2DLDA) for face recognition, we develop a general tensor discriminant analysis (GTDA) as a preprocessing step for LDA. The benefits of GTDA compared with existing preprocessing methods, e.g., principal component analysis (PCA) and 2DLDA, include 1) the USP is reduced in subsequent classification by, for example, LDA; 2) the discriminative information in the training tensors is preserved; and 3) GTDA provides stable recognition rates because the alternating projection optimization algorithm to obtain a solution of GTDA converges, while that of 2DLDA does not. We use human gait recognition to validate the proposed GTDA. The averaged gait images are utilized for gait representation. Given the popularity of Gabor function based image decompositions for image understanding and object recognition, we develop three different Gabor function based image representations: 1) the GaborD representation is the sum of Gabor filter responses over directions, 2) GaborS is the sum of Gabor filter responses over scales, and 3) GaborSD is the sum of Gabor filter responses over scales and directions. The GaborD, GaborS and GaborSD representations are applied to the problem of recognizing people from their averaged gait images.A large number of experiments were carried out to evaluate the effectiveness (recognition rate) of gait recognition based on first obtaining a Gabor, GaborD, GaborS or GaborSD image representation, then using GDTA to extract features and finally using LDA for classification. The proposed methods achieved good performance for gait recognition based on image sequences from the USF HumanID Database. Experimental comparisons are made with nine state of the art classification methods in gait recognition.
Yang, Lingjian; Ainali, Chrysanthi; Tsoka, Sophia; Papageorgiou, Lazaros G
2014-12-05
Applying machine learning methods on microarray gene expression profiles for disease classification problems is a popular method to derive biomarkers, i.e. sets of genes that can predict disease state or outcome. Traditional approaches where expression of genes were treated independently suffer from low prediction accuracy and difficulty of biological interpretation. Current research efforts focus on integrating information on protein interactions through biochemical pathway datasets with expression profiles to propose pathway-based classifiers that can enhance disease diagnosis and prognosis. As most of the pathway activity inference methods in literature are either unsupervised or applied on two-class datasets, there is good scope to address such limitations by proposing novel methodologies. A supervised multiclass pathway activity inference method using optimisation techniques is reported. For each pathway expression dataset, patterns of its constituent genes are summarised into one composite feature, termed pathway activity, and a novel mathematical programming model is proposed to infer this feature as a weighted linear summation of expression of its constituent genes. Gene weights are determined by the optimisation model, in a way that the resulting pathway activity has the optimal discriminative power with regards to disease phenotypes. Classification is then performed on the resulting low-dimensional pathway activity profile. The model was evaluated through a variety of published gene expression profiles that cover different types of disease. We show that not only does it improve classification accuracy, but it can also perform well in multiclass disease datasets, a limitation of other approaches from the literature. Desirable features of the model include the ability to control the maximum number of genes that may participate in determining pathway activity, which may be pre-specified by the user. Overall, this work highlights the potential of building pathway-based multi-phenotype classifiers for accurate disease diagnosis and prognosis problems.
Perinatal mortality classification: an analysis of 112 cases of stillbirth.
Reis, Ana Paula; Rocha, Ana; Lebre, Andrea; Ramos, Umbelina; Cunha, Ana
2017-10-01
This was a retrospective cohort analysis of stillbirths that occurred from January 2004 to December 2013 in our institution. We compared Tulip and Wigglesworth classification systems on a cohort of stillbirths and analysed the main differences between these two classifications. In this period, there were 112 stillbirths of a total of 31,758 births (stillbirth rate of 3.5 per 1000 births). There were 99 antepartum deaths and 13 intrapartum deaths. Foetal autopsy was performed in 99 cases and placental histopathological examination in all of the cases. The Wigglesworth found 'unknown' causes in 47 cases and the Tulip classification allocated 33 of these. Fourteen cases remained in the group of 'unknown' causes. Therefore, the Wigglesworth classification of stillbirths results in a higher proportion of unexplained stillbirths. We suggest that the traditional Wigglesworth classification should be substituted by a classification that manages the available information.
NASA Astrophysics Data System (ADS)
Baumstark, René; Duffey, Renee; Pu, Ruiliang
2016-11-01
The offshore extent of seagrass habitat along the West Florida (USA) coast represents an important corridor for inshore-offshore migration of economically important fish and shellfish. Surviving at the fringe of light requirements, offshore seagrass beds are sensitive to changes in water clarity. Beyond and intermingled with the offshore seagrass areas are large swaths of colonized hard bottom. These offshore habitats of the West Florida coast have lacked mapping efforts needed for status and trends monitoring. The objective of this study was to propose an object-based classification method for mapping offshore habitats and to compare results to traditional photo-interpreted maps. Benthic maps were created from WorldView-2 satellite imagery using an Object Based Image Analysis (OBIA) method and a visual photo-interpretation method. A logistic regression analysis identified depth and distance from shore as significant parameters for discriminating spectrally similar seagrass and colonized hard bottom features. Seagrass, colonized hard bottom and unconsolidated sediment (sand) were mapped with 78% overall accuracy using the OBIA method compared to 71% overall accuracy using the photo-interpretation method. This study suggests an alternative for mapping deeper, offshore habitats capable of producing higher thematic and spatial resolution maps compared to those created with the traditional photo-interpretation method.
Designing boosting ensemble of relational fuzzy systems.
Scherer, Rafał
2010-10-01
A method frequently used in classification systems for improving classification accuracy is to combine outputs of several classifiers. Among various types of classifiers, fuzzy ones are tempting because of using intelligible fuzzy if-then rules. In the paper we build an AdaBoost ensemble of relational neuro-fuzzy classifiers. Relational fuzzy systems bond input and output fuzzy linguistic values by a binary relation; thus, fuzzy rules have additional, comparing to traditional fuzzy systems, weights - elements of a fuzzy relation matrix. Thanks to this the system is better adjustable to data during learning. In the paper an ensemble of relational fuzzy systems is proposed. The problem is that such an ensemble contains separate rule bases which cannot be directly merged. As systems are separate, we cannot treat fuzzy rules coming from different systems as rules from the same (single) system. In the paper, the problem is addressed by a novel design of fuzzy systems constituting the ensemble, resulting in normalization of individual rule bases during learning. The method described in the paper is tested on several known benchmarks and compared with other machine learning solutions from the literature.
Qiu, Shanshan; Wang, Jun; Gao, Liping
2014-07-09
An electronic nose (E-nose) and an electronic tongue (E-tongue) have been used to characterize five types of strawberry juices based on processing approaches (i.e., microwave pasteurization, steam blanching, high temperature short time pasteurization, frozen-thawed, and freshly squeezed). Juice quality parameters (vitamin C, pH, total soluble solid, total acid, and sugar/acid ratio) were detected by traditional measuring methods. Multivariate statistical methods (linear discriminant analysis (LDA) and partial least squares regression (PLSR)) and neural networks (Random Forest (RF) and Support Vector Machines) were employed to qualitative classification and quantitative regression. E-tongue system reached higher accuracy rates than E-nose did, and the simultaneous utilization did have an advantage in LDA classification and PLSR regression. According to cross-validation, RF has shown outstanding and indisputable performances in the qualitative and quantitative analysis. This work indicates that the simultaneous utilization of E-nose and E-tongue can discriminate processed fruit juices and predict quality parameters successfully for the beverage industry.
Current application of chemometrics in traditional Chinese herbal medicine research.
Huang, Yipeng; Wu, Zhenwei; Su, Rihui; Ruan, Guihua; Du, Fuyou; Li, Gongke
2016-07-15
Traditional Chinese herbal medicines (TCHMs) are promising approach for the treatment of various diseases which have attracted increasing attention all over the world. Chemometrics in quality control of TCHMs are great useful tools that harnessing mathematics, statistics and other methods to acquire information maximally from the data obtained from various analytical approaches. This feature article focuses on the recent studies which evaluating the pharmacological efficacy and quality of TCHMs by determining, identifying and discriminating the bioactive or marker components in different samples with the help of chemometric techniques. In this work, the application of chemometric techniques in the classification of TCHMs based on their efficacy and usage was introduced. The recent advances of chemometrics applied in the chemical analysis of TCHMs were reviewed in detail. Copyright © 2015 Elsevier B.V. All rights reserved.
A signature dissimilarity measure for trabecular bone texture in knee radiographs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Woloszynski, T.; Podsiadlo, P.; Stachowiak, G. W.
Purpose: The purpose of this study is to develop a dissimilarity measure for the classification of trabecular bone (TB) texture in knee radiographs. Problems associated with the traditional extraction and selection of texture features and with the invariance to imaging conditions such as image size, anisotropy, noise, blur, exposure, magnification, and projection angle were addressed. Methods: In the method developed, called a signature dissimilarity measure (SDM), a sum of earth mover's distances calculated for roughness and orientation signatures is used to quantify dissimilarities between textures. Scale-space theory was used to ensure scale and rotation invariance. The effects of image size,more » anisotropy, noise, and blur on the SDM developed were studied using computer generated fractal texture images. The invariance of the measure to image exposure, magnification, and projection angle was studied using x-ray images of human tibia head. For the studies, Mann-Whitney tests with significance level of 0.01 were used. A comparison study between the performances of a SDM based classification system and other two systems in the classification of Brodatz textures and the detection of knee osteoarthritis (OA) were conducted. The other systems are based on weighted neighbor distance using compound hierarchy of algorithms representing morphology (WND-CHARM) and local binary patterns (LBP). Results: Results obtained indicate that the SDM developed is invariant to image exposure (2.5-30 mA s), magnification (x1.00-x1.35), noise associated with film graininess and quantum mottle (<25%), blur generated by a sharp film screen, and image size (>64x64 pixels). However, the measure is sensitive to changes in projection angle (>5 deg.), image anisotropy (>30 deg.), and blur generated by a regular film screen. For the classification of Brodatz textures, the SDM based system produced comparable results to the LBP system. For the detection of knee OA, the SDM based system achieved 78.8% classification accuracy and outperformed the WND-CHARM system (64.2%). Conclusions: The SDM is well suited for the classification of TB texture images in knee OA detection and may be useful for the texture classification of medical images in general.« less
Estimating of Soil Texture Using Landsat Imagery: a Case Study in Thatta Tehsil, Sindh
NASA Astrophysics Data System (ADS)
Khalil, Zahid
2016-07-01
Soil texture is considered as an important environment factor for agricultural growth. It is the most essential part for soil classification in large scale. Today the precise soil information in large scale is of great demand from various stakeholders including soil scientists, environmental managers, land use planners and traditional agricultural users. With the increasing demand of soil properties in fine scale spatial resolution made the traditional laboratory methods inadequate. In addition the costs of soil analysis with precision agriculture systems are more expensive than traditional methods. In this regard, the application of geo-spatial techniques can be used as an alternative for examining soil analysis. This study aims to examine the ability of Geo-spatial techniques in identifying the spatial patterns of soil attributes in fine scale. Around 28 samples of soil were collected from the different areas of Thatta Tehsil, Sindh, Pakistan for analyzing soil texture. An Ordinary Least Square (OLS) regression analysis was used to relate the reflectance values of Landsat8 OLI imagery with the soil variables. The analysis showed there was a significant relationship (p<0.05) of band 2 and 5 with silt% (R2 = 0.52), and band 4 and 6 with clay% (R2 =0.40). The equation derived from OLS analysis was then used for the whole study area for deriving soil attributes. The USDA textural classification triangle was implementing for the derivation of soil texture map in GIS environment. The outcome revealed that the 'sandy loam' was in great quantity followed by loam, sandy clay loam and clay loam. The outcome shows that the Geo-spatial techniques could be used efficiently for mapping soil texture of a larger area in fine scale. This technology helped in decreasing cost, time and increase detailed information by reducing field work to a considerable level.
Traditional Values/Contemporary Pressures: The Conflicting Needs of America's Rural Women.
ERIC Educational Resources Information Center
Dunne, Faith
Rural American women number well over 25 million and represent all socio-economic and ethnic classifications, yet they share a conservative orientation towards sex roles and appropriate life styles, characteristic social and geographic isolation, and the dilemma of how to manage the traditional demands of rural culture and the contemporary…
ERIC Educational Resources Information Center
Schultze-Krumbholz, Anja; Göbel, Kristin; Scheithauer, Herbert; Brighi, Antonella; Guarini, Annalisa; Tsorbatzoudis, Haralambos; Barkoukis, Vassilis; Pyzalski, Jacek; Plichta, Piotr; Del Rey, Rosario; Casas, José A.; Thompson, Fran; Smith, Peter K.
2015-01-01
In recently published studies on cyberbullying, students are frequently categorized into distinct (cyber)bully and (cyber)victim clusters based on theoretical assumptions and arbitrary cut-off scores adapted from traditional bullying research. The present study identified involvement classes empirically using latent class analysis (LCA), to…
Near ground level sensing for spatial analysis of vegetation
NASA Technical Reports Server (NTRS)
Sauer, Tom; Rasure, John; Gage, Charlie
1991-01-01
Measured changes in vegetation indicate the dynamics of ecological processes and can identify the impacts from disturbances. Traditional methods of vegetation analysis tend to be slow because they are labor intensive; as a result, these methods are often confined to small local area measurements. Scientists need new algorithms and instruments that will allow them to efficiently study environmental dynamics across a range of different spatial scales. A new methodology that addresses this problem is presented. This methodology includes the acquisition, processing, and presentation of near ground level image data and its corresponding spatial characteristics. The systematic approach taken encompasses a feature extraction process, a supervised and unsupervised classification process, and a region labeling process yielding spatial information.
Objectively classifying Southern Hemisphere extratropical cyclones
NASA Astrophysics Data System (ADS)
Catto, Jennifer
2016-04-01
There has been a long tradition in attempting to separate extratropical cyclones into different classes depending on their cloud signatures, airflows, synoptic precursors, or upper-level flow features. Depending on these features, the cyclones may have different impacts, for example in their precipitation intensity. It is important, therefore, to understand how the distribution of different cyclone classes may change in the future. Many of the previous classifications have been performed manually. In order to be able to evaluate climate models and understand how extratropical cyclones might change in the future, we need to be able to use an automated method to classify cyclones. Extratropical cyclones have been identified in the Southern Hemisphere from the ERA-Interim reanalysis dataset with a commonly used identification and tracking algorithm that employs 850 hPa relative vorticity. A clustering method applied to large-scale fields from ERA-Interim at the time of cyclone genesis (when the cyclone is first detected), has been used to objectively classify identified cyclones. The results are compared to the manual classification of Sinclair and Revell (2000) and the four objectively identified classes shown in this presentation are found to match well. The relative importance of diabatic heating in the clusters is investigated, as well as the differing precipitation characteristics. The success of the objective classification shows its utility in climate model evaluation and climate change studies.
Analysis of spatial distribution of land cover maps accuracy
NASA Astrophysics Data System (ADS)
Khatami, R.; Mountrakis, G.; Stehman, S. V.
2017-12-01
Land cover maps have become one of the most important products of remote sensing science. However, classification errors will exist in any classified map and affect the reliability of subsequent map usage. Moreover, classification accuracy often varies over different regions of a classified map. These variations of accuracy will affect the reliability of subsequent analyses of different regions based on the classified maps. The traditional approach of map accuracy assessment based on an error matrix does not capture the spatial variation in classification accuracy. Here, per-pixel accuracy prediction methods are proposed based on interpolating accuracy values from a test sample to produce wall-to-wall accuracy maps. Different accuracy prediction methods were developed based on four factors: predictive domain (spatial versus spectral), interpolation function (constant, linear, Gaussian, and logistic), incorporation of class information (interpolating each class separately versus grouping them together), and sample size. Incorporation of spectral domain as explanatory feature spaces of classification accuracy interpolation was done for the first time in this research. Performance of the prediction methods was evaluated using 26 test blocks, with 10 km × 10 km dimensions, dispersed throughout the United States. The performance of the predictions was evaluated using the area under the curve (AUC) of the receiver operating characteristic. Relative to existing accuracy prediction methods, our proposed methods resulted in improvements of AUC of 0.15 or greater. Evaluation of the four factors comprising the accuracy prediction methods demonstrated that: i) interpolations should be done separately for each class instead of grouping all classes together; ii) if an all-classes approach is used, the spectral domain will result in substantially greater AUC than the spatial domain; iii) for the smaller sample size and per-class predictions, the spectral and spatial domain yielded similar AUC; iv) for the larger sample size (i.e., very dense spatial sample) and per-class predictions, the spatial domain yielded larger AUC; v) increasing the sample size improved accuracy predictions with a greater benefit accruing to the spatial domain; and vi) the function used for interpolation had the smallest effect on AUC.
Yapese land classification and use in relation to agroforests
Pius Liyagel
1993-01-01
Traditional land use classification on Yap Island, especially in regards to agroforestry, is described. Today there is a need to classify land on Yap to protect culturally significant areas and to make the best possible use of the land to support a rapidly growing population. Any new uses of land should be evaluated to assure that actions in one area, even private...
Winter, R; Gudmundsson, P; Ericsson, G; Willenheimer, R
2001-06-01
To study the clinical value of the colour-M-mode slope of the early diastolic left ventricular filling phase (Vp) and the early diastolic downward M-mode slope of the left atrioventricular plane displacement (EDS), compared with diastolic function assessed by traditional Doppler evaluation. In 65 consecutive patients EDS and Vp were compared with a four-degree traditional diastolic function classification, based on pulsed Doppler assessment of the early to atrial transmitral flow ratio (E/A), the E-wave deceleration time (Edt), and the systolic to diastolic (S/D) pulmonary venous inflow ratio. Vp (P=0.006) and EDS (P=0.045) were related to traditional diastolic function (Kruskal--Wallis analysis). EDS showed a trend brake between the moderate and severe diastolic dysfunction groups by traditional Doppler evaluation. Vp and EDS correlated weakly in simple linear regression analysis (r=0.33). Vp and EDS discriminated poorly between normal and highly abnormal diastolic function. Vp and EDS were significantly related to diastolic function by traditional Doppler evaluation. They were, however, not useful as single parameters of left ventricular diastolic function due to a small difference between normal and highly abnormal values, allowing for little between-measurement variability. Consequently, these methods for the evaluation of left ventricular diastolic function do not add significantly to traditional Doppler evaluation.
Rubio, Carlos A
2017-12-01
Recent studies have disclosed novel histological phenotypes of colon tumours in carcinogen-treated rats. The aim of this study was to update the current histological classification of colonic neoplasias in Sprague-Dawley (SD) rats. Archival sections from 398 SD rats having 408 neoplasias in previous experiments were re-evaluated. Of the 408 colonic neoplasias, 11% (44/408) were adenomas without invasive growth and 89% (364/408) invasive carcinomas. Out of the 44 adenomas, 82% were conventional (tubular or villous), 14% traditional serrated (TSA; with unlocked serrations or with closed microtubules) and 5% gut-associated lymphoid tissue (GALT)-associated adenomas. Out of 364 carcinomas, 57% were conventional carcinomas, 26% GALT carcinomas, 8% undifferentiated, 6% signet-ring cell carcinomas, and 4% traditional serrated carcinomas (TSC). Thus, conventional adenomas, conventional carcinomas and GALT-associated carcinomas predominated (p<0.05). The updated classification of colonic tumours in SD rats includes conventional adenomas, TSA, GALT-associated adenomas, conventional carcinomas, TSC, GALT-associated carcinomas, signet-ring cell carcinomas and undifferentiated carcinomas. Several of the histological phenotypes reported here are not included in any of the current classifications of colonic tumours in rodents. This updated classification fulfils the requirements for an animal model of human disease, inasmuch as similar histological phenotypes of colon neoplasias have been documented in humans. Copyright© 2017, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.
Observation versus classification in supervised category learning.
Levering, Kimery R; Kurtz, Kenneth J
2015-02-01
The traditional supervised classification paradigm encourages learners to acquire only the knowledge needed to predict category membership (a discriminative approach). An alternative that aligns with important aspects of real-world concept formation is learning with a broader focus to acquire knowledge of the internal structure of each category (a generative approach). Our work addresses the impact of a particular component of the traditional classification task: the guess-and-correct cycle. We compare classification learning to a supervised observational learning task in which learners are shown labeled examples but make no classification response. The goals of this work sit at two levels: (1) testing for differences in the nature of the category representations that arise from two basic learning modes; and (2) evaluating the generative/discriminative continuum as a theoretical tool for understand learning modes and their outcomes. Specifically, we view the guess-and-correct cycle as consistent with a more discriminative approach and therefore expected it to lead to narrower category knowledge. Across two experiments, the observational mode led to greater sensitivity to distributional properties of features and correlations between features. We conclude that a relatively subtle procedural difference in supervised category learning substantially impacts what learners come to know about the categories. The results demonstrate the value of the generative/discriminative continuum as a tool for advancing the psychology of category learning and also provide a valuable constraint for formal models and associated theories.
Carvajal, Gonzalo; Figueroa, Miguel
2014-07-01
Typical image recognition systems operate in two stages: feature extraction to reduce the dimensionality of the input space, and classification based on the extracted features. Analog Very Large Scale Integration (VLSI) is an attractive technology to achieve compact and low-power implementations of these computationally intensive tasks for portable embedded devices. However, device mismatch limits the resolution of the circuits fabricated with this technology. Traditional layout techniques to reduce the mismatch aim to increase the resolution at the transistor level, without considering the intended application. Relating mismatch parameters to specific effects in the application level would allow designers to apply focalized mismatch compensation techniques according to predefined performance/cost tradeoffs. This paper models, analyzes, and evaluates the effects of mismatched analog arithmetic in both feature extraction and classification circuits. For the feature extraction, we propose analog adaptive linear combiners with on-chip learning for both Least Mean Square (LMS) and Generalized Hebbian Algorithm (GHA). Using mathematical abstractions of analog circuits, we identify mismatch parameters that are naturally compensated during the learning process, and propose cost-effective guidelines to reduce the effect of the rest. For the classification, we derive analog models for the circuits necessary to implement Nearest Neighbor (NN) approach and Radial Basis Function (RBF) networks, and use them to emulate analog classifiers with standard databases of face and hand-writing digits. Formal analysis and experiments show how we can exploit adaptive structures and properties of the input space to compensate the effects of device mismatch at the application level, thus reducing the design overhead of traditional layout techniques. Results are also directly extensible to multiple application domains using linear subspace methods. Copyright © 2014 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Zhang, Ming; Xie, Fei; Zhao, Jing; Sun, Rui; Zhang, Lei; Zhang, Yue
2018-04-01
The prosperity of license plate recognition technology has made great contribution to the development of Intelligent Transport System (ITS). In this paper, a robust and efficient license plate recognition method is proposed which is based on a combined feature extraction model and BPNN (Back Propagation Neural Network) algorithm. Firstly, the candidate region of the license plate detection and segmentation method is developed. Secondly, a new feature extraction model is designed considering three sets of features combination. Thirdly, the license plates classification and recognition method using the combined feature model and BPNN algorithm is presented. Finally, the experimental results indicate that the license plate segmentation and recognition both can be achieved effectively by the proposed algorithm. Compared with three traditional methods, the recognition accuracy of the proposed method has increased to 95.7% and the consuming time has decreased to 51.4ms.
Mathematical decision theory applied to land capability: a case study in the community of madrid.
Antón, J M; Saa-Requejo, A; Grau, J B; Gallardo, J; Díaz, M C; Andina, Diego; Sanchez, M E; Tarquis, A M
2014-03-01
In land evaluation science, a standard data set is obtained for each land unit to determine the land capability class for various uses, such as different farming systems, forestry, or the conservation or suitability of a specific crop. In this study, we used mathematical decision theory (MDT) methods to address this task. Mathematical decision theory has been used in areas such as management, finance, industrial design, rural development, the environment, and projects for future welfare to study quality and aptness problems using several criteria. We also review MDT applications in soil science and discuss the suitability of MDT methods for dealing simultaneously with a number of problems. The aim of the work was to show how MDT can be used to obtain a valid land quality index and to compare this with a traditional land capability method. Therefore, an additive classification method was applied to obtain a land quality index for 122 land units that were compiled for a case study of the Community of Madrid, Spain, and the results were compared with a previously assigned land capability class using traditional methods based on the minimum requirements for land attributes. Copyright © by the American Society of Agronomy, Crop Science Society of America, and Soil Science Society of America, Inc.
Kianmehr, Keivan; Alhajj, Reda
2008-09-01
In this study, we aim at building a classification framework, namely the CARSVM model, which integrates association rule mining and support vector machine (SVM). The goal is to benefit from advantages of both, the discriminative knowledge represented by class association rules and the classification power of the SVM algorithm, to construct an efficient and accurate classifier model that improves the interpretability problem of SVM as a traditional machine learning technique and overcomes the efficiency issues of associative classification algorithms. In our proposed framework: instead of using the original training set, a set of rule-based feature vectors, which are generated based on the discriminative ability of class association rules over the training samples, are presented to the learning component of the SVM algorithm. We show that rule-based feature vectors present a high-qualified source of discrimination knowledge that can impact substantially the prediction power of SVM and associative classification techniques. They provide users with more conveniences in terms of understandability and interpretability as well. We have used four datasets from UCI ML repository to evaluate the performance of the developed system in comparison with five well-known existing classification methods. Because of the importance and popularity of gene expression analysis as real world application of the classification model, we present an extension of CARSVM combined with feature selection to be applied to gene expression data. Then, we describe how this combination will provide biologists with an efficient and understandable classifier model. The reported test results and their biological interpretation demonstrate the applicability, efficiency and effectiveness of the proposed model. From the results, it can be concluded that a considerable increase in classification accuracy can be obtained when the rule-based feature vectors are integrated in the learning process of the SVM algorithm. In the context of applicability, according to the results obtained from gene expression analysis, we can conclude that the CARSVM system can be utilized in a variety of real world applications with some adjustments.
ERIC Educational Resources Information Center
Wolf, Patrick J.; Lasserre-Cortez, Shannon
2018-01-01
Charter schools are public schools authorized to operate with some independence from district or state public school regulations, while still being held accountable for student outcomes. Like traditional schools operated by school districts, charter schools are free and are intended to be open to all students who desire to attend. This study…
Breast masses in mammography classification with local contour features.
Li, Haixia; Meng, Xianjing; Wang, Tingwen; Tang, Yuchun; Yin, Yilong
2017-04-14
Mammography is one of the most popular tools for early detection of breast cancer. Contour of breast mass in mammography is very important information to distinguish benign and malignant mass. Contour of benign mass is smooth and round or oval, while malignant mass has irregular shape and spiculated contour. Several studies have shown that 1D signature translated from 2D contour can describe the contour features well. In this paper, we propose a new method to translate 2D contour of breast mass in mammography into 1D signature. The method can describe not only the contour features but also the regularity of breast mass. Then we segment the whole 1D signature into different subsections. We extract four local features including a new contour descriptor from the subsections. The new contour descriptor is root mean square (RMS) slope. It can describe the roughness of the contour. KNN, SVM and ANN classifier are used to classify benign breast mass and malignant mass. The proposed method is tested on a set with 323 contours including 143 benign masses and 180 malignant ones from digital database of screening mammography (DDSM). The best accuracy of classification is 99.66% using the feature of root mean square slope with SVM classifier. The performance of the proposed method is better than traditional method. In addition, RMS slope is an effective feature comparable to most of the existing features.
Doborjeh, Maryam Gholami; Wang, Grace Y; Kasabov, Nikola K; Kydd, Robert; Russell, Bruce
2016-09-01
This paper introduces a method utilizing spiking neural networks (SNN) for learning, classification, and comparative analysis of brain data. As a case study, the method was applied to electroencephalography (EEG) data collected during a GO/NOGO cognitive task performed by untreated opiate addicts, those undergoing methadone maintenance treatment (MMT) for opiate dependence and a healthy control group. the method is based on an SNN architecture called NeuCube, trained on spatiotemporal EEG data. NeuCube was used to classify EEG data across subject groups and across GO versus NOGO trials, but also facilitated a deeper comparative analysis of the dynamic brain processes. This analysis results in a better understanding of human brain functioning across subject groups when performing a cognitive task. In terms of the EEG data classification, a NeuCube model obtained better results (the maximum obtained accuracy: 90.91%) when compared with traditional statistical and artificial intelligence methods (the maximum obtained accuracy: 50.55%). more importantly, new information about the effects of MMT on cognitive brain functions is revealed through the analysis of the SNN model connectivity and its dynamics. this paper presented a new method for EEG data modeling and revealed new knowledge on brain functions associated with mental activity which is different from the brain activity observed in a resting state of the same subjects.
Hou, Bin; Wang, Yunhong; Liu, Qingjie
2016-01-01
Characterizations of up to date information of the Earth’s surface are an important application providing insights to urban planning, resources monitoring and environmental studies. A large number of change detection (CD) methods have been developed to solve them by utilizing remote sensing (RS) images. The advent of high resolution (HR) remote sensing images further provides challenges to traditional CD methods and opportunities to object-based CD methods. While several kinds of geospatial objects are recognized, this manuscript mainly focuses on buildings. Specifically, we propose a novel automatic approach combining pixel-based strategies with object-based ones for detecting building changes with HR remote sensing images. A multiresolution contextual morphological transformation called extended morphological attribute profiles (EMAPs) allows the extraction of geometrical features related to the structures within the scene at different scales. Pixel-based post-classification is executed on EMAPs using hierarchical fuzzy clustering. Subsequently, the hierarchical fuzzy frequency vector histograms are formed based on the image-objects acquired by simple linear iterative clustering (SLIC) segmentation. Then, saliency and morphological building index (MBI) extracted on difference images are used to generate a pseudo training set. Ultimately, object-based semi-supervised classification is implemented on this training set by applying random forest (RF). Most of the important changes are detected by the proposed method in our experiments. This study was checked for effectiveness using visual evaluation and numerical evaluation. PMID:27618903
Guo, Hao; Qin, Mengna; Chen, Junjie; Xu, Yong; Xiang, Jie
2017-01-01
High-order functional connectivity networks are rich in time information that can reflect dynamic changes in functional connectivity between brain regions. Accordingly, such networks are widely used to classify brain diseases. However, traditional methods for processing high-order functional connectivity networks generally include the clustering method, which reduces data dimensionality. As a result, such networks cannot be effectively interpreted in the context of neurology. Additionally, due to the large scale of high-order functional connectivity networks, it can be computationally very expensive to use complex network or graph theory to calculate certain topological properties. Here, we propose a novel method of generating a high-order minimum spanning tree functional connectivity network. This method increases the neurological significance of the high-order functional connectivity network, reduces network computing consumption, and produces a network scale that is conducive to subsequent network analysis. To ensure the quality of the topological information in the network structure, we used frequent subgraph mining technology to capture the discriminative subnetworks as features and combined this with quantifiable local network features. Then we applied a multikernel learning technique to the corresponding selected features to obtain the final classification results. We evaluated our proposed method using a data set containing 38 patients with major depressive disorder and 28 healthy controls. The experimental results showed a classification accuracy of up to 97.54%.
Qin, Mengna; Chen, Junjie; Xu, Yong; Xiang, Jie
2017-01-01
High-order functional connectivity networks are rich in time information that can reflect dynamic changes in functional connectivity between brain regions. Accordingly, such networks are widely used to classify brain diseases. However, traditional methods for processing high-order functional connectivity networks generally include the clustering method, which reduces data dimensionality. As a result, such networks cannot be effectively interpreted in the context of neurology. Additionally, due to the large scale of high-order functional connectivity networks, it can be computationally very expensive to use complex network or graph theory to calculate certain topological properties. Here, we propose a novel method of generating a high-order minimum spanning tree functional connectivity network. This method increases the neurological significance of the high-order functional connectivity network, reduces network computing consumption, and produces a network scale that is conducive to subsequent network analysis. To ensure the quality of the topological information in the network structure, we used frequent subgraph mining technology to capture the discriminative subnetworks as features and combined this with quantifiable local network features. Then we applied a multikernel learning technique to the corresponding selected features to obtain the final classification results. We evaluated our proposed method using a data set containing 38 patients with major depressive disorder and 28 healthy controls. The experimental results showed a classification accuracy of up to 97.54%. PMID:29387141
Hou, Bin; Wang, Yunhong; Liu, Qingjie
2016-08-27
Characterizations of up to date information of the Earth's surface are an important application providing insights to urban planning, resources monitoring and environmental studies. A large number of change detection (CD) methods have been developed to solve them by utilizing remote sensing (RS) images. The advent of high resolution (HR) remote sensing images further provides challenges to traditional CD methods and opportunities to object-based CD methods. While several kinds of geospatial objects are recognized, this manuscript mainly focuses on buildings. Specifically, we propose a novel automatic approach combining pixel-based strategies with object-based ones for detecting building changes with HR remote sensing images. A multiresolution contextual morphological transformation called extended morphological attribute profiles (EMAPs) allows the extraction of geometrical features related to the structures within the scene at different scales. Pixel-based post-classification is executed on EMAPs using hierarchical fuzzy clustering. Subsequently, the hierarchical fuzzy frequency vector histograms are formed based on the image-objects acquired by simple linear iterative clustering (SLIC) segmentation. Then, saliency and morphological building index (MBI) extracted on difference images are used to generate a pseudo training set. Ultimately, object-based semi-supervised classification is implemented on this training set by applying random forest (RF). Most of the important changes are detected by the proposed method in our experiments. This study was checked for effectiveness using visual evaluation and numerical evaluation.
Computational intelligence techniques for biological data mining: An overview
NASA Astrophysics Data System (ADS)
Faye, Ibrahima; Iqbal, Muhammad Javed; Said, Abas Md; Samir, Brahim Belhaouari
2014-10-01
Computational techniques have been successfully utilized for a highly accurate analysis and modeling of multifaceted and raw biological data gathered from various genome sequencing projects. These techniques are proving much more effective to overcome the limitations of the traditional in-vitro experiments on the constantly increasing sequence data. However, most critical problems that caught the attention of the researchers may include, but not limited to these: accurate structure and function prediction of unknown proteins, protein subcellular localization prediction, finding protein-protein interactions, protein fold recognition, analysis of microarray gene expression data, etc. To solve these problems, various classification and clustering techniques using machine learning have been extensively used in the published literature. These techniques include neural network algorithms, genetic algorithms, fuzzy ARTMAP, K-Means, K-NN, SVM, Rough set classifiers, decision tree and HMM based algorithms. Major difficulties in applying the above algorithms include the limitations found in the previous feature encoding and selection methods while extracting the best features, increasing classification accuracy and decreasing the running time overheads of the learning algorithms. The application of this research would be potentially useful in the drug design and in the diagnosis of some diseases. This paper presents a concise overview of the well-known protein classification techniques.
Using deep learning in image hyper spectral segmentation, classification, and detection
NASA Astrophysics Data System (ADS)
Zhao, Xiuying; Su, Zhenyu
2018-02-01
Recent years have shown that deep learning neural networks are a valuable tool in the field of computer vision. Deep learning method can be used in applications like remote sensing such as Land cover Classification, Detection of Vehicle in Satellite Images, Hyper spectral Image classification. This paper addresses the use of the deep learning artificial neural network in Satellite image segmentation. Image segmentation plays an important role in image processing. The hue of the remote sensing image often has a large hue difference, which will result in the poor display of the images in the VR environment. Image segmentation is a pre processing technique applied to the original images and splits the image into many parts which have different hue to unify the color. Several computational models based on supervised, unsupervised, parametric, probabilistic region based image segmentation techniques have been proposed. Recently, one of the machine learning technique known as, deep learning with convolution neural network has been widely used for development of efficient and automatic image segmentation models. In this paper, we focus on study of deep neural convolution network and its variants for automatic image segmentation rather than traditional image segmentation strategies.
Pietsch, Torsten; Haberler, Christine
2016-01-01
The revised WHO classification of tumors of the CNS 2016 has introduced the concept of the integrated diagnosis. The definition of medulloblastoma entities now requires a combination of the traditional histological information with additional molecular/genetic features. For definition of the histopathological component of the medulloblastoma diagnosis, the tumors should be assigned to one of the four entities classic, desmoplastic/nodular (DNMB), extensive nodular (MBEN), or large cell/anaplastic (LC/A) medulloblastoma. The genetically defined component comprises the four entities WNT-activated, SHH-activated and TP53 wildtype, SHH-activated and TP53 mutant, or non-WNT/non-SHH medulloblastoma. Robust and validated methods are available to allow a precise diagnosis of these medulloblastoma entities according to the updated WHO classification, and for differential diagnostic purposes. A combination of immunohistochemical markers including β-catenin, Yap1, p75-NGFR, Otx2, and p53, in combination with targeted sequencing and copy number assessment such as FISH analysis for MYC genes allows a precise assignment of patients for risk-adapted stratification. It also allows comparison to results of study cohorts in the past and provides a robust basis for further treatment refinement. PMID:27781424
Pietsch, Torsten; Haberler, Christine
The revised WHO classification of tumors of the CNS 2016 has introduced the concept of the integrated diagnosis. The definition of medulloblastoma entities now requires a combination of the traditional histological information with additional molecular/genetic features. For definition of the histopathological component of the medulloblastoma diagnosis, the tumors should be assigned to one of the four entities classic, desmoplastic/nodular (DNMB), extensive nodular (MBEN), or large cell/anaplastic (LC/A) medulloblastoma. The genetically defined component comprises the four entities WNT-activated, SHH-activated and TP53 wildtype, SHH-activated and TP53 mutant, or non-WNT/non-SHH medulloblastoma. Robust and validated methods are available to allow a precise diagnosis of these medulloblastoma entities according to the updated WHO classification, and for differential diagnostic purposes. A combination of immunohistochemical markers including β-catenin, Yap1, p75-NGFR, Otx2, and p53, in combination with targeted sequencing and copy number assessment such as FISH analysis for MYC genes allows a precise assignment of patients for risk-adapted stratification. It also allows comparison to results of study cohorts in the past and provides a robust basis for further treatment refinement. .
Liu, Xiao; Shi, Jun; Zhou, Shichong; Lu, Minhua
2014-01-01
The dimensionality reduction is an important step in ultrasound image based computer-aided diagnosis (CAD) for breast cancer. A newly proposed l2,1 regularized correntropy algorithm for robust feature selection (CRFS) has achieved good performance for noise corrupted data. Therefore, it has the potential to reduce the dimensions of ultrasound image features. However, in clinical practice, the collection of labeled instances is usually expensive and time costing, while it is relatively easy to acquire the unlabeled or undetermined instances. Therefore, the semi-supervised learning is very suitable for clinical CAD. The iterated Laplacian regularization (Iter-LR) is a new regularization method, which has been proved to outperform the traditional graph Laplacian regularization in semi-supervised classification and ranking. In this study, to augment the classification accuracy of the breast ultrasound CAD based on texture feature, we propose an Iter-LR-based semi-supervised CRFS (Iter-LR-CRFS) algorithm, and then apply it to reduce the feature dimensions of ultrasound images for breast CAD. We compared the Iter-LR-CRFS with LR-CRFS, original supervised CRFS, and principal component analysis. The experimental results indicate that the proposed Iter-LR-CRFS significantly outperforms all other algorithms.
"Kalila and Dimna" as One of the Traditional Antecedents of Modern Classifications of Values
ERIC Educational Resources Information Center
Aktas, Elif; Beldag, Adem
2017-01-01
"Kalila and Dimna," which is considered as one of the classic works of the Eastern literature, is a political morality and advice book that is still in effect thanks to the knowledge of wisdom it offers. The aim of this study is to examine this work according to the Western classifications of values (UNESCO, Rokeach, Schwartz, Spranger)…
Significance of clustering and classification applications in digital and physical libraries
NASA Astrophysics Data System (ADS)
Triantafyllou, Ioannis; Koulouris, Alexandros; Zervos, Spiros; Dendrinos, Markos; Giannakopoulos, Georgios
2015-02-01
Applications of clustering and classification techniques can be proved very significant in both digital and physical (paper-based) libraries. The most essential application, document classification and clustering, is crucial for the content that is produced and maintained in digital libraries, repositories, databases, social media, blogs etc., based on various tags and ontology elements, transcending the traditional library-oriented classification schemes. Other applications with very useful and beneficial role in the new digital library environment involve document routing, summarization and query expansion. Paper-based libraries can benefit as well since classification combined with advanced material characterization techniques such as FTIR (Fourier Transform InfraRed spectroscopy) can be vital for the study and prevention of material deterioration. An improved two-level self-organizing clustering architecture is proposed in order to enhance the discrimination capacity of the learning space, prior to classification, yielding promising results when applied to the above mentioned library tasks.
Wang, Shuang; Qi, Pengcheng; Zhou, Na; Zhao, Minmin; Ding, Weijing; Li, Song; Liu, Minyan; Wang, Qiao; Jin, Shumin
2016-10-01
Traditional Chinese Medicines (TCMs) have gained increasing popularity in modern society. However, the profiles of TCMs in vivo are still unclear owing to their complexity and low level in vivo. In this study, UPLC-Triple-TOF techniques were employed for data acquiring, and a novel pre-classification strategy was developed to rapidly and systematically screen and identify the absorbed constituents and metabolites of TCMs in vivo using Radix glehniae as the research object. In this strategy, pre-classification for absorbed constituents was first performed according to the similarity of their structures. Then representative constituents were elected from every class and analyzed separately to screen non-target absorbed constituents and metabolites in biosamples. This pre-classification strategy is basing on target (known) constituents to screen non-target (unknown) constituents from the massive data acquired by mass spectrometry. Finally, the screened candidate compounds were interpreted and identified based on a predicted metabolic pathway, well - studied fragmentation rules, a predicted metabolic pathway, polarity and retention time of the compounds, and some related literature. With this method, a total of 111 absorbed constituents and metabolites of Radix glehniae in rats' urine, plasma, and bile samples were screened and identified or tentatively characterized successfully. This strategy provides an idea for the screening and identification of the metabolites of other TCMs.
Cardiorespiratory fitness and classification of risk of cardiovascular disease mortality.
Gupta, Sachin; Rohatgi, Anand; Ayers, Colby R; Willis, Benjamin L; Haskell, William L; Khera, Amit; Drazner, Mark H; de Lemos, James A; Berry, Jarett D
2011-04-05
Cardiorespiratory fitness (fitness) is associated with cardiovascular disease (CVD) mortality. However, the extent to which fitness improves risk classification when added to traditional risk factors is unclear. Fitness was measured by the Balke protocol in 66 371 subjects without prior CVD enrolled in the Cooper Center Longitudinal Study between 1970 and 2006; follow-up was extended through 2006. Cox proportional hazards models were used to estimate the risk of CVD mortality with a traditional risk factor model (age, sex, systolic blood pressure, diabetes mellitus, total cholesterol, and smoking) with and without the addition of fitness. The net reclassification improvement and integrated discrimination improvement were calculated at 10 and 25 years. Ten-year risk estimates for CVD mortality were categorized as <1%, 1% to <5%, and ≥5%, and 25-year risk estimates were categorized as <8%, 8% to 30%, and ≥30%. During a median follow-up period of 16 years, there were 1621 CVD deaths. The addition of fitness to the traditional risk factor model resulted in reclassification of 10.7% of the men, with significant net reclassification improvement at both 10 years (net reclassification improvement=0.121) and 25 years (net reclassification improvement=0.041) (P<0.001 for both). The integrated discrimination improvement was 0.010 at 10 years (P<0.001), and the relative integrated discrimination improvement was 29%. Similar findings were observed for women at 25 years. A single measurement of fitness significantly improves classification of both short-term (10-year) and long-term (25-year) risk for CVD mortality when added to traditional risk factors.
Wright, C.; Gallant, Alisa L.
2007-01-01
The U.S. Fish and Wildlife Service uses the term palustrine wetland to describe vegetated wetlands traditionally identified as marsh, bog, fen, swamp, or wet meadow. Landsat TM imagery was combined with image texture and ancillary environmental data to model probabilities of palustrine wetland occurrence in Yellowstone National Park using classification trees. Model training and test locations were identified from National Wetlands Inventory maps, and classification trees were built for seven years spanning a range of annual precipitation. At a coarse level, palustrine wetland was separated from upland. At a finer level, five palustrine wetland types were discriminated: aquatic bed (PAB), emergent (PEM), forested (PFO), scrub–shrub (PSS), and unconsolidated shore (PUS). TM-derived variables alone were relatively accurate at separating wetland from upland, but model error rates dropped incrementally as image texture, DEM-derived terrain variables, and other ancillary GIS layers were added. For classification trees making use of all available predictors, average overall test error rates were 7.8% for palustrine wetland/upland models and 17.0% for palustrine wetland type models, with consistent accuracies across years. However, models were prone to wetland over-prediction. While the predominant PEM class was classified with omission and commission error rates less than 14%, we had difficulty identifying the PAB and PSS classes. Ancillary vegetation information greatly improved PSS classification and moderately improved PFO discrimination. Association with geothermal areas distinguished PUS wetlands. Wetland over-prediction was exacerbated by class imbalance in likely combination with spatial and spectral limitations of the TM sensor. Wetland probability surfaces may be more informative than hard classification, and appear to respond to climate-driven wetland variability. The developed method is portable, relatively easy to implement, and should be applicable in other settings and over larger extents.
Ben Chaabane, Salim; Fnaiech, Farhat
2014-01-23
Color image segmentation has been so far applied in many areas; hence, recently many different techniques have been developed and proposed. In the medical imaging area, the image segmentation may be helpful to provide assistance to doctor in order to follow-up the disease of a certain patient from the breast cancer processed images. The main objective of this work is to rebuild and also to enhance each cell from the three component images provided by an input image. Indeed, from an initial segmentation obtained using the statistical features and histogram threshold techniques, the resulting segmentation may represent accurately the non complete and pasted cells and enhance them. This allows real help to doctors, and consequently, these cells become clear and easy to be counted. A novel method for color edges extraction based on statistical features and automatic threshold is presented. The traditional edge detector, based on the first and the second order neighborhood, describing the relationship between the current pixel and its neighbors, is extended to the statistical domain. Hence, color edges in an image are obtained by combining the statistical features and the automatic threshold techniques. Finally, on the obtained color edges with specific primitive color, a combination rule is used to integrate the edge results over the three color components. Breast cancer cell images were used to evaluate the performance of the proposed method both quantitatively and qualitatively. Hence, a visual and a numerical assessment based on the probability of correct classification (PC), the false classification (Pf), and the classification accuracy (Sens(%)) are presented and compared with existing techniques. The proposed method shows its superiority in the detection of points which really belong to the cells, and also the facility of counting the number of the processed cells. Computer simulations highlight that the proposed method substantially enhances the segmented image with smaller error rates better than other existing algorithms under the same settings (patterns and parameters). Moreover, it provides high classification accuracy, reaching the rate of 97.94%. Additionally, the segmentation method may be extended to other medical imaging types having similar properties.
Multi-font printed Mongolian document recognition system
NASA Astrophysics Data System (ADS)
Peng, Liangrui; Liu, Changsong; Ding, Xiaoqing; Wang, Hua; Jin, Jianming
2009-01-01
Mongolian is one of the major ethnic languages in China. Large amount of Mongolian printed documents need to be digitized in digital library and various applications. Traditional Mongolian script has unique writing style and multi-font-type variations, which bring challenges to Mongolian OCR research. As traditional Mongolian script has some characteristics, for example, one character may be part of another character, we define the character set for recognition according to the segmented components, and the components are combined into characters by rule-based post-processing module. For character recognition, a method based on visual directional feature and multi-level classifiers is presented. For character segmentation, a scheme is used to find the segmentation point by analyzing the properties of projection and connected components. As Mongolian has different font-types which are categorized into two major groups, the parameter of segmentation is adjusted for each group. A font-type classification method for the two font-type group is introduced. For recognition of Mongolian text mixed with Chinese and English, language identification and relevant character recognition kernels are integrated. Experiments show that the presented methods are effective. The text recognition rate is 96.9% on the test samples from practical documents with multi-font-types and mixed scripts.
Moving towards the goals of FP2020 - classifying contraceptives.
Festin, Mario Philip R; Kiarie, James; Solo, Julie; Spieler, Jeffrey; Malarcher, Shawn; Van Look, Paul F A; Temmerman, Marleen
2016-10-01
With the renewed focus on family planning, a clear and transparent understanding is needed for the consistent classification of contraceptives, especially in the commonly used modern/traditional system. The World Health Organization Department of Reproductive Health and Research and the United States Agency for International Development (USAID) therefore convened a technical consultation in January 2015 to address issues related to classifying contraceptives. The consultation defined modern contraceptive methods as having a sound basis in reproductive biology, a precise protocol for correct use and evidence of efficacy under various conditions based on appropriately designed studies. Methods in country programs like Fertility Awareness Based Methods [such as Standard Days Method (SDM) and TwoDay Method], Lactational Amenorrhea Method (LAM) and emergency contraception should be reported as modern. Herbs, charms and vaginal douching are not counted as contraceptive methods as they have no scientific basis in preventing pregnancy nor are in country programs. More research is needed on defining and measuring use of emergency contraceptive methods, to reflect their contribution to reducing unmet need. The ideal contraceptive classification system should be simple, easy to use, clear and consistent, with greater parsimony. Measurement challenges remain but should not be the driving force to determine what methods are counted or reported as modern or not. Family planning programs should consider multiple attributes of contraceptive methods (e.g., level of effectiveness, need for program support, duration of labeled use, hormonal or nonhormonal) to ensure they provide a variety of methods to meet the needs of women and men. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Deep Molecular Diversity of Mammalian Synapses: Why It Matters and How to Measure It
O’Rourke, Nancy A.; Weiler, Nick C.; Micheva, Kristina D.; Smith, Stephen J
2013-01-01
Summary Pioneering studies during the middle of the twentieth century revealed substantial diversity amongst mammalian chemical synapses and led to a widely accepted synapse type classification based on neurotransmitter molecule identity. Subsequently, powerful new physiological, genetic and structural methods have enabled the discovery of much deeper functional and molecular diversity within each traditional neurotransmitter type. Today, this deep diversity continues to pose both daunting challenges and exciting new opportunities for neuroscience. Our growing understanding of deep synapse diversity may transform how we think about and study neural circuit development, structure and function. PMID:22573027
Teixeira da Silva, Jaime A; Jin, Xiaohua; Dobránszki, Judit; Lu, Jiangjie; Wang, Huizhong; Zotz, Gerhard; Cardoso, Jean Carlos; Zeng, Songjun
2016-02-01
Orchids of the genus Dendrobium are of great economic importance in global horticultural trade and in Asian traditional medicine. For both areas, research yielding solid information on taxonomy, phylogeny, and breeding of this genus are essential. Traditional morphological and cytological characterization are used in combination with molecular results in classification and identification. Markers may be useful when used alone but are not always reliable in identification. The number of species studied and identified by molecular markers is small at present. Conventional breeding methods are time-consuming and laborious. In the past two decades, promising advances have been made in taxonomy, phylogeny and breeding of Dendrobium species due to the intensive use of molecular markers. In this review, we focus on the main molecular techniques used in 121 published studies and discuss their importance and possibilities in speeding up the breeding of new cultivars and hybrids. Copyright © 2015 Elsevier Inc. All rights reserved.
Supervised machine learning and active learning in classification of radiology reports.
Nguyen, Dung H M; Patrick, Jon D
2014-01-01
This paper presents an automated system for classifying the results of imaging examinations (CT, MRI, positron emission tomography) into reportable and non-reportable cancer cases. This system is part of an industrial-strength processing pipeline built to extract content from radiology reports for use in the Victorian Cancer Registry. In addition to traditional supervised learning methods such as conditional random fields and support vector machines, active learning (AL) approaches were investigated to optimize training production and further improve classification performance. The project involved two pilot sites in Victoria, Australia (Lake Imaging (Ballarat) and Peter MacCallum Cancer Centre (Melbourne)) and, in collaboration with the NSW Central Registry, one pilot site at Westmead Hospital (Sydney). The reportability classifier performance achieved 98.25% sensitivity and 96.14% specificity on the cancer registry's held-out test set. Up to 92% of training data needed for supervised machine learning can be saved by AL. AL is a promising method for optimizing the supervised training production used in classification of radiology reports. When an AL strategy is applied during the data selection process, the cost of manual classification can be reduced significantly. The most important practical application of the reportability classifier is that it can dramatically reduce human effort in identifying relevant reports from the large imaging pool for further investigation of cancer. The classifier is built on a large real-world dataset and can achieve high performance in filtering relevant reports to support cancer registries. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Can We Speculate Running Application With Server Power Consumption Trace?
Li, Yuanlong; Hu, Han; Wen, Yonggang; Zhang, Jun
2018-05-01
In this paper, we propose to detect the running applications in a server by classifying the observed power consumption series for the purpose of data center energy consumption monitoring and analysis. Time series classification problem has been extensively studied with various distance measurements developed; also recently the deep learning-based sequence models have been proved to be promising. In this paper, we propose a novel distance measurement and build a time series classification algorithm hybridizing nearest neighbor and long short term memory (LSTM) neural network. More specifically, first we propose a new distance measurement termed as local time warping (LTW), which utilizes a user-specified index set for local warping, and is designed to be noncommutative and nondynamic programming. Second, we hybridize the 1-nearest neighbor (1NN)-LTW and LSTM together. In particular, we combine the prediction probability vector of 1NN-LTW and LSTM to determine the label of the test cases. Finally, using the power consumption data from a real data center, we show that the proposed LTW can improve the classification accuracy of dynamic time warping (DTW) from about 84% to 90%. Our experimental results prove that the proposed LTW is competitive on our data set compared with existed DTW variants and its noncommutative feature is indeed beneficial. We also test a linear version of LTW and find out that it can perform similar to state-of-the-art DTW-based method while it runs as fast as the linear runtime lower bound methods like LB_Keogh for our problem. With the hybrid algorithm, for the power series classification task we achieve an accuracy up to about 93%. Our research can inspire more studies on time series distance measurement and the hybrid of the deep learning models with other traditional models.
Endometrial cancer surgery costs: robot vs laparoscopy.
Holtz, David O; Miroshnichenko, Gennady; Finnegan, Mark O; Chernick, Michael; Dunton, Charles J
2010-01-01
To compare surgical costs for endometrial cancer staging between robotic-assisted and traditional laparoscopic methods. Retrospective chart review from November 2005 to July 2006 (Canadian Task Force classification II-3). Non-university-affiliated teaching hospital. Thirty-three women with diagnosed endometrial cancer undergoing hysterectomy, bilateral salpingo-oophorectomy, and pelvic and paraaortic lymph node resection. Patients underwent either robotic or traditional laparoscopic surgery without randomization. Hospital cost data were obtained for operating room time, instrument use, and disposable items from hospital billing records and provided by the finance department. Separate overall hospital stay costs were also obtained. Mean operative costs were higher for robotic procedures ($3323 vs $2029; p<.001), due in part to longer operating room time ($1549 vs $1335; p=.03). The more significant cost difference was due to disposable instrumentation ($1755 vs $672; p<.001). Total hospital costs were also higher for robotic-assisted procedures ($5084 vs $ 3615; p=.002). Robotic surgery costs were significantly higher than traditional laparoscopy costs for staging of endometrial cancer in this small cohort of patients. Copyright (c) 2010 AAGL. Published by Elsevier Inc. All rights reserved.
[Design and implementation of Chinese materia medica resources survey results display system].
Wang, Hui; Zhang, Xiao-Bo; Ge, Xiao-Guang; Jin, Yan; Wang, Ling; Zhao, Yan-Ping; Jing, Zhi-Xian; Guo, Lan-Ping; Huang, Lu-Qi
2017-11-01
From the beginning of the fourth national census of traditional Chinese medicine resources in 2011, a large amount of data have been collected and compiled, including wild medicinal plant resource data, cultivation of medicinal plant information, traditional knowledge, and specimen information. The traditional paper-based recording method is inconvenient for query and application. The B/S architecture, JavaWeb framework and SOA are used to design and develop the fourth national census results display platform. Through the data integration and sorting, the users are to provide with integrated data services and data query display solutions. The platform realizes the fine data classification, and has the simple data retrieval and the university statistical analysis function. The platform uses Echarts components, Geo Server, Open Layers and other technologies to provide a variety of data display forms such as charts, maps and other visualization forms, intuitive reflects the number, distribution and type of Chinese material medica resources. It meets the data mapping requirements of different levels of users, and provides support for management decision-making. Copyright© by the Chinese Pharmaceutical Association.
Simms, Leonard J; Calabrese, William R
2016-02-01
Traditional personality disorders (PDs) are associated with significant psychosocial impairment. DSM-5 Section III includes an alternative hybrid personality disorder (PD) classification approach, with both type and trait elements, but relatively little is known about the impairments associated with Section III traits. Our objective was to study the incremental validity of Section III traits--compared to normal-range traits, traditional PD criterion counts, and common psychiatric symptomatology--in predicting psychosocial impairment. To that end, 628 current/recent psychiatric patients completed measures of PD traits, normal-range traits, traditional PD criteria, psychiatric symptomatology, and psychosocial impairments. Hierarchical regressions revealed that Section III PD traits incrementally predicted psychosocial impairment over normal-range personality traits, PD criterion counts, and common psychiatric symptomatology. In contrast, the incremental effects for normal-range traits, PD symptom counts, and common psychiatric symptomatology were substantially smaller than for PD traits. These findings have implications for PD classification and the impairment literature more generally.
Lu, Huijuan; Wei, Shasha; Zhou, Zili; Miao, Yanzi; Lu, Yi
2015-01-01
The main purpose of traditional classification algorithms on bioinformatics application is to acquire better classification accuracy. However, these algorithms cannot meet the requirement that minimises the average misclassification cost. In this paper, a new algorithm of cost-sensitive regularised extreme learning machine (CS-RELM) was proposed by using probability estimation and misclassification cost to reconstruct the classification results. By improving the classification accuracy of a group of small sample which higher misclassification cost, the new CS-RELM can minimise the classification cost. The 'rejection cost' was integrated into CS-RELM algorithm to further reduce the average misclassification cost. By using Colon Tumour dataset and SRBCT (Small Round Blue Cells Tumour) dataset, CS-RELM was compared with other cost-sensitive algorithms such as extreme learning machine (ELM), cost-sensitive extreme learning machine, regularised extreme learning machine, cost-sensitive support vector machine (SVM). The results of experiments show that CS-RELM with embedded rejection cost could reduce the average cost of misclassification and made more credible classification decision than others.
Chen, Hong-Lin; Cao, Ying-Juan; Wang, Jing; Huai, Bao-Sha
2015-09-01
The Braden Scale is the most widely used pressure ulcer risk assessment in the world, but the currently used 5 risk classification groups do not accurately discriminate among their risk categories. To optimize risk classification based on Braden Scale scores, a retrospective analysis of all consecutively admitted patients in an acute care facility who were at risk for pressure ulcer development was performed between January 2013 and December 2013. Predicted pressure ulcer incidence first was calculated by logistic regression model based on original Braden score. Risk classification then was modified based on the predicted pressure ulcer incidence and compared between different risk categories in the modified (3-group) classification and the traditional (5-group) classification using chi-square test. Two thousand, six hundred, twenty-five (2,625) patients (mean age 59.8 ± 16.5, range 1 month to 98 years, 1,601 of whom were men) were included in the study; 81 patients (3.1%) developed a pressure ulcer. The predicted pressure ulcer incidence ranged from 0.1% to 49.7%. When the predicted pressure ulcer incidence was greater than 10.0% (high risk), the corresponding Braden scores were less than 11; when the predicted incidence ranged from 1.0% to 10.0% (moderate risk), the corresponding Braden scores ranged from 12 to 16; and when the predicted incidence was less than 1.0% (mild risk), the corresponding Braden scores were greater than 17. In the modified classification, observed pressure ulcer incidence was significantly different between each of the 3 risk categories (P less than 0.05). However, in the traditional classification, the observed incidence was not significantly different between the high-risk category and moderate-risk category (P less than 0.05) and between the mild-risk category and no-risk category (P less than 0.05). If future studies confirm the validity of these findings, pressure ulcer prevention protocols of care based on Braden Scale scores can be simplified.
Hip and Wrist Accelerometer Algorithms for Free-Living Behavior Classification.
Ellis, Katherine; Kerr, Jacqueline; Godbole, Suneeta; Staudenmayer, John; Lanckriet, Gert
2016-05-01
Accelerometers are a valuable tool for objective measurement of physical activity (PA). Wrist-worn devices may improve compliance over standard hip placement, but more research is needed to evaluate their validity for measuring PA in free-living settings. Traditional cut-point methods for accelerometers can be inaccurate and need testing in free living with wrist-worn devices. In this study, we developed and tested the performance of machine learning (ML) algorithms for classifying PA types from both hip and wrist accelerometer data. Forty overweight or obese women (mean age = 55.2 ± 15.3 yr; BMI = 32.0 ± 3.7) wore two ActiGraph GT3X+ accelerometers (right hip, nondominant wrist; ActiGraph, Pensacola, FL) for seven free-living days. Wearable cameras captured ground truth activity labels. A classifier consisting of a random forest and hidden Markov model classified the accelerometer data into four activities (sitting, standing, walking/running, and riding in a vehicle). Free-living wrist and hip ML classifiers were compared with each other, with traditional accelerometer cut points, and with an algorithm developed in a laboratory setting. The ML classifier obtained average values of 89.4% and 84.6% balanced accuracy over the four activities using the hip and wrist accelerometer, respectively. In our data set with average values of 28.4 min of walking or running per day, the ML classifier predicted average values of 28.5 and 24.5 min of walking or running using the hip and wrist accelerometer, respectively. Intensity-based cut points and the laboratory algorithm significantly underestimated walking minutes. Our results demonstrate the superior performance of our PA-type classification algorithm, particularly in comparison with traditional cut points. Although the hip algorithm performed better, additional compliance achieved with wrist devices might justify using a slightly lower performing algorithm.
Multiscale CNNs for Brain Tumor Segmentation and Diagnosis.
Zhao, Liya; Jia, Kebin
2016-01-01
Early brain tumor detection and diagnosis are critical to clinics. Thus segmentation of focused tumor area needs to be accurate, efficient, and robust. In this paper, we propose an automatic brain tumor segmentation method based on Convolutional Neural Networks (CNNs). Traditional CNNs focus only on local features and ignore global region features, which are both important for pixel classification and recognition. Besides, brain tumor can appear in any place of the brain and be any size and shape in patients. We design a three-stream framework named as multiscale CNNs which could automatically detect the optimum top-three scales of the image sizes and combine information from different scales of the regions around that pixel. Datasets provided by Multimodal Brain Tumor Image Segmentation Benchmark (BRATS) organized by MICCAI 2013 are utilized for both training and testing. The designed multiscale CNNs framework also combines multimodal features from T1, T1-enhanced, T2, and FLAIR MRI images. By comparison with traditional CNNs and the best two methods in BRATS 2012 and 2013, our framework shows advances in brain tumor segmentation accuracy and robustness.
Neural Network Based Sensory Fusion for Landmark Detection
NASA Technical Reports Server (NTRS)
Kumbla, Kishan -K.; Akbarzadeh, Mohammad R.
1997-01-01
NASA is planning to send numerous unmanned planetary missions to explore the space. This requires autonomous robotic vehicles which can navigate in an unstructured, unknown, and uncertain environment. Landmark based navigation is a new area of research which differs from the traditional goal-oriented navigation, where a mobile robot starts from an initial point and reaches a destination in accordance with a pre-planned path. The landmark based navigation has the advantage of allowing the robot to find its way without communication with the mission control station and without exact knowledge of its coordinates. Current algorithms based on landmark navigation however pose several constraints. First, they require large memories to store the images. Second, the task of comparing the images using traditional methods is computationally intensive and consequently real-time implementation is difficult. The method proposed here consists of three stages, First stage utilizes a heuristic-based algorithm to identify significant objects. The second stage utilizes a neural network (NN) to efficiently classify images of the identified objects. The third stage combines distance information with the classification results of neural networks for efficient and intelligent navigation.
Hierarchical minutiae matching for fingerprint and palmprint identification.
Chen, Fanglin; Huang, Xiaolin; Zhou, Jie
2013-12-01
Fingerprints and palmprints are the most common authentic biometrics for personal identification, especially for forensic security. Previous research have been proposed to speed up the searching process in fingerprint and palmprint identification systems, such as those based on classification or indexing, in which the deterioration of identification accuracy is hard to avert. In this paper, a novel hierarchical minutiae matching algorithm for fingerprint and palmprint identification systems is proposed. This method decomposes the matching step into several stages and rejects many false fingerprints or palmprints on different stages, thus it can save much time while preserving a high identification rate. Experimental results show that the proposed algorithm can save almost 50% searching time compared with traditional methods and illustrate its effectiveness.
Liang, Wenyi; Chen, Wenjing; Wu, Lingfang; Li, Shi; Qi, Qi; Cui, Yaping; Liang, Linjin; Ye, Ting; Zhang, Lanzhen
2017-03-17
Danshen, the dried root of Salvia miltiorrhiza Bge., is a widely used commercially available herbal drug, and unstable quality of different samples is a current issue. This study focused on a comprehensive and systematic method combining fingerprints and chemical identification with chemometrics for discrimination and quality assessment of Danshen samples. Twenty-five samples were analyzed by HPLC-PAD and HPLC-MS n . Forty-nine components were identified and characteristic fragmentation regularities were summarized for further interpretation of bioactive components. Chemometric analysis was employed to differentiate samples and clarify the quality differences of Danshen including hierarchical cluster analysis, principal component analysis, and partial least squares discriminant analysis. Consistent results were that the samples were divided into three categories which reflected the difference in quality of Danshen samples. By analyzing the reasons for sample classification, it was revealed that the processing method had a more obvious impact on sample classification than the geographical origin, it induced the different content of bioactive compounds and finally lead to different qualities. Cryptotanshinone, trijuganone B, and 15,16-dihydrotanshinone I were screened out as markers to distinguish samples by different processing methods. The developed strategy could provide a reference for evaluation and discrimination of other traditional herbal medicines.
NASA Astrophysics Data System (ADS)
Baumstark, R. D.; Duffey, R.; Pu, R.
2016-12-01
The offshore extent of seagrass habitat along the West Florida (USA) coast represents an important corridor for inshore-offshore migration of economically important fish and shellfish. Surviving at the fringe of light requirements, offshore seagrass beds are sensitive to changes in water clarity. Beyond and intermingled with the offshore seagrass areas are large swaths of colonized hard bottom. These offshore habitats of the West Florida coast have lacked mapping efforts needed for status and trends monitoring. The objective of this study was to propose an object-based classification method for mapping offshore habitats and to compare results to traditional photo-interpreted maps. Benthic maps depicting the spatial distribution and percent biological cover were created from WorldView-2 satellite imagery using Object Based Image Analysis (OBIA) method and a visual photo-interpretation method. A logistic regression analysis identified depth and distance from shore as significant parameters for discriminating spectrally similar seagrass and colonized hard bottom features. Seagrass, colonized hard bottom and unconsolidated sediment (sand) were mapped with 78% overall accuracy using the OBIA method compared to 71% overall accuracy using the photo-interpretation method. This study presents an alternative for mapping deeper, offshore habitats capable of producing higher thematic (percent biological cover) and spatial resolution maps compared to those created with the traditional photo-interpretation method.
Characterization and delineation of caribou habitat on Unimak Island using remote sensing techniques
NASA Astrophysics Data System (ADS)
Atkinson, Brain M.
The assessment of herbivore habitat quality is traditionally based on quantifying the forages available to the animal across their home range through ground-based techniques. While these methods are highly accurate, they can be time-consuming and highly expensive, especially for herbivores that occupy vast spatial landscapes. The Unimak Island caribou herd has been decreasing in the last decade at rates that have prompted discussion of management intervention. Frequent inclement weather in this region of Alaska has provided for little opportunity to study the caribou forage habitat on Unimak Island. The overall objectives of this study were two-fold 1) to assess the feasibility of using high-resolution color and near-infrared aerial imagery to map the forage distribution of caribou habitat on Unimak Island and 2) to assess the use of a new high-resolution multispectral satellite imagery platform, RapidEye, and use of the "red-edge" spectral band on vegetation classification accuracy. Maximum likelihood classification algorithms were used to create land cover maps in aerial and satellite imagery. Accuracy assessments and transformed divergence values were produced to assess vegetative spectral information and classification accuracy. By using RapidEye and aerial digital imagery in a hierarchical supervised classification technique, we were able to produce a high resolution land cover map of Unimak Island. We obtained overall accuracy rates of 71.4 percent which are comparable to other land cover maps using RapidEye imagery. The "red-edge" spectral band included in the RapidEye imagery provides additional spectral information that allows for a more accurate overall classification, raising overall accuracy 5.2 percent.
[Road Extraction in Remote Sensing Images Based on Spectral and Edge Analysis].
Zhao, Wen-zhi; Luo, Li-qun; Guo, Zhou; Yue, Jun; Yu, Xue-ying; Liu, Hui; Wei, Jing
2015-10-01
Roads are typically man-made objects in urban areas. Road extraction from high-resolution images has important applications for urban planning and transportation development. However, due to the confusion of spectral characteristic, it is difficult to distinguish roads from other objects by merely using traditional classification methods that mainly depend on spectral information. Edge is an important feature for the identification of linear objects (e. g. , roads). The distribution patterns of edges vary greatly among different objects. It is crucial to merge edge statistical information into spectral ones. In this study, a new method that combines spectral information and edge statistical features has been proposed. First, edge detection is conducted by using self-adaptive mean-shift algorithm on the panchromatic band, which can greatly reduce pseudo-edges and noise effects. Then, edge statistical features are obtained from the edge statistical model, which measures the length and angle distribution of edges. Finally, by integrating the spectral and edge statistical features, SVM algorithm is used to classify the image and roads are ultimately extracted. A series of experiments are conducted and the results show that the overall accuracy of proposed method is 93% comparing with only 78% overall accuracy of the traditional. The results demonstrate that the proposed method is efficient and valuable for road extraction, especially on high-resolution images.
ERIC Educational Resources Information Center
Sanches-Ferreira, Manuela; Silveira-Maia, Mónica; Alves, Sílvia
2014-01-01
Portugal was the first country decreeing the mandatory use of the International Classification of Functioning, Disability and Health: Child and Youth (ICF-CY) framework for guiding special education assessment process and to base eligibility decision-making on students' functioning profiles--in contrast with traditional approaches centred on…
Fusing Laser Reflectance and Image Data for Terrain Classification for Small Autonomous Robots
2014-12-01
limit us to low power, lightweight sensors , and a maximum range of approximately 5 meters. Contrast these robot characteristics to typical terrain...classifi- cation work which uses large autonomous ground vehicles with sensors mounted high above the ground. Terrain classification for small autonomous...into predefined classes [10], [11]. However, wheeled vehicles offer the ability to use non-traditional sensors such as vibration sensors [12] and
Zhao, Dehua; Jiang, Hao; Yang, Tangwu; Cai, Ying; Xu, Delin; An, Shuqing
2012-03-01
Classification trees (CT) have been used successfully in the past to classify aquatic vegetation from spectral indices (SI) obtained from remotely-sensed images. However, applying CT models developed for certain image dates to other time periods within the same year or among different years can reduce the classification accuracy. In this study, we developed CT models with modified thresholds using extreme SI values (CT(m)) to improve the stability of the models when applying them to different time periods. A total of 903 ground-truth samples were obtained in September of 2009 and 2010 and classified as emergent, floating-leaf, or submerged vegetation or other cover types. Classification trees were developed for 2009 (Model-09) and 2010 (Model-10) using field samples and a combination of two images from winter and summer. Overall accuracies of these models were 92.8% and 94.9%, respectively, which confirmed the ability of CT analysis to map aquatic vegetation in Taihu Lake. However, Model-10 had only 58.9-71.6% classification accuracy and 31.1-58.3% agreement (i.e., pixels classified the same in the two maps) for aquatic vegetation when it was applied to image pairs from both a different time period in 2010 and a similar time period in 2009. We developed a method to estimate the effects of extrinsic (EF) and intrinsic (IF) factors on model uncertainty using Modis images. Results indicated that 71.1% of the instability in classification between time periods was due to EF, which might include changes in atmospheric conditions, sun-view angle and water quality. The remainder was due to IF, such as phenological and growth status differences between time periods. The modified version of Model-10 (i.e. CT(m)) performed better than traditional CT with different image dates. When applied to 2009 images, the CT(m) version of Model-10 had very similar thresholds and performance as Model-09, with overall accuracies of 92.8% and 90.5% for Model-09 and the CT(m) version of Model-10, respectively. CT(m) decreased the variability related to EF and IF and thereby improved the applicability of the models to different time periods. In both practice and theory, our results suggested that CT(m) was more stable than traditional CT models and could be used to map aquatic vegetation in time periods other than the one for which the model was developed. Copyright © 2011 Elsevier Ltd. All rights reserved.
Searching bioremediation patents through Cooperative Patent Classification (CPC).
Prasad, Rajendra
2016-03-01
Patent classification systems have traditionally evolved independently at each patent jurisdiction to classify patents handled by their examiners to be able to search previous patents while dealing with new patent applications. As patent databases maintained by them went online for free access to public as also for global search of prior art by examiners, the need arose for a common platform and uniform structure of patent databases. The diversity of different classification, however, posed problems of integrating and searching relevant patents across patent jurisdictions. To address this problem of comparability of data from different sources and searching patents, WIPO in the recent past developed what is known as International Patent Classification (IPC) system which most countries readily adopted to code their patents with IPC codes along with their own codes. The Cooperative Patent Classification (CPC) is the latest patent classification system based on IPC/European Classification (ECLA) system, developed by the European Patent Office (EPO) and the United States Patent and Trademark Office (USPTO) which is likely to become a global standard. This paper discusses this new classification system with reference to patents on bioremediation.
Lambert, Thomas; Nahler, Alexander; Rohla, Miklos; Reiter, Christian; Grund, Michael; Kammler, Jürgen; Blessberger, Hermann; Kypta, Alexander; Kellermair, Jörg; Schwarz, Stefan; Starnawski, Jennifer A; Lichtenauer, Michael; Weiss, Thomas W; Huber, Kurt; Steinwender, Clemens
2016-10-01
Defining an adequate endpoint for renal denervation trials represents a major challenge. A high inter-individual and intra-individual variability of blood pressure levels as well as a partial or total non-adherence on antihypertensive drugs hamper treatment evaluations after renal denervation. Blood pressure measurements at a single point in time as used as primary endpoint in most clinical trials on renal denervation, might not be sufficient to discriminate between patients who do or do not respond to renal denervation. We compared the traditional responder classification (defined as systolic 24-hour blood pressure reduction of -5mmHg six months after renal denervation) with a novel definition of an ideal respondership (based on a 24h blood pressure reduction at no point in time, one, or all follow-up timepoints). We were able to re-classify almost a quarter of patients. Blood pressure variability was substantial in patients traditionally defined as responders. On the other hand, our novel classification of an ideal respondership seems to be clinically superior in discriminating sustained from pseudo-response to renal denervation. Based on our observations, we recommend that the traditional response classification should be reconsidered and possibly strengthened by using a composite endpoint of 24h-BP reductions at different follow-up-visits. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
An ensemble predictive modeling framework for breast cancer classification.
Nagarajan, Radhakrishnan; Upreti, Meenakshi
2017-12-01
Molecular changes often precede clinical presentation of diseases and can be useful surrogates with potential to assist in informed clinical decision making. Recent studies have demonstrated the usefulness of modeling approaches such as classification that can predict the clinical outcomes from molecular expression profiles. While useful, a majority of these approaches implicitly use all molecular markers as features in the classification process often resulting in sparse high-dimensional projection of the samples often comparable to that of the sample size. In this study, a variant of the recently proposed ensemble classification approach is used for predicting good and poor-prognosis breast cancer samples from their molecular expression profiles. In contrast to traditional single and ensemble classifiers, the proposed approach uses multiple base classifiers with varying feature sets obtained from two-dimensional projection of the samples in conjunction with a majority voting strategy for predicting the class labels. In contrast to our earlier implementation, base classifiers in the ensembles are chosen based on maximal sensitivity and minimal redundancy by choosing only those with low average cosine distance. The resulting ensemble sets are subsequently modeled as undirected graphs. Performance of four different classification algorithms is shown to be better within the proposed ensemble framework in contrast to using them as traditional single classifier systems. Significance of a subset of genes with high-degree centrality in the network abstractions across the poor-prognosis samples is also discussed. Copyright © 2017 Elsevier Inc. All rights reserved.
Robust gene selection methods using weighting schemes for microarray data analysis.
Kang, Suyeon; Song, Jongwoo
2017-09-02
A common task in microarray data analysis is to identify informative genes that are differentially expressed between two different states. Owing to the high-dimensional nature of microarray data, identification of significant genes has been essential in analyzing the data. However, the performances of many gene selection techniques are highly dependent on the experimental conditions, such as the presence of measurement error or a limited number of sample replicates. We have proposed new filter-based gene selection techniques, by applying a simple modification to significance analysis of microarrays (SAM). To prove the effectiveness of the proposed method, we considered a series of synthetic datasets with different noise levels and sample sizes along with two real datasets. The following findings were made. First, our proposed methods outperform conventional methods for all simulation set-ups. In particular, our methods are much better when the given data are noisy and sample size is small. They showed relatively robust performance regardless of noise level and sample size, whereas the performance of SAM became significantly worse as the noise level became high or sample size decreased. When sufficient sample replicates were available, SAM and our methods showed similar performance. Finally, our proposed methods are competitive with traditional methods in classification tasks for microarrays. The results of simulation study and real data analysis have demonstrated that our proposed methods are effective for detecting significant genes and classification tasks, especially when the given data are noisy or have few sample replicates. By employing weighting schemes, we can obtain robust and reliable results for microarray data analysis.
NASA Astrophysics Data System (ADS)
Tarando, Sebastian Roberto; Fetita, Catalin; Brillet, Pierre-Yves
2017-03-01
The infiltrative lung diseases are a class of irreversible, non-neoplastic lung pathologies requiring regular follow-up with CT imaging. Quantifying the evolution of the patient status imposes the development of automated classification tools for lung texture. Traditionally, such classification relies on a two-dimensional analysis of axial CT images. This paper proposes a cascade of the existing CNN based CAD system, specifically tuned-up. The advantage of using a deep learning approach is a better regularization of the classification output. In a preliminary evaluation, the combined approach was tested on a 13 patient database of various lung pathologies, showing an increase of 10% in True Positive Rate (TPR) with respect to the best suited state of the art CNN for this task.
Commission 45: Spectral Classification
NASA Astrophysics Data System (ADS)
Giridhar, Sunetra; Gray, Richard O.; Corbally, Christopher J.; Bailer-Jones, Coryn A. L.; Eyer, Laurent; Irwin, Michael J.; Kirkpatrick, J. Davy; Majewski, Steven; Minniti, Dante; Nordström, Birgitta
This report gives an update of developments (since the last General Assembly at Prague) in the areas that are of relevance to the commission. In addition to numerous papers, a new monograph entitled Stellar Spectral Classification with Richard Gray and Chris Corbally as leading authors will be published by Princeton University Press as part of their Princeton Series in Astrophysics in April 2009. This book is an up-to-date and encyclopedic review of stellar spectral classification across the H-R diagram, including the traditional MK system in the blue-violet, recent extensions into the ultraviolet and infrared, the newly defined L-type and T-type spectral classes, as well as spectral classification of carbon stars, S-type stars, white dwarfs, novae, supernovae and Wolf-Rayet stars.
Ludwig, Simone A; Kong, Jun
2017-12-01
Innovative methods and new technologies have significantly improved the quality of our daily life. However, disabled people, for example those that cannot use their arms and legs anymore, often cannot benefit from these developments, since they cannot use their hands to interact with traditional interaction methods (such as mouse or keyboard) to communicate with a computer system. A brain-computer interface (BCI) system allows such a disabled person to control an external device via brain waves. Past research mostly dealt with static interfaces, which limit users to a stationary location. However, since we are living in a world that is highly mobile, this paper evaluates a speller interface on a mobile phone used in a moving condition. The spelling experiments were conducted with 14 able-bodied subjects using visual flashes as the stimulus to spell 47 alphanumeric characters (38 letters and 9 numbers). This data was then used for the classification experiments. In par- ticular, two research directions are pursued. The first investigates the impact of different classification algorithms, and the second direction looks at the channel configuration, i.e., which channels are most beneficial in terms of achieving the highest classification accuracy. The evaluation results indicate that the Bayesian Linear Discriminant Analysis algorithm achieves the best accuracy. Also, the findings of the investigation on the channel configuration, which can potentially reduce the amount of data processing on a mobile device with limited computing capacity, is especially useful in mobile BCIs.
Nonlinear Classification of AVO Attributes Using SVM
NASA Astrophysics Data System (ADS)
Zhao, B.; Zhou, H.
2005-05-01
A key research topic in reservoir characterization is the detection of the presence of fluids using seismic and well-log data. In particular, partial gas discrimination is very challenging because low and high gas saturation can result in similar anomalies in terms of Amplitude Variation with Offset (AVO), bright spot, and velocity sag. Hence, a successful fluid detection will require a good understanding of the seismic signatures of the fluids, high-quality data, and good detection methodology. Traditional attempts of partial gas discrimination employ the Neural Network algorithm. A new approach is to use the Support Vector Machine (SVM) (Vapnik, 1995; Liu and Sacchi, 2003). While the potential of the SVM has not been fully explored for reservoir fluid detection, the current nonlinear methods classify seismic attributes without the use of rock physics constraints. The objective of this study is to improve the capability of distinguishing a fizz-water reservoir from a commercial gas reservoir by developing a new detection method using AVO attributes and rock physics constraints. This study will first test the SVM classification with synthetic data, and then apply the algorithm to field data from the King-Kong and Lisa-Anne fields in Gulf of Mexico. While both field areas have high amplitude seismic anomalies, King-Kong field produces commercial gas but Lisa-Anne field does not. We expect that the new SVM-based nonlinear classification of AVO attributes may be able to separate commercial gas from fizz-water in these two fields.
Optoelectronic methods in potential application in monitoring of environmental conditions
NASA Astrophysics Data System (ADS)
Mularczyk-Oliwa, Monika; Bombalska, Aneta; Kwaśny, Mirosław; Kopczyński, Krzysztof; Włodarski, Maksymilian; Kaliszewski, Miron; Kostecki, Jerzy
2016-12-01
Allergic rhinitis, also known as hay fever is a type of inflammation which occurs when the immune system overreacts to allergens in the air. It became the most common disease among people. It became important to monitor air content for the presence of a particular type of allergen. For the purposes of environmental monitoring there is a need to widen the group of traditional methods of identification of pollen for faster and more accurate research systems. The aim of the work was the characterization and classification of certain types of plant pollens by using laser optical methods, which were supported by the chemmometrics. Several species of pollen were examined, for which a database of spectral characteristics was created, using LIF, Raman scattering and FTIR methods. Spectral database contains characteristics of both common allergens and pollen of minor importance. Based on registered spectra, statistical analysis was made, which allows the classification of the tested pollen species. For the study of the emission spectra Nd:YAG laser was used with the fourth harmonic generation (266 nm) and GaN diode laser (375 nm). For Raman scattering spectra spectrometer Nicolet IS-50 with a excitation wavelength of 1064 nm was used. The FTIR spectra, recorded in the mid infrared1 range (4000-650 cm-1) were collected with use of transmission mode (KBr pellet), ATR and DRIFT.
NASA Astrophysics Data System (ADS)
Schmalz, M.; Ritter, G.
Accurate multispectral or hyperspectral signature classification is key to the nonimaging detection and recognition of space objects. Additionally, signature classification accuracy depends on accurate spectral endmember determination [1]. Previous approaches to endmember computation and signature classification were based on linear operators or neural networks (NNs) expressed in terms of the algebra (R, +, x) [1,2]. Unfortunately, class separation in these methods tends to be suboptimal, and the number of signatures that can be accurately classified often depends linearly on the number of NN inputs. This can lead to poor endmember distinction, as well as potentially significant classification errors in the presence of noise or densely interleaved signatures. In contrast to traditional CNNs, autoassociative morphological memories (AMM) are a construct similar to Hopfield autoassociatived memories defined on the (R, +, ?,?) lattice algebra [3]. Unlimited storage and perfect recall of noiseless real valued patterns has been proven for AMMs [4]. However, AMMs suffer from sensitivity to specific noise models, that can be characterized as erosive and dilative noise. On the other hand, the prior definition of a set of endmembers corresponds to material spectra lying on vertices of the minimum convex region covering the image data. These vertices can be characterized as morphologically independent patterns. It has further been shown that AMMs can be based on dendritic computation [3,6]. These techniques yield improved accuracy and class segmentation/separation ability in the presence of highly interleaved signature data. In this paper, we present a procedure for endmember determination based on AMM noise sensitivity, which employs morphological dendritic computation. We show that detected endmembers can be exploited by AMM based classification techniques, to achieve accurate signature classification in the presence of noise, closely spaced or interleaved signatures, and simulated camera optical distortions. In particular, we examine two critical cases: (1) classification of multiple closely spaced signatures that are difficult to separate using distance measures, and (2) classification of materials in simulated hyperspectral images of spaceborne satellites. In each case, test data are derived from a NASA database of space material signatures. Additional analysis pertains to computational complexity and noise sensitivity, which are superior to classical NN based techniques.
Automated texture-based identification of ovarian cancer in confocal microendoscope images
NASA Astrophysics Data System (ADS)
Srivastava, Saurabh; Rodriguez, Jeffrey J.; Rouse, Andrew R.; Brewer, Molly A.; Gmitro, Arthur F.
2005-03-01
The fluorescence confocal microendoscope provides high-resolution, in-vivo imaging of cellular pathology during optical biopsy. There are indications that the examination of human ovaries with this instrument has diagnostic implications for the early detection of ovarian cancer. The purpose of this study was to develop a computer-aided system to facilitate the identification of ovarian cancer from digital images captured with the confocal microendoscope system. To achieve this goal, we modeled the cellular-level structure present in these images as texture and extracted features based on first-order statistics, spatial gray-level dependence matrices, and spatial-frequency content. Selection of the best features for classification was performed using traditional feature selection techniques including stepwise discriminant analysis, forward sequential search, a non-parametric method, principal component analysis, and a heuristic technique that combines the results of these methods. The best set of features selected was used for classification, and performance of various machine classifiers was compared by analyzing the areas under their receiver operating characteristic curves. The results show that it is possible to automatically identify patients with ovarian cancer based on texture features extracted from confocal microendoscope images and that the machine performance is superior to that of the human observer.
Plenis, Alina; Rekowska, Natalia; Bączek, Tomasz
2016-01-01
This article focuses on correlating the column classification obtained from the method created at the Katholieke Universiteit Leuven (KUL), with the chromatographic resolution attained in biomedical separation. In the KUL system, each column is described with four parameters, which enables estimation of the FKUL value characterising similarity of those parameters to the selected reference stationary phase. Thus, a ranking list based on the FKUL value can be calculated for the chosen reference column, then correlated with the results of the column performance test. In this study, the column performance test was based on analysis of moclobemide and its two metabolites in human plasma by liquid chromatography (LC), using 18 columns. The comparative study was performed using traditional correlation of the FKUL values with the retention parameters of the analytes describing the column performance test. In order to deepen the comparative assessment of both data sets, factor analysis (FA) was also used. The obtained results indicated that the stationary phase classes, closely related according to the KUL method, yielded comparable separation for the target substances. Therefore, the column ranking system based on the FKUL-values could be considered supportive in the choice of the appropriate column for biomedical analysis. PMID:26805819
Plenis, Alina; Rekowska, Natalia; Bączek, Tomasz
2016-01-21
This article focuses on correlating the column classification obtained from the method created at the Katholieke Universiteit Leuven (KUL), with the chromatographic resolution attained in biomedical separation. In the KUL system, each column is described with four parameters, which enables estimation of the FKUL value characterising similarity of those parameters to the selected reference stationary phase. Thus, a ranking list based on the FKUL value can be calculated for the chosen reference column, then correlated with the results of the column performance test. In this study, the column performance test was based on analysis of moclobemide and its two metabolites in human plasma by liquid chromatography (LC), using 18 columns. The comparative study was performed using traditional correlation of the FKUL values with the retention parameters of the analytes describing the column performance test. In order to deepen the comparative assessment of both data sets, factor analysis (FA) was also used. The obtained results indicated that the stationary phase classes, closely related according to the KUL method, yielded comparable separation for the target substances. Therefore, the column ranking system based on the FKUL-values could be considered supportive in the choice of the appropriate column for biomedical analysis.
Application of diffusion maps to identify human factors of self-reported anomalies in aviation.
Andrzejczak, Chris; Karwowski, Waldemar; Mikusinski, Piotr
2012-01-01
A study investigating what factors are present leading to pilots submitting voluntary anomaly reports regarding their flight performance was conducted. Diffusion Maps (DM) were selected as the method of choice for performing dimensionality reduction on text records for this study. Diffusion Maps have seen successful use in other domains such as image classification and pattern recognition. High-dimensionality data in the form of narrative text reports from the NASA Aviation Safety Reporting System (ASRS) were clustered and categorized by way of dimensionality reduction. Supervised analyses were performed to create a baseline document clustering system. Dimensionality reduction techniques identified concepts or keywords within records, and allowed the creation of a framework for an unsupervised document classification system. Results from the unsupervised clustering algorithm performed similarly to the supervised methods outlined in the study. The dimensionality reduction was performed on 100 of the most commonly occurring words within 126,000 text records describing commercial aviation incidents. This study demonstrates that unsupervised machine clustering and organization of incident reports is possible based on unbiased inputs. Findings from this study reinforced traditional views on what factors contribute to civil aviation anomalies, however, new associations between previously unrelated factors and conditions were also found.
Iravani, B; Towhidkhah, F; Roghani, M
2014-12-01
Parkinson Disease (PD) is one of the most common neural disorders worldwide. Different treatments such as medication and deep brain stimulation (DBS) have been proposed to minimize and control Parkinson's symptoms. DBS has been recognized as an effective approach to decrease most movement disorders of PD. In this study, a new method is proposed for feature extraction and separation of treated and untreated Parkinsonan rats. For this purpose, unilateral intrastriatal 6-hydroxydopamine (6-OHDA, 12.5 μg/5 μl of saline-ascorbate)-lesioned rats were treated with DBS. We performed a behavioral experiment and video tracked traveled trajectories of rats. Then, we investigated the effect of deep brain stimulation of subthalamus nucleus on their behavioral movements. Time, frequency and chaotic features of traveled trajectories were extracted. These features provide the ability to quantify the behavioral movements of Parkinsonian rats. The results showed that the traveled trajectories of untreated were more convoluted with the different time/frequency response. Compared to the traditional features used before to quantify the animals' behavior, the new features improved classification accuracy up to 80 % for untreated and treated rats.
Stringlike Pulse Quantification Study by Pulse Wave in 3D Pulse Mapping
Chung, Yu-Feng; Yeh, Cheng-Chang; Si, Xiao-Chen; Chang, Chien-Chen; Hu, Chung-Shing; Chu, Yu-Wen
2012-01-01
Abstract Background A stringlike pulse is highly related to hypertension, and many classification approaches have been proposed in which the differentiation pulse wave (dPW) can effectively classify the stringlike pulse indicating hypertension. Unfortunately, the dPW method cannot distinguish the spring stringlike pulse from the stringlike pulse so labeled by physicians in clinics. Design By using a Bi-Sensing Pulse Diagnosis Instrument (BSPDI), this study proposed a novel Plain Pulse Wave (PPW) to classify a stringlike pulse based on an array of pulse signals, mimicking a Traditional Chinese Medicine physician's finger-reading skill. Results In comparison to PPWs at different pulse taking positions, phase delay Δθand correlation coefficient r can be elucidated as the quantification parameters of stringlike pulse. As a result, the recognition rates of a hypertensive stringlike pulse, spring stringlike pulse, and non–stringlike pulse are 100%, 100%, 77% for PPW and 70%, 0%, 59% for dPW, respectively. Conclusions Integrating dPW and PPW can unify the classification of stringlike pulse including hypertensive stringlike pulse and spring stringlike pulse. Hence, the proposed novel method, PPW, enhances quantification of stringlike pulse. PMID:23057481
Measures of native and non-native rhythm in a quantity language.
Stockmal, Verna; Markus, Dace; Bond, Dzintra
2005-01-01
The traditional phonetic classification of language rhythm as stress-timed or syllable-timed is attributed to Pike. Recently, two different proposals have been offered for describing the rhythmic structure of languages from acoustic-phonetic measurements. Ramus has suggested a metric based on the proportion of vocalic intervals and the variability (SD) of consonantal intervals. Grabe has proposed Pairwise Variability Indices (nPVI, rPVI) calculated from the differences in vocalic and consonantal durations between successive syllables. We have calculated both the Ramus and Grabe metrics for Latvian, traditionally considered a syllable rhythm language, and for Latvian as spoken by Russian learners. Native speakers and proficient learners were very similar whereas low-proficiency learners showed high variability on some properties. The metrics did not provide an unambiguous classification of Latvian.
Multiscale Persistent Functions for Biomolecular Structure Characterization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xia, Kelin; Li, Zhiming; Mu, Lin
Here in this paper, we introduce multiscale persistent functions for biomolecular structure characterization. The essential idea is to combine our multiscale rigidity functions (MRFs) with persistent homology analysis, so as to construct a series of multiscale persistent functions, particularly multiscale persistent entropies, for structure characterization. To clarify the fundamental idea of our method, the multiscale persistent entropy (MPE) model is discussed in great detail. Mathematically, unlike the previous persistent entropy (Chintakunta et al. in Pattern Recognit 48(2):391–401, 2015; Merelli et al. in Entropy 17(10):6872–6892, 2015; Rucco et al. in: Proceedings of ECCS 2014, Springer, pp 117–128, 2016), a special resolutionmore » parameter is incorporated into our model. Various scales can be achieved by tuning its value. Physically, our MPE can be used in conformational entropy evaluation. More specifically, it is found that our method incorporates in it a natural classification scheme. This is achieved through a density filtration of an MRF built from angular distributions. To further validate our model, a systematical comparison with the traditional entropy evaluation model is done. Additionally, it is found that our model is able to preserve the intrinsic topological features of biomolecular data much better than traditional approaches, particularly for resolutions in the intermediate range. Moreover, by comparing with traditional entropies from various grid sizes, bond angle-based methods and a persistent homology-based support vector machine method (Cang et al. in Mol Based Math Biol 3:140–162, 2015), we find that our MPE method gives the best results in terms of average true positive rate in a classic protein structure classification test. More interestingly, all-alpha and all-beta protein classes can be clearly separated from each other with zero error only in our model. Finally, a special protein structure index (PSI) is proposed, for the first time, to describe the “regularity” of protein structures. Basically, a protein structure is deemed as regular if it has a consistent and orderly configuration. Our PSI model is tested on a database of 110 proteins; we find that structures with larger portions of loops and intrinsically disorder regions are always associated with larger PSI, meaning an irregular configuration, while proteins with larger portions of secondary structures, i.e., alpha-helix or beta-sheet, have smaller PSI. Essentially, PSI can be used to describe the “regularity” information in any systems.« less
Automatic extraction of blocks from 3D point clouds of fractured rock
NASA Astrophysics Data System (ADS)
Chen, Na; Kemeny, John; Jiang, Qinghui; Pan, Zhiwen
2017-12-01
This paper presents a new method for extracting blocks and calculating block size automatically from rock surface 3D point clouds. Block size is an important rock mass characteristic and forms the basis for several rock mass classification schemes. The proposed method consists of four steps: 1) the automatic extraction of discontinuities using an improved Ransac Shape Detection method, 2) the calculation of discontinuity intersections based on plane geometry, 3) the extraction of block candidates based on three discontinuities intersecting one another to form corners, and 4) the identification of "true" blocks using an improved Floodfill algorithm. The calculated block sizes were compared with manual measurements in two case studies, one with fabricated cardboard blocks and the other from an actual rock mass outcrop. The results demonstrate that the proposed method is accurate and overcomes the inaccuracies, safety hazards, and biases of traditional techniques.
Factor complexity of crash occurrence: An empirical demonstration using boosted regression trees.
Chung, Yi-Shih
2013-12-01
Factor complexity is a characteristic of traffic crashes. This paper proposes a novel method, namely boosted regression trees (BRT), to investigate the complex and nonlinear relationships in high-variance traffic crash data. The Taiwanese 2004-2005 single-vehicle motorcycle crash data are used to demonstrate the utility of BRT. Traditional logistic regression and classification and regression tree (CART) models are also used to compare their estimation results and external validities. Both the in-sample cross-validation and out-of-sample validation results show that an increase in tree complexity provides improved, although declining, classification performance, indicating a limited factor complexity of single-vehicle motorcycle crashes. The effects of crucial variables including geographical, time, and sociodemographic factors explain some fatal crashes. Relatively unique fatal crashes are better approximated by interactive terms, especially combinations of behavioral factors. BRT models generally provide improved transferability than conventional logistic regression and CART models. This study also discusses the implications of the results for devising safety policies. Copyright © 2012 Elsevier Ltd. All rights reserved.
Shen, Simon; Syal, Karan; Tao, Nongjian; Wang, Shaopeng
2015-12-01
We present a Single-Cell Motion Characterization System (SiCMoCS) to automatically extract bacterial cell morphological features from microscope images and use those features to automatically classify cell motion for rod shaped motile bacterial cells. In some imaging based studies, bacteria cells need to be attached to the surface for time-lapse observation of cellular processes such as cell membrane-protein interactions and membrane elasticity. These studies often generate large volumes of images. Extracting accurate bacterial cell morphology features from these images is critical for quantitative assessment. Using SiCMoCS, we demonstrated simultaneous and automated motion tracking and classification of hundreds of individual cells in an image sequence of several hundred frames. This is a significant improvement from traditional manual and semi-automated approaches to segmenting bacterial cells based on empirical thresholds, and a first attempt to automatically classify bacterial motion types for motile rod shaped bacterial cells, which enables rapid and quantitative analysis of various types of bacterial motion.