Sample records for discriminant analysis classification

  1. Classification accuracy on the family planning participation status using kernel discriminant analysis

    NASA Astrophysics Data System (ADS)

    Kurniawan, Dian; Suparti; Sugito

    2018-05-01

    Population growth in Indonesia has increased every year. According to the population census conducted by the Central Bureau of Statistics (BPS) in 2010, the population of Indonesia has reached 237.6 million people. Therefore, to control the population growth rate, the government hold Family Planning or Keluarga Berencana (KB) program for couples of childbearing age. The purpose of this program is to improve the health of mothers and children in order to manifest prosperous society by controlling births while ensuring control of population growth. The data used in this study is the updated family data of Semarang city in 2016 that conducted by National Family Planning Coordinating Board (BKKBN). From these data, classifiers with kernel discriminant analysis will be obtained, and also classification accuracy will be obtained from that method. The result of the analysis showed that normal kernel discriminant analysis gives 71.05 % classification accuracy with 28.95 % classification error. Whereas triweight kernel discriminant analysis gives 73.68 % classification accuracy with 26.32 % classification error. Using triweight kernel discriminant for data preprocessing of family planning participation of childbearing age couples in Semarang City of 2016 can be stated better than with normal kernel discriminant.

  2. Discriminant forest classification method and system

    DOEpatents

    Chen, Barry Y.; Hanley, William G.; Lemmond, Tracy D.; Hiller, Lawrence J.; Knapp, David A.; Mugge, Marshall J.

    2012-11-06

    A hybrid machine learning methodology and system for classification that combines classical random forest (RF) methodology with discriminant analysis (DA) techniques to provide enhanced classification capability. A DA technique which uses feature measurements of an object to predict its class membership, such as linear discriminant analysis (LDA) or Andersen-Bahadur linear discriminant technique (AB), is used to split the data at each node in each of its classification trees to train and grow the trees and the forest. When training is finished, a set of n DA-based decision trees of a discriminant forest is produced for use in predicting the classification of new samples of unknown class.

  3. Discriminative Nonlinear Analysis Operator Learning: When Cosparse Model Meets Image Classification.

    PubMed

    Wen, Zaidao; Hou, Biao; Jiao, Licheng

    2017-05-03

    Linear synthesis model based dictionary learning framework has achieved remarkable performances in image classification in the last decade. Behaved as a generative feature model, it however suffers from some intrinsic deficiencies. In this paper, we propose a novel parametric nonlinear analysis cosparse model (NACM) with which a unique feature vector will be much more efficiently extracted. Additionally, we derive a deep insight to demonstrate that NACM is capable of simultaneously learning the task adapted feature transformation and regularization to encode our preferences, domain prior knowledge and task oriented supervised information into the features. The proposed NACM is devoted to the classification task as a discriminative feature model and yield a novel discriminative nonlinear analysis operator learning framework (DNAOL). The theoretical analysis and experimental performances clearly demonstrate that DNAOL will not only achieve the better or at least competitive classification accuracies than the state-of-the-art algorithms but it can also dramatically reduce the time complexities in both training and testing phases.

  4. Spectral Regression Discriminant Analysis for Hyperspectral Image Classification

    NASA Astrophysics Data System (ADS)

    Pan, Y.; Wu, J.; Huang, H.; Liu, J.

    2012-08-01

    Dimensionality reduction algorithms, which aim to select a small set of efficient and discriminant features, have attracted great attention for Hyperspectral Image Classification. The manifold learning methods are popular for dimensionality reduction, such as Locally Linear Embedding, Isomap, and Laplacian Eigenmap. However, a disadvantage of many manifold learning methods is that their computations usually involve eigen-decomposition of dense matrices which is expensive in both time and memory. In this paper, we introduce a new dimensionality reduction method, called Spectral Regression Discriminant Analysis (SRDA). SRDA casts the problem of learning an embedding function into a regression framework, which avoids eigen-decomposition of dense matrices. Also, with the regression based framework, different kinds of regularizes can be naturally incorporated into our algorithm which makes it more flexible. It can make efficient use of data points to discover the intrinsic discriminant structure in the data. Experimental results on Washington DC Mall and AVIRIS Indian Pines hyperspectral data sets demonstrate the effectiveness of the proposed method.

  5. Application of Linear Discriminant Analysis in Dimensionality Reduction for Hand Motion Classification

    NASA Astrophysics Data System (ADS)

    Phinyomark, A.; Hu, H.; Phukpattaranont, P.; Limsakul, C.

    2012-01-01

    The classification of upper-limb movements based on surface electromyography (EMG) signals is an important issue in the control of assistive devices and rehabilitation systems. Increasing the number of EMG channels and features in order to increase the number of control commands can yield a high dimensional feature vector. To cope with the accuracy and computation problems associated with high dimensionality, it is commonplace to apply a processing step that transforms the data to a space of significantly lower dimensions with only a limited loss of useful information. Linear discriminant analysis (LDA) has been successfully applied as an EMG feature projection method. Recently, a number of extended LDA-based algorithms have been proposed, which are more competitive in terms of both classification accuracy and computational costs/times with classical LDA. This paper presents the findings of a comparative study of classical LDA and five extended LDA methods. From a quantitative comparison based on seven multi-feature sets, three extended LDA-based algorithms, consisting of uncorrelated LDA, orthogonal LDA and orthogonal fuzzy neighborhood discriminant analysis, produce better class separability when compared with a baseline system (without feature projection), principle component analysis (PCA), and classical LDA. Based on a 7-dimension time domain and time-scale feature vectors, these methods achieved respectively 95.2% and 93.2% classification accuracy by using a linear discriminant classifier.

  6. Chance-corrected classification for use in discriminant analysis: Ecological applications

    USGS Publications Warehouse

    Titus, K.; Mosher, J.A.; Williams, B.K.

    1984-01-01

    A method for evaluating the classification table from a discriminant analysis is described. The statistic, kappa, is useful to ecologists in that it removes the effects of chance. It is useful even with equal group sample sizes although the need for a chance-corrected measure of prediction becomes greater with more dissimilar group sample sizes. Examples are presented.

  7. Stability and bias of classification rates in biological applications of discriminant analysis

    USGS Publications Warehouse

    Williams, B.K.; Titus, K.; Hines, J.E.

    1990-01-01

    We assessed the sampling stability of classification rates in discriminant analysis by using a factorial design with factors for multivariate dimensionality, dispersion structure, configuration of group means, and sample size. A total of 32,400 discriminant analyses were conducted, based on data from simulated populations with appropriate underlying statistical distributions. Simulation results indicated strong bias in correct classification rates when group sample sizes were small and when overlap among groups was high. We also found that stability of the correct classification rates was influenced by these factors, indicating that the number of samples required for a given level of precision increases with the amount of overlap among groups. In a review of 60 published studies, we found that 57% of the articles presented results on classification rates, though few of them mentioned potential biases in their results. Wildlife researchers should choose the total number of samples per group to be at least 2 times the number of variables to be measured when overlap among groups is low. Substantially more samples are required as the overlap among groups increases

  8. Local classification: Locally weighted-partial least squares-discriminant analysis (LW-PLS-DA).

    PubMed

    Bevilacqua, Marta; Marini, Federico

    2014-08-01

    The possibility of devising a simple, flexible and accurate non-linear classification method, by extending the locally weighted partial least squares (LW-PLS) approach to the cases where the algorithm is used in a discriminant way (partial least squares discriminant analysis, PLS-DA), is presented. In particular, to assess which category an unknown sample belongs to, the proposed algorithm operates by identifying which training objects are most similar to the one to be predicted and building a PLS-DA model using these calibration samples only. Moreover, the influence of the selected training samples on the local model can be further modulated by adopting a not uniform distance-based weighting scheme which allows the farthest calibration objects to have less impact than the closest ones. The performances of the proposed locally weighted-partial least squares-discriminant analysis (LW-PLS-DA) algorithm have been tested on three simulated data sets characterized by a varying degree of non-linearity: in all cases, a classification accuracy higher than 99% on external validation samples was achieved. Moreover, when also applied to a real data set (classification of rice varieties), characterized by a high extent of non-linearity, the proposed method provided an average correct classification rate of about 93% on the test set. By the preliminary results, showed in this paper, the performances of the proposed LW-PLS-DA approach have proved to be comparable and in some cases better than those obtained by other non-linear methods (k nearest neighbors, kernel-PLS-DA and, in the case of rice, counterpropagation neural networks). Copyright © 2014 Elsevier B.V. All rights reserved.

  9. Classification of electroencephalograph signals using time-frequency decomposition and linear discriminant analysis

    NASA Astrophysics Data System (ADS)

    Szuflitowska, B.; Orlowski, P.

    2017-08-01

    Automated detection system consists of two key steps: extraction of features from EEG signals and classification for detection of pathology activity. The EEG sequences were analyzed using Short-Time Fourier Transform and the classification was performed using Linear Discriminant Analysis. The accuracy of the technique was tested on three sets of EEG signals: epilepsy, healthy and Alzheimer's Disease. The classification error below 10% has been considered a success. The higher accuracy are obtained for new data of unknown classes than testing data. The methodology can be helpful in differentiation epilepsy seizure and disturbances in the EEG signal in Alzheimer's Disease.

  10. Liquid contrabands classification based on energy dispersive X-ray diffraction and hybrid discriminant analysis

    NASA Astrophysics Data System (ADS)

    YangDai, Tianyi; Zhang, Li

    2016-02-01

    Energy dispersive X-ray diffraction (EDXRD) combined with hybrid discriminant analysis (HDA) has been utilized for classifying the liquid materials for the first time. The XRD spectra of 37 kinds of liquid contrabands and daily supplies were obtained using an EDXRD test bed facility. The unique spectra of different samples reveal XRD's capability to distinguish liquid contrabands from daily supplies. In order to create a system to detect liquid contrabands, the diffraction spectra were subjected to HDA which is the combination of principal components analysis (PCA) and linear discriminant analysis (LDA). Experiments based on the leave-one-out method demonstrate that HDA is a practical method with higher classification accuracy and lower noise sensitivity than the other methods in this application. The study shows the great capability and potential of the combination of XRD and HDA for liquid contrabands classification.

  11. Real-Time Classification of Exercise Exertion Levels Using Discriminant Analysis of HRV Data.

    PubMed

    Jeong, In Cheol; Finkelstein, Joseph

    2015-01-01

    Heart rate variability (HRV) was shown to reflect activation of sympathetic nervous system however it is not clear which set of HRV parameters is optimal for real-time classification of exercise exertion levels. There is no studies that compared potential of two types of HRV parameters (time-domain and frequency-domain) in predicting exercise exertion level using discriminant analysis. The main goal of this study was to compare potential of HRV time-domain parameters versus HRV frequency-domain parameters in classifying exercise exertion level. Rest, exercise, and recovery categories were used in classification models. Overall 79.5% classification agreement by the time-domain parameters as compared to overall 52.8% classification agreement by frequency-domain parameters demonstrated that the time-domain parameters had higher potential in classifying exercise exertion levels.

  12. Classification and prediction of pilot weather encounters: A discriminant function analysis.

    PubMed

    O'Hare, David; Hunter, David R; Martinussen, Monica; Wiggins, Mark

    2011-05-01

    Flight into adverse weather continues to be a significant hazard for General Aviation (GA) pilots. Weather-related crashes have a significantly higher fatality rate than other GA crashes. Previous research has identified lack of situational awareness, risk perception, and risk tolerance as possible explanations for why pilots would continue into adverse weather. However, very little is known about the nature of these encounters or the differences between pilots who avoid adverse weather and those who do not. Visitors to a web site described an experience with adverse weather and completed a range of measures of personal characteristics. The resulting data from 364 pilots were carefully screened and subject to a discriminant function analysis. Two significant functions were found. The first, accounting for 69% of the variance, reflected measures of risk awareness and pilot judgment while the second differentiated pilots in terms of their experience levels. The variables measured in this study enabled us to correctly discriminate between the three groups of pilots considerably better (53% correct classifications) than would have been possible by chance (33% correct classifications). The implications of these findings for targeting safety interventions are discussed.

  13. Study on bayes discriminant analysis of EEG data.

    PubMed

    Shi, Yuan; He, DanDan; Qin, Fang

    2014-01-01

    In this paper, we have done Bayes Discriminant analysis to EEG data of experiment objects which are recorded impersonally come up with a relatively accurate method used in feature extraction and classification decisions. In accordance with the strength of α wave, the head electrodes are divided into four species. In use of part of 21 electrodes EEG data of 63 people, we have done Bayes Discriminant analysis to EEG data of six objects. Results In use of part of EEG data of 63 people, we have done Bayes Discriminant analysis, the electrode classification accuracy rates is 64.4%. Bayes Discriminant has higher prediction accuracy, EEG features (mainly αwave) extract more accurate. Bayes Discriminant would be better applied to the feature extraction and classification decisions of EEG data.

  14. The NWRA Classification Infrastructure: description and extension to the Discriminant Analysis Flare Forecasting System (DAFFS)

    NASA Astrophysics Data System (ADS)

    Leka, K. D.; Barnes, Graham; Wagner, Eric

    2018-04-01

    A classification infrastructure built upon Discriminant Analysis (DA) has been developed at NorthWest Research Associates for examining the statistical differences between samples of two known populations. Originating to examine the physical differences between flare-quiet and flare-imminent solar active regions, we describe herein some details of the infrastructure including: parametrization of large datasets, schemes for handling "null" and "bad" data in multi-parameter analysis, application of non-parametric multi-dimensional DA, an extension through Bayes' theorem to probabilistic classification, and methods invoked for evaluating classifier success. The classifier infrastructure is applicable to a wide range of scientific questions in solar physics. We demonstrate its application to the question of distinguishing flare-imminent from flare-quiet solar active regions, updating results from the original publications that were based on different data and much smaller sample sizes. Finally, as a demonstration of "Research to Operations" efforts in the space-weather forecasting context, we present the Discriminant Analysis Flare Forecasting System (DAFFS), a near-real-time operationally-running solar flare forecasting tool that was developed from the research-directed infrastructure.

  15. Prediction of Depression in Cancer Patients With Different Classification Criteria, Linear Discriminant Analysis versus Logistic Regression.

    PubMed

    Shayan, Zahra; Mohammad Gholi Mezerji, Naser; Shayan, Leila; Naseri, Parisa

    2015-11-03

    Logistic regression (LR) and linear discriminant analysis (LDA) are two popular statistical models for prediction of group membership. Although they are very similar, the LDA makes more assumptions about the data. When categorical and continuous variables used simultaneously, the optimal choice between the two models is questionable. In most studies, classification error (CE) is used to discriminate between subjects in several groups, but this index is not suitable to predict the accuracy of the outcome. The present study compared LR and LDA models using classification indices. This cross-sectional study selected 243 cancer patients. Sample sets of different sizes (n = 50, 100, 150, 200, 220) were randomly selected and the CE, B, and Q classification indices were calculated by the LR and LDA models. CE revealed the a lack of superiority for one model over the other, but the results showed that LR performed better than LDA for the B and Q indices in all situations. No significant effect for sample size on CE was noted for selection of an optimal model. Assessment of the accuracy of prediction of real data indicated that the B and Q indices are appropriate for selection of an optimal model. The results of this study showed that LR performs better in some cases and LDA in others when based on CE. The CE index is not appropriate for classification, although the B and Q indices performed better and offered more efficient criteria for comparison and discrimination between groups.

  16. Discriminant analysis for fast multiclass data classification through regularized kernel function approximation.

    PubMed

    Ghorai, Santanu; Mukherjee, Anirban; Dutta, Pranab K

    2010-06-01

    In this brief we have proposed the multiclass data classification by computationally inexpensive discriminant analysis through vector-valued regularized kernel function approximation (VVRKFA). VVRKFA being an extension of fast regularized kernel function approximation (FRKFA), provides the vector-valued response at single step. The VVRKFA finds a linear operator and a bias vector by using a reduced kernel that maps a pattern from feature space into the low dimensional label space. The classification of patterns is carried out in this low dimensional label subspace. A test pattern is classified depending on its proximity to class centroids. The effectiveness of the proposed method is experimentally verified and compared with multiclass support vector machine (SVM) on several benchmark data sets as well as on gene microarray data for multi-category cancer classification. The results indicate the significant improvement in both training and testing time compared to that of multiclass SVM with comparable testing accuracy principally in large data sets. Experiments in this brief also serve as comparison of performance of VVRKFA with stratified random sampling and sub-sampling.

  17. Classification of debtor credit status and determination amount of credit risk by using linier discriminant function

    NASA Astrophysics Data System (ADS)

    Aidi, Muhammad Nur; Sari, Resty Indah

    2012-05-01

    A decision of credit that given by bank or another creditur must have a risk and it called credit risk. Credit risk is an investor's risk of loss arising from a borrower who does not make payments as promised. The substantial of credit risk can lead to losses for the banks and the debtor. To minimize this problem need a further study to identify a potential new customer before the decision given. Identification of debtor can using various approaches analysis, one of them is by using discriminant analysis. Discriminant analysis in this study are used to classify whether belonging to the debtor's good credit or bad credit. The result of this study are two discriminant functions that can identify new debtor. Before step built the discriminant function, selection of explanatory variables should be done. Purpose of selection independent variable is to choose the variable that can discriminate the group maximally. Selection variables in this study using different test, for categoric variable selection of variable using proportion chi-square test, and stepwise discriminant for numeric variable. The result of this study are two discriminant functions that can identify new debtor. The selected variables that can discriminating two groups of debtor maximally are status of existing checking account, credit history, credit amount, installment rate in percentage of disposable income, sex, age in year, other installment plans, and number of people being liable to provide maintenance. This classification produce a classification accuracy rate is good enough, that is equal to 74,70%. Debtor classification using discriminant analysis has risk level that is small enough, and it ranged beetwen 14,992% and 17,608%. Based on that credit risk rate, using discriminant analysis on the classification of credit status can be used effectively.

  18. Multi-level discriminative dictionary learning with application to large scale image classification.

    PubMed

    Shen, Li; Sun, Gang; Huang, Qingming; Wang, Shuhui; Lin, Zhouchen; Wu, Enhua

    2015-10-01

    The sparse coding technique has shown flexibility and capability in image representation and analysis. It is a powerful tool in many visual applications. Some recent work has shown that incorporating the properties of task (such as discrimination for classification task) into dictionary learning is effective for improving the accuracy. However, the traditional supervised dictionary learning methods suffer from high computation complexity when dealing with large number of categories, making them less satisfactory in large scale applications. In this paper, we propose a novel multi-level discriminative dictionary learning method and apply it to large scale image classification. Our method takes advantage of hierarchical category correlation to encode multi-level discriminative information. Each internal node of the category hierarchy is associated with a discriminative dictionary and a classification model. The dictionaries at different layers are learnt to capture the information of different scales. Moreover, each node at lower layers also inherits the dictionary of its parent, so that the categories at lower layers can be described with multi-scale information. The learning of dictionaries and associated classification models is jointly conducted by minimizing an overall tree loss. The experimental results on challenging data sets demonstrate that our approach achieves excellent accuracy and competitive computation cost compared with other sparse coding methods for large scale image classification.

  19. Discriminant analysis of cardiovascular and respiratory variables for classification of road cyclists by specialty.

    PubMed

    Nikolić, Biljana; Martinović, Jelena; Matić, Milan; Stefanović, Đorđe

    2018-05-29

    Different variables determine the performance of cyclists, which brings up the question how these parameters may help in their classification by specialty. The aim of the study was to determine differences in cardiorespiratory parameters of male cyclists according to their specialty, flat rider (N=21), hill rider (N=35) and sprinter (N=20) and obtain the multivariate model for further cyclists classification by specialties, based on selected variables. Seventeen variables were measured at submaximal and maximum load on the cycle ergometer Cosmed E 400HK (Cosmed, Rome, Italy) (initial 100W with 25W increase, 90-100 rpm). Multivariate discriminant analysis was used to determine which variables group cyclists within their specialty, and to predict which variables can direct cyclists to a particular specialty. Among nine variables that statistically contribute to the discriminant power of the model, achieved power on the anaerobic threshold and the produced CO2 had the biggest impact. The obtained discriminatory model correctly classified 91.43% of flat riders, 85.71% of hill riders, while sprinters were classified completely correct (100%), i.e. 92.10% of examinees were correctly classified, which point out the strength of the discriminatory model. Respiratory indicators mostly contribute to the discriminant power of the model, which may significantly contribute to training practice and laboratory tests in future.

  20. Polarimetric SAR image classification based on discriminative dictionary learning model

    NASA Astrophysics Data System (ADS)

    Sang, Cheng Wei; Sun, Hong

    2018-03-01

    Polarimetric SAR (PolSAR) image classification is one of the important applications of PolSAR remote sensing. It is a difficult high-dimension nonlinear mapping problem, the sparse representations based on learning overcomplete dictionary have shown great potential to solve such problem. The overcomplete dictionary plays an important role in PolSAR image classification, however for PolSAR image complex scenes, features shared by different classes will weaken the discrimination of learned dictionary, so as to degrade classification performance. In this paper, we propose a novel overcomplete dictionary learning model to enhance the discrimination of dictionary. The learned overcomplete dictionary by the proposed model is more discriminative and very suitable for PolSAR classification.

  1. Linear Discriminant Analysis Achieves High Classification Accuracy for the BOLD fMRI Response to Naturalistic Movie Stimuli

    PubMed Central

    Mandelkow, Hendrik; de Zwart, Jacco A.; Duyn, Jeff H.

    2016-01-01

    Naturalistic stimuli like movies evoke complex perceptual processes, which are of great interest in the study of human cognition by functional MRI (fMRI). However, conventional fMRI analysis based on statistical parametric mapping (SPM) and the general linear model (GLM) is hampered by a lack of accurate parametric models of the BOLD response to complex stimuli. In this situation, statistical machine-learning methods, a.k.a. multivariate pattern analysis (MVPA), have received growing attention for their ability to generate stimulus response models in a data-driven fashion. However, machine-learning methods typically require large amounts of training data as well as computational resources. In the past, this has largely limited their application to fMRI experiments involving small sets of stimulus categories and small regions of interest in the brain. By contrast, the present study compares several classification algorithms known as Nearest Neighbor (NN), Gaussian Naïve Bayes (GNB), and (regularized) Linear Discriminant Analysis (LDA) in terms of their classification accuracy in discriminating the global fMRI response patterns evoked by a large number of naturalistic visual stimuli presented as a movie. Results show that LDA regularized by principal component analysis (PCA) achieved high classification accuracies, above 90% on average for single fMRI volumes acquired 2 s apart during a 300 s movie (chance level 0.7% = 2 s/300 s). The largest source of classification errors were autocorrelations in the BOLD signal compounded by the similarity of consecutive stimuli. All classifiers performed best when given input features from a large region of interest comprising around 25% of the voxels that responded significantly to the visual stimulus. Consistent with this, the most informative principal components represented widespread distributions of co-activated brain regions that were similar between subjects and may represent functional networks. In light of these

  2. Comparing Linear Discriminant Function with Logistic Regression for the Two-Group Classification Problem.

    ERIC Educational Resources Information Center

    Fan, Xitao; Wang, Lin

    The Monte Carlo study compared the performance of predictive discriminant analysis (PDA) and that of logistic regression (LR) for the two-group classification problem. Prior probabilities were used for classification, but the cost of misclassification was assumed to be equal. The study used a fully crossed three-factor experimental design (with…

  3. Fast-HPLC Fingerprinting to Discriminate Olive Oil from Other Edible Vegetable Oils by Multivariate Classification Methods.

    PubMed

    Jiménez-Carvelo, Ana M; González-Casado, Antonio; Pérez-Castaño, Estefanía; Cuadros-Rodríguez, Luis

    2017-03-01

    A new analytical method for the differentiation of olive oil from other vegetable oils using reversed-phase LC and applying chemometric techniques was developed. A 3 cm short column was used to obtain the chromatographic fingerprint of the methyl-transesterified fraction of each vegetable oil. The chromatographic analysis took only 4 min. The multivariate classification methods used were k-nearest neighbors, partial least-squares (PLS) discriminant analysis, one-class PLS, support vector machine classification, and soft independent modeling of class analogies. The discrimination of olive oil from other vegetable edible oils was evaluated by several classification quality metrics. Several strategies for the classification of the olive oil were used: one input-class, two input-class, and pseudo two input-class.

  4. Influence of variable selection on partial least squares discriminant analysis models for explosive residue classification

    NASA Astrophysics Data System (ADS)

    De Lucia, Frank C., Jr.; Gottfried, Jennifer L.

    2011-02-01

    Using a series of thirteen organic materials that includes novel high-nitrogen energetic materials, conventional organic military explosives, and benign organic materials, we have demonstrated the importance of variable selection for maximizing residue discrimination with partial least squares discriminant analysis (PLS-DA). We built several PLS-DA models using different variable sets based on laser induced breakdown spectroscopy (LIBS) spectra of the organic residues on an aluminum substrate under an argon atmosphere. The model classification results for each sample are presented and the influence of the variables on these results is discussed. We found that using the whole spectra as the data input for the PLS-DA model gave the best results. However, variables due to the surrounding atmosphere and the substrate contribute to discrimination when the whole spectra are used, indicating this may not be the most robust model. Further iterative testing with additional validation data sets is necessary to determine the most robust model.

  5. Discriminative Bayesian Dictionary Learning for Classification.

    PubMed

    Akhtar, Naveed; Shafait, Faisal; Mian, Ajmal

    2016-12-01

    We propose a Bayesian approach to learn discriminative dictionaries for sparse representation of data. The proposed approach infers probability distributions over the atoms of a discriminative dictionary using a finite approximation of Beta Process. It also computes sets of Bernoulli distributions that associate class labels to the learned dictionary atoms. This association signifies the selection probabilities of the dictionary atoms in the expansion of class-specific data. Furthermore, the non-parametric character of the proposed approach allows it to infer the correct size of the dictionary. We exploit the aforementioned Bernoulli distributions in separately learning a linear classifier. The classifier uses the same hierarchical Bayesian model as the dictionary, which we present along the analytical inference solution for Gibbs sampling. For classification, a test instance is first sparsely encoded over the learned dictionary and the codes are fed to the classifier. We performed experiments for face and action recognition; and object and scene-category classification using five public datasets and compared the results with state-of-the-art discriminative sparse representation approaches. Experiments show that the proposed Bayesian approach consistently outperforms the existing approaches.

  6. Spatial-temporal discriminant analysis for ERP-based brain-computer interface.

    PubMed

    Zhang, Yu; Zhou, Guoxu; Zhao, Qibin; Jin, Jing; Wang, Xingyu; Cichocki, Andrzej

    2013-03-01

    Linear discriminant analysis (LDA) has been widely adopted to classify event-related potential (ERP) in brain-computer interface (BCI). Good classification performance of the ERP-based BCI usually requires sufficient data recordings for effective training of the LDA classifier, and hence a long system calibration time which however may depress the system practicability and cause the users resistance to the BCI system. In this study, we introduce a spatial-temporal discriminant analysis (STDA) to ERP classification. As a multiway extension of the LDA, the STDA method tries to maximize the discriminant information between target and nontarget classes through finding two projection matrices from spatial and temporal dimensions collaboratively, which reduces effectively the feature dimensionality in the discriminant analysis, and hence decreases significantly the number of required training samples. The proposed STDA method was validated with dataset II of the BCI Competition III and dataset recorded from our own experiments, and compared to the state-of-the-art algorithms for ERP classification. Online experiments were additionally implemented for the validation. The superior classification performance in using few training samples shows that the STDA is effective to reduce the system calibration time and improve the classification accuracy, thereby enhancing the practicability of ERP-based BCI.

  7. Discriminative clustering on manifold for adaptive transductive classification.

    PubMed

    Zhang, Zhao; Jia, Lei; Zhang, Min; Li, Bing; Zhang, Li; Li, Fanzhang

    2017-10-01

    In this paper, we mainly propose a novel adaptive transductive label propagation approach by joint discriminative clustering on manifolds for representing and classifying high-dimensional data. Our framework seamlessly combines the unsupervised manifold learning, discriminative clustering and adaptive classification into a unified model. Also, our method incorporates the adaptive graph weight construction with label propagation. Specifically, our method is capable of propagating label information using adaptive weights over low-dimensional manifold features, which is different from most existing studies that usually predict the labels and construct the weights in the original Euclidean space. For transductive classification by our formulation, we first perform the joint discriminative K-means clustering and manifold learning to capture the low-dimensional nonlinear manifolds. Then, we construct the adaptive weights over the learnt manifold features, where the adaptive weights are calculated through performing the joint minimization of the reconstruction errors over features and soft labels so that the graph weights can be joint-optimal for data representation and classification. Using the adaptive weights, we can easily estimate the unknown labels of samples. After that, our method returns the updated weights for further updating the manifold features. Extensive simulations on image classification and segmentation show that our proposed algorithm can deliver the state-of-the-art performance on several public datasets. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. Independent components analysis to increase efficiency of discriminant analysis methods (FDA and LDA): Application to NMR fingerprinting of wine.

    PubMed

    Monakhova, Yulia B; Godelmann, Rolf; Kuballa, Thomas; Mushtakova, Svetlana P; Rutledge, Douglas N

    2015-08-15

    Discriminant analysis (DA) methods, such as linear discriminant analysis (LDA) or factorial discriminant analysis (FDA), are well-known chemometric approaches for solving classification problems in chemistry. In most applications, principle components analysis (PCA) is used as the first step to generate orthogonal eigenvectors and the corresponding sample scores are utilized to generate discriminant features for the discrimination. Independent components analysis (ICA) based on the minimization of mutual information can be used as an alternative to PCA as a preprocessing tool for LDA and FDA classification. To illustrate the performance of this ICA/DA methodology, four representative nuclear magnetic resonance (NMR) data sets of wine samples were used. The classification was performed regarding grape variety, year of vintage and geographical origin. The average increase for ICA/DA in comparison with PCA/DA in the percentage of correct classification varied between 6±1% and 8±2%. The maximum increase in classification efficiency of 11±2% was observed for discrimination of the year of vintage (ICA/FDA) and geographical origin (ICA/LDA). The procedure to determine the number of extracted features (PCs, ICs) for the optimum DA models was discussed. The use of independent components (ICs) instead of principle components (PCs) resulted in improved classification performance of DA methods. The ICA/LDA method is preferable to ICA/FDA for recognition tasks based on NMR spectroscopic measurements. Copyright © 2015 Elsevier B.V. All rights reserved.

  9. Nearest clusters based partial least squares discriminant analysis for the classification of spectral data.

    PubMed

    Song, Weiran; Wang, Hui; Maguire, Paul; Nibouche, Omar

    2018-06-07

    Partial Least Squares Discriminant Analysis (PLS-DA) is one of the most effective multivariate analysis methods for spectral data analysis, which extracts latent variables and uses them to predict responses. In particular, it is an effective method for handling high-dimensional and collinear spectral data. However, PLS-DA does not explicitly address data multimodality, i.e., within-class multimodal distribution of data. In this paper, we present a novel method termed nearest clusters based PLS-DA (NCPLS-DA) for addressing the multimodality and nonlinearity issues explicitly and improving the performance of PLS-DA on spectral data classification. The new method applies hierarchical clustering to divide samples into clusters and calculates the corresponding centre of every cluster. For a given query point, only clusters whose centres are nearest to such a query point are used for PLS-DA. Such a method can provide a simple and effective tool for separating multimodal and nonlinear classes into clusters which are locally linear and unimodal. Experimental results on 17 datasets, including 12 UCI and 5 spectral datasets, show that NCPLS-DA can outperform 4 baseline methods, namely, PLS-DA, kernel PLS-DA, local PLS-DA and k-NN, achieving the highest classification accuracy most of the time. Copyright © 2018 Elsevier B.V. All rights reserved.

  10. MATRIX DISCRIMINANT ANALYSIS WITH APPLICATION TO COLORIMETRIC SENSOR ARRAY DATA

    PubMed Central

    Suslick, Kenneth S.

    2014-01-01

    With the rapid development of nano-technology, a “colorimetric sensor array” (CSA) which is referred to as an optical electronic nose has been developed for the identification of toxicants. Unlike traditional sensors which rely on a single chemical interaction, CSA can measure multiple chemical interactions by using chemo-responsive dyes. The color changes of the chemo-responsive dyes are recorded before and after exposure to toxicants and serve as a template for classification. The color changes are digitalized in the form of a matrix with rows representing dye effects and columns representing the spectrum of colors. Thus, matrix-classification methods are highly desirable. In this article, we develop a novel classification method, matrix discriminant analysis (MDA), which is a generalization of linear discriminant analysis (LDA) for the data in matrix form. By incorporating the intrinsic matrix-structure of the data in discriminant analysis, the proposed method can improve CSA’s sensitivity and more importantly, specificity. A penalized MDA method, PMDA, is also introduced to further incorporate sparsity structure in discriminant function. Numerical studies suggest that the proposed MDA and PMDA methods outperform LDA and other competing discriminant methods for matrix predictors. The asymptotic consistency of MDA is also established. R code and data are available online as supplementary material. PMID:26783371

  11. Classification of passive auditory event-related potentials using discriminant analysis and self-organizing feature maps.

    PubMed

    Schönweiler, R; Wübbelt, P; Tolloczko, R; Rose, C; Ptok, M

    2000-01-01

    Discriminant analysis (DA) and self-organizing feature maps (SOFM) were used to classify passively evoked auditory event-related potentials (ERP) P(1), N(1), P(2) and N(2). Responses from 16 children with severe behavioral auditory perception deficits, 16 children with marked behavioral auditory perception deficits, and 14 controls were examined. Eighteen ERP amplitude parameters were selected for examination of statistical differences between the groups. Different DA methods and SOFM configurations were trained to the values. SOFM had better classification results than DA methods. Subsequently, measures on another 37 subjects that were unknown for the trained SOFM were used to test the reliability of the system. With 10-dimensional vectors, reliable classifications were obtained that matched behavioral auditory perception deficits in 96%, implying central auditory processing disorder (CAPD). The results also support the assumption that CAPD includes a 'non-peripheral' auditory processing deficit. Copyright 2000 S. Karger AG, Basel.

  12. Analysis and classification of optical tomographic images of rheumatoid fingers with ANOVA and discriminate analysis

    NASA Astrophysics Data System (ADS)

    Montejo, Ludguier D.; Kim, Hyun K.; Häme, Yrjö; Jia, Jingfei; Montejo, Julio D.; Netz, Uwe J.; Blaschke, Sabine; Zwaka, Paul; Müeller, Gerhard A.; Beuthan, Jürgen; Hielscher, Andreas H.

    2011-03-01

    We present a study on the effectiveness of computer-aided diagnosis (CAD) of rheumatoid arthritis (RA) from frequency-domain diffuse optical tomographic (FDOT) images. FDOT is used to obtain the distribution of tissue optical properties. Subsequently, the non-parametric Kruskal-Wallis ANOVA test is employed to verify statistically significant differences between the optical parameters of patients affected by RA and healthy volunteers. Furthermore, quadratic discriminate analysis (QDA) of the absorption (μa) and scattering (μa or μ's) distributions is used to classify subjects as affected or not affected by RA. We evaluate the classification efficiency by determining the sensitivity (Se), specificity (Sp), and the Youden index (Y). We find that combining features extracted from μa and μa or μ's images allows for more accurate classification than when μa or μa or μ's features are considered individually on their own. Combining μa and μa or μ's features yields values of up to Y = 0.75 (Se = 0.84 and Sp = 0.91). The best results when μa or μ's features are considered individually are Y = 0.65 (Se = 0.85 and Sp = 0.80) and Y = 0.70 (Se = 0.80 and Sp = 0.90), respectively.

  13. Gene features selection for three-class disease classification via multiple orthogonal partial least square discriminant analysis and S-plot using microarray data.

    PubMed

    Yang, Mingxing; Li, Xiumin; Li, Zhibin; Ou, Zhimin; Liu, Ming; Liu, Suhuan; Li, Xuejun; Yang, Shuyu

    2013-01-01

    DNA microarray analysis is characterized by obtaining a large number of gene variables from a small number of observations. Cluster analysis is widely used to analyze DNA microarray data to make classification and diagnosis of disease. Because there are so many irrelevant and insignificant genes in a dataset, a feature selection approach must be employed in data analysis. The performance of cluster analysis of this high-throughput data depends on whether the feature selection approach chooses the most relevant genes associated with disease classes. Here we proposed a new method using multiple Orthogonal Partial Least Squares-Discriminant Analysis (mOPLS-DA) models and S-plots to select the most relevant genes to conduct three-class disease classification and prediction. We tested our method using Golub's leukemia microarray data. For three classes with subtypes, we proposed hierarchical orthogonal partial least squares-discriminant analysis (OPLS-DA) models and S-plots to select features for two main classes and their subtypes. For three classes in parallel, we employed three OPLS-DA models and S-plots to choose marker genes for each class. The power of feature selection to classify and predict three-class disease was evaluated using cluster analysis. Further, the general performance of our method was tested using four public datasets and compared with those of four other feature selection methods. The results revealed that our method effectively selected the most relevant features for disease classification and prediction, and its performance was better than that of the other methods.

  14. The Effect of Unequal Samples, Heterogeneity of Covariance Matrices, and Number of Variables on Discriminant Analysis Classification Tables and Related Statistics.

    ERIC Educational Resources Information Center

    Spearing, Debra; Woehlke, Paula

    To assess the effect on discriminant analysis in terms of correct classification into two groups, the following parameters were systematically altered using Monte Carlo techniques: sample sizes; proportions of one group to the other; number of independent variables; and covariance matrices. The pairing of the off diagonals (or covariances) with…

  15. Ranking procedure for partial discriminant analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Beckman, R.J.; Johnson, M.E.

    1981-09-01

    A rank procedure developed by Broffitt, Randles, and Hogg (1976) is modified to control the conditional probability of misclassification given that classification has been attempted. This modification leads to a useful solution to the two-population partial discriminant analysis problem for even moderately sized training sets.

  16. Discrimination Enhancement with Transient Feature Analysis of a Graphene Chemical Sensor.

    PubMed

    Nallon, Eric C; Schnee, Vincent P; Bright, Collin J; Polcha, Michael P; Li, Qiliang

    2016-01-19

    A graphene chemical sensor is subjected to a set of structurally and chemically similar hydrocarbon compounds consisting of toluene, o-xylene, p-xylene, and mesitylene. The fractional change in resistance of the sensor upon exposure to these compounds exhibits a similar response magnitude among compounds, whereas large variation is observed within repetitions for each compound, causing a response overlap. Therefore, traditional features depending on maximum response change will cause confusion during further discrimination and classification analysis. More robust features that are less sensitive to concentration, sampling, and drift variability would provide higher quality information. In this work, we have explored the advantage of using transient-based exponential fitting coefficients to enhance the discrimination of similar compounds. The advantages of such feature analysis to discriminate each compound is evaluated using principle component analysis (PCA). In addition, machine learning-based classification algorithms were used to compare the prediction accuracies when using fitting coefficients as features. The additional features greatly enhanced the discrimination between compounds while performing PCA and also improved the prediction accuracy by 34% when using linear discrimination analysis.

  17. Thyroid nodule classification using ultrasound elastography via linear discriminant analysis.

    PubMed

    Luo, Si; Kim, Eung-Hun; Dighe, Manjiri; Kim, Yongmin

    2011-05-01

    The non-surgical diagnosis of thyroid nodules is currently made via a fine needle aspiration (FNA) biopsy. It is estimated that somewhere between 250,000 and 300,000 thyroid FNA biopsies are performed in the United States annually. However, a large percentage (approximately 70%) of these biopsies turn out to be benign. Since the aggressive FNA management of thyroid nodules is costly, quantitative risk assessment and stratification of a nodule's malignancy is of value in triage and more appropriate healthcare resources utilization. In this paper, we introduce a new method for classifying the thyroid nodules based on the ultrasound (US) elastography features. Unlike approaches to assess the stiffness of a thyroid nodule by visually inspecting the pseudo-color pattern in the strain image, we use a classification algorithm to stratify the nodule by using the power spectrum of strain rate waveform extracted from the US elastography image sequence. Pulsation from the carotid artery was used to compress the thyroid nodules. Ultrasound data previously acquired from 98 thyroid nodules were used in this retrospective study to evaluate our classification algorithm. A classifier was developed based on the linear discriminant analysis (LDA) and used to differentiate the thyroid nodules into two types: (I) no FNA (observation-only) and (II) FNA. Using our method, 62 nodules were classified as type I, all of which were benign, while 36 nodules were classified as Type-II, 16 malignant and 20 benign, resulting in a sensitivity of 100% and specificity of 75.6% in detecting malignant thyroid nodules. This indicates that our triage method based on US elastography has the potential to substantially reduce the number of FNA biopsies (63.3%) by detecting benign nodules and managing them via follow-up observations rather than an FNA biopsy. Published by Elsevier B.V.

  18. [Study on the classification of dominant pathogens related to febrile respiratory syndrome, based on the method of Bayes discriminant analysis].

    PubMed

    Li, X C; Li, J S; Meng, L; Bai, Y N; Yu, D S; Liu, X N; Liu, X F; Jiang, X J; Ren, X W; Yang, X T; Shen, X P; Zhang, J W

    2017-08-10

    Objective: To understand the dominant pathogens of febrile respiratory syndrome (FRS) patients in Gansu province and to establish the Bayes discriminant function in order to identify the patients infected with the dominant pathogens. Methods: FRS patients were collected in various sentinel hospitals of Gansu province from 2009 to 2015 and the dominant pathogens were determined by describing the composition of pathogenic profile. Significant clinical variables were selected by stepwise discriminant analysis to establish the Bayes discriminant function. Results: In the detection of pathogens for FRS, both influenza virus and rhinovirus showed higher positive rates than those caused by other viruses (13.79%, 8.63%), that accounting for 54.38%, 13.73% of total viral positive patients. Most frequently detected bacteria would include Streptococcus pneumoniae , and haemophilus influenza (44.41%, 18.07%) that accounting for 66.21% and 24.55% among the bacterial positive patients. The original-validated rate of discriminant function, established by 11 clinical variables, was 73.1%, with the cross-validated rate as 70.6%. Conclusion: Influenza virus, Rhinovirus, Streptococcus pneumoniae and Haemophilus influenzae were the dominant pathogens of FRS in Gansu province. Results from the Bayes discriminant analysis showed both higher accuracy in the classification of dominant pathogens, and applicative value for FRS.

  19. Graphical methods for the sensitivity analysis in discriminant analysis

    DOE PAGES

    Kim, Youngil; Anderson-Cook, Christine M.; Dae-Heung, Jang

    2015-09-30

    Similar to regression, many measures to detect influential data points in discriminant analysis have been developed. Many follow similar principles as the diagnostic measures used in linear regression in the context of discriminant analysis. Here we focus on the impact on the predicted classification posterior probability when a data point is omitted. The new method is intuitive and easily interpretative compared to existing methods. We also propose a graphical display to show the individual movement of the posterior probability of other data points when a specific data point is omitted. This enables the summaries to capture the overall pattern ofmore » the change.« less

  20. EXTRACTING PRINCIPLE COMPONENTS FOR DISCRIMINANT ANALYSIS OF FMRI IMAGES.

    PubMed

    Liu, Jingyu; Xu, Lai; Caprihan, Arvind; Calhoun, Vince D

    2008-05-12

    This paper presents an approach for selecting optimal components for discriminant analysis. Such an approach is useful when further detailed analyses for discrimination or characterization requires dimensionality reduction. Our approach can accommodate a categorical variable such as diagnosis (e.g. schizophrenic patient or healthy control), or a continuous variable like severity of the disorder. This information is utilized as a reference for measuring a component's discriminant power after principle component decomposition. After sorting each component according to its discriminant power, we extract the best components for discriminant analysis. An application of our reference selection approach is shown using a functional magnetic resonance imaging data set in which the sample size is much less than the dimensionality. The results show that the reference selection approach provides an improved discriminant component set as compared to other approaches. Our approach is general and provides a solid foundation for further discrimination and classification studies.

  1. Parametric Time-Frequency Analysis and Its Applications in Music Classification

    NASA Astrophysics Data System (ADS)

    Shen, Ying; Li, Xiaoli; Ma, Ngok-Wah; Krishnan, Sridhar

    2010-12-01

    Analysis of nonstationary signals, such as music signals, is a challenging task. The purpose of this study is to explore an efficient and powerful technique to analyze and classify music signals in higher frequency range (44.1 kHz). The pursuit methods are good tools for this purpose, but they aimed at representing the signals rather than classifying them as in Y. Paragakin et al., 2009. Among the pursuit methods, matching pursuit (MP), an adaptive true nonstationary time-frequency signal analysis tool, is applied for music classification. First, MP decomposes the sample signals into time-frequency functions or atoms. Atom parameters are then analyzed and manipulated, and discriminant features are extracted from atom parameters. Besides the parameters obtained using MP, an additional feature, central energy, is also derived. Linear discriminant analysis and the leave-one-out method are used to evaluate the classification accuracy rate for different feature sets. The study is one of the very few works that analyze atoms statistically and extract discriminant features directly from the parameters. From our experiments, it is evident that the MP algorithm with the Gabor dictionary decomposes nonstationary signals, such as music signals, into atoms in which the parameters contain strong discriminant information sufficient for accurate and efficient signal classifications.

  2. Semi-supervised learning for ordinal Kernel Discriminant Analysis.

    PubMed

    Pérez-Ortiz, M; Gutiérrez, P A; Carbonero-Ruz, M; Hervás-Martínez, C

    2016-12-01

    Ordinal classification considers those classification problems where the labels of the variable to predict follow a given order. Naturally, labelled data is scarce or difficult to obtain in this type of problems because, in many cases, ordinal labels are given by a user or expert (e.g. in recommendation systems). Firstly, this paper develops a new strategy for ordinal classification where both labelled and unlabelled data are used in the model construction step (a scheme which is referred to as semi-supervised learning). More specifically, the ordinal version of kernel discriminant learning is extended for this setting considering the neighbourhood information of unlabelled data, which is proposed to be computed in the feature space induced by the kernel function. Secondly, a new method for semi-supervised kernel learning is devised in the context of ordinal classification, which is combined with our developed classification strategy to optimise the kernel parameters. The experiments conducted compare 6 different approaches for semi-supervised learning in the context of ordinal classification in a battery of 30 datasets, showing (1) the good synergy of the ordinal version of discriminant analysis and the use of unlabelled data and (2) the advantage of computing distances in the feature space induced by the kernel function. Copyright © 2016 Elsevier Ltd. All rights reserved.

  3. EXTRACTING PRINCIPLE COMPONENTS FOR DISCRIMINANT ANALYSIS OF FMRI IMAGES

    PubMed Central

    Liu, Jingyu; Xu, Lai; Caprihan, Arvind; Calhoun, Vince D.

    2009-01-01

    This paper presents an approach for selecting optimal components for discriminant analysis. Such an approach is useful when further detailed analyses for discrimination or characterization requires dimensionality reduction. Our approach can accommodate a categorical variable such as diagnosis (e.g. schizophrenic patient or healthy control), or a continuous variable like severity of the disorder. This information is utilized as a reference for measuring a component’s discriminant power after principle component decomposition. After sorting each component according to its discriminant power, we extract the best components for discriminant analysis. An application of our reference selection approach is shown using a functional magnetic resonance imaging data set in which the sample size is much less than the dimensionality. The results show that the reference selection approach provides an improved discriminant component set as compared to other approaches. Our approach is general and provides a solid foundation for further discrimination and classification studies. PMID:20582334

  4. Using complex networks for text classification: Discriminating informative and imaginative documents

    NASA Astrophysics Data System (ADS)

    de Arruda, Henrique F.; Costa, Luciano da F.; Amancio, Diego R.

    2016-01-01

    Statistical methods have been widely employed in recent years to grasp many language properties. The application of such techniques have allowed an improvement of several linguistic applications, such as machine translation and document classification. In the latter, many approaches have emphasised the semantical content of texts, as is the case of bag-of-word language models. These approaches have certainly yielded reasonable performance. However, some potential features such as the structural organization of texts have been used only in a few studies. In this context, we probe how features derived from textual structure analysis can be effectively employed in a classification task. More specifically, we performed a supervised classification aiming at discriminating informative from imaginative documents. Using a networked model that describes the local topological/dynamical properties of function words, we achieved an accuracy rate of up to 95%, which is much higher than similar networked approaches. A systematic analysis of feature relevance revealed that symmetry and accessibility measurements are among the most prominent network measurements. Our results suggest that these measurements could be used in related language applications, as they play a complementary role in characterising texts.

  5. Discriminative illumination: per-pixel classification of raw materials based on optimal projections of spectral BRDF.

    PubMed

    Liu, Chao; Gu, Jinwei

    2014-01-01

    Classifying raw, unpainted materials--metal, plastic, ceramic, fabric, and so on--is an important yet challenging task for computer vision. Previous works measure subsets of surface spectral reflectance as features for classification. However, acquiring the full spectral reflectance is time consuming and error-prone. In this paper, we propose to use coded illumination to directly measure discriminative features for material classification. Optimal illumination patterns--which we call "discriminative illumination"--are learned from training samples, after projecting to which the spectral reflectance of different materials are maximally separated. This projection is automatically realized by the integration of incident light for surface reflection. While a single discriminative illumination is capable of linear, two-class classification, we show that multiple discriminative illuminations can be used for nonlinear and multiclass classification. We also show theoretically that the proposed method has higher signal-to-noise ratio than previous methods due to light multiplexing. Finally, we construct an LED-based multispectral dome and use the discriminative illumination method for classifying a variety of raw materials, including metal (aluminum, alloy, steel, stainless steel, brass, and copper), plastic, ceramic, fabric, and wood. Experimental results demonstrate its effectiveness.

  6. Automated aural classification used for inter-species discrimination of cetaceans.

    PubMed

    Binder, Carolyn M; Hines, Paul C

    2014-04-01

    Passive acoustic methods are in widespread use to detect and classify cetacean species; however, passive acoustic systems often suffer from large false detection rates resulting from numerous transient sources. To reduce the acoustic analyst workload, automatic recognition methods may be implemented in a two-stage process. First, a general automatic detector is implemented that produces many detections to ensure cetacean presence is noted. Then an automatic classifier is used to significantly reduce the number of false detections and classify the cetacean species. This process requires development of a robust classifier capable of performing inter-species classification. Because human analysts can aurally discriminate species, an automated aural classifier that uses perceptual signal features was tested on a cetacean data set. The classifier successfully discriminated between four species of cetaceans-bowhead, humpback, North Atlantic right, and sperm whales-with 85% accuracy. It also performed well (100% accuracy) for discriminating sperm whale clicks from right whale gunshots. An accuracy of 92% and area under the receiver operating characteristic curve of 0.97 were obtained for the relatively challenging bowhead and humpback recognition case. These results demonstrated that the perceptual features employed by the aural classifier provided powerful discrimination cues for inter-species classification of cetaceans.

  7. Discriminative least squares regression for multiclass classification and feature selection.

    PubMed

    Xiang, Shiming; Nie, Feiping; Meng, Gaofeng; Pan, Chunhong; Zhang, Changshui

    2012-11-01

    This paper presents a framework of discriminative least squares regression (LSR) for multiclass classification and feature selection. The core idea is to enlarge the distance between different classes under the conceptual framework of LSR. First, a technique called ε-dragging is introduced to force the regression targets of different classes moving along opposite directions such that the distances between classes can be enlarged. Then, the ε-draggings are integrated into the LSR model for multiclass classification. Our learning framework, referred to as discriminative LSR, has a compact model form, where there is no need to train two-class machines that are independent of each other. With its compact form, this model can be naturally extended for feature selection. This goal is achieved in terms of L2,1 norm of matrix, generating a sparse learning model for feature selection. The model for multiclass classification and its extension for feature selection are finally solved elegantly and efficiently. Experimental evaluation over a range of benchmark datasets indicates the validity of our method.

  8. Histopathological Image Classification using Discriminative Feature-oriented Dictionary Learning

    PubMed Central

    Vu, Tiep Huu; Mousavi, Hojjat Seyed; Monga, Vishal; Rao, Ganesh; Rao, UK Arvind

    2016-01-01

    In histopathological image analysis, feature extraction for classification is a challenging task due to the diversity of histology features suitable for each problem as well as presence of rich geometrical structures. In this paper, we propose an automatic feature discovery framework via learning class-specific dictionaries and present a low-complexity method for classification and disease grading in histopathology. Essentially, our Discriminative Feature-oriented Dictionary Learning (DFDL) method learns class-specific dictionaries such that under a sparsity constraint, the learned dictionaries allow representing a new image sample parsimoniously via the dictionary corresponding to the class identity of the sample. At the same time, the dictionary is designed to be poorly capable of representing samples from other classes. Experiments on three challenging real-world image databases: 1) histopathological images of intraductal breast lesions, 2) mammalian kidney, lung and spleen images provided by the Animal Diagnostics Lab (ADL) at Pennsylvania State University, and 3) brain tumor images from The Cancer Genome Atlas (TCGA) database, reveal the merits of our proposal over state-of-the-art alternatives. Moreover, we demonstrate that DFDL exhibits a more graceful decay in classification accuracy against the number of training images which is highly desirable in practice where generous training is often not available. PMID:26513781

  9. A Discriminant Distance Based Composite Vector Selection Method for Odor Classification

    PubMed Central

    Choi, Sang-Il; Jeong, Gu-Min

    2014-01-01

    We present a composite vector selection method for an effective electronic nose system that performs well even in noisy environments. Each composite vector generated from a electronic nose data sample is evaluated by computing the discriminant distance. By quantitatively measuring the amount of discriminative information in each composite vector, composite vectors containing informative variables can be distinguished and the final composite features for odor classification are extracted using the selected composite vectors. Using the only informative composite vectors can be also helpful to extract better composite features instead of using all the generated composite vectors. Experimental results with different volatile organic compound data show that the proposed system has good classification performance even in a noisy environment compared to other methods. PMID:24747735

  10. Penalized discriminant analysis for the detection of wild-grown and cultivated Ganoderma lucidum using Fourier transform infrared spectroscopy

    NASA Astrophysics Data System (ADS)

    Zhu, Ying; Tan, Tuck Lee

    2016-04-01

    An effective and simple analytical method using Fourier transform infrared (FTIR) spectroscopy to distinguish wild-grown high-quality Ganoderma lucidum (G. lucidum) from cultivated one is of essential importance for its quality assurance and medicinal value estimation. Commonly used chemical and analytical methods using full spectrum are not so effective for the detection and interpretation due to the complex system of the herbal medicine. In this study, two penalized discriminant analysis models, penalized linear discriminant analysis (PLDA) and elastic net (Elnet),using FTIR spectroscopy have been explored for the purpose of discrimination and interpretation. The classification performances of the two penalized models have been compared with two widely used multivariate methods, principal component discriminant analysis (PCDA) and partial least squares discriminant analysis (PLSDA). The Elnet model involving a combination of L1 and L2 norm penalties enabled an automatic selection of a small number of informative spectral absorption bands and gave an excellent classification accuracy of 99% for discrimination between spectra of wild-grown and cultivated G. lucidum. Its classification performance was superior to that of the PLDA model in a pure L1 setting and outperformed the PCDA and PLSDA models using full wavelength. The well-performed selection of informative spectral features leads to substantial reduction in model complexity and improvement of classification accuracy, and it is particularly helpful for the quantitative interpretations of the major chemical constituents of G. lucidum regarding its anti-cancer effects.

  11. Penalized discriminant analysis for the detection of wild-grown and cultivated Ganoderma lucidum using Fourier transform infrared spectroscopy.

    PubMed

    Zhu, Ying; Tan, Tuck Lee

    2016-04-15

    An effective and simple analytical method using Fourier transform infrared (FTIR) spectroscopy to distinguish wild-grown high-quality Ganoderma lucidum (G. lucidum) from cultivated one is of essential importance for its quality assurance and medicinal value estimation. Commonly used chemical and analytical methods using full spectrum are not so effective for the detection and interpretation due to the complex system of the herbal medicine. In this study, two penalized discriminant analysis models, penalized linear discriminant analysis (PLDA) and elastic net (Elnet),using FTIR spectroscopy have been explored for the purpose of discrimination and interpretation. The classification performances of the two penalized models have been compared with two widely used multivariate methods, principal component discriminant analysis (PCDA) and partial least squares discriminant analysis (PLSDA). The Elnet model involving a combination of L1 and L2 norm penalties enabled an automatic selection of a small number of informative spectral absorption bands and gave an excellent classification accuracy of 99% for discrimination between spectra of wild-grown and cultivated G. lucidum. Its classification performance was superior to that of the PLDA model in a pure L1 setting and outperformed the PCDA and PLSDA models using full wavelength. The well-performed selection of informative spectral features leads to substantial reduction in model complexity and improvement of classification accuracy, and it is particularly helpful for the quantitative interpretations of the major chemical constituents of G. lucidum regarding its anti-cancer effects. Copyright © 2016 Elsevier B.V. All rights reserved.

  12. Discrimination among populations of sockeye salmon fry with Fourier analysis of otolith banding patterns formed during incubation

    USGS Publications Warehouse

    Finn, James E.; Burger, Carl V.; Holland-Bartels, Leslie E.

    1997-01-01

    We used otolith banding patterns formed during incubation to discriminate among hatchery- and wild-incubated fry of sockeye salmon Oncorhynchus nerka from Tustumena Lake, Alaska. Fourier analysis of otolith luminance profiles was used to describe banding patterns: the amplitudes of individual Fourier harmonics were discriminant variables. Correct classification of otoliths to either hatchery or wild origin was 83.1% (cross-validation) and 72.7% (test data) with the use of quadratic discriminant function analysts on 10 Fourier amplitudes. Overall classification rates among the six test groups (one hatchery and five wild groups) were 46.5% (cross-validation) and 39.3% (test data) with the use of linear discriminant function analysis on 16 Fourier amplitudes. Although classification rates for wild-incubated fry from any one site never exceeded 67% (cross-validation) or 60% (test data), location-specific information was evident for all groups because the probability of classifying an individual to its true incubation location was significantly greater than chance. Results indicate phenotypic differences in otolith microstructure among incubation sites separated by less than 10 km. Analysis of otolith luminance profiles is a potentially useful technique for discriminating among and between various populations of hatchery and wild fish.

  13. Online Learning for Classification of Alzheimer Disease based on Cortical Thickness and Hippocampal Shape Analysis.

    PubMed

    Lee, Ga-Young; Kim, Jeonghun; Kim, Ju Han; Kim, Kiwoong; Seong, Joon-Kyung

    2014-01-01

    Mobile healthcare applications are becoming a growing trend. Also, the prevalence of dementia in modern society is showing a steady growing trend. Among degenerative brain diseases that cause dementia, Alzheimer disease (AD) is the most common. The purpose of this study was to identify AD patients using magnetic resonance imaging in the mobile environment. We propose an incremental classification for mobile healthcare systems. Our classification method is based on incremental learning for AD diagnosis and AD prediction using the cortical thickness data and hippocampus shape. We constructed a classifier based on principal component analysis and linear discriminant analysis. We performed initial learning and mobile subject classification. Initial learning is the group learning part in our server. Our smartphone agent implements the mobile classification and shows various results. With use of cortical thickness data analysis alone, the discrimination accuracy was 87.33% (sensitivity 96.49% and specificity 64.33%). When cortical thickness data and hippocampal shape were analyzed together, the achieved accuracy was 87.52% (sensitivity 96.79% and specificity 63.24%). In this paper, we presented a classification method based on online learning for AD diagnosis by employing both cortical thickness data and hippocampal shape analysis data. Our method was implemented on smartphone devices and discriminated AD patients for normal group.

  14. Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests

    PubMed Central

    2011-01-01

    Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed

  15. Comparison of discriminant analysis methods: Application to occupational exposure to particulate matter

    NASA Astrophysics Data System (ADS)

    Ramos, M. Rosário; Carolino, E.; Viegas, Carla; Viegas, Sandra

    2016-06-01

    Health effects associated with occupational exposure to particulate matter have been studied by several authors. In this study were selected six industries of five different areas: Cork company 1, Cork company 2, poultry, slaughterhouse for cattle, riding arena and production of animal feed. The measurements tool was a portable device for direct reading. This tool provides information on the particle number concentration for six different diameters, namely 0.3 µm, 0.5 µm, 1 µm, 2.5 µm, 5 µm and 10 µm. The focus on these features is because they might be more closely related with adverse health effects. The aim is to identify the particles that better discriminate the industries, with the ultimate goal of classifying industries regarding potential negative effects on workers' health. Several methods of discriminant analysis were applied to data of occupational exposure to particulate matter and compared with respect to classification accuracy. The selected methods were linear discriminant analyses (LDA); linear quadratic discriminant analysis (QDA), robust linear discriminant analysis with selected estimators (MLE (Maximum Likelihood Estimators), MVE (Minimum Volume Elipsoid), "t", MCD (Minimum Covariance Determinant), MCD-A, MCD-B), multinomial logistic regression and artificial neural networks (ANN). The predictive accuracy of the methods was accessed through a simulation study. ANN yielded the highest rate of classification accuracy in the data set under study. Results indicate that the particle number concentration of diameter size 0.5 µm is the parameter that better discriminates industries.

  16. Aberrant functional connectivity for diagnosis of major depressive disorder: a discriminant analysis.

    PubMed

    Cao, Longlong; Guo, Shuixia; Xue, Zhimin; Hu, Yong; Liu, Haihong; Mwansisya, Tumbwene E; Pu, Weidan; Yang, Bo; Liu, Chang; Feng, Jianfeng; Chen, Eric Y H; Liu, Zhening

    2014-02-01

    Aberrant brain functional connectivity patterns have been reported in major depressive disorder (MDD). It is unknown whether they can be used in discriminant analysis for diagnosis of MDD. In the present study we examined the efficiency of discriminant analysis of MDD by individualized computer-assisted diagnosis. Based on resting-state functional magnetic resonance imaging data, a new approach was adopted to investigate functional connectivity changes in 39 MDD patients and 37 well-matched healthy controls. By using the proposed feature selection method, we identified significant altered functional connections in patients. They were subsequently applied to our analysis as discriminant features using a support vector machine classification method. Furthermore, the relative contribution of functional connectivity was estimated. After subset selection of high-dimension features, the support vector machine classifier reached up to approximately 84% with leave-one-out training during the discrimination process. Through summarizing the classification contribution of functional connectivities, we obtained four obvious contribution modules: inferior orbitofrontal module, supramarginal gyrus module, inferior parietal lobule-posterior cingulated gyrus module and middle temporal gyrus-inferior temporal gyrus module. The experimental results demonstrated that the proposed method is effective in discriminating MDD patients from healthy controls. Functional connectivities might be useful as new biomarkers to assist clinicians in computer auxiliary diagnosis of MDD. © 2013 The Authors. Psychiatry and Clinical Neurosciences © 2013 Japanese Society of Psychiatry and Neurology.

  17. Multi-Site Diagnostic Classification of Schizophrenia Using Discriminant Deep Learning with Functional Connectivity MRI.

    PubMed

    Zeng, Ling-Li; Wang, Huaning; Hu, Panpan; Yang, Bo; Pu, Weidan; Shen, Hui; Chen, Xingui; Liu, Zhening; Yin, Hong; Tan, Qingrong; Wang, Kai; Hu, Dewen

    2018-04-01

    A lack of a sufficiently large sample at single sites causes poor generalizability in automatic diagnosis classification of heterogeneous psychiatric disorders such as schizophrenia based on brain imaging scans. Advanced deep learning methods may be capable of learning subtle hidden patterns from high dimensional imaging data, overcome potential site-related variation, and achieve reproducible cross-site classification. However, deep learning-based cross-site transfer classification, despite less imaging site-specificity and more generalizability of diagnostic models, has not been investigated in schizophrenia. A large multi-site functional MRI sample (n = 734, including 357 schizophrenic patients from seven imaging resources) was collected, and a deep discriminant autoencoder network, aimed at learning imaging site-shared functional connectivity features, was developed to discriminate schizophrenic individuals from healthy controls. Accuracies of approximately 85·0% and 81·0% were obtained in multi-site pooling classification and leave-site-out transfer classification, respectively. The learned functional connectivity features revealed dysregulation of the cortical-striatal-cerebellar circuit in schizophrenia, and the most discriminating functional connections were primarily located within and across the default, salience, and control networks. The findings imply that dysfunctional integration of the cortical-striatal-cerebellar circuit across the default, salience, and control networks may play an important role in the "disconnectivity" model underlying the pathophysiology of schizophrenia. The proposed discriminant deep learning method may be capable of learning reliable connectome patterns and help in understanding the pathophysiology and achieving accurate prediction of schizophrenia across multiple independent imaging sites. Copyright © 2018 German Center for Neurodegenerative Diseases (DZNE). Published by Elsevier B.V. All rights reserved.

  18. Multi-class ERP-based BCI data analysis using a discriminant space self-organizing map.

    PubMed

    Onishi, Akinari; Natsume, Kiyohisa

    2014-01-01

    Emotional or non-emotional image stimulus is recently applied to event-related potential (ERP) based brain computer interfaces (BCI). Though the classification performance is over 80% in a single trial, a discrimination between those ERPs has not been considered. In this research we tried to clarify the discriminability of four-class ERP-based BCI target data elicited by desk, seal, spider images and letter intensifications. A conventional self organizing map (SOM) and newly proposed discriminant space SOM (ds-SOM) were applied, then the discriminabilites were visualized. We also classify all pairs of those ERPs by stepwise linear discriminant analysis (SWLDA) and verify the visualization of discriminabilities. As a result, the ds-SOM showed understandable visualization of the data with a shorter computational time than the traditional SOM. We also confirmed the clear boundary between the letter cluster and the other clusters. The result was coherent with the classification performances by SWLDA. The method might be helpful not only for developing a new BCI paradigm, but also for the big data analysis.

  19. Automated Hand-Held UXO Detection, Classification & Discrimination Sensor

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bell, Thomas H.

    2000-06-12

    The research focused on procedures for target discrimination and classification using hand-held EMI sensors. The idea is to have a small, portable sensor that can be operated in a sweep or similar pattern in front of the operator, and that is capable of distinguishing between buried UXO and clutter on the spot. Curing Phase 1, we developed the processing techniques for distinguishing between buried UXO and clutter using the EM61-HH hand-held metal detector.

  20. Comparison of cranial sex determination by discriminant analysis and logistic regression.

    PubMed

    Amores-Ampuero, Anabel; Alemán, Inmaculada

    2016-04-05

    Various methods have been proposed for estimating dimorphism. The objective of this study was to compare sex determination results from cranial measurements using discriminant analysis or logistic regression. The study sample comprised 130 individuals (70 males) of known sex, age, and cause of death from San José cemetery in Granada (Spain). Measurements of 19 neurocranial dimensions and 11 splanchnocranial dimensions were subjected to discriminant analysis and logistic regression, and the percentages of correct classification were compared between the sex functions obtained with each method. The discriminant capacity of the selected variables was evaluated with a cross-validation procedure. The percentage accuracy with discriminant analysis was 78.2% for the neurocranium (82.4% in females and 74.6% in males) and 73.7% for the splanchnocranium (79.6% in females and 68.8% in males). These percentages were higher with logistic regression analysis: 85.7% for the neurocranium (in both sexes) and 94.1% for the splanchnocranium (100% in females and 91.7% in males).

  1. Analysis and Classification of Entering Freshmen Mathematic Students Using Multiple Discriminate Function Analysis.

    ERIC Educational Resources Information Center

    Ahrens, Steve

    Predictor variables that could be used effectively to place entering freshmen methematics students into courses of instruction in mathematics were investigated at West Virginia University. Multiple discriminant analysis was used with nearly 6,000 student records collected over a three-year period, and a series of predictive equations were…

  2. Some observations on the use of discriminant analysis in ecology

    USGS Publications Warehouse

    Williams, B.K.

    1983-01-01

    The application of discriminant analysis in ecological investigations is discussed. The appropriate statistical assumptions for discriminant analysis are illustrated, and both classification and group separation approaches are outlined. Three assumptions that are crucial in ecological studies are discussed at length, and the consequences of their violation are developed. These assumptions are: equality of dispersions, identifiability of prior probabilities, and precise and accurate estimation of means and dispersions. The use of discriminant functions for purposes of interpreting ecological relationships is also discussed. It is suggested that the common practice of imputing ecological 'meaning' to the signs and magnitudes of coefficients be replaced by an assessment of 'structure coefficients.' Finally, the potential and limitations of representation of data in canonical space are considered, and some cautionary points are made concerning ecological interpretation of patterns in canonical space.

  3. Discriminant function analysis as tool for subsurface geologist

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chesser, K.

    1987-05-01

    Sedimentary structures such as cross-bedding control porosity, permeability, and other petrophysical properties in sandstone reservoirs. Understanding the distribution of such structures in the subsurface not only aids in the prediction of reservoir properties but also provides information about depositional environments. Discriminant function analysis (DFA) is a simple yet powerful method incorporating petrophysical data from wireline logs, core analyses, or other sources into groups that have been previously defined through direct observation of sedimentary structures in cores. Once data have been classified into meaningful groups, the geologist can predict the distribution of specific sedimentary structures or important reservoir properties in areasmore » where cores are unavailable. DFA is efficient. Given several variables, DFA will choose the best combination to discriminate among groups. The initial classification function can be computed from relatively few observations, and additional data may be included as necessary. Furthermore, DFA provides quantitative goodness-of-fit estimates for each observation. Such estimates can be used as mapping parameters or to assess risk in petroleum ventures. Petrophysical data from the Skinner sandstone of Strauss field in southeastern Kansas tested the ability of DFA to discriminate between cross-bedded and ripple-bedded sandstones. Petroleum production in Strauss field is largely restricted to the more permeable cross-bedded sandstones. DFA based on permeability correctly placed 80% of samples into cross-bedded or ripple-bedded groups. Addition of formation factor to the discriminant function increased correct classifications to 83% - a small but statistically significant gain.« less

  4. Local kernel nonparametric discriminant analysis for adaptive extraction of complex structures

    NASA Astrophysics Data System (ADS)

    Li, Quanbao; Wei, Fajie; Zhou, Shenghan

    2017-05-01

    The linear discriminant analysis (LDA) is one of popular means for linear feature extraction. It usually performs well when the global data structure is consistent with the local data structure. Other frequently-used approaches of feature extraction usually require linear, independence, or large sample condition. However, in real world applications, these assumptions are not always satisfied or cannot be tested. In this paper, we introduce an adaptive method, local kernel nonparametric discriminant analysis (LKNDA), which integrates conventional discriminant analysis with nonparametric statistics. LKNDA is adept in identifying both complex nonlinear structures and the ad hoc rule. Six simulation cases demonstrate that LKNDA have both parametric and nonparametric algorithm advantages and higher classification accuracy. Quartic unilateral kernel function may provide better robustness of prediction than other functions. LKNDA gives an alternative solution for discriminant cases of complex nonlinear feature extraction or unknown feature extraction. At last, the application of LKNDA in the complex feature extraction of financial market activities is proposed.

  5. General tensor discriminant analysis and gabor features for gait recognition.

    PubMed

    Tao, Dacheng; Li, Xuelong; Wu, Xindong; Maybank, Stephen J

    2007-10-01

    The traditional image representations are not suited to conventional classification methods, such as the linear discriminant analysis (LDA), because of the under sample problem (USP): the dimensionality of the feature space is much higher than the number of training samples. Motivated by the successes of the two dimensional LDA (2DLDA) for face recognition, we develop a general tensor discriminant analysis (GTDA) as a preprocessing step for LDA. The benefits of GTDA compared with existing preprocessing methods, e.g., principal component analysis (PCA) and 2DLDA, include 1) the USP is reduced in subsequent classification by, for example, LDA; 2) the discriminative information in the training tensors is preserved; and 3) GTDA provides stable recognition rates because the alternating projection optimization algorithm to obtain a solution of GTDA converges, while that of 2DLDA does not. We use human gait recognition to validate the proposed GTDA. The averaged gait images are utilized for gait representation. Given the popularity of Gabor function based image decompositions for image understanding and object recognition, we develop three different Gabor function based image representations: 1) the GaborD representation is the sum of Gabor filter responses over directions, 2) GaborS is the sum of Gabor filter responses over scales, and 3) GaborSD is the sum of Gabor filter responses over scales and directions. The GaborD, GaborS and GaborSD representations are applied to the problem of recognizing people from their averaged gait images.A large number of experiments were carried out to evaluate the effectiveness (recognition rate) of gait recognition based on first obtaining a Gabor, GaborD, GaborS or GaborSD image representation, then using GDTA to extract features and finally using LDA for classification. The proposed methods achieved good performance for gait recognition based on image sequences from the USF HumanID Database. Experimental comparisons are made with nine

  6. Full-motion video analysis for improved gender classification

    NASA Astrophysics Data System (ADS)

    Flora, Jeffrey B.; Lochtefeld, Darrell F.; Iftekharuddin, Khan M.

    2014-06-01

    The ability of computer systems to perform gender classification using the dynamic motion of the human subject has important applications in medicine, human factors, and human-computer interface systems. Previous works in motion analysis have used data from sensors (including gyroscopes, accelerometers, and force plates), radar signatures, and video. However, full-motion video, motion capture, range data provides a higher resolution time and spatial dataset for the analysis of dynamic motion. Works using motion capture data have been limited by small datasets in a controlled environment. In this paper, we explore machine learning techniques to a new dataset that has a larger number of subjects. Additionally, these subjects move unrestricted through a capture volume, representing a more realistic, less controlled environment. We conclude that existing linear classification methods are insufficient for the gender classification for larger dataset captured in relatively uncontrolled environment. A method based on a nonlinear support vector machine classifier is proposed to obtain gender classification for the larger dataset. In experimental testing with a dataset consisting of 98 trials (49 subjects, 2 trials per subject), classification rates using leave-one-out cross-validation are improved from 73% using linear discriminant analysis to 88% using the nonlinear support vector machine classifier.

  7. Discriminative Hierarchical K-Means Tree for Large-Scale Image Classification.

    PubMed

    Chen, Shizhi; Yang, Xiaodong; Tian, Yingli

    2015-09-01

    A key challenge in large-scale image classification is how to achieve efficiency in terms of both computation and memory without compromising classification accuracy. The learning-based classifiers achieve the state-of-the-art accuracies, but have been criticized for the computational complexity that grows linearly with the number of classes. The nonparametric nearest neighbor (NN)-based classifiers naturally handle large numbers of categories, but incur prohibitively expensive computation and memory costs. In this brief, we present a novel classification scheme, i.e., discriminative hierarchical K-means tree (D-HKTree), which combines the advantages of both learning-based and NN-based classifiers. The complexity of the D-HKTree only grows sublinearly with the number of categories, which is much better than the recent hierarchical support vector machines-based methods. The memory requirement is the order of magnitude less than the recent Naïve Bayesian NN-based approaches. The proposed D-HKTree classification scheme is evaluated on several challenging benchmark databases and achieves the state-of-the-art accuracies, while with significantly lower computation cost and memory requirement.

  8. Intra-regional classification of grape seeds produced in Mendoza province (Argentina) by multi-elemental analysis and chemometrics tools.

    PubMed

    Canizo, Brenda V; Escudero, Leticia B; Pérez, María B; Pellerano, Roberto G; Wuilloud, Rodolfo G

    2018-03-01

    The feasibility of the application of chemometric techniques associated with multi-element analysis for the classification of grape seeds according to their provenance vineyard soil was investigated. Grape seed samples from different localities of Mendoza province (Argentina) were evaluated. Inductively coupled plasma mass spectrometry (ICP-MS) was used for the determination of twenty-nine elements (Ag, As, Ce, Co, Cs, Cu, Eu, Fe, Ga, Gd, La, Lu, Mn, Mo, Nb, Nd, Ni, Pr, Rb, Sm, Te, Ti, Tl, Tm, U, V, Y, Zn and Zr). Once the analytical data were collected, supervised pattern recognition techniques such as linear discriminant analysis (LDA), partial least square discriminant analysis (PLS-DA), k-nearest neighbors (k-NN), support vector machine (SVM) and Random Forest (RF) were applied to construct classification/discrimination rules. The results indicated that nonlinear methods, RF and SVM, perform best with up to 98% and 93% accuracy rate, respectively, and therefore are excellent tools for classification of grapes. Copyright © 2017 Elsevier Ltd. All rights reserved.

  9. Rapid classification of enzymes in cleaning products by hydrolysis, mass spectrometry and linear discriminant analysis.

    PubMed

    Beneito-Cambra, Miriam; Herrero-Martínez, José Manuel; Simó-Alfonso, Ernesto F; Ramis-Ramos, Guillermo

    2008-11-01

    A method for the rapid classification of proteases, lipases, amylases and cellulases used as enhancers in cleaning products, based on precipitation with acetone, hydrolysis with HCl, dilution of the hydrolysates with ethanol, and direct infusion into the electrospray ion source of an ion-trap mass spectrometer, has been developed. The abundances of the ([M+H]+ ions of the amino acids, from the hydrolysates of both the enzyme industrial concentrates and the detergent bases spiked with them, were used to construct linear discriminant analysis models, capable of distinguishing between the enzyme classes. For this purpose, the variables were normalized as follows: (A) the ion abundance of each amino acid was divided by the sum of the ion abundances of all the amino acids in the corresponding mass spectrum; (B) the ratios of pairs of ion abundances were obtained by dividing the ion abundance of each amino acid by each one of the ion abundances of the other 17 amino acids in the corresponding mass spectrum. Using normalization procedure B, excellent class-resolution between proteases, lipases, amylases and cellulases was achieved. In all cases, enzymes in industrial concentrates and manufactured cleaning products were correctly classified with >98% assignment probability.

  10. An electroglottographical analysis-based discriminant function model differentiating multiple sclerosis patients from healthy controls.

    PubMed

    Vavougios, George D; Doskas, Triantafyllos; Konstantopoulos, Kostas

    2018-05-01

    Dysarthrophonia is a predominant symptom in many neurological diseases, affecting the quality of life of the patients. In this study, we produced a discriminant function equation that can differentiate MS patients from healthy controls, using electroglottographic variables not analyzed in a previous study. We applied stepwise linear discriminant function analysis in order to produce a function and score derived from electroglottographic variables extracted from a previous study. The derived discriminant function's statistical significance was determined via Wilk's λ test (and the associated p value). Finally, a 2 × 2 confusion matrix was used to determine the function's predictive accuracy, whereas the cross-validated predictive accuracy is estimated via the "leave-one-out" classification process. Discriminant function analysis (DFA) was used to create a linear function of continuous predictors. DFA produced the following model (Wilk's λ = 0.043, χ2 = 388.588, p < 0.0001, Tables 3 and 4): D (MS vs controls) = 0.728*DQx1 mean monologue + 0.325*CQx monologue + 0.298*DFx1 90% range monologue + 0.443*DQx1 90% range reading - 1.490*DQx1 90% range monologue. The derived discriminant score (S1) was used subsequently in order to form the coordinates of a ROC curve. Thus, a cutoff score of - 0.788 for S1 corresponded to a perfect classification (100% sensitivity and 100% specificity, p = 1.67e -22 ). Consistent with previous findings, electroglottographic evaluation represents an easy to implement and potentially important assessment in MS patients, achieving adequate classification accuracy. Further evaluation is needed to determine its use as a biomarker.

  11. Comparative study on fast classification of brick samples by combination of principal component analysis and linear discriminant analysis using stand-off and table-top laser-induced breakdown spectroscopy

    NASA Astrophysics Data System (ADS)

    Vítková, Gabriela; Prokeš, Lubomír; Novotný, Karel; Pořízka, Pavel; Novotný, Jan; Všianský, Dalibor; Čelko, Ladislav; Kaiser, Jozef

    2014-11-01

    Focusing on historical aspect, during archeological excavation or restoration works of buildings or different structures built from bricks it is important to determine, preferably in-situ and in real-time, the locality of bricks origin. Fast classification of bricks on the base of Laser-Induced Breakdown Spectroscopy (LIBS) spectra is possible using multivariate statistical methods. Combination of principal component analysis (PCA) and linear discriminant analysis (LDA) was applied in this case. LIBS was used to classify altogether the 29 brick samples from 7 different localities. Realizing comparative study using two different LIBS setups - stand-off and table-top it is shown that stand-off LIBS has a big potential for archeological in-field measurements.

  12. Progress toward the determination of correct classification rates in fire debris analysis.

    PubMed

    Waddell, Erin E; Song, Emma T; Rinke, Caitlin N; Williams, Mary R; Sigman, Michael E

    2013-07-01

    Principal components analysis (PCA), linear discriminant analysis (LDA), and quadratic discriminant analysis (QDA) were used to develop a multistep classification procedure for determining the presence of ignitable liquid residue in fire debris and assigning any ignitable liquid residue present into the classes defined under the American Society for Testing and Materials (ASTM) E 1618-10 standard method. A multistep classification procedure was tested by cross-validation based on model data sets comprised of the time-averaged mass spectra (also referred to as total ion spectra) of commercial ignitable liquids and pyrolysis products from common building materials and household furnishings (referred to simply as substrates). Fire debris samples from laboratory-scale and field test burns were also used to test the model. The optimal model's true-positive rate was 81.3% for cross-validation samples and 70.9% for fire debris samples. The false-positive rate was 9.9% for cross-validation samples and 8.9% for fire debris samples. © 2013 American Academy of Forensic Sciences.

  13. DISCRIMINATION OF GRANITOIDS AND MINERALIZED GRANITOIDS IN THE MIDYAN REGION, NORTHWESTERN ARABIAN SHIELD, SAUDI ARABIA, BY LANDSAT MSS DATA-ANALYSIS.

    USGS Publications Warehouse

    Davis, Philip A.; Grolier, Maurice J.

    1984-01-01

    Landsat multispectral scanner (MSS) band and band-ratio databases of two scenes covering the Midyan region of northwestern Saudi Arabia were examined quantitatively and qualitatively to determine which databases best discriminate the geologic units of this semi-arid and arid region. Unsupervised, linear-discriminant cluster-analysis was performed on these two band-ratio combinations and on the MSS bands for both scenes. The results for granitoid-rock discrimination indicated that the classification images using the MSS bands are superior to the band-ratio classification images for two reasons, discussed in the paper. Yet, the effects of topography and material type (including desert varnish) on the MSS-band data produced ambiguities in the MSS-band classification results. However, these ambiguities were clarified by using a simulated natural-color image in conjunction with the MSS-band classification image.

  14. Discriminant analysis of resting-state functional connectivity patterns on the Grassmann manifold

    NASA Astrophysics Data System (ADS)

    Fan, Yong; Liu, Yong; Jiang, Tianzi; Liu, Zhening; Hao, Yihui; Liu, Haihong

    2010-03-01

    The functional networks, extracted from fMRI images using independent component analysis, have been demonstrated informative for distinguishing brain states of cognitive functions and neurological diseases. In this paper, we propose a novel algorithm for discriminant analysis of functional networks encoded by spatial independent components. The functional networks of each individual are used as bases for a linear subspace, referred to as a functional connectivity pattern, which facilitates a comprehensive characterization of temporal signals of fMRI data. The functional connectivity patterns of different individuals are analyzed on the Grassmann manifold by adopting a principal angle based subspace distance. In conjunction with a support vector machine classifier, a forward component selection technique is proposed to select independent components for constructing the most discriminative functional connectivity pattern. The discriminant analysis method has been applied to an fMRI based schizophrenia study with 31 schizophrenia patients and 31 healthy individuals. The experimental results demonstrate that the proposed method not only achieves a promising classification performance for distinguishing schizophrenia patients from healthy controls, but also identifies discriminative functional networks that are informative for schizophrenia diagnosis.

  15. Using Discrete Loss Functions and Weighted Kappa for Classification: An Illustration Based on Bayesian Network Analysis

    ERIC Educational Resources Information Center

    Zwick, Rebecca; Lenaburg, Lubella

    2009-01-01

    In certain data analyses (e.g., multiple discriminant analysis and multinomial log-linear modeling), classification decisions are made based on the estimated posterior probabilities that individuals belong to each of several distinct categories. In the Bayesian network literature, this type of classification is often accomplished by assigning…

  16. Feature extraction with deep neural networks by a generalized discriminant analysis.

    PubMed

    Stuhlsatz, André; Lippel, Jens; Zielke, Thomas

    2012-04-01

    We present an approach to feature extraction that is a generalization of the classical linear discriminant analysis (LDA) on the basis of deep neural networks (DNNs). As for LDA, discriminative features generated from independent Gaussian class conditionals are assumed. This modeling has the advantages that the intrinsic dimensionality of the feature space is bounded by the number of classes and that the optimal discriminant function is linear. Unfortunately, linear transformations are insufficient to extract optimal discriminative features from arbitrarily distributed raw measurements. The generalized discriminant analysis (GerDA) proposed in this paper uses nonlinear transformations that are learnt by DNNs in a semisupervised fashion. We show that the feature extraction based on our approach displays excellent performance on real-world recognition and detection tasks, such as handwritten digit recognition and face detection. In a series of experiments, we evaluate GerDA features with respect to dimensionality reduction, visualization, classification, and detection. Moreover, we show that GerDA DNNs can preprocess truly high-dimensional input data to low-dimensional representations that facilitate accurate predictions even if simple linear predictors or measures of similarity are used.

  17. Classification of Fusarium-Infected Korean Hulled Barley Using Near-Infrared Reflectance Spectroscopy and Partial Least Squares Discriminant Analysis

    PubMed Central

    Lim, Jongguk; Kim, Giyoung; Mo, Changyeun; Oh, Kyoungmin; Yoo, Hyeonchae; Ham, Hyeonheui; Kim, Moon S.

    2017-01-01

    The purpose of this study is to use near-infrared reflectance (NIR) spectroscopy equipment to nondestructively and rapidly discriminate Fusarium-infected hulled barley. Both normal hulled barley and Fusarium-infected hulled barley were scanned by using a NIR spectrometer with a wavelength range of 1175 to 2170 nm. Multiple mathematical pretreatments were applied to the reflectance spectra obtained for Fusarium discrimination and the multivariate analysis method of partial least squares discriminant analysis (PLS-DA) was used for discriminant prediction. The PLS-DA prediction model developed by applying the second-order derivative pretreatment to the reflectance spectra obtained from the side of hulled barley without crease achieved 100% accuracy in discriminating the normal hulled barley and the Fusarium-infected hulled barley. These results demonstrated the feasibility of rapid discrimination of the Fusarium-infected hulled barley by combining multivariate analysis with the NIR spectroscopic technique, which is utilized as a nondestructive detection method. PMID:28974012

  18. An improved discriminative filter bank selection approach for motor imagery EEG signal classification using mutual information.

    PubMed

    Kumar, Shiu; Sharma, Alok; Tsunoda, Tatsuhiko

    2017-12-28

    Common spatial pattern (CSP) has been an effective technique for feature extraction in electroencephalography (EEG) based brain computer interfaces (BCIs). However, motor imagery EEG signal feature extraction using CSP generally depends on the selection of the frequency bands to a great extent. In this study, we propose a mutual information based frequency band selection approach. The idea of the proposed method is to utilize the information from all the available channels for effectively selecting the most discriminative filter banks. CSP features are extracted from multiple overlapping sub-bands. An additional sub-band has been introduced that cover the wide frequency band (7-30 Hz) and two different types of features are extracted using CSP and common spatio-spectral pattern techniques, respectively. Mutual information is then computed from the extracted features of each of these bands and the top filter banks are selected for further processing. Linear discriminant analysis is applied to the features extracted from each of the filter banks. The scores are fused together, and classification is done using support vector machine. The proposed method is evaluated using BCI Competition III dataset IVa, BCI Competition IV dataset I and BCI Competition IV dataset IIb, and it outperformed all other competing methods achieving the lowest misclassification rate and the highest kappa coefficient on all three datasets. Introducing a wide sub-band and using mutual information for selecting the most discriminative sub-bands, the proposed method shows improvement in motor imagery EEG signal classification.

  19. Pattern classification of fMRI data: applications for analysis of spatially distributed cortical networks.

    PubMed

    Yourganov, Grigori; Schmah, Tanya; Churchill, Nathan W; Berman, Marc G; Grady, Cheryl L; Strother, Stephen C

    2014-08-01

    The field of fMRI data analysis is rapidly growing in sophistication, particularly in the domain of multivariate pattern classification. However, the interaction between the properties of the analytical model and the parameters of the BOLD signal (e.g. signal magnitude, temporal variance and functional connectivity) is still an open problem. We addressed this problem by evaluating a set of pattern classification algorithms on simulated and experimental block-design fMRI data. The set of classifiers consisted of linear and quadratic discriminants, linear support vector machine, and linear and nonlinear Gaussian naive Bayes classifiers. For linear discriminant, we used two methods of regularization: principal component analysis, and ridge regularization. The classifiers were used (1) to classify the volumes according to the behavioral task that was performed by the subject, and (2) to construct spatial maps that indicated the relative contribution of each voxel to classification. Our evaluation metrics were: (1) accuracy of out-of-sample classification and (2) reproducibility of spatial maps. In simulated data sets, we performed an additional evaluation of spatial maps with ROC analysis. We varied the magnitude, temporal variance and connectivity of simulated fMRI signal and identified the optimal classifier for each simulated environment. Overall, the best performers were linear and quadratic discriminants (operating on principal components of the data matrix) and, in some rare situations, a nonlinear Gaussian naïve Bayes classifier. The results from the simulated data were supported by within-subject analysis of experimental fMRI data, collected in a study of aging. This is the first study that systematically characterizes interactions between analysis model and signal parameters (such as magnitude, variance and correlation) on the performance of pattern classifiers for fMRI. Copyright © 2014 Elsevier Inc. All rights reserved.

  20. Discrimination of red and white rice bran from Indonesia using HPLC fingerprint analysis combined with chemometrics.

    PubMed

    Sabir, Aryani; Rafi, Mohamad; Darusman, Latifah K

    2017-04-15

    HPLC fingerprint analysis combined with chemometrics was developed to discriminate between the red and the white rice bran grown in Indonesia. The major component in rice bran is γ-oryzanol which consisted of 4 main compounds, namely cycloartenol ferulate, cyclobranol ferulate, campesterol ferulate and β-sitosterol ferulate. Separation of these four compounds along with other compounds was performed using C18 and methanol-acetonitrile with gradient elution system. By using these intensity variations, principal component and discriminant analysis were performed to discriminate the two samples. Discriminant analysis was successfully discriminated the red from the white rice bran with predictive ability of the model showed a satisfactory classification for the test samples. The results of this study indicated that the developed method was suitable as quality control method for rice bran in terms of identification and discrimination of the red and the white rice bran. Copyright © 2016 Elsevier Ltd. All rights reserved.

  1. Classification of Antibiotic Resistance Patterns of Indicator Bacteria by Discriminant Analysis: Use in Predicting the Source of Fecal Contamination in Subtropical Waters

    PubMed Central

    Harwood, Valerie J.; Whitlock, John; Withington, Victoria

    2000-01-01

    The antibiotic resistance patterns of fecal streptococci and fecal coliforms isolated from domestic wastewater and animal feces were determined using a battery of antibiotics (amoxicillin, ampicillin, cephalothin, chlortetracycline, oxytetracycline, tetracycline, erythromycin, streptomycin, and vancomycin) at four concentrations each. The sources of animal feces included wild birds, cattle, chickens, dogs, pigs, and raccoons. Antibiotic resistance patterns of fecal streptococci and fecal coliforms from known sources were grouped into two separate databases, and discriminant analysis of these patterns was used to establish the relationship between the antibiotic resistance patterns and the bacterial source. The fecal streptococcus and fecal coliform databases classified isolates from known sources with similar accuracies. The average rate of correct classification for the fecal streptococcus database was 62.3%, and that for the fecal coliform database was 63.9%. The sources of fecal streptococci and fecal coliforms isolated from surface waters were identified by discriminant analysis of their antibiotic resistance patterns. Both databases identified the source of indicator bacteria isolated from surface waters directly impacted by septic tank discharges as human. At sample sites selected for relatively low anthropogenic impact, the dominant sources of indicator bacteria were identified as various animals. The antibiotic resistance analysis technique promises to be a useful tool in assessing sources of fecal contamination in subtropical waters, such as those in Florida. PMID:10966379

  2. Application of otolith shape analysis for stock discrimination and species identification of five goby species (Perciformes: Gobiidae) in the northern Chinese coastal waters

    NASA Astrophysics Data System (ADS)

    Yu, Xin; Cao, Liang; Liu, Jinhu; Zhao, Bo; Shan, Xiujuan; Dou, Shuozeng

    2014-09-01

    We tested the use of otolith shape analysis to discriminate between species and stocks of five goby species ( Ctenotrypauchen chinensis, Odontamblyopus lacepedii, Amblychaeturichthys hexanema, Chaeturichthys stigmatias, and Acanthogobius hasta) found in northern Chinese coastal waters. The five species were well differentiated with high overall classification success using shape indices (83.7%), elliptic Fourier coefficients (98.6%), or the combination of both methods (94.9%). However, shape analysis alone was only moderately successful at discriminating among the four stocks (Liaodong Bay, LD; Bohai Bay, BH; Huanghe (Yellow) River estuary HRE, and Jiaozhou Bay, JZ stocks) of A. hasta (50%-54%) and C. stigmatias (65.7%-75.8%). For these two species, shape analysis was moderately successful at discriminating the HRE or JZ stocks from other stocks, but failed to effectively identify the LD and BH stocks. A large number of otoliths were misclassified between the HRE and JZ stocks, which are geographically well separated. The classification success for stock discrimination was higher using elliptic Fourier coefficients alone (70.2%) or in combination with shape indices (75.8%) than using only shape indices (65.7%) in C. stigmatias whereas there was little difference among the three methods for A. hasta. Our results supported the common belief that otolith shape analysis is generally more effective for interspecific identification than intraspecific discrimination. Moreover, compared with shape indices analysis, Fourier analysis improves classification success during inter- and intra-species discrimination by otolith shape analysis, although this did not necessarily always occur in all fish species.

  3. An Initial Analysis of LANDSAT-4 Thematic Mapper Data for the Discrimination of Agricultural, Forested Wetland, and Urban Land Covers

    NASA Technical Reports Server (NTRS)

    Quattrochi, D. A.

    1984-01-01

    An initial analysis of LANDSAT 4 Thematic Mapper (TM) data for the discrimination of agricultural, forested wetland, and urban land covers is conducted using a scene of data collected over Arkansas and Tennessee. A classification of agricultural lands derived from multitemporal LANDSAT Multispectral Scanner (MSS) data is compared with a classification of TM data for the same area. Results from this comparative analysis show that the multitemporal MSS classification produced an overall accuracy of 80.91% while the TM classification yields an overall classification accuracy of 97.06% correct.

  4. Robust linear discriminant analysis with distance based estimators

    NASA Astrophysics Data System (ADS)

    Lim, Yai-Fung; Yahaya, Sharipah Soaad Syed; Ali, Hazlina

    2017-11-01

    Linear discriminant analysis (LDA) is one of the supervised classification techniques concerning relationship between a categorical variable and a set of continuous variables. The main objective of LDA is to create a function to distinguish between populations and allocating future observations to previously defined populations. Under the assumptions of normality and homoscedasticity, the LDA yields optimal linear discriminant rule (LDR) between two or more groups. However, the optimality of LDA highly relies on the sample mean and pooled sample covariance matrix which are known to be sensitive to outliers. To alleviate these conflicts, a new robust LDA using distance based estimators known as minimum variance vector (MVV) has been proposed in this study. The MVV estimators were used to substitute the classical sample mean and classical sample covariance to form a robust linear discriminant rule (RLDR). Simulation and real data study were conducted to examine on the performance of the proposed RLDR measured in terms of misclassification error rates. The computational result showed that the proposed RLDR is better than the classical LDR and was comparable with the existing robust LDR.

  5. Classification and discrimination of pediatric patients undergoing open heart surgery with and without methylprednisolone treatment by cytomics

    NASA Astrophysics Data System (ADS)

    Bocsi, Jozsef; Mittag, Anja; Pierzchalski, Arkadiusz; Osmancik, Pavel; Dähnert, Ingo; Tárnok, Attila

    2011-02-01

    Introduction: Methylprednisolone (MP) is frequently preoperatively administered in children undergoing open heart surgery. The aim of this medication is to inhibit overshooting immune responses. Earlier studies demonstrated cellular and humoral immunological changes in pediatric patients undergoing heart surgeries with and without MP administration. Here in a retrospective study we investigated the modulation of the cellular immune response by MP. The aim was to identify suitable parameters characterizing MP effects by cluster analysis. Methods: Blood samples were analysed from two aged matched groups with surgical correction of septum defects. Group without MP treatment consisted of 10 patients; MP was administered on 21 patients (median dose: 11mg/kg) before cardiopulmonary bypass (CPB). EDTA anticoagulated blood was obtained 24 h preoperatively, after anesthesia, at CPB begin and end (CPB2), 4h, 24h, 48h after surgery, at discharge and at out-patient followup (8.2; 3.3-12.2 month after surgery; median and IQR). Flow cytometry showed the biggest MP relevant changes at CPB2 and 4h postoperatively. They were used for clustering analysis. Classification was made by discriminant analysis and cluster analysis by means of Genes@work software. Results & conclusion: 146 parameters were obtained from analysis. Cross-validation revealed several parameters being able to discriminate between MP groups and to identify immune modulation. MP administration resulted in a delayed activation of monocytes, increased ratio of neutrophils, reduced T-lymphocytes counts. Cluster analysis demonstrated that classification of patients is possible based on the identified cytomics parameters. Further investigation of these parameters might help to understand the MP effects in pediatric open heart surgery.

  6. Gaussian Discriminant Analysis for Optimal Delineation of Mild Cognitive Impairment in Alzheimer's Disease.

    PubMed

    Fang, Chen; Li, Chunfei; Cabrerizo, Mercedes; Barreto, Armando; Andrian, Jean; Rishe, Naphtali; Loewenstein, David; Duara, Ranjan; Adjouadi, Malek

    2018-04-12

    Over the past few years, several approaches have been proposed to assist in the early diagnosis of Alzheimer's disease (AD) and its prodromal stage of mild cognitive impairment (MCI). Using multimodal biomarkers for this high-dimensional classification problem, the widely used algorithms include Support Vector Machines (SVM), Sparse Representation-based classification (SRC), Deep Belief Networks (DBN) and Random Forest (RF). These widely used algorithms continue to yield unsatisfactory performance for delineating the MCI participants from the cognitively normal control (CN) group. A novel Gaussian discriminant analysis-based algorithm is thus introduced to achieve a more effective and accurate classification performance than the aforementioned state-of-the-art algorithms. This study makes use of magnetic resonance imaging (MRI) data uniquely as input to two separate high-dimensional decision spaces that reflect the structural measures of the two brain hemispheres. The data used include 190 CN, 305 MCI and 133 AD subjects as part of the AD Big Data DREAM Challenge #1. Using 80% data for a 10-fold cross-validation, the proposed algorithm achieved an average F1 score of 95.89% and an accuracy of 96.54% for discriminating AD from CN; and more importantly, an average F1 score of 92.08% and an accuracy of 90.26% for discriminating MCI from CN. Then, a true test was implemented on the remaining 20% held-out test data. For discriminating MCI from CN, an accuracy of 80.61%, a sensitivity of 81.97% and a specificity of 78.38% were obtained. These results show significant improvement over existing algorithms for discriminating the subtle differences between MCI participants and the CN group.

  7. Physical activity classification with dynamic discriminative methods.

    PubMed

    Ray, Evan L; Sasaki, Jeffer E; Freedson, Patty S; Staudenmayer, John

    2018-06-19

    A person's physical activity has important health implications, so it is important to be able to measure aspects of physical activity objectively. One approach to doing that is to use data from an accelerometer to classify physical activity according to activity type (e.g., lying down, sitting, standing, or walking) or intensity (e.g., sedentary, light, moderate, or vigorous). This can be formulated as a labeled classification problem, where the model relates a feature vector summarizing the accelerometer signal in a window of time to the activity type or intensity in that window. These data exhibit two key characteristics: (1) the activity classes in different time windows are not independent, and (2) the accelerometer features have moderately high dimension and follow complex distributions. Through a simulation study and applications to three datasets, we demonstrate that a model's classification performance is related to how it addresses these aspects of the data. Dynamic methods that account for temporal dependence achieve better performance than static methods that do not. Generative methods that explicitly model the distribution of the accelerometer signal features do not perform as well as methods that take a discriminative approach to establishing the relationship between the accelerometer signal and the activity class. Specifically, Conditional Random Fields consistently have better performance than commonly employed methods that ignore temporal dependence or attempt to model the accelerometer features. © 2018, The International Biometric Society.

  8. [Prognostic parameters in liver cirrhosis, varicose bleeding and sclerosing therapy. Prospective comparison of a prognostic system with the Child classification obtained by discriminant analysis].

    PubMed

    Sauerbruch, T; Ansari, H; Wotzka, R; Soehendra, N; Köpcke, W

    1988-01-08

    Prospective prognosis systems for predicting half-year death-rate after bleeding from oesophageal varices and sclerotherapy were tested on 129 patients. The receiver-operating-characteristic curves of three discriminant scores were compared with the Child-Pugh classification. It was found that the latter is still the best for prognosticating the course of the disease. A simplified discriminant score which contains as its only factors bilirubin and the Quick value does, however, give nearly as good information.

  9. Discrimination between Alzheimer's Disease and Late Onset Bipolar Disorder Using Multivariate Analysis.

    PubMed

    Besga, Ariadna; Gonzalez, Itxaso; Echeburua, Enrique; Savio, Alexandre; Ayerdi, Borja; Chyzhyk, Darya; Madrigal, Jose L M; Leza, Juan C; Graña, Manuel; Gonzalez-Pinto, Ana Maria

    2015-01-01

    Late onset bipolar disorder (LOBD) is often difficult to distinguish from degenerative dementias, such as Alzheimer disease (AD), due to comorbidities and common cognitive symptoms. Moreover, LOBD prevalence in the elder population is not negligible and it is increasing. Both pathologies share pathophysiological neuroinflammation features. Improvements in differential diagnosis of LOBD and AD will help to select the best personalized treatment. The aim of this study is to assess the relative significance of clinical observations, neuropsychological tests, and specific blood plasma biomarkers (inflammatory and neurotrophic), separately and combined, in the differential diagnosis of LOBD versus AD. It was carried out evaluating the accuracy achieved by classification-based computer-aided diagnosis (CAD) systems based on these variables. A sample of healthy controls (HC) (n = 26), AD patients (n = 37), and LOBD patients (n = 32) was recruited at the Alava University Hospital. Clinical observations, neuropsychological tests, and plasma biomarkers were measured at recruitment time. We applied multivariate machine learning classification methods to discriminate subjects from HC, AD, and LOBD populations in the study. We analyzed, for each classification contrast, feature sets combining clinical observations, neuropsychological measures, and biological markers, including inflammation biomarkers. Furthermore, we analyzed reduced feature sets containing variables with significative differences determined by a Welch's t-test. Furthermore, a battery of classifier architectures were applied, encompassing linear and non-linear Support Vector Machines (SVM), Random Forests (RF), Classification and regression trees (CART), and their performance was evaluated in a leave-one-out (LOO) cross-validation scheme. Post hoc analysis of Gini index in CART classifiers provided a measure of each variable importance. Welch's t-test found one biomarker (Malondialdehyde) with

  10. Unsupervised Wishart Classfication of Wetlands in Newfoundland, Canada Using Polsar Data Based on Fisher Linear Discriminant Analysis

    NASA Astrophysics Data System (ADS)

    Mohammadimanesh, F.; Salehi, B.; Mahdianpari, M.; Homayouni, S.

    2016-06-01

    Polarimetric Synthetic Aperture Radar (PolSAR) imagery is a complex multi-dimensional dataset, which is an important source of information for various natural resources and environmental classification and monitoring applications. PolSAR imagery produces valuable information by observing scattering mechanisms from different natural and man-made objects. Land cover mapping using PolSAR data classification is one of the most important applications of SAR remote sensing earth observations, which have gained increasing attention in the recent years. However, one of the most challenging aspects of classification is selecting features with maximum discrimination capability. To address this challenge, a statistical approach based on the Fisher Linear Discriminant Analysis (FLDA) and the incorporation of physical interpretation of PolSAR data into classification is proposed in this paper. After pre-processing of PolSAR data, including the speckle reduction, the H/α classification is used in order to classify the basic scattering mechanisms. Then, a new method for feature weighting, based on the fusion of FLDA and physical interpretation, is implemented. This method proves to increase the classification accuracy as well as increasing between-class discrimination in the final Wishart classification. The proposed method was applied to a full polarimetric C-band RADARSAT-2 data set from Avalon area, Newfoundland and Labrador, Canada. This imagery has been acquired in June 2015, and covers various types of wetlands including bogs, fens, marshes and shallow water. The results were compared with the standard Wishart classification, and an improvement of about 20% was achieved in the overall accuracy. This method provides an opportunity for operational wetland classification in northern latitude with high accuracy using only SAR polarimetric data.

  11. Kernel PLS-SVC for Linear and Nonlinear Discrimination

    NASA Technical Reports Server (NTRS)

    Rosipal, Roman; Trejo, Leonard J.; Matthews, Bryan

    2003-01-01

    A new methodology for discrimination is proposed. This is based on kernel orthonormalized partial least squares (PLS) dimensionality reduction of the original data space followed by support vector machines for classification. Close connection of orthonormalized PLS and Fisher's approach to linear discrimination or equivalently with canonical correlation analysis is described. This gives preference to use orthonormalized PLS over principal component analysis. Good behavior of the proposed method is demonstrated on 13 different benchmark data sets and on the real world problem of the classification finger movement periods versus non-movement periods based on electroencephalogram.

  12. Ultrahigh-Dimensional Multiclass Linear Discriminant Analysis by Pairwise Sure Independence Screening

    PubMed Central

    Pan, Rui; Wang, Hansheng; Li, Runze

    2016-01-01

    This paper is concerned with the problem of feature screening for multi-class linear discriminant analysis under ultrahigh dimensional setting. We allow the number of classes to be relatively large. As a result, the total number of relevant features is larger than usual. This makes the related classification problem much more challenging than the conventional one, where the number of classes is small (very often two). To solve the problem, we propose a novel pairwise sure independence screening method for linear discriminant analysis with an ultrahigh dimensional predictor. The proposed procedure is directly applicable to the situation with many classes. We further prove that the proposed method is screening consistent. Simulation studies are conducted to assess the finite sample performance of the new procedure. We also demonstrate the proposed methodology via an empirical analysis of a real life example on handwritten Chinese character recognition. PMID:28127109

  13. Examining the Effectiveness of Discriminant Function Analysis and Cluster Analysis in Species Identification of Male Field Crickets Based on Their Calling Songs

    PubMed Central

    Jaiswara, Ranjana; Nandi, Diptarup; Balakrishnan, Rohini

    2013-01-01

    Traditional taxonomy based on morphology has often failed in accurate species identification owing to the occurrence of cryptic species, which are reproductively isolated but morphologically identical. Molecular data have thus been used to complement morphology in species identification. The sexual advertisement calls in several groups of acoustically communicating animals are species-specific and can thus complement molecular data as non-invasive tools for identification. Several statistical tools and automated identifier algorithms have been used to investigate the efficiency of acoustic signals in species identification. Despite a plethora of such methods, there is a general lack of knowledge regarding the appropriate usage of these methods in specific taxa. In this study, we investigated the performance of two commonly used statistical methods, discriminant function analysis (DFA) and cluster analysis, in identification and classification based on acoustic signals of field cricket species belonging to the subfamily Gryllinae. Using a comparative approach we evaluated the optimal number of species and calling song characteristics for both the methods that lead to most accurate classification and identification. The accuracy of classification using DFA was high and was not affected by the number of taxa used. However, a constraint in using discriminant function analysis is the need for a priori classification of songs. Accuracy of classification using cluster analysis, which does not require a priori knowledge, was maximum for 6–7 taxa and decreased significantly when more than ten taxa were analysed together. We also investigated the efficacy of two novel derived acoustic features in improving the accuracy of identification. Our results show that DFA is a reliable statistical tool for species identification using acoustic signals. Our results also show that cluster analysis of acoustic signals in crickets works effectively for species classification and

  14. Examining the effectiveness of discriminant function analysis and cluster analysis in species identification of male field crickets based on their calling songs.

    PubMed

    Jaiswara, Ranjana; Nandi, Diptarup; Balakrishnan, Rohini

    2013-01-01

    Traditional taxonomy based on morphology has often failed in accurate species identification owing to the occurrence of cryptic species, which are reproductively isolated but morphologically identical. Molecular data have thus been used to complement morphology in species identification. The sexual advertisement calls in several groups of acoustically communicating animals are species-specific and can thus complement molecular data as non-invasive tools for identification. Several statistical tools and automated identifier algorithms have been used to investigate the efficiency of acoustic signals in species identification. Despite a plethora of such methods, there is a general lack of knowledge regarding the appropriate usage of these methods in specific taxa. In this study, we investigated the performance of two commonly used statistical methods, discriminant function analysis (DFA) and cluster analysis, in identification and classification based on acoustic signals of field cricket species belonging to the subfamily Gryllinae. Using a comparative approach we evaluated the optimal number of species and calling song characteristics for both the methods that lead to most accurate classification and identification. The accuracy of classification using DFA was high and was not affected by the number of taxa used. However, a constraint in using discriminant function analysis is the need for a priori classification of songs. Accuracy of classification using cluster analysis, which does not require a priori knowledge, was maximum for 6-7 taxa and decreased significantly when more than ten taxa were analysed together. We also investigated the efficacy of two novel derived acoustic features in improving the accuracy of identification. Our results show that DFA is a reliable statistical tool for species identification using acoustic signals. Our results also show that cluster analysis of acoustic signals in crickets works effectively for species classification and

  15. Alteration mapping at Goldfield, Nevada, by cluster and discriminant analysis of LANDSAT digital data

    NASA Technical Reports Server (NTRS)

    Ballew, G.

    1977-01-01

    The ability of Landsat multispectral digital data to differentiate among 62 combinations of rock and alteration types at the Goldfield mining district of Western Nevada was investigated by using statistical techniques of cluster and discriminant analysis. Multivariate discriminant analysis was not effective in classifying each of the 62 groups, with classification results essentially the same whether data of four channels alone or combined with six ratios of channels were used. Bivariate plots of group means revealed a cluster of three groups including mill tailings, basalt and all other rock and alteration types. Automatic hierarchical clustering based on the fourth dimensional Mahalanobis distance between group means of 30 groups having five or more samples was performed. The results of the cluster analysis revealed hierarchies of mill tailings vs. natural materials, basalt vs. non-basalt, highly reflectant rocks vs. other rocks and exclusively unaltered rocks vs. predominantly altered rocks. The hierarchies were used to determine the order in which sets of multiple discriminant analyses were to be performed and the resulting discriminant functions were used to produce a map of geology and alteration which has an overall accuracy of 70 percent for discriminating exclusively altered rocks from predominantly altered rocks.

  16. Discriminative Dictionary Learning With Two-Level Low Rank and Group Sparse Decomposition for Image Classification.

    PubMed

    Wen, Zaidao; Hou, Zaidao; Jiao, Licheng

    2017-11-01

    Discriminative dictionary learning (DDL) framework has been widely used in image classification which aims to learn some class-specific feature vectors as well as a representative dictionary according to a set of labeled training samples. However, interclass similarities and intraclass variances among input samples and learned features will generally weaken the representability of dictionary and the discrimination of feature vectors so as to degrade the classification performance. Therefore, how to explicitly represent them becomes an important issue. In this paper, we present a novel DDL framework with two-level low rank and group sparse decomposition model. In the first level, we learn a class-shared and several class-specific dictionaries, where a low rank and a group sparse regularization are, respectively, imposed on the corresponding feature matrices. In the second level, the class-specific feature matrix will be further decomposed into a low rank and a sparse matrix so that intraclass variances can be separated to concentrate the corresponding feature vectors. Extensive experimental results demonstrate the effectiveness of our model. Compared with the other state-of-the-arts on several popular image databases, our model can achieve a competitive or better performance in terms of the classification accuracy.

  17. Hierarchical Discriminant Analysis.

    PubMed

    Lu, Di; Ding, Chuntao; Xu, Jinliang; Wang, Shangguang

    2018-01-18

    The Internet of Things (IoT) generates lots of high-dimensional sensor intelligent data. The processing of high-dimensional data (e.g., data visualization and data classification) is very difficult, so it requires excellent subspace learning algorithms to learn a latent subspace to preserve the intrinsic structure of the high-dimensional data, and abandon the least useful information in the subsequent processing. In this context, many subspace learning algorithms have been presented. However, in the process of transforming the high-dimensional data into the low-dimensional space, the huge difference between the sum of inter-class distance and the sum of intra-class distance for distinct data may cause a bias problem. That means that the impact of intra-class distance is overwhelmed. To address this problem, we propose a novel algorithm called Hierarchical Discriminant Analysis (HDA). It minimizes the sum of intra-class distance first, and then maximizes the sum of inter-class distance. This proposed method balances the bias from the inter-class and that from the intra-class to achieve better performance. Extensive experiments are conducted on several benchmark face datasets. The results reveal that HDA obtains better performance than other dimensionality reduction algorithms.

  18. Evaluation of linear discriminant analysis for automated Raman histological mapping of esophageal high-grade dysplasia

    NASA Astrophysics Data System (ADS)

    Hutchings, Joanne; Kendall, Catherine; Shepherd, Neil; Barr, Hugh; Stone, Nicholas

    2010-11-01

    Rapid Raman mapping has the potential to be used for automated histopathology diagnosis, providing an adjunct technique to histology diagnosis. The aim of this work is to evaluate the feasibility of automated and objective pathology classification of Raman maps using linear discriminant analysis. Raman maps of esophageal tissue sections are acquired. Principal component (PC)-fed linear discriminant analysis (LDA) is carried out using subsets of the Raman map data (6483 spectra). An overall (validated) training classification model performance of 97.7% (sensitivity 95.0 to 100% and specificity 98.6 to 100%) is obtained. The remainder of the map spectra (131,672 spectra) are projected onto the classification model resulting in Raman images, demonstrating good correlation with contiguous hematoxylin and eosin (HE) sections. Initial results suggest that LDA has the potential to automate pathology diagnosis of esophageal Raman images, but since the classification of test spectra is forced into existing training groups, further work is required to optimize the training model. A small pixel size is advantageous for developing the training datasets using mapping data, despite lengthy mapping times, due to additional morphological information gained, and could facilitate differentiation of further tissue groups, such as the basal cells/lamina propria, in the future, but larger pixels sizes (and faster mapping) may be more feasible for clinical application.

  19. Investigating the limitations of tree species classification using the Combined Cluster and Discriminant Analysis method for low density ALS data from a dense forest region in Aggtelek (Hungary)

    NASA Astrophysics Data System (ADS)

    Koma, Zsófia; Deák, Márton; Kovács, József; Székely, Balázs; Kelemen, Kristóf; Standovár, Tibor

    2016-04-01

    Airborne Laser Scanning (ALS) is a widely used technology for forestry classification applications. However, single tree detection and species classification from low density ALS point cloud is limited in a dense forest region. In this study we investigate the division of a forest into homogenous groups at stand level. The study area is located in the Aggtelek karst region (Northeast Hungary) with a complex relief topography. The ALS dataset contained only 4 discrete echoes (at 2-4 pt/m2 density) from the study area during leaf-on season. Ground-truth measurements about canopy closure and proportion of tree species cover are available for every 70 meter in 500 square meter circular plots. In the first step, ALS data were processed and geometrical and intensity based features were calculated into a 5×5 meter raster based grid. The derived features contained: basic statistics of relative height, canopy RMS, echo ratio, openness, pulse penetration ratio, basic statistics of radiometric feature. In the second step the data were investigated using Combined Cluster and Discriminant Analysis (CCDA, Kovács et al., 2014). The CCDA method first determines a basic grouping for the multiple circle shaped sampling locations using hierarchical clustering and then for the arising grouping possibilities a core cycle is executed comparing the goodness of the investigated groupings with random ones. Out of these comparisons difference values arise, yielding information about the optimal grouping out of the investigated ones. If sub-groups are then further investigated, one might even find homogeneous groups. We found that low density ALS data classification into homogeneous groups are highly dependent on canopy closure, and the proportion of the dominant tree species. The presented results show high potential using CCDA for determination of homogenous separable groups in LiDAR based tree species classification. Aggtelek Karst/Slovakian Karst Caves" (HUSK/1101/221/0180, Aggtelek NP

  20. Rapid discrimination of different Apiaceae species based on HPTLC fingerprints and targeted flavonoids determination using multivariate image analysis.

    PubMed

    Shawky, Eman; Abou El Kheir, Rasha M

    2018-02-11

    Species of Apiaceae are used in folk medicine as spices and in officinal medicinal preparations of drugs. They are an excellent source of phenolics exhibiting antioxidant activity, which are of great benefit to human health. Discrimination among Apiaceae medicinal herbs remains an intricate challenge due to their morphological similarity. In this study, a combined "untargeted" and "targeted" approach to investigate different Apiaceae plants species was proposed by using the merging of high-performance thin layer chromatography (HPTLC)-image analysis and pattern recognition methods which were used for fingerprinting and classification of 42 different Apiaceae samples collected from Egypt. Software for image processing was applied for fingerprinting and data acquisition. HPTLC fingerprint assisted by principal component analysis (PCA) and hierarchical cluster analysis (HCA)-heat maps resulted in a reliable untargeted approach for discrimination and classification of different samples. The "targeted" approach was performed by developing and validating an HPTLC method allowing the quantification of eight flavonoids. The combination of quantitative data with PCA and HCA-heat-maps allowed the different samples to be discriminated from each other. The use of chemometrics tools for evaluation of fingerprints reduced expense and analysis time. The proposed method can be adopted for routine discrimination and evaluation of the phytochemical variability in different Apiaceae species extracts. Copyright © 2018 John Wiley & Sons, Ltd.

  1. The contribution of cluster and discriminant analysis to the classification of complex aquifer systems.

    PubMed

    Panagopoulos, G P; Angelopoulou, D; Tzirtzilakis, E E; Giannoulopoulos, P

    2016-10-01

    This paper presents an innovated method for the discrimination of groundwater samples in common groups representing the hydrogeological units from where they have been pumped. This method proved very efficient even in areas with complex hydrogeological regimes. The proposed method requires chemical analyses of water samples only for major ions, meaning that it is applicable to most of cases worldwide. Another benefit of the method is that it gives a further insight of the aquifer hydrogeochemistry as it provides the ions that are responsible for the discrimination of the group. The procedure begins with cluster analysis of the dataset in order to classify the samples in the corresponding hydrogeological unit. The feasibility of the method is proven from the fact that the samples of volcanic origin were separated into two different clusters, namely the lava units and the pyroclastic-ignimbritic aquifer. The second step is the discriminant analysis of the data which provides the functions that distinguish the groups from each other and the most significant variables that define the hydrochemical composition of the aquifer. The whole procedure was highly successful as the 94.7 % of the samples were classified to the correct aquifer system. Finally, the resulted functions can be safely used to categorize samples of either unknown or doubtful origin improving thus the quality and the size of existing hydrochemical databases.

  2. An investigation of the use of discriminant analysis for the classification of blade edge type from cut marks made by metal and bamboo blades.

    PubMed

    Bonney, Heather

    2014-08-01

    Analysis of cut marks in bone is largely limited to two dimensional qualitative description. Development of morphological classification methods using measurements from cut mark cross sections could have multiple uses across palaeoanthropological and archaeological disciplines, where cutting edge types are used to investigate and reconstruct behavioral patterns. An experimental study was undertaken, using porcine bone, to determine the usefulness of discriminant function analysis in classifying cut marks by blade edge type, from a number of measurements taken from their cross-sectional profile. The discriminant analysis correctly classified 86.7% of the experimental cut marks into serrated, non-serrated and bamboo blade types. The technique was then used to investigate a series of cut marks of unknown origin from a collection of trophy skulls from the Torres Strait Islands, to investigate whether they were made by bamboo or metal blades. Nineteen out of twenty of the cut marks investigated were classified as bamboo which supports the non-contemporaneous ethnographic accounts of the knives used for trophy taking and defleshing remains. With further investigation across a variety of blade types, this technique could prove a valuable tool in the interpretation of cut mark evidence from a wide variety of contexts, particularly in forensic anthropology where the requirement for presentation of evidence in a statistical format is becoming increasingly important. © 2014 Wiley Periodicals, Inc.

  3. A Comparison of Two-Group Classification Methods

    ERIC Educational Resources Information Center

    Holden, Jocelyn E.; Finch, W. Holmes; Kelley, Ken

    2011-01-01

    The statistical classification of "N" individuals into "G" mutually exclusive groups when the actual group membership is unknown is common in the social and behavioral sciences. The results of such classification methods often have important consequences. Among the most common methods of statistical classification are linear discriminant analysis,…

  4. Discriminative analysis of early Alzheimer's disease based on two intrinsically anti-correlated networks with resting-state fMRI.

    PubMed

    Wang, Kun; Jiang, Tianzi; Liang, Meng; Wang, Liang; Tian, Lixia; Zhang, Xinqing; Li, Kuncheng; Liu, Zhening

    2006-01-01

    In this work, we proposed a discriminative model of Alzheimer's disease (AD) on the basis of multivariate pattern classification and functional magnetic resonance imaging (fMRI). This model used the correlation/anti-correlation coefficients of two intrinsically anti-correlated networks in resting brains, which have been suggested by two recent studies, as the feature of classification. Pseudo-Fisher Linear Discriminative Analysis (pFLDA) was then performed on the feature space and a linear classifier was generated. Using leave-one-out (LOO) cross validation, our results showed a correct classification rate of 83%. We also compared the proposed model with another one based on the whole brain functional connectivity. Our proposed model outperformed the other one significantly, and this implied that the two intrinsically anti-correlated networks may be a more susceptible part of the whole brain network in the early stage of AD.

  5. Shift-invariant discrete wavelet transform analysis for retinal image classification.

    PubMed

    Khademi, April; Krishnan, Sridhar

    2007-12-01

    This work involves retinal image classification and a novel analysis system was developed. From the compressed domain, the proposed scheme extracts textural features from wavelet coefficients, which describe the relative homogeneity of localized areas of the retinal images. Since the discrete wavelet transform (DWT) is shift-variant, a shift-invariant DWT was explored to ensure that a robust feature set was extracted. To combat the small database size, linear discriminant analysis classification was used with the leave one out method. 38 normal and 48 abnormal (exudates, large drusens, fine drusens, choroidal neovascularization, central vein and artery occlusion, histoplasmosis, arteriosclerotic retinopathy, hemi-central retinal vein occlusion and more) were used and a specificity of 79% and sensitivity of 85.4% were achieved (the average classification rate is 82.2%). The success of the system can be accounted to the highly robust feature set which included translation, scale and semi-rotational, features. Additionally, this technique is database independent since the features were specifically tuned to the pathologies of the human eye.

  6. Singularity and Nonnormality in the Classification of Compositional Data

    USGS Publications Warehouse

    Bohling, Geoffrey C.; Davis, J.C.; Olea, R.A.; Harff, Jan

    1998-01-01

    Geologists may want to classify compositional data and express the classification as a map. Regionalized classification is a tool that can be used for this purpose, but it incorporates discriminant analysis, which requires the computation and inversion of a covariance matrix. Covariance matrices of compositional data always will be singular (noninvertible) because of the unit-sum constraint. Fortunately, discriminant analyses can be calculated using a pseudo-inverse of the singular covariance matrix; this is done automatically by some statistical packages such as SAS. Granulometric data from the Darss Sill region of the Baltic Sea is used to explore how the pseudo-inversion procedure influences discriminant analysis results, comparing the algorithm used by SAS to the more conventional Moore-Penrose algorithm. Logratio transforms have been recommended to overcome problems associated with analysis of compositional data, including singularity. A regionalized classification of the Darss Sill data after logratio transformation is different only slightly from one based on raw granulometric data, suggesting that closure problems do not influence severely regionalized classification of compositional data.

  7. Sex determination of the Acadian Flycatcher using discriminant analysis

    USGS Publications Warehouse

    Wilson, R.R.

    1999-01-01

    I used five morphometric variables from 114 individuals captured in Arkansas to develop a discriminant model to predict the sex of Acadian Flycatchers (Empidonax virescens). Stepwise discriminant function analyses selected wing chord and tail length as the most parsimonious subset of variables for discriminating sex. This two-variable model correctly classified 80% of females and 97% of males used to develop the model. Validation of the model using 19 individuals from Louisiana and Virginia resulted in 100% correct classification of males and females. This model provides criteria for sexing monomorphic Acadian Flycatchers during the breeding season and possibly during the winter.

  8. Discrimination of irradiated MOX fuel from UOX fuel by multivariate statistical analysis of simulated activities of gamma-emitting isotopes

    NASA Astrophysics Data System (ADS)

    Åberg Lindell, M.; Andersson, P.; Grape, S.; Hellesen, C.; Håkansson, A.; Thulin, M.

    2018-03-01

    This paper investigates how concentrations of certain fission products and their related gamma-ray emissions can be used to discriminate between uranium oxide (UOX) and mixed oxide (MOX) type fuel. Discrimination of irradiated MOX fuel from irradiated UOX fuel is important in nuclear facilities and for transport of nuclear fuel, for purposes of both criticality safety and nuclear safeguards. Although facility operators keep records on the identity and properties of each fuel, tools for nuclear safeguards inspectors that enable independent verification of the fuel are critical in the recovery of continuity of knowledge, should it be lost. A discrimination methodology for classification of UOX and MOX fuel, based on passive gamma-ray spectroscopy data and multivariate analysis methods, is presented. Nuclear fuels and their gamma-ray emissions were simulated in the Monte Carlo code Serpent, and the resulting data was used as input to train seven different multivariate classification techniques. The trained classifiers were subsequently implemented and evaluated with respect to their capabilities to correctly predict the classes of unknown fuel items. The best results concerning successful discrimination of UOX and MOX-fuel were acquired when using non-linear classification techniques, such as the k nearest neighbors method and the Gaussian kernel support vector machine. For fuel with cooling times up to 20 years, when it is considered that gamma-rays from the isotope 134Cs can still be efficiently measured, success rates of 100% were obtained. A sensitivity analysis indicated that these methods were also robust.

  9. Application of texture analysis method for mammogram density classification

    NASA Astrophysics Data System (ADS)

    Nithya, R.; Santhi, B.

    2017-07-01

    Mammographic density is considered a major risk factor for developing breast cancer. This paper proposes an automated approach to classify breast tissue types in digital mammogram. The main objective of the proposed Computer-Aided Diagnosis (CAD) system is to investigate various feature extraction methods and classifiers to improve the diagnostic accuracy in mammogram density classification. Texture analysis methods are used to extract the features from the mammogram. Texture features are extracted by using histogram, Gray Level Co-Occurrence Matrix (GLCM), Gray Level Run Length Matrix (GLRLM), Gray Level Difference Matrix (GLDM), Local Binary Pattern (LBP), Entropy, Discrete Wavelet Transform (DWT), Wavelet Packet Transform (WPT), Gabor transform and trace transform. These extracted features are selected using Analysis of Variance (ANOVA). The features selected by ANOVA are fed into the classifiers to characterize the mammogram into two-class (fatty/dense) and three-class (fatty/glandular/dense) breast density classification. This work has been carried out by using the mini-Mammographic Image Analysis Society (MIAS) database. Five classifiers are employed namely, Artificial Neural Network (ANN), Linear Discriminant Analysis (LDA), Naive Bayes (NB), K-Nearest Neighbor (KNN), and Support Vector Machine (SVM). Experimental results show that ANN provides better performance than LDA, NB, KNN and SVM classifiers. The proposed methodology has achieved 97.5% accuracy for three-class and 99.37% for two-class density classification.

  10. Shape classification of wear particles by image boundary analysis using machine learning algorithms

    NASA Astrophysics Data System (ADS)

    Yuan, Wei; Chin, K. S.; Hua, Meng; Dong, Guangneng; Wang, Chunhui

    2016-05-01

    The shape features of wear particles generated from wear track usually contain plenty of information about the wear states of a machinery operational condition. Techniques to quickly identify types of wear particles quickly to respond to the machine operation and prolong the machine's life appear to be lacking and are yet to be established. To bridge rapid off-line feature recognition with on-line wear mode identification, this paper presents a new radial concave deviation (RCD) method that mainly involves the use of the particle boundary signal to analyze wear particle features. Signal output from the RCDs subsequently facilitates the determination of several other feature parameters, typically relevant to the shape and size of the wear particle. Debris feature and type are identified through the use of various classification methods, such as linear discriminant analysis, quadratic discriminant analysis, naïve Bayesian method, and classification and regression tree method (CART). The average errors of the training and test via ten-fold cross validation suggest CART is a highly suitable approach for classifying and analyzing particle features. Furthermore, the results of the wear debris analysis enable the maintenance team to diagnose faults appropriately.

  11. Enamel surface topography analysis for diet discrimination. A methodology to enhance and select discriminative parameters

    NASA Astrophysics Data System (ADS)

    Francisco, Arthur; Blondel, Cécile; Brunetière, Noël; Ramdarshan, Anusha; Merceron, Gildas

    2018-03-01

    Tooth wear and, more specifically, dental microwear texture is a dietary proxy that has been used for years in vertebrate paleoecology and ecology. DMTA, dental microwear texture analysis, relies on a few parameters related to the surface complexity, anisotropy and heterogeneity of the enamel facets at the micrometric scale. Working with few but physically meaningful parameters helps in comparing published results and in defining levels for classification purposes. Other dental microwear approaches are based on ISO parameters and coupled with statistical tests to find the more relevant ones. The present study roughly utilizes most of the aforementioned parameters in their more or less modified form. But more than parameters, we here propose a new approach: instead of a single parameter characterizing the whole surface, we sample the surface and thus generate 9 derived parameters in order to broaden the parameter set. The identification of the most discriminative parameters is performed with an automated procedure which is an extended and refined version of the workflows encountered in some studies. The procedure in its initial form includes the most common tools, like the ANOVA and the correlation analysis, along with the required mathematical tests. The discrimination results show that a simplified form of the procedure is able to more efficiently identify the desired number of discriminative parameters. Also highlighted are some trends like the relevance of working with both height and spatial parameters, as well as the potential benefits of dimensionless surfaces. On a set of 45 surfaces issued from 45 specimens of three modern ruminants with differences in feeding preferences (grazing, leaf-browsing and fruit-eating), it is clearly shown that the level of wear discrimination is improved with the new methodology compared to the other ones.

  12. A simple randomisation procedure for validating discriminant analysis: a methodological note.

    PubMed

    Wastell, D G

    1987-04-01

    Because the goal of discriminant analysis (DA) is to optimise classification, it designedly exaggerates between-group differences. This bias complicates validation of DA. Jack-knifing has been used for validation but is inappropriate when stepwise selection (SWDA) is employed. A simple randomisation test is presented which is shown to give correct decisions for SWDA. The general superiority of randomisation tests over orthodox significance tests is discussed. Current work on non-parametric methods of estimating the error rates of prediction rules is briefly reviewed.

  13. Hyperspectral image analysis for rapid and accurate discrimination of bacterial infections: A benchmark study.

    PubMed

    Arrigoni, Simone; Turra, Giovanni; Signoroni, Alberto

    2017-09-01

    With the rapid diffusion of Full Laboratory Automation systems, Clinical Microbiology is currently experiencing a new digital revolution. The ability to capture and process large amounts of visual data from microbiological specimen processing enables the definition of completely new objectives. These include the direct identification of pathogens growing on culturing plates, with expected improvements in rapid definition of the right treatment for patients affected by bacterial infections. In this framework, the synergies between light spectroscopy and image analysis, offered by hyperspectral imaging, are of prominent interest. This leads us to assess the feasibility of a reliable and rapid discrimination of pathogens through the classification of their spectral signatures extracted from hyperspectral image acquisitions of bacteria colonies growing on blood agar plates. We designed and implemented the whole data acquisition and processing pipeline and performed a comprehensive comparison among 40 combinations of different data preprocessing and classification techniques. High discrimination performance has been achieved also thanks to improved colony segmentation and spectral signature extraction. Experimental results reveal the high accuracy and suitability of the proposed approach, driving the selection of most suitable and scalable classification pipelines and stimulating clinical validations. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. Comparative study of wine tannin classification using Fourier transform mid-infrared spectrometry and sensory analysis.

    PubMed

    Fernández, Katherina; Labarca, Ximena; Bordeu, Edmundo; Guesalaga, Andrés; Agosin, Eduardo

    2007-11-01

    Wine tannins are fundamental to the determination of wine quality. However, the chemical and sensorial analysis of these compounds is not straightforward and a simple and rapid technique is necessary. We analyzed the mid-infrared spectra of white, red, and model wines spiked with known amounts of skin or seed tannins, collected using Fourier transform mid-infrared (FT-MIR) transmission spectroscopy (400-4000 cm(-1)). The spectral data were classified according to their tannin source, skin or seed, and tannin concentration by means of discriminant analysis (DA) and soft independent modeling of class analogy (SIMCA) to obtain a probabilistic classification. Wines were also classified sensorially by a trained panel and compared with FT-MIR. SIMCA models gave the most accurate classification (over 97%) and prediction (over 60%) among the wine samples. The prediction was increased (over 73%) using the leave-one-out cross-validation technique. Sensory classification of the wines was less accurate than that obtained with FT-MIR and SIMCA. Overall, these results show the potential of FT-MIR spectroscopy, in combination with adequate statistical tools, to discriminate wines with different tannin levels.

  15. A Recurrent Probabilistic Neural Network with Dimensionality Reduction Based on Time-series Discriminant Component Analysis.

    PubMed

    Hayashi, Hideaki; Shibanoki, Taro; Shima, Keisuke; Kurita, Yuichi; Tsuji, Toshio

    2015-12-01

    This paper proposes a probabilistic neural network (NN) developed on the basis of time-series discriminant component analysis (TSDCA) that can be used to classify high-dimensional time-series patterns. TSDCA involves the compression of high-dimensional time series into a lower dimensional space using a set of orthogonal transformations and the calculation of posterior probabilities based on a continuous-density hidden Markov model with a Gaussian mixture model expressed in the reduced-dimensional space. The analysis can be incorporated into an NN, which is named a time-series discriminant component network (TSDCN), so that parameters of dimensionality reduction and classification can be obtained simultaneously as network coefficients according to a backpropagation through time-based learning algorithm with the Lagrange multiplier method. The TSDCN is considered to enable high-accuracy classification of high-dimensional time-series patterns and to reduce the computation time taken for network training. The validity of the TSDCN is demonstrated for high-dimensional artificial data and electroencephalogram signals in the experiments conducted during the study.

  16. Morphological image analysis for classification of gastrointestinal tissues using optical coherence tomography

    NASA Astrophysics Data System (ADS)

    Garcia-Allende, P. Beatriz; Amygdalos, Iakovos; Dhanapala, Hiruni; Goldin, Robert D.; Hanna, George B.; Elson, Daniel S.

    2012-01-01

    Computer-aided diagnosis of ophthalmic diseases using optical coherence tomography (OCT) relies on the extraction of thickness and size measures from the OCT images, but such defined layers are usually not observed in emerging OCT applications aimed at "optical biopsy" such as pulmonology or gastroenterology. Mathematical methods such as Principal Component Analysis (PCA) or textural analyses including both spatial textural analysis derived from the two-dimensional discrete Fourier transform (DFT) and statistical texture analysis obtained independently from center-symmetric auto-correlation (CSAC) and spatial grey-level dependency matrices (SGLDM), as well as, quantitative measurements of the attenuation coefficient have been previously proposed to overcome this problem. We recently proposed an alternative approach consisting of a region segmentation according to the intensity variation along the vertical axis and a pure statistical technology for feature quantification. OCT images were first segmented in the axial direction in an automated manner according to intensity. Afterwards, a morphological analysis of the segmented OCT images was employed for quantifying the features that served for tissue classification. In this study, a PCA processing of the extracted features is accomplished to combine their discriminative power in a lower number of dimensions. Ready discrimination of gastrointestinal surgical specimens is attained demonstrating that the approach further surpasses the algorithms previously reported and is feasible for tissue classification in the clinical setting.

  17. Origin Discrimination of Osmanthus fragrans var. thunbergii Flowers using GC-MS and UPLC-PDA Combined with Multivariable Analysis Methods.

    PubMed

    Zhou, Fei; Zhao, Yajing; Peng, Jiyu; Jiang, Yirong; Li, Maiquan; Jiang, Yuan; Lu, Baiyi

    2017-07-01

    Osmanthus fragrans flowers are used as folk medicine and additives for teas, beverages and foods. The metabolites of O. fragrans flowers from different geographical origins were inconsistent in some extent. Chromatography and mass spectrometry combined with multivariable analysis methods provides an approach for discriminating the origin of O. fragrans flowers. To discriminate the Osmanthus fragrans var. thunbergii flowers from different origins with the identified metabolites. GC-MS and UPLC-PDA were conducted to analyse the metabolites in O. fragrans var. thunbergii flowers (in total 150 samples). Principal component analysis (PCA), soft independent modelling of class analogy analysis (SIMCA) and random forest (RF) analysis were applied to group the GC-MS and UPLC-PDA data. GC-MS identified 32 compounds common to all samples while UPLC-PDA/QTOF-MS identified 16 common compounds. PCA of the UPLC-PDA data generated a better clustering than PCA of the GC-MS data. Ten metabolites (six from GC-MS and four from UPLC-PDA) were selected as effective compounds for discrimination by PCA loadings. SIMCA and RF analysis were used to build classification models, and the RF model, based on the four effective compounds (caffeic acid derivative, acteoside, ligustroside and compound 15), yielded better results with the classification rate of 100% in the calibration set and 97.8% in the prediction set. GC-MS and UPLC-PDA combined with multivariable analysis methods can discriminate the origin of Osmanthus fragrans var. thunbergii flowers. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  18. Fast classification of hazelnut cultivars through portable infrared spectroscopy and chemometrics

    NASA Astrophysics Data System (ADS)

    Manfredi, Marcello; Robotti, Elisa; Quasso, Fabio; Mazzucco, Eleonora; Calabrese, Giorgio; Marengo, Emilio

    2018-01-01

    The authentication and traceability of hazelnuts is very important for both the consumer and the food industry, to safeguard the protected varieties and the food quality. This study investigates the use of a portable FTIR spectrometer coupled to multivariate statistical analysis for the classification of raw hazelnuts. The method discriminates hazelnuts from different origins/cultivars based on differences of the signal intensities of their IR spectra. The multivariate classification methods, namely principal component analysis (PCA) followed by linear discriminant analysis (LDA) and partial least square discriminant analysis (PLS-DA), with or without variable selection, allowed a very good discrimination among the groups, with PLS-DA coupled to variable selection providing the best results. Due to the fast analysis, high sensitivity, simplicity and no sample preparation, the proposed analytical methodology could be successfully used to verify the cultivar of hazelnuts, and the analysis can be performed quickly and directly on site.

  19. Local linear discriminant analysis framework using sample neighbors.

    PubMed

    Fan, Zizhu; Xu, Yong; Zhang, David

    2011-07-01

    The linear discriminant analysis (LDA) is a very popular linear feature extraction approach. The algorithms of LDA usually perform well under the following two assumptions. The first assumption is that the global data structure is consistent with the local data structure. The second assumption is that the input data classes are Gaussian distributions. However, in real-world applications, these assumptions are not always satisfied. In this paper, we propose an improved LDA framework, the local LDA (LLDA), which can perform well without needing to satisfy the above two assumptions. Our LLDA framework can effectively capture the local structure of samples. According to different types of local data structure, our LLDA framework incorporates several different forms of linear feature extraction approaches, such as the classical LDA and principal component analysis. The proposed framework includes two LLDA algorithms: a vector-based LLDA algorithm and a matrix-based LLDA (MLLDA) algorithm. MLLDA is directly applicable to image recognition, such as face recognition. Our algorithms need to train only a small portion of the whole training set before testing a sample. They are suitable for learning large-scale databases especially when the input data dimensions are very high and can achieve high classification accuracy. Extensive experiments show that the proposed algorithms can obtain good classification results.

  20. Discrimination of whisky brands and counterfeit identification by UV-Vis spectroscopy and multivariate data analysis.

    PubMed

    Martins, Angélica Rocha; Talhavini, Márcio; Vieira, Maurício Leite; Zacca, Jorge Jardim; Braga, Jez Willian Batista

    2017-08-15

    The discrimination of whisky brands and counterfeit identification were performed by UV-Vis spectroscopy combined with partial least squares for discriminant analysis (PLS-DA). In the proposed method all spectra were obtained with no sample preparation. The discrimination models were built with the employment of seven whisky brands: Red Label, Black Label, White Horse, Chivas Regal (12years), Ballantine's Finest, Old Parr and Natu Nobilis. The method was validated with an independent test set of authentic samples belonging to the seven selected brands and another eleven brands not included in the training samples. Furthermore, seventy-three counterfeit samples were also used to validate the method. Results showed correct classification rates for genuine and false samples over 98.6% and 93.1%, respectively, indicating that the method can be helpful for the forensic analysis of whisky samples. Copyright © 2017 Elsevier Ltd. All rights reserved.

  1. Are psychological measures and actuarial data equally effective in discriminating among the prison population? Analysis by crimes

    PubMed Central

    Burneo-Garcés, Carlos; Marín-Morales, Agar; Pérez-García, Miguel

    2018-01-01

    The ability of a wide range of psychological and actuarial measures to characterize crimes in the prison population has not yet been compared in a single study. Our main objective was to determine if the discriminant capacity of psychological measures (PM) and actuarial data (AD) varies according to the crime. An Ecuadorian sample of 576 men convicted of Robbery, Murder, Rape and Drug Possession crimes was evaluated through an ad hoc questionnaire, prison files and the Spanish adaptation of the Personality Assessment Inventory. Discriminant analysis was used to establish, for each crime, the discriminant capacity and the classification accuracy of a model composed of AD (socio-demographic and judicial measures) and a second model incorporating PM. The AD showed a superior discriminant capacity, whilst the contribution of both types of measures varied according to the crime. The PM generated some increase in the correct classification percentages for Murder, Rape and Drug Possession, but their contribution was zero for the crime of Robbery. Specific profiles of each crime were obtained from the strongest significant correlations between the value of each explanatory variable and the probability of belonging to the crime. The AD model is more robust when these four crimes are characterized. The contribution of AD and PM depends on the crime, and the inclusion of PM in actuarial models moderately optimizes the classification accuracy of Murder, Rape, and Drug Possession crimes. PMID:29874264

  2. Heuristics to Facilitate Understanding of Discriminant Analysis.

    ERIC Educational Resources Information Center

    Van Epps, Pamela D.

    This paper discusses the principles underlying discriminant analysis and constructs a simulated data set to illustrate its methods. Discriminant analysis is a multivariate technique for identifying the best combination of variables to maximally discriminate between groups. Discriminant functions are established on existing groups and used to…

  3. Classification enhancement for post-stroke dementia using fuzzy neighborhood preserving analysis with QR-decomposition.

    PubMed

    Al-Qazzaz, Noor Kamal; Ali, Sawal; Ahmad, Siti Anom; Escudero, Javier

    2017-07-01

    The aim of the present study was to discriminate the electroencephalogram (EEG) of 5 patients with vascular dementia (VaD), 15 patients with stroke-related mild cognitive impairment (MCI), and 15 control normal subjects during a working memory (WM) task. We used independent component analysis (ICA) and wavelet transform (WT) as a hybrid preprocessing approach for EEG artifact removal. Three different features were extracted from the cleaned EEG signals: spectral entropy (SpecEn), permutation entropy (PerEn) and Tsallis entropy (TsEn). Two classification schemes were applied - support vector machine (SVM) and k-nearest neighbors (kNN) - with fuzzy neighborhood preserving analysis with QR-decomposition (FNPAQR) as a dimensionality reduction technique. The FNPAQR dimensionality reduction technique increased the SVM classification accuracy from 82.22% to 90.37% and from 82.6% to 86.67% for kNN. These results suggest that FNPAQR consistently improves the discrimination of VaD, MCI patients and control normal subjects and it could be a useful feature selection to help the identification of patients with VaD and MCI.

  4. Discrimination of inflammatory bowel disease using Raman spectroscopy and linear discriminant analysis methods

    NASA Astrophysics Data System (ADS)

    Ding, Hao; Cao, Ming; DuPont, Andrew W.; Scott, Larry D.; Guha, Sushovan; Singhal, Shashideep; Younes, Mamoun; Pence, Isaac; Herline, Alan; Schwartz, David; Xu, Hua; Mahadevan-Jansen, Anita; Bi, Xiaohong

    2016-03-01

    Inflammatory bowel disease (IBD) is an idiopathic disease that is typically characterized by chronic inflammation of the gastrointestinal tract. Recently much effort has been devoted to the development of novel diagnostic tools that can assist physicians for fast, accurate, and automated diagnosis of the disease. Previous research based on Raman spectroscopy has shown promising results in differentiating IBD patients from normal screening cases. In the current study, we examined IBD patients in vivo through a colonoscope-coupled Raman system. Optical diagnosis for IBD discrimination was conducted based on full-range spectra using multivariate statistical methods. Further, we incorporated several feature selection methods in machine learning into the classification model. The diagnostic performance for disease differentiation was significantly improved after feature selection. Our results showed that improved IBD diagnosis can be achieved using Raman spectroscopy in combination with multivariate analysis and feature selection.

  5. Fast discrimination of hydroxypropyl methyl cellulose using portable Raman spectrometer and multivariate methods

    NASA Astrophysics Data System (ADS)

    Song, Biao; Lu, Dan; Peng, Ming; Li, Xia; Zou, Ye; Huang, Meizhen; Lu, Feng

    2017-02-01

    Raman spectroscopy is developed as a fast and non-destructive method for the discrimination and classification of hydroxypropyl methyl cellulose (HPMC) samples. 44 E series and 41 K series of HPMC samples are measured by a self-developed portable Raman spectrometer (Hx-Raman) which is excited by a 785 nm diode laser and the spectrum range is 200-2700 cm-1 with a resolution (FWHM) of 6 cm-1. Multivariate analysis is applied for discrimination of E series from K series. By methods of principal components analysis (PCA) and Fisher discriminant analysis (FDA), a discrimination result with sensitivity of 90.91% and specificity of 95.12% is achieved. The corresponding receiver operating characteristic (ROC) is 0.99, indicting the accuracy of the predictive model. This result demonstrates the prospect of portable Raman spectrometer for rapid, non-destructive classification and discrimination of E series and K series samples of HPMC.

  6. Invariant approach to the character classification

    NASA Astrophysics Data System (ADS)

    Šariri, Kristina; Demoli, Nazif

    2008-04-01

    Image moments analysis is a very useful tool which allows image description invariant to translation and rotation, scale change and some types of image distortions. The aim of this work was development of simple method for fast and reliable classification of characters by using Hu's and affine moment invariants. Measure of Eucleidean distance was used as a discrimination feature with statistical parameters estimated. The method was tested in classification of Times New Roman font letters as well as sets of the handwritten characters. It is shown that using all Hu's and three affine invariants as discrimination set improves recognition rate by 30%.

  7. Gender classification of running subjects using full-body kinematics

    NASA Astrophysics Data System (ADS)

    Williams, Christina M.; Flora, Jeffrey B.; Iftekharuddin, Khan M.

    2016-05-01

    This paper proposes novel automated gender classification of subjects while engaged in running activity. The machine learning techniques include preprocessing steps using principal component analysis followed by classification with linear discriminant analysis, and nonlinear support vector machines, and decision-stump with AdaBoost. The dataset consists of 49 subjects (25 males, 24 females, 2 trials each) all equipped with approximately 80 retroreflective markers. The trials are reflective of the subject's entire body moving unrestrained through a capture volume at a self-selected running speed, thus producing highly realistic data. The classification accuracy using leave-one-out cross validation for the 49 subjects is improved from 66.33% using linear discriminant analysis to 86.74% using the nonlinear support vector machine. Results are further improved to 87.76% by means of implementing a nonlinear decision stump with AdaBoost classifier. The experimental findings suggest that the linear classification approaches are inadequate in classifying gender for a large dataset with subjects running in a moderately uninhibited environment.

  8. Theory and analysis of statistical discriminant techniques as applied to remote sensing data

    NASA Technical Reports Server (NTRS)

    Odell, P. L.

    1973-01-01

    Classification of remote earth resources sensing data according to normed exponential density statistics is reported. The use of density models appropriate for several physical situations provides an exact solution for the probabilities of classifications associated with the Bayes discriminant procedure even when the covariance matrices are unequal.

  9. Similarity-balanced discriminant neighbor embedding and its application to cancer classification based on gene expression data.

    PubMed

    Zhang, Li; Qian, Liqiang; Ding, Chuntao; Zhou, Weida; Li, Fanzhang

    2015-09-01

    The family of discriminant neighborhood embedding (DNE) methods is typical graph-based methods for dimension reduction, and has been successfully applied to face recognition. This paper proposes a new variant of DNE, called similarity-balanced discriminant neighborhood embedding (SBDNE) and applies it to cancer classification using gene expression data. By introducing a novel similarity function, SBDNE deals with two data points in the same class and the different classes with different ways. The homogeneous and heterogeneous neighbors are selected according to the new similarity function instead of the Euclidean distance. SBDNE constructs two adjacent graphs, or between-class adjacent graph and within-class adjacent graph, using the new similarity function. According to these two adjacent graphs, we can generate the local between-class scatter and the local within-class scatter, respectively. Thus, SBDNE can maximize the between-class scatter and simultaneously minimize the within-class scatter to find the optimal projection matrix. Experimental results on six microarray datasets show that SBDNE is a promising method for cancer classification. Copyright © 2015 Elsevier Ltd. All rights reserved.

  10. Conscientious Classification: A Data Scientist's Guide to Discrimination-Aware Classification.

    PubMed

    d'Alessandro, Brian; O'Neil, Cathy; LaGatta, Tom

    2017-06-01

    Recent research has helped to cultivate growing awareness that machine-learning systems fueled by big data can create or exacerbate troubling disparities in society. Much of this research comes from outside of the practicing data science community, leaving its members with little concrete guidance to proactively address these concerns. This article introduces issues of discrimination to the data science community on its own terms. In it, we tour the familiar data-mining process while providing a taxonomy of common practices that have the potential to produce unintended discrimination. We also survey how discrimination is commonly measured, and suggest how familiar development processes can be augmented to mitigate systems' discriminatory potential. We advocate that data scientists should be intentional about modeling and reducing discriminatory outcomes. Without doing so, their efforts will result in perpetuating any systemic discrimination that may exist, but under a misleading veil of data-driven objectivity.

  11. Orthogonal sparse linear discriminant analysis

    NASA Astrophysics Data System (ADS)

    Liu, Zhonghua; Liu, Gang; Pu, Jiexin; Wang, Xiaohong; Wang, Haijun

    2018-03-01

    Linear discriminant analysis (LDA) is a linear feature extraction approach, and it has received much attention. On the basis of LDA, researchers have done a lot of research work on it, and many variant versions of LDA were proposed. However, the inherent problem of LDA cannot be solved very well by the variant methods. The major disadvantages of the classical LDA are as follows. First, it is sensitive to outliers and noises. Second, only the global discriminant structure is preserved, while the local discriminant information is ignored. In this paper, we present a new orthogonal sparse linear discriminant analysis (OSLDA) algorithm. The k nearest neighbour graph is first constructed to preserve the locality discriminant information of sample points. Then, L2,1-norm constraint on the projection matrix is used to act as loss function, which can make the proposed method robust to outliers in data points. Extensive experiments have been performed on several standard public image databases, and the experiment results demonstrate the performance of the proposed OSLDA algorithm.

  12. Application of linear discriminant analysis and Attenuated Total Reflectance Fourier Transform Infrared microspectroscopy for diagnosis of colon cancer.

    PubMed

    Khanmohammadi, Mohammadreza; Bagheri Garmarudi, Amir; Samani, Simin; Ghasemi, Keyvan; Ashuri, Ahmad

    2011-06-01

    Attenuated Total Reflectance Fourier Transform Infrared (ATR-FTIR) microspectroscopy was applied for detection of colon cancer according to the spectral features of colon tissues. Supervised classification models can be trained to identify the tissue type based on the spectroscopic fingerprint. A total of 78 colon tissues were used in spectroscopy studies. Major spectral differences were observed in 1,740-900 cm(-1) spectral region. Several chemometric methods such as analysis of variance (ANOVA), cluster analysis (CA) and linear discriminate analysis (LDA) were applied for classification of IR spectra. Utilizing the chemometric techniques, clear and reproducible differences were observed between the spectra of normal and cancer cases, suggesting that infrared microspectroscopy in conjunction with spectral data processing would be useful for diagnostic classification. Using LDA technique, the spectra were classified into cancer and normal tissue classes with an accuracy of 95.8%. The sensitivity and specificity was 100 and 93.1%, respectively.

  13. Discriminant analysis in wildlife research: Theory and applications

    USGS Publications Warehouse

    Williams, B.K.; Capen, D.E.

    1981-01-01

    Discriminant analysis, a method of analyzing grouped multivariate data, is often used in ecological investigations. It has both a predictive and an explanatory function, the former aiming at classification of individuals of unknown group membership. The goal of the latter function is to exhibit group separation by means of linear transforms, and the corresponding method is called canonical analysis. This discussion focuses on the application of canonical analysis in ecology. In order to clarify its meaning, a parametric approach is taken instead of the usual data-based formulation. For certain assumptions the data-based canonical variates are shown to result from maximum likelihood estimation, thus insuring consistency and asymptotic efficiency. The distorting effects of covariance heterogeneity are examined, as are certain difficulties which arise in interpreting the canonical functions. A 'distortion metric' is defined, by means of which distortions resulting from the canonical transformation can be assessed. Several sampling problems which arise in ecological applications are considered. It is concluded that the method may prove valuable for data exploration, but is of limited value as an inferential procedure.

  14. Comprehensive Chemical Fingerprinting of High-Quality Cocoa at Early Stages of Processing: Effectiveness of Combined Untargeted and Targeted Approaches for Classification and Discrimination.

    PubMed

    Magagna, Federico; Guglielmetti, Alessandro; Liberto, Erica; Reichenbach, Stephen E; Allegrucci, Elena; Gobino, Guido; Bicchi, Carlo; Cordero, Chiara

    2017-08-02

    This study investigates chemical information of volatile fractions of high-quality cocoa (Theobroma cacao L. Malvaceae) from different origins (Mexico, Ecuador, Venezuela, Columbia, Java, Trinidad, and Sao Tomè) produced for fine chocolate. This study explores the evolution of the entire pattern of volatiles in relation to cocoa processing (raw, roasted, steamed, and ground beans). Advanced chemical fingerprinting (e.g., combined untargeted and targeted fingerprinting) with comprehensive two-dimensional gas chromatography coupled with mass spectrometry allows advanced pattern recognition for classification, discrimination, and sensory-quality characterization. The entire data set is analyzed for 595 reliable two-dimensional peak regions, including 130 known analytes and 13 potent odorants. Multivariate analysis with unsupervised exploration (principal component analysis) and simple supervised discrimination methods (Fisher ratios and linear regression trees) reveal informative patterns of similarities and differences and identify characteristic compounds related to sample origin and manufacturing step.

  15. Legitimating Racial Discrimination: Emotions, Not Beliefs, Best Predict Discrimination in a Meta-Analysis

    PubMed Central

    Talaska, Cara A.; Chaiken, Shelly

    2013-01-01

    Investigations of racial bias have emphasized stereotypes and other beliefs as central explanatory mechanisms and as legitimating discrimination. In recent theory and research, emotional prejudices have emerged as another, more direct predictor of discrimination. A new comprehensive meta-analysis of 57 racial attitude-discrimination studies finds a moderate relationship between overall attitudes and discrimination. Emotional prejudices are twices as closely related to racial discrimination as stereotypes and beliefs are. Moreover, emotional prejudices are closely related to both observed and self-reported discrimination, whereas stereotypes and beliefs are related only to self-reported discrimination. Implications for justifying discrimination are discussed. PMID:24052687

  16. [Study on the genuineness and producing area of Panax notoginseng based on infrared spectroscopy combined with discriminant analysis].

    PubMed

    Liu, Fei; Wang, Yuan-zhong; Yang, Chun-yan; Jin, Hang

    2015-01-01

    The genuineness and producing area of Panax notoginseng were studied based on infrared spectroscopy combined with discriminant analysis. The infrared spectra of 136 taproots of P. notoginseng from 13 planting point in 11 counties were collected and the second derivate spectra were calculated by Omnic 8. 0 software. The infrared spectra and their second derivate spectra in the range 1 800 - 700 cm-1 were used to build model by stepwise discriminant analysis, which was in order to distinguish study on the genuineness of P. notoginseng. The model built based on the second derivate spectra showed the better recognition effect for the genuineness of P. notoginseng. The correct rate of returned classification reached to 100%, and the prediction accuracy was 93. 4%. The stability of model was tested by cross validation and the method was performed extrapolation validation. The second derivate spectra combined with the same discriminant analysis method were used to distinguish the producing area of P. notoginseng. The recognition effect of models built based on different range of spectrum and different numbers of samples were compared and found that when the model was built by collecting 8 samples from each planting point as training sample and the spectrum in the range 1 500 - 1 200 cm-1 , the recognition effect was better, with the correct rate of returned classification reached to 99. 0%, and the prediction accuracy was 76. 5%. The results indicated that infrared spectroscopy combined with discriminant analysis showed good recognition effect for the genuineness of P. notoginseng. The method might be a hopeful new method for identification of genuineness of P. notoginseng in practice. The method could recognize the producing area of P. notoginseng to some extent and could be a new thought for identification of the producing area of P. natoginseng.

  17. Detection of non-milk fat in milk fat by gas chromatography and linear discriminant analysis.

    PubMed

    Gutiérrez, R; Vega, S; Díaz, G; Sánchez, J; Coronado, M; Ramírez, A; Pérez, J; González, M; Schettino, B

    2009-05-01

    Gas chromatography was utilized to determine triacylglycerol profiles in milk and non-milk fat. The values of triacylglycerol were subjected to linear discriminant analysis to detect and quantify non-milk fat in milk fat. Two groups of milk fat were analyzed: A) raw milk fat from the central region of Mexico (n = 216) and B) ultrapasteurized milk fat from 3 industries (n = 36), as well as pork lard (n = 2), bovine tallow (n = 2), fish oil (n = 2), peanut (n = 2), corn (n = 2), olive (n = 2), and soy (n = 2). The samples of raw milk fat were adulterated with non-milk fats in proportions of 0, 5, 10, 15, and 20% to form 5 groups. The first function obtained from the linear discriminant analysis allowed the correct classification of 94.4% of the samples with levels <10% of adulteration. The triacylglycerol values of the ultrapasteurized milk fats were evaluated with the discriminant function, demonstrating that one industry added non-milk fat to its product in 80% of the samples analyzed.

  18. Automatic analysis and classification of surface electromyography.

    PubMed

    Abou-Chadi, F E; Nashar, A; Saad, M

    2001-01-01

    In this paper, parametric modeling of surface electromyography (EMG) algorithms that facilitates automatic SEMG feature extraction and artificial neural networks (ANN) are combined for providing an integrated system for the automatic analysis and diagnosis of myopathic disorders. Three paradigms of ANN were investigated: the multilayer backpropagation algorithm, the self-organizing feature map algorithm and a probabilistic neural network model. The performance of the three classifiers was compared with that of the old Fisher linear discriminant (FLD) classifiers. The results have shown that the three ANN models give higher performance. The percentage of correct classification reaches 90%. Poorer diagnostic performance was obtained from the FLD classifier. The system presented here indicates that surface EMG, when properly processed, can be used to provide the physician with a diagnostic assist device.

  19. Discrimination of lymphoma using laser-induced breakdown spectroscopy conducted on whole blood samples

    PubMed Central

    Chen, Xue; Li, Xiaohui; Yang, Sibo; Yu, Xin; Liu, Aichun

    2018-01-01

    Lymphoma is a significant cancer that affects the human lymphatic and hematopoietic systems. In this work, discrimination of lymphoma using laser-induced breakdown spectroscopy (LIBS) conducted on whole blood samples is presented. The whole blood samples collected from lymphoma patients and healthy controls are deposited onto standard quantitative filter papers and ablated with a 1064 nm Q-switched Nd:YAG laser. 16 atomic and ionic emission lines of calcium (Ca), iron (Fe), magnesium (Mg), potassium (K) and sodium (Na) are selected to discriminate the cancer disease. Chemometric methods, including principal component analysis (PCA), linear discriminant analysis (LDA) classification, and k nearest neighbor (kNN) classification are used to build the discrimination models. Both LDA and kNN models have achieved very good discrimination performances for lymphoma, with an accuracy of over 99.7%, a sensitivity of over 0.996, and a specificity of over 0.997. These results demonstrate that the whole-blood-based LIBS technique in combination with chemometric methods can serve as a fast, less invasive, and accurate method for detection and discrimination of human malignancies. PMID:29541503

  20. A latent discriminative model-based approach for classification of imaginary motor tasks from EEG data.

    PubMed

    Saa, Jaime F Delgado; Çetin, Müjdat

    2012-04-01

    We consider the problem of classification of imaginary motor tasks from electroencephalography (EEG) data for brain-computer interfaces (BCIs) and propose a new approach based on hidden conditional random fields (HCRFs). HCRFs are discriminative graphical models that are attractive for this problem because they (1) exploit the temporal structure of EEG; (2) include latent variables that can be used to model different brain states in the signal; and (3) involve learned statistical models matched to the classification task, avoiding some of the limitations of generative models. Our approach involves spatial filtering of the EEG signals and estimation of power spectra based on autoregressive modeling of temporal segments of the EEG signals. Given this time-frequency representation, we select certain frequency bands that are known to be associated with execution of motor tasks. These selected features constitute the data that are fed to the HCRF, parameters of which are learned from training data. Inference algorithms on the HCRFs are used for the classification of motor tasks. We experimentally compare this approach to the best performing methods in BCI competition IV as well as a number of more recent methods and observe that our proposed method yields better classification accuracy.

  1. A multiple maximum scatter difference discriminant criterion for facial feature extraction.

    PubMed

    Song, Fengxi; Zhang, David; Mei, Dayong; Guo, Zhongwei

    2007-12-01

    Maximum scatter difference (MSD) discriminant criterion was a recently presented binary discriminant criterion for pattern classification that utilizes the generalized scatter difference rather than the generalized Rayleigh quotient as a class separability measure, thereby avoiding the singularity problem when addressing small-sample-size problems. MSD classifiers based on this criterion have been quite effective on face-recognition tasks, but as they are binary classifiers, they are not as efficient on large-scale classification tasks. To address the problem, this paper generalizes the classification-oriented binary criterion to its multiple counterpart--multiple MSD (MMSD) discriminant criterion for facial feature extraction. The MMSD feature-extraction method, which is based on this novel discriminant criterion, is a new subspace-based feature-extraction method. Unlike most other subspace-based feature-extraction methods, the MMSD computes its discriminant vectors from both the range of the between-class scatter matrix and the null space of the within-class scatter matrix. The MMSD is theoretically elegant and easy to calculate. Extensive experimental studies conducted on the benchmark database, FERET, show that the MMSD out-performs state-of-the-art facial feature-extraction methods such as null space method, direct linear discriminant analysis (LDA), eigenface, Fisherface, and complete LDA.

  2. Discrimination of genetically modified sugar beets based on terahertz spectroscopy

    NASA Astrophysics Data System (ADS)

    Chen, Tao; Li, Zhi; Yin, Xianhua; Hu, Fangrong; Hu, Cong

    2016-01-01

    The objective of this paper was to apply terahertz (THz) spectroscopy combined with chemometrics techniques for discrimination of genetically modified (GM) and non-GM sugar beets. In this paper, the THz spectra of 84 sugar beet samples (36 GM sugar beets and 48 non-GM ones) were obtained by using terahertz time-domain spectroscopy (THz-TDS) system in the frequency range from 0.2 to 1.2 THz. Three chemometrics methods, principal component analysis (PCA), discriminant analysis (DA) and discriminant partial least squares (DPLS), were employed to classify sugar beet samples into two groups: genetically modified organisms (GMOs) and non-GMOs. The DPLS method yielded the best classification result, and the percentages of successful classification for GM and non-GM sugar beets were both 100%. Results of the present study demonstrate the usefulness of THz spectroscopy together with chemometrics methods as a powerful tool to distinguish GM and non-GM sugar beets.

  3. Identification of wheat varieties with a parallel-plate capacitance sensor using fisher linear discriminant analysis

    USDA-ARS?s Scientific Manuscript database

    Fisher’s linear discriminant (FLD) models for wheat variety classification were developed and validated. The inputs to the FLD models were the capacitance (C), impedance (Z), and phase angle ('), measured at two frequencies. Classification of wheat varieties was obtained as output of the FLD mod...

  4. Linear discriminant analysis based on L1-norm maximization.

    PubMed

    Zhong, Fujin; Zhang, Jiashu

    2013-08-01

    Linear discriminant analysis (LDA) is a well-known dimensionality reduction technique, which is widely used for many purposes. However, conventional LDA is sensitive to outliers because its objective function is based on the distance criterion using L2-norm. This paper proposes a simple but effective robust LDA version based on L1-norm maximization, which learns a set of local optimal projection vectors by maximizing the ratio of the L1-norm-based between-class dispersion and the L1-norm-based within-class dispersion. The proposed method is theoretically proved to be feasible and robust to outliers while overcoming the singular problem of the within-class scatter matrix for conventional LDA. Experiments on artificial datasets, standard classification datasets and three popular image databases demonstrate the efficacy of the proposed method.

  5. Classification of narcotics in solid mixtures using principal component analysis and Raman spectroscopy.

    PubMed

    Ryder, Alan G

    2002-03-01

    Eighty-five solid samples consisting of illegal narcotics diluted with several different materials were analyzed by near-infrared (785 nm excitation) Raman spectroscopy. Principal Component Analysis (PCA) was employed to classify the samples according to narcotic type. The best sample discrimination was obtained by using the first derivative of the Raman spectra. Furthermore, restricting the spectral variables for PCA to 2 or 3% of the original spectral data according to the most intense peaks in the Raman spectrum of the pure narcotic resulted in a rapid discrimination method for classifying samples according to narcotic type. This method allows for the easy discrimination between cocaine, heroin, and MDMA mixtures even when the Raman spectra are complex or very similar. This approach of restricting the spectral variables also decreases the computational time by a factor of 30 (compared to the complete spectrum), making the methodology attractive for rapid automatic classification and identification of suspect materials.

  6. An initial analysis of LANDSAT 4 Thematic Mapper data for the classification of agricultural, forested wetland, and urban land covers

    NASA Technical Reports Server (NTRS)

    Quattrochi, D. A.; Anderson, J. E.; Brannon, D. P.; Hill, C. L.

    1982-01-01

    An initial analysis of LANDSAT 4 thematic mapper (TM) data for the delineation and classification of agricultural, forested wetland, and urban land covers was conducted. A study area in Poinsett County, Arkansas was used to evaluate a classification of agricultural lands derived from multitemporal LANDSAT multispectral scanner (MSS) data in comparison with a classification of TM data for the same area. Data over Reelfoot Lake in northwestern Tennessee were utilized to evaluate the TM for delineating forested wetland species. A classification of the study area was assessed for accuracy in discriminating five forested wetland categories. Finally, the TM data were used to identify urban features within a small city. A computer generated classification of Union City, Tennessee was analyzed for accuracy in delineating urban land covers. An evaluation of digitally enhanced TM data using principal components analysis to facilitate photointerpretation of urban features was also performed.

  7. Discriminative spatial-frequency-temporal feature extraction and classification of motor imagery EEG: An sparse regression and Weighted Naïve Bayesian Classifier-based approach.

    PubMed

    Miao, Minmin; Zeng, Hong; Wang, Aimin; Zhao, Changsen; Liu, Feixiang

    2017-02-15

    Common spatial pattern (CSP) is most widely used in motor imagery based brain-computer interface (BCI) systems. In conventional CSP algorithm, pairs of the eigenvectors corresponding to both extreme eigenvalues are selected to construct the optimal spatial filter. In addition, an appropriate selection of subject-specific time segments and frequency bands plays an important role in its successful application. This study proposes to optimize spatial-frequency-temporal patterns for discriminative feature extraction. Spatial optimization is implemented by channel selection and finding discriminative spatial filters adaptively on each time-frequency segment. A novel Discernibility of Feature Sets (DFS) criteria is designed for spatial filter optimization. Besides, discriminative features located in multiple time-frequency segments are selected automatically by the proposed sparse time-frequency segment common spatial pattern (STFSCSP) method which exploits sparse regression for significant features selection. Finally, a weight determined by the sparse coefficient is assigned for each selected CSP feature and we propose a Weighted Naïve Bayesian Classifier (WNBC) for classification. Experimental results on two public EEG datasets demonstrate that optimizing spatial-frequency-temporal patterns in a data-driven manner for discriminative feature extraction greatly improves the classification performance. The proposed method gives significantly better classification accuracies in comparison with several competing methods in the literature. The proposed approach is a promising candidate for future BCI systems. Copyright © 2016 Elsevier B.V. All rights reserved.

  8. Signal peptide discrimination and cleavage site identification using SVM and NN.

    PubMed

    Kazemian, H B; Yusuf, S A; White, K

    2014-02-01

    About 15% of all proteins in a genome contain a signal peptide (SP) sequence, at the N-terminus, that targets the protein to intracellular secretory pathways. Once the protein is targeted correctly in the cell, the SP is cleaved, releasing the mature protein. Accurate prediction of the presence of these short amino-acid SP chains is crucial for modelling the topology of membrane proteins, since SP sequences can be confused with transmembrane domains due to similar composition of hydrophobic amino acids. This paper presents a cascaded Support Vector Machine (SVM)-Neural Network (NN) classification methodology for SP discrimination and cleavage site identification. The proposed method utilises a dual phase classification approach using SVM as a primary classifier to discriminate SP sequences from Non-SP. The methodology further employs NNs to predict the most suitable cleavage site candidates. In phase one, a SVM classification utilises hydrophobic propensities as a primary feature vector extraction using symmetric sliding window amino-acid sequence analysis for discrimination of SP and Non-SP. In phase two, a NN classification uses asymmetric sliding window sequence analysis for prediction of cleavage site identification. The proposed SVM-NN method was tested using Uni-Prot non-redundant datasets of eukaryotic and prokaryotic proteins with SP and Non-SP N-termini. Computer simulation results demonstrate an overall accuracy of 0.90 for SP and Non-SP discrimination based on Matthews Correlation Coefficient (MCC) tests using SVM. For SP cleavage site prediction, the overall accuracy is 91.5% based on cross-validation tests using the novel SVM-NN model. © 2013 Published by Elsevier Ltd.

  9. A Feature Selection Method Based on Fisher's Discriminant Ratio for Text Sentiment Classification

    NASA Astrophysics Data System (ADS)

    Wang, Suge; Li, Deyu; Wei, Yingjie; Li, Hongxia

    With the rapid growth of e-commerce, product reviews on the Web have become an important information source for customers' decision making when they intend to buy some product. As the reviews are often too many for customers to go through, how to automatically classify them into different sentiment orientation categories (i.e. positive/negative) has become a research problem. In this paper, based on Fisher's discriminant ratio, an effective feature selection method is proposed for product review text sentiment classification. In order to validate the validity of the proposed method, we compared it with other methods respectively based on information gain and mutual information while support vector machine is adopted as the classifier. In this paper, 6 subexperiments are conducted by combining different feature selection methods with 2 kinds of candidate feature sets. Under 1006 review documents of cars, the experimental results indicate that the Fisher's discriminant ratio based on word frequency estimation has the best performance with F value 83.3% while the candidate features are the words which appear in both positive and negative texts.

  10. Otolith shape analysis for stock discrimination of two Collichthys genus croaker (Pieces: Sciaenidae,) from the northern Chinese coast

    NASA Astrophysics Data System (ADS)

    Zhao, Bo; Liu, Jinhu; Song, Junjie; Cao, Liang; Dou, Shuozeng

    2017-08-01

    The otolith morphology of two croaker species (Collichthys lucidus and Collichthys niveatus) from three areas (Liaodong Bay, LD; Huanghe (Yellow) River estuary, HRE; Jiaozhou Bay, JZ) along the northern Chinese coast were investigated for species identification and stock discrimination. The otolith contour shape described by elliptic Fourier coefficients (EFC) were analysed using principal components analysis (PCA) and stepwise canonical discriminant analysis (CDA) to identify species and stocks. The two species were well differentiated, with an overall classification success rate of 97.8%. And variations in the otolith shapes were significant enough to discriminate among the three geographical samples of C. lucidus (67.7%) or C. niveatus (65.2%). Relatively high mis-assignment occurred between the geographically adjacent LD and HRE samples, which implied that individual mixing may exist between the two samples. This study yielded information complementary to that derived from genetic studies and provided information for assessing the stock structure of C. lucidus and C. niveatus in the Bohai Sea and the Yellow Sea.

  11. Study on nondestructive discrimination of genuine and counterfeit wild ginsengs using NIRS

    NASA Astrophysics Data System (ADS)

    Lu, Q.; Fan, Y.; Peng, Z.; Ding, H.; Gao, H.

    2012-07-01

    A new approach for the nondestructive discrimination between genuine wild ginsengs and the counterfeit ones by near infrared spectroscopy (NIRS) was developed. Both discriminant analysis and back propagation artificial neural network (BP-ANN) were applied to the model establishment for discrimination. Optimal modeling wavelengths were determined based on the anomalous spectral information of counterfeit samples. Through principal component analysis (PCA) of various wild ginseng samples, genuine and counterfeit, the cumulative percentages of variance of the principal components were obtained, serving as a reference for principal component (PC) factor determination. Discriminant analysis achieved an identification ratio of 88.46%. With sample' truth values as its outputs, a three-layer BP-ANN model was built, which yielded a higher discrimination accuracy of 100%. The overall results sufficiently demonstrate that NIRS combined with BP-ANN classification algorithm performs better on ginseng discrimination than discriminant analysis, and can be used as a rapid and nondestructive method for the detection of counterfeit wild ginsengs in food and pharmaceutical industry.

  12. Rare earth elements minimal harvest year variation facilitates robust geographical origin discrimination: The case of PDO "Fava Santorinis".

    PubMed

    Drivelos, Spiros A; Danezis, Georgios P; Haroutounian, Serkos A; Georgiou, Constantinos A

    2016-12-15

    This study examines the trace and rare earth elemental (REE) fingerprint variations of PDO (Protected Designation of Origin) "Fava Santorinis" over three consecutive harvesting years (2011-2013). Classification of samples in harvesting years was studied by performing discriminant analysis (DA), k nearest neighbours (κ-NN), partial least squares (PLS) analysis and probabilistic neural networks (PNN) using rare earth elements and trace metals determined using ICP-MS. DA performed better than κ-NN, producing 100% discrimination using trace elements and 79% using REEs. PLS was found to be superior to PNN, achieving 99% and 90% classification for trace and REEs, respectively, while PNN achieved 96% and 71% classification for trace and REEs, respectively. The information obtained using REEs did not enhance classification, indicating that REEs vary minimally per harvesting year, providing robust geographical origin discrimination. The results show that seasonal patterns can occur in the elemental composition of "Fava Santorinis", probably reflecting seasonality of climate. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. Physicochemical properties of honey from Marche, Central Italy: classification of unifloral and multifloral honeys by multivariate analysis.

    PubMed

    Truzzi, Cristina; Illuminati, Silvia; Annibaldia, Anna; Finale, Carolina; Rossetti, Monica; Scarponi, Giuseppe

    2014-11-01

    The purpose of this study was the physicochemical characterization and classification of Italian honey from Marche Region with a chemometric approach. A total of 135 honeys of different botanical origins [acacia (Robinia pseudoacacia L.), chestnut (Castanea sativa), coriander (Coriandrum sativum L.), lime (Tilia spp.), sunflower (Helianthus annuus L.), Metcalfa honeydew and multifloral honey] were considered. The average results of electrical conductivity (0.14-1.45 mS cm(-1)), pH (3.89-5.42), free acidity (10.9-39.0 meq(NaOH) kg(-1)), lactones (2.4-4.5 meq(NaOH) kg(-1)), total acidity (14.5-40.9 meq(NaOH) kg(-1)), proline (229-665 mg kg(-1)) and 5-(hydroxy-methyl)-2-furaldehyde (0.6-3.9 mg kg(-1)) content show wide variability among the analysed honey types, with statistically significant differences between the different honey types. Pattern recognition methods such as principal component analysis and discriminant analysis were performed in order to find a relationship between variables and types of honey and to classify honey on the basis of its physicochemical properties. The variables of electrical conductivity, acidity (free, lactones), pH and proline content exhibited higher discriminant power and provided enough information for the classification and distinction of unifloral honey types, but not for the classification of multifloral honey (100% and 85% of samples correctly classified, respectively).

  14. Rapid discrimination of sea buckthorn berries from different H. rhamnoides subspecies by multi-step IR spectroscopy coupled with multivariate data analysis

    NASA Astrophysics Data System (ADS)

    Liu, Yue; Zhang, Ying; Zhang, Jing; Fan, Gang; Tu, Ya; Sun, Suqin; Shen, Xudong; Li, Qingzhu; Zhang, Yi

    2018-03-01

    As an important ethnic medicine, sea buckthorn was widely used to prevent and treat various diseases due to its nutritional and medicinal properties. According to the Chinese Pharmacopoeia, sea buckthorn was originated from H. rhamnoides, which includes five subspecies distributed in China. Confusion and misidentification usually occurred due to their similar morphology, especially in dried and powdered forms. Additionally, these five subspecies have vital differences in quality and physiological efficacy. This paper focused on the quick classification and identification method of sea buckthorn berry powders from five H. rhamnoides subspecies using multi-step IR spectroscopy coupled with multivariate data analysis. The holistic chemical compositions revealed by the FT-IR spectra demonstrated that flavonoids, fatty acids and sugars were the main chemical components. Further, the differences in FT-IR spectra regarding their peaks, positions and intensities were used to identify H. rhamnoides subspecies samples. The discrimination was achieved using principal component analysis (PCA) and partial least square-discriminant analysis (PLS-DA). The results showed that the combination of multi-step IR spectroscopy and chemometric analysis offered a simple, fast and reliable method for the classification and identification of the sea buckthorn berry powders from different H. rhamnoides subspecies.

  15. A manual and an automatic TERS based virus discrimination

    NASA Astrophysics Data System (ADS)

    Olschewski, Konstanze; Kämmer, Evelyn; Stöckel, Stephan; Bocklitz, Thomas; Deckert-Gaudig, Tanja; Zell, Roland; Cialla-May, Dana; Weber, Karina; Deckert, Volker; Popp, Jürgen

    2015-02-01

    Rapid techniques for virus identification are more relevant today than ever. Conventional virus detection and identification strategies generally rest upon various microbiological methods and genomic approaches, which are not suited for the analysis of single virus particles. In contrast, the highly sensitive spectroscopic technique tip-enhanced Raman spectroscopy (TERS) allows the characterisation of biological nano-structures like virions on a single-particle level. In this study, the feasibility of TERS in combination with chemometrics to discriminate two pathogenic viruses, Varicella-zoster virus (VZV) and Porcine teschovirus (PTV), was investigated. In a first step, chemometric methods transformed the spectral data in such a way that a rapid visual discrimination of the two examined viruses was enabled. In a further step, these methods were utilised to perform an automatic quality rating of the measured spectra. Spectra that passed this test were eventually used to calculate a classification model, through which a successful discrimination of the two viral species based on TERS spectra of single virus particles was also realised with a classification accuracy of 91%.Rapid techniques for virus identification are more relevant today than ever. Conventional virus detection and identification strategies generally rest upon various microbiological methods and genomic approaches, which are not suited for the analysis of single virus particles. In contrast, the highly sensitive spectroscopic technique tip-enhanced Raman spectroscopy (TERS) allows the characterisation of biological nano-structures like virions on a single-particle level. In this study, the feasibility of TERS in combination with chemometrics to discriminate two pathogenic viruses, Varicella-zoster virus (VZV) and Porcine teschovirus (PTV), was investigated. In a first step, chemometric methods transformed the spectral data in such a way that a rapid visual discrimination of the two examined viruses

  16. Motor Oil Classification using Color Histograms and Pattern Recognition Techniques.

    PubMed

    Ahmadi, Shiva; Mani-Varnosfaderani, Ahmad; Habibi, Biuck

    2018-04-20

    Motor oil classification is important for quality control and the identification of oil adulteration. In thiswork, we propose a simple, rapid, inexpensive and nondestructive approach based on image analysis and pattern recognition techniques for the classification of nine different types of motor oils according to their corresponding color histograms. For this, we applied color histogram in different color spaces such as red green blue (RGB), grayscale, and hue saturation intensity (HSI) in order to extract features that can help with the classification procedure. These color histograms and their combinations were used as input for model development and then were statistically evaluated by using linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and support vector machine (SVM) techniques. Here, two common solutions for solving a multiclass classification problem were applied: (1) transformation to binary classification problem using a one-against-all (OAA) approach and (2) extension from binary classifiers to a single globally optimized multilabel classification model. In the OAA strategy, LDA, QDA, and SVM reached up to 97% in terms of accuracy, sensitivity, and specificity for both the training and test sets. In extension from binary case, despite good performances by the SVM classification model, QDA and LDA provided better results up to 92% for RGB-grayscale-HSI color histograms and up to 93% for the HSI color map, respectively. In order to reduce the numbers of independent variables for modeling, a principle component analysis algorithm was used. Our results suggest that the proposed method is promising for the identification and classification of different types of motor oils.

  17. Multi-task linear programming discriminant analysis for the identification of progressive MCI individuals.

    PubMed

    Yu, Guan; Liu, Yufeng; Thung, Kim-Han; Shen, Dinggang

    2014-01-01

    Accurately identifying mild cognitive impairment (MCI) individuals who will progress to Alzheimer's disease (AD) is very important for making early interventions. Many classification methods focus on integrating multiple imaging modalities such as magnetic resonance imaging (MRI) and fluorodeoxyglucose positron emission tomography (FDG-PET). However, the main challenge for MCI classification using multiple imaging modalities is the existence of a lot of missing data in many subjects. For example, in the Alzheimer's Disease Neuroimaging Initiative (ADNI) study, almost half of the subjects do not have PET images. In this paper, we propose a new and flexible binary classification method, namely Multi-task Linear Programming Discriminant (MLPD) analysis, for the incomplete multi-source feature learning. Specifically, we decompose the classification problem into different classification tasks, i.e., one for each combination of available data sources. To solve all different classification tasks jointly, our proposed MLPD method links them together by constraining them to achieve the similar estimated mean difference between the two classes (under classification) for those shared features. Compared with the state-of-the-art incomplete Multi-Source Feature (iMSF) learning method, instead of constraining different classification tasks to choose a common feature subset for those shared features, MLPD can flexibly and adaptively choose different feature subsets for different classification tasks. Furthermore, our proposed MLPD method can be efficiently implemented by linear programming. To validate our MLPD method, we perform experiments on the ADNI baseline dataset with the incomplete MRI and PET images from 167 progressive MCI (pMCI) subjects and 226 stable MCI (sMCI) subjects. We further compared our method with the iMSF method (using incomplete MRI and PET images) and also the single-task classification method (using only MRI or only subjects with both MRI and PET images

  18. Multi-Task Linear Programming Discriminant Analysis for the Identification of Progressive MCI Individuals

    PubMed Central

    Yu, Guan; Liu, Yufeng; Thung, Kim-Han; Shen, Dinggang

    2014-01-01

    Accurately identifying mild cognitive impairment (MCI) individuals who will progress to Alzheimer's disease (AD) is very important for making early interventions. Many classification methods focus on integrating multiple imaging modalities such as magnetic resonance imaging (MRI) and fluorodeoxyglucose positron emission tomography (FDG-PET). However, the main challenge for MCI classification using multiple imaging modalities is the existence of a lot of missing data in many subjects. For example, in the Alzheimer's Disease Neuroimaging Initiative (ADNI) study, almost half of the subjects do not have PET images. In this paper, we propose a new and flexible binary classification method, namely Multi-task Linear Programming Discriminant (MLPD) analysis, for the incomplete multi-source feature learning. Specifically, we decompose the classification problem into different classification tasks, i.e., one for each combination of available data sources. To solve all different classification tasks jointly, our proposed MLPD method links them together by constraining them to achieve the similar estimated mean difference between the two classes (under classification) for those shared features. Compared with the state-of-the-art incomplete Multi-Source Feature (iMSF) learning method, instead of constraining different classification tasks to choose a common feature subset for those shared features, MLPD can flexibly and adaptively choose different feature subsets for different classification tasks. Furthermore, our proposed MLPD method can be efficiently implemented by linear programming. To validate our MLPD method, we perform experiments on the ADNI baseline dataset with the incomplete MRI and PET images from 167 progressive MCI (pMCI) subjects and 226 stable MCI (sMCI) subjects. We further compared our method with the iMSF method (using incomplete MRI and PET images) and also the single-task classification method (using only MRI or only subjects with both MRI and PET images

  19. Discriminating plant species across California's diverse ecosystems using airborne VSWIR and TIR imagery

    NASA Astrophysics Data System (ADS)

    Meerdink, S.; Roberts, D. A.; Roth, K. L.

    2015-12-01

    Accurate knowledge of the spatial distribution of plant species is required for many research and management agendas that track ecosystem health. Because of this, there is continuous development of research focused on remotely-sensed species classifications for many diverse ecosystems. While plant species have been mapped using airborne imaging spectroscopy, the geographic extent has been limited due to data availability and spectrally similar species continue to be difficult to separate. The proposed Hyperspectral Infrared Imager (HyspIRI) space-borne mission, which includes a visible near infrared/shortwave infrared (VSWIR) imaging spectrometer and thermal infrared (TIR) multi-spectral imager, would present an opportunity to improve species discrimination over a much broader scale. Here we evaluate: 1) the capability of VSWIR and/or TIR spectra to discriminate plant species; 2) the accuracy of species classifications within an ecosystem; and 3) the potential for discriminating among species across a range of ecosystems. Simulated HyspIRI imagery was acquired in spring/summer of 2013 spanning from Santa Barbara to Bakersfield, CA with the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) and the MODIS/ASTER Airborne Simulator (MASTER) instruments. Three spectral libraries were created from these images: AVIRIS (224 bands from 0.4 - 2.5 µm), MASTER (8 bands from 7.5 - 12 µm), and AVIRIS + MASTER. We used canonical discriminant analysis (CDA) as a dimension reduction technique and then classified plant species using linear discriminant analysis (LDA). Our results show the inclusion of TIR spectra improved species discrimination, but only for plant species with emissivities departing from that of a gray body. Ecosystems with species that have high spectral contrast had higher classification accuracies. Mapping plant species across all ecosystems resulted in a classification with lower accuracies than a single ecosystem due to the complex nature of

  20. Classification of Ilex species based on metabolomic fingerprinting using nuclear magnetic resonance and multivariate data analysis.

    PubMed

    Choi, Young Hae; Sertic, Sarah; Kim, Hye Kyong; Wilson, Erica G; Michopoulos, Filippos; Lefeber, Alfons W M; Erkelens, Cornelis; Prat Kricun, Sergio D; Verpoorte, Robert

    2005-02-23

    The metabolomic analysis of 11 Ilex species, I. argentina, I. brasiliensis, I. brevicuspis, I. dumosavar. dumosa, I. dumosa var. guaranina, I. integerrima, I. microdonta, I. paraguariensis var. paraguariensis, I. pseudobuxus, I. taubertiana, and I. theezans, was carried out by NMR spectroscopy and multivariate data analysis. The analysis using principal component analysis and classification of the (1)H NMR spectra showed a clear discrimination of those samples based on the metabolites present in the organic and aqueous fractions. The major metabolites that contribute to the discrimination are arbutin, caffeine, phenylpropanoids, and theobromine. Among those metabolites, arbutin, which has not been reported yet as a constituent of Ilex species, was found to be a biomarker for I. argentina,I. brasiliensis, I. brevicuspis, I. integerrima, I. microdonta, I. pseudobuxus, I. taubertiana, and I. theezans. This reliable method based on the determination of a large number of metabolites makes the chemotaxonomical analysis of Ilex species possible.

  1. Multi-resolution analysis using integrated microscopic configuration with local patterns for benign-malignant mass classification

    NASA Astrophysics Data System (ADS)

    Rabidas, Rinku; Midya, Abhishek; Chakraborty, Jayasree; Sadhu, Anup; Arif, Wasim

    2018-02-01

    In this paper, Curvelet based local attributes, Curvelet-Local configuration pattern (C-LCP), is introduced for the characterization of mammographic masses as benign or malignant. Amid different anomalies such as micro- calcification, bilateral asymmetry, architectural distortion, and masses, the reason for targeting the mass lesions is due to their variation in shape, size, and margin which makes the diagnosis a challenging task. Being efficient in classification, multi-resolution property of the Curvelet transform is exploited and local information is extracted from the coefficients of each subband using Local configuration pattern (LCP). The microscopic measures in concatenation with the local textural information provide more discriminating capability than individual. The measures embody the magnitude information along with the pixel-wise relationships among the neighboring pixels. The performance analysis is conducted with 200 mammograms of the DDSM database containing 100 mass cases of each benign and malignant. The optimal set of features is acquired via stepwise logistic regression method and the classification is carried out with Fisher linear discriminant analysis. The best area under the receiver operating characteristic curve and accuracy of 0.95 and 87.55% are achieved with the proposed method, which is further compared with some of the state-of-the-art competing methods.

  2. Idiopathic interstitial pneumonias and emphysema: detection and classification using a texture-discriminative approach

    NASA Astrophysics Data System (ADS)

    Fetita, C.; Chang-Chien, K. C.; Brillet, P. Y.; Pr"teux, F.; Chang, R. F.

    2012-03-01

    Our study aims at developing a computer-aided diagnosis (CAD) system for fully automatic detection and classification of pathological lung parenchyma patterns in idiopathic interstitial pneumonias (IIP) and emphysema using multi-detector computed tomography (MDCT). The proposed CAD system is based on three-dimensional (3-D) mathematical morphology, texture and fuzzy logic analysis, and can be divided into four stages: (1) a multi-resolution decomposition scheme based on a 3-D morphological filter was exploited to discriminate the lung region patterns at different analysis scales. (2) An additional spatial lung partitioning based on the lung tissue texture was introduced to reinforce the spatial separation between patterns extracted at the same resolution level in the decomposition pyramid. Then, (3) a hierarchic tree structure was exploited to describe the relationship between patterns at different resolution levels, and for each pattern, six fuzzy membership functions were established for assigning a probability of association with a normal tissue or a pathological target. Finally, (4) a decision step exploiting the fuzzy-logic assignments selects the target class of each lung pattern among the following categories: normal (N), emphysema (EM), fibrosis/honeycombing (FHC), and ground glass (GDG). According to a preliminary evaluation on an extended database, the proposed method can overcome the drawbacks of a previously developed approach and achieve higher sensitivity and specificity.

  3. High Dimensional Classification Using Features Annealed Independence Rules.

    PubMed

    Fan, Jianqing; Fan, Yingying

    2008-01-01

    Classification using high-dimensional features arises frequently in many contemporary statistical studies such as tumor classification using microarray or other high-throughput data. The impact of dimensionality on classifications is largely poorly understood. In a seminal paper, Bickel and Levina (2004) show that the Fisher discriminant performs poorly due to diverging spectra and they propose to use the independence rule to overcome the problem. We first demonstrate that even for the independence classification rule, classification using all the features can be as bad as the random guessing due to noise accumulation in estimating population centroids in high-dimensional feature space. In fact, we demonstrate further that almost all linear discriminants can perform as bad as the random guessing. Thus, it is paramountly important to select a subset of important features for high-dimensional classification, resulting in Features Annealed Independence Rules (FAIR). The conditions under which all the important features can be selected by the two-sample t-statistic are established. The choice of the optimal number of features, or equivalently, the threshold value of the test statistics are proposed based on an upper bound of the classification error. Simulation studies and real data analysis support our theoretical results and demonstrate convincingly the advantage of our new classification procedure.

  4. Attractor structure discriminates sleep states: recurrence plot analysis applied to infant breathing patterns.

    PubMed

    Terrill, Philip Ian; Wilson, Stephen James; Suresh, Sadasivam; Cooper, David M; Dakin, Carolyn

    2010-05-01

    Breathing patterns are characteristically different between infant active sleep (AS) and quiet sleep (QS), and statistical quantifications of interbreath interval (IBI) data have previously been used to discriminate between infant sleep states. It has also been identified that breathing patterns are governed by a nonlinear controller. This study aims to investigate whether nonlinear quantifications of infant IBI data are characteristically different between AS and QS, and whether they may be used to discriminate between these infant sleep states. Polysomnograms were obtained from 24 healthy infants at six months of age. Periods of AS and QS were identified, and IBI data extracted. Recurrence quantification analysis (RQA) was applied to each period, and recurrence calculated for a fixed radius in the range of 0-8 in steps of 0.02, and embedding dimensions of 4, 6, 8, and 16. When a threshold classifier was trained, the RQA variable recurrence was able to correctly classify 94.3% of periods in a test dataset. It was concluded that RQA of IBI data is able to accurately discriminate between infant sleep states. This is a promising step toward development of a minimal-channel automatic sleep state classification system.

  5. Alteration mapping at Goldfield, Nevada, by cluster and discriminant analysis of Landsat digital data. [mapping of hydrothermally altered volcanic rocks

    NASA Technical Reports Server (NTRS)

    Ballew, G.

    1977-01-01

    The ability of Landsat multispectral digital data to differentiate among 62 combinations of rock and alteration types at the Goldfield mining district of Western Nevada was investigated by using statistical techniques of cluster and discriminant analysis. Multivariate discriminant analysis was not effective in classifying each of the 62 groups, with classification results essentially the same whether data of four channels alone or combined with six ratios of channels were used. Bivariate plots of group means revealed a cluster of three groups including mill tailings, basalt and all other rock and alteration types. Automatic hierarchical clustering based on the fourth dimensional Mahalanobis distance between group means of 30 groups having five or more samples was performed using Johnson's HICLUS program. The results of the cluster analysis revealed hierarchies of mill tailings vs. natural materials, basalt vs. non-basalt, highly reflectant rocks vs. other rocks and exclusively unaltered rocks vs. predominantly altered rocks. The hierarchies were used to determine the order in which sets of multiple discriminant analyses were to be performed and the resulting discriminant functions were used to produce a map of geology and alteration which has an overall accuracy of 70 percent for discriminating exclusively altered rocks from predominantly altered rocks.

  6. Pathological speech signal analysis and classification using empirical mode decomposition.

    PubMed

    Kaleem, Muhammad; Ghoraani, Behnaz; Guergachi, Aziz; Krishnan, Sridhar

    2013-07-01

    Automated classification of normal and pathological speech signals can provide an objective and accurate mechanism for pathological speech diagnosis, and is an active area of research. A large part of this research is based on analysis of acoustic measures extracted from sustained vowels. However, sustained vowels do not reflect real-world attributes of voice as effectively as continuous speech, which can take into account important attributes of speech such as rapid voice onset and termination, changes in voice frequency and amplitude, and sudden discontinuities in speech. This paper presents a methodology based on empirical mode decomposition (EMD) for classification of continuous normal and pathological speech signals obtained from a well-known database. EMD is used to decompose randomly chosen portions of speech signals into intrinsic mode functions, which are then analyzed to extract meaningful temporal and spectral features, including true instantaneous features which can capture discriminative information in signals hidden at local time-scales. A total of six features are extracted, and a linear classifier is used with the feature vector to classify continuous speech portions obtained from a database consisting of 51 normal and 161 pathological speakers. A classification accuracy of 95.7 % is obtained, thus demonstrating the effectiveness of the methodology.

  7. Application of texture analysis method for classification of benign and malignant thyroid nodules in ultrasound images.

    PubMed

    Abbasian Ardakani, Ali; Gharbali, Akbar; Mohammadi, Afshin

    2015-01-01

    The aim of this study was to evaluate computer aided diagnosis (CAD) system with texture analysis (TA) to improve radiologists' accuracy in identification of thyroid nodules as malignant or benign. A total of 70 cases (26 benign and 44 malignant) were analyzed in this study. We extracted up to 270 statistical texture features as a descriptor for each selected region of interests (ROIs) in three normalization schemes (default, 3s and 1%-99%). Then features by the lowest probability of classification error and average correlation coefficients (POE+ACC), and Fisher coefficient (Fisher) eliminated to 10 best and most effective features. These features were analyzed under standard and nonstandard states. For TA of the thyroid nodules, Principle Component Analysis (PCA), Linear Discriminant Analysis (LDA) and Non-Linear Discriminant Analysis (NDA) were applied. First Nearest-Neighbour (1-NN) classifier was performed for the features resulting from PCA and LDA. NDA features were classified by artificial neural network (A-NN). Receiver operating characteristic (ROC) curve analysis was used for examining the performance of TA methods. The best results were driven in 1-99% normalization with features extracted by POE+ACC algorithm and analyzed by NDA with the area under the ROC curve ( Az) of 0.9722 which correspond to sensitivity of 94.45%, specificity of 100%, and accuracy of 97.14%. Our results indicate that TA is a reliable method, can provide useful information help radiologist in detection and classification of benign and malignant thyroid nodules.

  8. Discrimination of Bacillus anthracis from closely related microorganisms by analysis of 16S and 23S rRNA with oligonucleotide microchips

    DOEpatents

    Bavykin, Sergei G.; Mirzabekov, Andrei D.

    2007-10-30

    The present invention is directed to a novel method of discriminating a highly infectious bacterium Bacillus anthracis from a group of closely related microorganisms. Sequence variations in the 16S and 23S rRNA of the B. cereus subgroup including B. anthracis are utilized to construct an array that can detect these sequence variations through selective hybridizations. The identification and analysis of these sequence variations enables positive discrimination of isolates of the B. cereus group that includes B. anthracis. Discrimination of single base differences in rRNA was achieved with a microchip during analysis of B. cereus group isolates from both single and in mixed probes, as well as identification of polymorphic sites. Successful use of a microchip to determine the appropriate subgroup classification using eight reference microorganisms from the B. cereus group as a study set, was demonstrated.

  9. Classification and regression tree (CART) analyses of genomic signatures reveal sets of tetramers that discriminate temperature optima of archaea and bacteria

    PubMed Central

    Dyer, Betsey D.; Kahn, Michael J.; LeBlanc, Mark D.

    2008-01-01

    Classification and regression tree (CART) analysis was applied to genome-wide tetranucleotide frequencies (genomic signatures) of 195 archaea and bacteria. Although genomic signatures have typically been used to classify evolutionary divergence, in this study, convergent evolution was the focus. Temperature optima for most of the organisms examined could be distinguished by CART analyses of tetranucleotide frequencies. This suggests that pervasive (nonlinear) qualities of genomes may reflect certain environmental conditions (such as temperature) in which those genomes evolved. The predominant use of GAGA and AGGA as the discriminating tetramers in CART models suggests that purine-loading and codon biases of thermophiles may explain some of the results. PMID:19054742

  10. Accurate discrimination of Alzheimer's disease from other dementia and/or normal subjects using SPECT specific volume analysis

    NASA Astrophysics Data System (ADS)

    Iyatomi, Hitoshi; Hashimoto, Jun; Yoshii, Fumuhito; Kazama, Toshiki; Kawada, Shuichi; Imai, Yutaka

    2014-03-01

    Discrimination between Alzheimer's disease and other dementia is clinically significant, however it is often difficult. In this study, we developed classification models among Alzheimer's disease (AD), other dementia (OD) and/or normal subjects (NC) using patient factors and indices obtained by brain perfusion SPECT. SPECT is commonly used to assess cerebral blood flow (CBF) and allows the evaluation of the severity of hypoperfusion by introducing statistical parametric mapping (SPM). We investigated a total of 150 cases (50 cases each for AD, OD, and NC) from Tokai University Hospital, Japan. In each case, we obtained a total of 127 candidate parameters from: (A) 2 patient factors (age and sex), (B) 12 CBF parameters and 113 SPM parameters including (C) 3 from specific volume analysis (SVA), and (D) 110 from voxel-based analysis stereotactic extraction estimation (vbSEE). We built linear classifiers with a statistical stepwise feature selection and evaluated the performance with the leave-one-out cross validation strategy. Our classifiers achieved very high classification performances with reasonable number of selected parameters. In the most significant discrimination in clinical, namely those of AD from OD, our classifier achieved both sensitivity (SE) and specificity (SP) of 96%. In a similar way, our classifiers achieved a SE of 90% and a SP of 98% in AD from NC, as well as a SE of 88% and a SP of 86% in AD from OD and NC cases. Introducing SPM indices such as SVA and vbSEE, classification performances improved around 7-15%. We confirmed that these SPM factors are quite important for diagnosing Alzheimer's disease.

  11. Protein Subcellular Localization with Gaussian Kernel Discriminant Analysis and Its Kernel Parameter Selection.

    PubMed

    Wang, Shunfang; Nie, Bing; Yue, Kun; Fei, Yu; Li, Wenjia; Xu, Dongshu

    2017-12-15

    Kernel discriminant analysis (KDA) is a dimension reduction and classification algorithm based on nonlinear kernel trick, which can be novelly used to treat high-dimensional and complex biological data before undergoing classification processes such as protein subcellular localization. Kernel parameters make a great impact on the performance of the KDA model. Specifically, for KDA with the popular Gaussian kernel, to select the scale parameter is still a challenging problem. Thus, this paper introduces the KDA method and proposes a new method for Gaussian kernel parameter selection depending on the fact that the differences between reconstruction errors of edge normal samples and those of interior normal samples should be maximized for certain suitable kernel parameters. Experiments with various standard data sets of protein subcellular localization show that the overall accuracy of protein classification prediction with KDA is much higher than that without KDA. Meanwhile, the kernel parameter of KDA has a great impact on the efficiency, and the proposed method can produce an optimum parameter, which makes the new algorithm not only perform as effectively as the traditional ones, but also reduce the computational time and thus improve efficiency.

  12. Geographical classification of apple based on hyperspectral imaging

    NASA Astrophysics Data System (ADS)

    Guo, Zhiming; Huang, Wenqian; Chen, Liping; Zhao, Chunjiang; Peng, Yankun

    2013-05-01

    Attribute of apple according to geographical origin is often recognized and appreciated by the consumers. It is usually an important factor to determine the price of a commercial product. Hyperspectral imaging technology and supervised pattern recognition was attempted to discriminate apple according to geographical origins in this work. Hyperspectral images of 207 Fuji apple samples were collected by hyperspectral camera (400-1000nm). Principal component analysis (PCA) was performed on hyperspectral imaging data to determine main efficient wavelength images, and then characteristic variables were extracted by texture analysis based on gray level co-occurrence matrix (GLCM) from dominant waveband image. All characteristic variables were obtained by fusing the data of images in efficient spectra. Support vector machine (SVM) was used to construct the classification model, and showed excellent performance in classification results. The total classification rate had the high classify accuracy of 92.75% in the training set and 89.86% in the prediction sets, respectively. The overall results demonstrated that the hyperspectral imaging technique coupled with SVM classifier can be efficiently utilized to discriminate Fuji apple according to geographical origins.

  13. Quantum Ensemble Classification: A Sampling-Based Learning Control Approach.

    PubMed

    Chen, Chunlin; Dong, Daoyi; Qi, Bo; Petersen, Ian R; Rabitz, Herschel

    2017-06-01

    Quantum ensemble classification (QEC) has significant applications in discrimination of atoms (or molecules), separation of isotopes, and quantum information extraction. However, quantum mechanics forbids deterministic discrimination among nonorthogonal states. The classification of inhomogeneous quantum ensembles is very challenging, since there exist variations in the parameters characterizing the members within different classes. In this paper, we recast QEC as a supervised quantum learning problem. A systematic classification methodology is presented by using a sampling-based learning control (SLC) approach for quantum discrimination. The classification task is accomplished via simultaneously steering members belonging to different classes to their corresponding target states (e.g., mutually orthogonal states). First, a new discrimination method is proposed for two similar quantum systems. Then, an SLC method is presented for QEC. Numerical results demonstrate the effectiveness of the proposed approach for the binary classification of two-level quantum ensembles and the multiclass classification of multilevel quantum ensembles.

  14. Random whole metagenomic sequencing for forensic discrimination of soils.

    PubMed

    Khodakova, Anastasia S; Smith, Renee J; Burgoyne, Leigh; Abarno, Damien; Linacre, Adrian

    2014-01-01

    Here we assess the ability of random whole metagenomic sequencing approaches to discriminate between similar soils from two geographically distinct urban sites for application in forensic science. Repeat samples from two parklands in residential areas separated by approximately 3 km were collected and the DNA was extracted. Shotgun, whole genome amplification (WGA) and single arbitrarily primed DNA amplification (AP-PCR) based sequencing techniques were then used to generate soil metagenomic profiles. Full and subsampled metagenomic datasets were then annotated against M5NR/M5RNA (taxonomic classification) and SEED Subsystems (metabolic classification) databases. Further comparative analyses were performed using a number of statistical tools including: hierarchical agglomerative clustering (CLUSTER); similarity profile analysis (SIMPROF); non-metric multidimensional scaling (NMDS); and canonical analysis of principal coordinates (CAP) at all major levels of taxonomic and metabolic classification. Our data showed that shotgun and WGA-based approaches generated highly similar metagenomic profiles for the soil samples such that the soil samples could not be distinguished accurately. An AP-PCR based approach was shown to be successful at obtaining reproducible site-specific metagenomic DNA profiles, which in turn were employed for successful discrimination of visually similar soil samples collected from two different locations.

  15. Forensic analysis of explosives using isotope ratio mass spectrometry (IRMS)--discrimination of ammonium nitrate sources.

    PubMed

    Benson, Sarah J; Lennard, Christopher J; Maynard, Philip; Hill, David M; Andrew, Anita S; Roux, Claude

    2009-06-01

    An evaluation was undertaken to determine if isotope ratio mass spectrometry (IRMS) could assist in the investigation of complex forensic cases by providing a level of discrimination not achievable utilising traditional forensic techniques. The focus of the research was on ammonium nitrate (AN), a common oxidiser used in improvised explosive mixtures. The potential value of IRMS to attribute Australian AN samples to the manufacturing source was demonstrated through the development of a preliminary AN classification scheme based on nitrogen isotopes. Although the discrimination utilising nitrogen isotopes alone was limited and only relevant to samples from the three Australian manufacturers during the evaluated time period, the classification scheme has potential as an investigative aid. Combining oxygen and hydrogen stable isotope values permitted the differentiation of AN prills from three different Australian manufacturers. Samples from five different overseas sources could be differentiated utilising a combination of the nitrogen, oxygen and hydrogen isotope values. Limited differentiation between Australian and overseas prills was achieved for the samples analysed. The comparison of nitrogen isotope values from intact AN prill samples with those from post-blast AN prill residues highlighted that the nitrogen isotopic composition of the prills was not maintained post-blast; hence, limiting the technique to analysis of un-reacted explosive material.

  16. Systematic Analysis of Primary Sequence Domain Segments for the Discrimination Between Class C GPCR Subtypes.

    PubMed

    König, Caroline; Alquézar, René; Vellido, Alfredo; Giraldo, Jesús

    2018-03-01

    G-protein-coupled receptors (GPCRs) are a large and diverse super-family of eukaryotic cell membrane proteins that play an important physiological role as transmitters of extracellular signal. In this paper, we investigate Class C, a member of this super-family that has attracted much attention in pharmacology. The limited knowledge about the complete 3D crystal structure of Class C receptors makes necessary the use of their primary amino acid sequences for analytical purposes. Here, we provide a systematic analysis of distinct receptor sequence segments with regard to their ability to differentiate between seven class C GPCR subtypes according to their topological location in the extracellular, transmembrane, or intracellular domains. We build on the results from the previous research that provided preliminary evidence of the potential use of separated domains of complete class C GPCR sequences as the basis for subtype classification. The use of the extracellular N-terminus domain alone was shown to result in a minor decrease in subtype discrimination in comparison with the complete sequence, despite discarding much of the sequence information. In this paper, we describe the use of Support Vector Machine-based classification models to evaluate the subtype-discriminating capacity of the specific topological sequence segments.

  17. Atmospheric pressure chemical ionisation mass spectrometry analysis linked with chemometrics for food classification - a case study: geographical provenance and cultivar classification of monovarietal clarified apple juices.

    PubMed

    Gan, Heng-Hui; Soukoulis, Christos; Fisk, Ian

    2014-03-01

    In the present work, we have evaluated for first time the feasibility of APCI-MS volatile compound fingerprinting in conjunction with chemometrics (PLS-DA) as a new strategy for rapid and non-destructive food classification. For this purpose 202 clarified monovarietal juices extracted from apples differing in their botanical and geographical origin were used for evaluation of the performance of APCI-MS as a classification tool. For an independent test set PLS-DA analyses of pre-treated spectral data gave 100% and 94.2% correct classification rate for the classification by cultivar and geographical origin, respectively. Moreover, PLS-DA analysis of APCI-MS in conjunction with GC-MS data revealed that masses within the spectral ACPI-MS data set were related with parent ions or fragments of alkyesters, carbonyl compounds (hexanal, trans-2-hexenal) and alcohols (1-hexanol, 1-butanol, cis-3-hexenol) and had significant discriminating power both in terms of cultivar and geographical origin. Copyright © 2013 The Authors. Published by Elsevier Ltd.. All rights reserved.

  18. [Image Feature Extraction and Discriminant Analysis of Xinjiang Uygur Medicine Based on Color Histogram].

    PubMed

    Hamit, Murat; Yun, Weikang; Yan, Chuanbo; Kutluk, Abdugheni; Fang, Yang; Alip, Elzat

    2015-06-01

    Image feature extraction is an important part of image processing and it is an important field of research and application of image processing technology. Uygur medicine is one of Chinese traditional medicine and researchers pay more attention to it. But large amounts of Uygur medicine data have not been fully utilized. In this study, we extracted the image color histogram feature of herbal and zooid medicine of Xinjiang Uygur. First, we did preprocessing, including image color enhancement, size normalizition and color space transformation. Then we extracted color histogram feature and analyzed them with statistical method. And finally, we evaluated the classification ability of features by Bayes discriminant analysis. Experimental results showed that high accuracy for Uygur medicine image classification was obtained by using color histogram feature. This study would have a certain help for the content-based medical image retrieval for Xinjiang Uygur medicine.

  19. The differentiation of camel breeds based on meat measurements using discriminant analysis.

    PubMed

    Al-Atiyat, Raed Mahmoud; Suliman, Gamal; AlSuhaibani, Entissar; El-Waziry, Ahmad; Al-Owaimer, Abdullah; Basmaeil, Saeid

    2016-06-01

    The meat productivity of camel in the tropics is still under investigation for identification of better meat breed or type. Therefore, four one-humped Saudi Arabian (SA) camel breeds, Majaheem, Maghateer, Hamrah, and Safrah were experimented in order to differentiate them from each other based on meat measurements. The measurements were biometrical meat traits measured on six intact males from each breed. The results showed higher values of the Majaheem breed than that obtained for the other breeds except few cases such dressing percentage and rib-eye area. In differentiation analysis, the most discriminating meat variables were myofibrillar protein index, meat color components (L* and a*, b*), and cooking loss. Consequently, the Safrah and the Majaheem breeds presented the largest dissimilarity as evidenced by their multivariate means. The canonical discriminant analysis allowed an additional understanding of the differentiation between breeds. Furthermore, two large clusters, one formed by Hamrah and Maghateer in one group along with Safrah. These classifications may assign each breed into one cluster considering they are better as meat producers. The Majaheem was clustered alone in another cluster that might be a result of being better as milk producers. Nevertheless, the productivity type of the camel breeds of SA needs further morphology and genetic descriptions.

  20. The Color of Health: Skin Color, Ethnoracial Classification, and Discrimination in the Health of Latin Americans

    PubMed Central

    Perreira, Krista M.; Telles, Edward E.

    2014-01-01

    Latin America is one of the most ethnoracially heterogeneous regions of the world. Despite this, health disparities research in Latin America tends to focus on gender, class and regional health differences while downplaying ethnoracial differences. Few scholars have conducted studies of ethnoracial identification and health disparities in Latin America. Research that examines multiple measures of ethnoracial identification is rarer still. Official data on race/ethnicity in Latin America are based on self-identification which can differ from interviewer-ascribed or phenotypic classification based on skin color. We use data from Brazil, Colombia, Mexico, and Peru to examine associations of interviewer-ascribed skin color, interviewer-ascribed race/ethnicity, and self-reported race/ethnicity with self-rated health among Latin American adults (ages 18-65). We also examine associations of observer-ascribed skin color with three additional correlates of health – skin color discrimination, class discrimination, and socio-economic status. We find a significant gradient in self-rated health by skin color. Those with darker skin colors report poorer health. Darker skin color influences self-rated health primarily by increasing exposure to class discrimination and low socio-economic status. PMID:24957692

  1. The color of health: skin color, ethnoracial classification, and discrimination in the health of Latin Americans.

    PubMed

    Perreira, Krista M; Telles, Edward E

    2014-09-01

    Latin America is one of the most ethnoracially heterogeneous regions of the world. Despite this, health disparities research in Latin America tends to focus on gender, class and regional health differences while downplaying ethnoracial differences. Few scholars have conducted studies of ethnoracial identification and health disparities in Latin America. Research that examines multiple measures of ethnoracial identification is rarer still. Official data on race/ethnicity in Latin America are based on self-identification which can differ from interviewer-ascribed or phenotypic classification based on skin color. We use data from Brazil, Colombia, Mexico, and Peru to examine associations of interviewer-ascribed skin color, interviewer-ascribed race/ethnicity, and self-reported race/ethnicity with self-rated health among Latin American adults (ages 18-65). We also examine associations of observer-ascribed skin color with three additional correlates of health - skin color discrimination, class discrimination, and socio-economic status. We find a significant gradient in self-rated health by skin color. Those with darker skin colors report poorer health. Darker skin color influences self-rated health primarily by increasing exposure to class discrimination and low socio-economic status. Copyright © 2014 Elsevier Ltd. All rights reserved.

  2. A Discriminative Approach to EEG Seizure Detection

    PubMed Central

    Johnson, Ashley N.; Sow, Daby; Biem, Alain

    2011-01-01

    Seizures are abnormal sudden discharges in the brain with signatures represented in electroencephalograms (EEG). The efficacy of the application of speech processing techniques to discriminate between seizure and non-seizure states in EEGs is reported. The approach accounts for the challenges of unbalanced datasets (seizure and non-seizure), while also showing a system capable of real-time seizure detection. The Minimum Classification Error (MCE) algorithm, which is a discriminative learning algorithm with wide-use in speech processing, is applied and compared with conventional classification techniques that have already been applied to the discrimination between seizure and non-seizure states in the literature. The system is evaluated on 22 pediatric patients multi-channel EEG recordings. Experimental results show that the application of speech processing techniques and MCE compare favorably with conventional classification techniques in terms of classification performance, while requiring less computational overhead. The results strongly suggests the possibility of deploying the designed system at the bedside. PMID:22195192

  3. A discrimlnant function approach to ecological site classification in northern New England

    Treesearch

    James M. Fincher; Marie-Louise Smith

    1994-01-01

    Describes one approach to ecologically based classification of upland forest community types of the White and Green Mountain physiographic regions. The classification approach is based on an intensive statistical analysis of the relationship between the communities and soil-site factors. Discriminant functions useful in distinguishing between types based on soil-site...

  4. Highly Accurate Classification of Watson-Crick Basepairs on Termini of Single DNA Molecules

    PubMed Central

    Winters-Hilt, Stephen; Vercoutere, Wenonah; DeGuzman, Veronica S.; Deamer, David; Akeson, Mark; Haussler, David

    2003-01-01

    We introduce a computational method for classification of individual DNA molecules measured by an α-hemolysin channel detector. We show classification with better than 99% accuracy for DNA hairpin molecules that differ only in their terminal Watson-Crick basepairs. Signal classification was done in silico to establish performance metrics (i.e., where train and test data were of known type, via single-species data files). It was then performed in solution to assay real mixtures of DNA hairpins. Hidden Markov Models (HMMs) were used with Expectation/Maximization for denoising and for associating a feature vector with the ionic current blockade of the DNA molecule. Support Vector Machines (SVMs) were used as discriminators, and were the focus of off-line training. A multiclass SVM architecture was designed to place less discriminatory load on weaker discriminators, and novel SVM kernels were used to boost discrimination strength. The tuning on HMMs and SVMs enabled biophysical analysis of the captured molecule states and state transitions; structure revealed in the biophysical analysis was used for better feature selection. PMID:12547778

  5. Combining features from ERP components in single-trial EEG for discriminating four-category visual objects.

    PubMed

    Wang, Changming; Xiong, Shi; Hu, Xiaoping; Yao, Li; Zhang, Jiacai

    2012-10-01

    Categorization of images containing visual objects can be successfully recognized using single-trial electroencephalograph (EEG) measured when subjects view images. Previous studies have shown that task-related information contained in event-related potential (ERP) components could discriminate two or three categories of object images. In this study, we investigated whether four categories of objects (human faces, buildings, cats and cars) could be mutually discriminated using single-trial EEG data. Here, the EEG waveforms acquired while subjects were viewing four categories of object images were segmented into several ERP components (P1, N1, P2a and P2b), and then Fisher linear discriminant analysis (Fisher-LDA) was used to classify EEG features extracted from ERP components. Firstly, we compared the classification results using features from single ERP components, and identified that the N1 component achieved the highest classification accuracies. Secondly, we discriminated four categories of objects using combining features from multiple ERP components, and showed that combination of ERP components improved four-category classification accuracies by utilizing the complementarity of discriminative information in ERP components. These findings confirmed that four categories of object images could be discriminated with single-trial EEG and could direct us to select effective EEG features for classifying visual objects.

  6. Credit scoring analysis using kernel discriminant

    NASA Astrophysics Data System (ADS)

    Widiharih, T.; Mukid, M. A.; Mustafid

    2018-05-01

    Credit scoring model is an important tool for reducing the risk of wrong decisions when granting credit facilities to applicants. This paper investigate the performance of kernel discriminant model in assessing customer credit risk. Kernel discriminant analysis is a non- parametric method which means that it does not require any assumptions about the probability distribution of the input. The main ingredient is a kernel that allows an efficient computation of Fisher discriminant. We use several kernel such as normal, epanechnikov, biweight, and triweight. The models accuracy was compared each other using data from a financial institution in Indonesia. The results show that kernel discriminant can be an alternative method that can be used to determine who is eligible for a credit loan. In the data we use, it shows that a normal kernel is relevant to be selected for credit scoring using kernel discriminant model. Sensitivity and specificity reach to 0.5556 and 0.5488 respectively.

  7. Automotive System for Remote Surface Classification.

    PubMed

    Bystrov, Aleksandr; Hoare, Edward; Tran, Thuy-Yung; Clarke, Nigel; Gashinova, Marina; Cherniakov, Mikhail

    2017-04-01

    In this paper we shall discuss a novel approach to road surface recognition, based on the analysis of backscattered microwave and ultrasonic signals. The novelty of our method is sonar and polarimetric radar data fusion, extraction of features for separate swathes of illuminated surface (segmentation), and using of multi-stage artificial neural network for surface classification. The developed system consists of 24 GHz radar and 40 kHz ultrasonic sensor. The features are extracted from backscattered signals and then the procedures of principal component analysis and supervised classification are applied to feature data. The special attention is paid to multi-stage artificial neural network which allows an overall increase in classification accuracy. The proposed technique was tested for recognition of a large number of real surfaces in different weather conditions with the average accuracy of correct classification of 95%. The obtained results thereby demonstrate that the use of proposed system architecture and statistical methods allow for reliable discrimination of various road surfaces in real conditions.

  8. Automotive System for Remote Surface Classification

    PubMed Central

    Bystrov, Aleksandr; Hoare, Edward; Tran, Thuy-Yung; Clarke, Nigel; Gashinova, Marina; Cherniakov, Mikhail

    2017-01-01

    In this paper we shall discuss a novel approach to road surface recognition, based on the analysis of backscattered microwave and ultrasonic signals. The novelty of our method is sonar and polarimetric radar data fusion, extraction of features for separate swathes of illuminated surface (segmentation), and using of multi-stage artificial neural network for surface classification. The developed system consists of 24 GHz radar and 40 kHz ultrasonic sensor. The features are extracted from backscattered signals and then the procedures of principal component analysis and supervised classification are applied to feature data. The special attention is paid to multi-stage artificial neural network which allows an overall increase in classification accuracy. The proposed technique was tested for recognition of a large number of real surfaces in different weather conditions with the average accuracy of correct classification of 95%. The obtained results thereby demonstrate that the use of proposed system architecture and statistical methods allow for reliable discrimination of various road surfaces in real conditions. PMID:28368297

  9. Chemometric classification of morphologically similar Umbelliferae medicinal herbs by DART-TOF-MS fingerprint.

    PubMed

    Lee, Sang Min; Kim, Hye-Jin; Jang, Young Pyo

    2012-01-01

    It needs many years of special training to gain expertise on the organoleptic classification of botanical raw materials and, even for those experts, discrimination among Umbelliferae medicinal herbs remains an intricate challenge due to their morphological similarity. To develop a new chemometric classification method using a direct analysis in real time-time of flight-mass spectrometry (DART-TOF-MS) fingerprinting for Umbelliferae medicinal herbs and to provide a platform for its application to the discrimination of other herbal medicines. Angelica tenuissima, Angelica gigas, Angelica dahurica and Cnidium officinale were chosen for this study and ten samples of each species were purchased from various Korean markets. DART-TOF-MS was employed on powdered raw materials to obtain a chemical fingerprint of each sample and the orthogonal partial-least squares method in discriminant analysis (OPLS-DA) was used for multivariate analysis. All samples of collected species were successfully discriminated from each other according to their characteristic DART-TOF-MS fingerprint. Decursin (or decursinol angelate) and byakangelicol were identified as marker molecules for Angelica gigas and A. dahurica, respectively. Using the OPLS method for discriminant analysis, Angelica tenuissima and Cnidium officinale were clearly separated into two groups. Angelica tenuissima was characterised by the presence of ligustilide and unidentified molecular ions of m/z 239 and 283, while senkyunolide A together with signals with m/z 387 and 389 were the marker compounds for Cnidium officinale. Elaborating with chemoinformatics, DART-TOF-MS fingerprinting with chemoinformatic tools results in a powerful method for the classification of morphologically similar Umbelliferae medicinal herbs and quality control of medicinal herbal products, including the extracts of these crude drugs. Copyright © 2012 John Wiley & Sons, Ltd.

  10. Unsupervised learning of discriminative edge measures for vehicle matching between nonoverlapping cameras.

    PubMed

    Shan, Ying; Sawhney, Harpreet S; Kumar, Rakesh

    2008-04-01

    This paper proposes a novel unsupervised algorithm learning discriminative features in the context of matching road vehicles between two non-overlapping cameras. The matching problem is formulated as a same-different classification problem, which aims to compute the probability of vehicle images from two distinct cameras being from the same vehicle or different vehicle(s). We employ a novel measurement vector that consists of three independent edge-based measures and their associated robust measures computed from a pair of aligned vehicle edge maps. The weight of each measure is determined by an unsupervised learning algorithm that optimally separates the same-different classes in the combined measurement space. This is achieved with a weak classification algorithm that automatically collects representative samples from same-different classes, followed by a more discriminative classifier based on Fisher' s Linear Discriminants and Gibbs Sampling. The robustness of the match measures and the use of unsupervised discriminant analysis in the classification ensures that the proposed method performs consistently in the presence of missing/false features, temporally and spatially changing illumination conditions, and systematic misalignment caused by different camera configurations. Extensive experiments based on real data of over 200 vehicles at different times of day demonstrate promising results.

  11. Score-moment combined linear discrimination analysis (SMC-LDA) as an improved discrimination method.

    PubMed

    Han, Jintae; Chung, Hoeil; Han, Sung-Hwan; Yoon, Moon-Young

    2007-01-01

    A new discrimination method called the score-moment combined linear discrimination analysis (SMC-LDA) has been developed and its performance has been evaluated using three practical spectroscopic datasets. The key concept of SMC-LDA was to use not only the score from principal component analysis (PCA), but also the moment of the spectrum, as inputs for LDA to improve discrimination. Along with conventional score, moment is used in spectroscopic fields as an effective alternative for spectral feature representation. Three different approaches were considered. Initially, the score generated from PCA was projected onto a two-dimensional feature space by maximizing Fisher's criterion function (conventional PCA-LDA). Next, the same procedure was performed using only moment. Finally, both score and moment were utilized simultaneously for LDA. To evaluate discrimination performances, three different spectroscopic datasets were employed: (1) infrared (IR) spectra of normal and malignant stomach tissue, (2) near-infrared (NIR) spectra of diesel and light gas oil (LGO) and (3) Raman spectra of Chinese and Korean ginseng. For each case, the best discrimination results were achieved when both score and moment were used for LDA (SMC-LDA). Since the spectral representation character of moment was different from that of score, inclusion of both score and moment for LDA provided more diversified and descriptive information.

  12. Using the PDD Behavior Inventory as a Level 2 Screener: A Classification and Regression Trees Analysis

    ERIC Educational Resources Information Center

    Cohen, Ira L.; Liu, Xudong; Hudson, Melissa; Gillis, Jennifer; Cavalari, Rachel N. S.; Romanczyk, Raymond G.; Karmel, Bernard Z.; Gardner, Judith M.

    2016-01-01

    In order to improve discrimination accuracy between Autism Spectrum Disorder (ASD) and similar neurodevelopmental disorders, a data mining procedure, Classification and Regression Trees (CART), was used on a large multi-site sample of PDD Behavior Inventory (PDDBI) forms on children with and without ASD. Discrimination accuracy exceeded 80%,…

  13. Deep feature extraction and combination for synthetic aperture radar target classification

    NASA Astrophysics Data System (ADS)

    Amrani, Moussa; Jiang, Feng

    2017-10-01

    Feature extraction has always been a difficult problem in the classification performance of synthetic aperture radar automatic target recognition (SAR-ATR). It is very important to select discriminative features to train a classifier, which is a prerequisite. Inspired by the great success of convolutional neural network (CNN), we address the problem of SAR target classification by proposing a feature extraction method, which takes advantage of exploiting the extracted deep features from CNNs on SAR images to introduce more powerful discriminative features and robust representation ability for them. First, the pretrained VGG-S net is fine-tuned on moving and stationary target acquisition and recognition (MSTAR) public release database. Second, after a simple preprocessing is performed, the fine-tuned network is used as a fixed feature extractor to extract deep features from the processed SAR images. Third, the extracted deep features are fused by using a traditional concatenation and a discriminant correlation analysis algorithm. Finally, for target classification, K-nearest neighbors algorithm based on LogDet divergence-based metric learning triplet constraints is adopted as a baseline classifier. Experiments on MSTAR are conducted, and the classification accuracy results demonstrate that the proposed method outperforms the state-of-the-art methods.

  14. Taxonomic discrimination of higher plants by pyrolysis mass spectrometry.

    PubMed

    Kim, S W; Ban, S H; Chung, H J; Choi, D W; Choi, P S; Yoo, O J; Liu, J R

    2004-02-01

    Pyrolysis mass spectrometry (PyMS) is a rapid, simple, high-resolution analytical method based on thermal degradation of complex material in a vacuum and has been widely applied to the discrimination of closely related microbial strains. Leaf samples of six species and one variety of higher plants (Rosa multiflora, R. multiflora var. platyphylla, Sedum kamtschaticum, S. takesimense, S. sarmentosum, Hepatica insularis, and H. asiatica) were subjected to PyMS for spectral fingerprinting. Principal component analysis of PyMS data was not able to discriminate these plants in discrete clusters. However, canonical variate analysis of PyMS data separated these plants from one another. A hierarchical dendrogram based on canonical variate analysis was in agreement with the known taxonomy of the plants at the variety level. These results indicate that PyMS is able to discriminate higher plants based on taxonomic classification at the family, genus, species, and variety level.

  15. Discrimination of almonds (Prunus dulcis) geographical origin by minerals and fatty acids profiling.

    PubMed

    Amorello, Diana; Orecchio, Santino; Pace, Andrea; Barreca, Salvatore

    2016-09-01

    Twenty-one almond samples from three different geographical origins (Sicily, Spain and California) were investigated by determining minerals and fatty acids compositions. Data were used to discriminate by chemometry almond origin by linear discriminant analysis. With respect to previous PCA profiling studies, this work provides a simpler analytical protocol for the identification of almonds geographical origin. Classification by using mineral contents data only was correct in 77% of the samples, while, by using fatty acid profiles, the percentages of samples correctly classified reached 82%. The coupling of mineral contents and fatty acid profiles lead to an increased efficiency of the classification with 87% of samples correctly classified.

  16. A Critical Analysis of Anti-Discrimination Law and Microaggressions in Academia

    ERIC Educational Resources Information Center

    Lukes, Robin; Bangs, Joann

    2014-01-01

    This article provides a critical analysis of microaggressions and anti-discrimination law in academia. There are many challenges for faculty claiming discrimination under current civil rights laws. Examples of microaggressions that fall outside of anti-discrimination law will be provided. Traditional legal analysis of discrimination will not end…

  17. Classification of Malaysia aromatic rice using multivariate statistical analysis

    NASA Astrophysics Data System (ADS)

    Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A.; Omar, O.

    2015-05-01

    Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC-MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.

  18. Integrated Low-Rank-Based Discriminative Feature Learning for Recognition.

    PubMed

    Zhou, Pan; Lin, Zhouchen; Zhang, Chao

    2016-05-01

    Feature learning plays a central role in pattern recognition. In recent years, many representation-based feature learning methods have been proposed and have achieved great success in many applications. However, these methods perform feature learning and subsequent classification in two separate steps, which may not be optimal for recognition tasks. In this paper, we present a supervised low-rank-based approach for learning discriminative features. By integrating latent low-rank representation (LatLRR) with a ridge regression-based classifier, our approach combines feature learning with classification, so that the regulated classification error is minimized. In this way, the extracted features are more discriminative for the recognition tasks. Our approach benefits from a recent discovery on the closed-form solutions to noiseless LatLRR. When there is noise, a robust Principal Component Analysis (PCA)-based denoising step can be added as preprocessing. When the scale of a problem is large, we utilize a fast randomized algorithm to speed up the computation of robust PCA. Extensive experimental results demonstrate the effectiveness and robustness of our method.

  19. DIFFERENTIATION OF AURANTII FRUCTUS IMMATURUS AND FRUCTUS PONICIRI TRIFOLIATAE IMMATURUS BY FLOW-INJECTION WITH ULTRAVIOLET SPECTROSCOPIC DETECTION AND PROTON NUCLEAR MAGNETIC RESONANCE USING PARTIAL LEAST-SQUARES DISCRIMINANT ANALYSIS.

    PubMed

    Zhang, Mengliang; Zhao, Yang; Harrington, Peter de B; Chen, Pei

    2016-03-01

    Two simple fingerprinting methods, flow-injection coupled to ultraviolet spectroscopy and proton nuclear magnetic resonance, were used for discriminating between Aurantii fructus immaturus and Fructus poniciri trifoliatae immaturus . Both methods were combined with partial least-squares discriminant analysis. In the flow-injection method, four data representations were evaluated: total ultraviolet absorbance chromatograms, averaged ultraviolet spectra, absorbance at 193, 205, 225, and 283 nm, and absorbance at 225 and 283 nm. Prediction rates of 100% were achieved for all data representations by partial least-squares discriminant analysis using leave-one-sample-out cross-validation. The prediction rate for the proton nuclear magnetic resonance data by partial least-squares discriminant analysis with leave-one-sample-out cross-validation was also 100%. A new validation set of data was collected by flow-injection with ultraviolet spectroscopic detection two weeks later and predicted by partial least-squares discriminant analysis models constructed by the initial data representations with no parameter changes. The classification rates were 95% with the total ultraviolet absorbance chromatograms datasets and 100% with the other three datasets. Flow-injection with ultraviolet detection and proton nuclear magnetic resonance are simple, high throughput, and low-cost methods for discrimination studies.

  20. Multivariate analysis of volatile compounds detected by headspace solid-phase microextraction/gas chromatography: A tool for sensory classification of cork stoppers.

    PubMed

    Prat, Chantal; Besalú, Emili; Bañeras, Lluís; Anticó, Enriqueta

    2011-06-15

    The volatile fraction of aqueous cork macerates of tainted and non-tainted agglomerate cork stoppers was analysed by headspace solid-phase microextraction (HS-SPME)/gas chromatography. Twenty compounds containing terpenoids, aliphatic alcohols, lignin-related compounds and others were selected and analysed in individual corks. Cork stoppers were previously classified in six different classes according to sensory descriptions including, 2,4,6-trichloroanisole taint and other frequent, non-characteristic odours found in cork. A multivariate analysis of the chromatographic data of 20 selected chemical compounds using linear discriminant analysis models helped in the differentiation of the a priori made groups. The discriminant model selected five compounds as the best combination. Selected compounds appear in the model in the following order; 2,4,6 TCA, fenchyl alcohol, 1-octen-3-ol, benzyl alcohol and benzothiazole. Unfortunately, not all six a priori differentiated sensory classes were clearly discriminated in the model, probably indicating that no measurable differences exist in the chromatographic data for some categories. The predictive analyses of a refined model in which two sensory classes were fused together resulted in a good classification. Prediction rates of control (non-tainted), TCA, musty-earthy-vegetative, vegetative and chemical descriptions were 100%, 100%, 85%, 67.3% and 100%, respectively, when the modified model was used. The multivariate analysis of chromatographic data will help in the classification of stoppers and provide a perfect complement to sensory analyses. Copyright © 2010 Elsevier Ltd. All rights reserved.

  1. Q-mode versus R-mode principal component analysis for linear discriminant analysis (LDA)

    NASA Astrophysics Data System (ADS)

    Lee, Loong Chuen; Liong, Choong-Yeun; Jemain, Abdul Aziz

    2017-05-01

    Many literature apply Principal Component Analysis (PCA) as either preliminary visualization or variable con-struction methods or both. Focus of PCA can be on the samples (R-mode PCA) or variables (Q-mode PCA). Traditionally, R-mode PCA has been the usual approach to reduce high-dimensionality data before the application of Linear Discriminant Analysis (LDA), to solve classification problems. Output from PCA composed of two new matrices known as loadings and scores matrices. Each matrix can then be used to produce a plot, i.e. loadings plot aids identification of important variables whereas scores plot presents spatial distribution of samples on new axes that are also known as Principal Components (PCs). Fundamentally, the scores matrix always be the input variables for building classification model. A recent paper uses Q-mode PCA but the focus of analysis was not on the variables but instead on the samples. As a result, the authors have exchanged the use of both loadings and scores plots in which clustering of samples was studied using loadings plot whereas scores plot has been used to identify important manifest variables. Therefore, the aim of this study is to statistically validate the proposed practice. Evaluation is based on performance of external error obtained from LDA models according to number of PCs. On top of that, bootstrapping was also conducted to evaluate the external error of each of the LDA models. Results show that LDA models produced by PCs from R-mode PCA give logical performance and the matched external error are also unbiased whereas the ones produced with Q-mode PCA show the opposites. With that, we concluded that PCs produced from Q-mode is not statistically stable and thus should not be applied to problems of classifying samples, but variables. We hope this paper will provide some insights on the disputable issues.

  2. Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems.

    PubMed

    Lê Cao, Kim-Anh; Boitard, Simon; Besse, Philippe

    2011-06-22

    Variable selection on high throughput biological data, such as gene expression or single nucleotide polymorphisms (SNPs), becomes inevitable to select relevant information and, therefore, to better characterize diseases or assess genetic structure. There are different ways to perform variable selection in large data sets. Statistical tests are commonly used to identify differentially expressed features for explanatory purposes, whereas Machine Learning wrapper approaches can be used for predictive purposes. In the case of multiple highly correlated variables, another option is to use multivariate exploratory approaches to give more insight into cell biology, biological pathways or complex traits. A simple extension of a sparse PLS exploratory approach is proposed to perform variable selection in a multiclass classification framework. sPLS-DA has a classification performance similar to other wrapper or sparse discriminant analysis approaches on public microarray and SNP data sets. More importantly, sPLS-DA is clearly competitive in terms of computational efficiency and superior in terms of interpretability of the results via valuable graphical outputs. sPLS-DA is available in the R package mixOmics, which is dedicated to the analysis of large biological data sets.

  3. Describing three-class task performance: three-class linear discriminant analysis and three-class ROC analysis

    NASA Astrophysics Data System (ADS)

    He, Xin; Frey, Eric C.

    2007-03-01

    Binary ROC analysis has solid decision-theoretic foundations and a close relationship to linear discriminant analysis (LDA). In particular, for the case of Gaussian equal covariance input data, the area under the ROC curve (AUC) value has a direct relationship to the Hotelling trace. Many attempts have been made to extend binary classification methods to multi-class. For example, Fukunaga extended binary LDA to obtain multi-class LDA, which uses the multi-class Hotelling trace as a figure-of-merit, and we have previously developed a three-class ROC analysis method. This work explores the relationship between conventional multi-class LDA and three-class ROC analysis. First, we developed a linear observer, the three-class Hotelling observer (3-HO). For Gaussian equal covariance data, the 3- HO provides equivalent performance to the three-class ideal observer and, under less strict conditions, maximizes the signal to noise ratio for classification of all pairs of the three classes simultaneously. The 3-HO templates are not the eigenvectors obtained from multi-class LDA. Second, we show that the three-class Hotelling trace, which is the figureof- merit in the conventional three-class extension of LDA, has significant limitations. Third, we demonstrate that, under certain conditions, there is a linear relationship between the eigenvectors obtained from multi-class LDA and 3-HO templates. We conclude that the 3-HO based on decision theory has advantages both in its decision theoretic background and in the usefulness of its figure-of-merit. Additionally, there exists the possibility of interpreting the two linear features extracted by the conventional extension of LDA from a decision theoretic point of view.

  4. Laser-induced breakdown spectroscopy-based investigation and classification of pharmaceutical tablets using multivariate chemometric analysis

    PubMed Central

    Myakalwar, Ashwin Kumar; Sreedhar, S.; Barman, Ishan; Dingari, Narahara Chari; Rao, S. Venugopal; Kiran, P. Prem; Tewari, Surya P.; Kumar, G. Manoj

    2012-01-01

    We report the effectiveness of laser-induced breakdown spectroscopy (LIBS) in probing the content of pharmaceutical tablets and also investigate its feasibility for routine classification. This method is particularly beneficial in applications where its exquisite chemical specificity and suitability for remote and on site characterization significantly improves the speed and accuracy of quality control and assurance process. Our experiments reveal that in addition to the presence of carbon, hydrogen, nitrogen and oxygen, which can be primarily attributed to the active pharmaceutical ingredients, specific inorganic atoms were also present in all the tablets. Initial attempts at classification by a ratiometric approach using oxygen to nitrogen compositional values yielded an optimal value (at 746.83 nm) with the least relative standard deviation but nevertheless failed to provide an acceptable classification. To overcome this bottleneck in the detection process, two chemometric algorithms, i.e. principal component analysis (PCA) and soft independent modeling of class analogy (SIMCA), were implemented to exploit the multivariate nature of the LIBS data demonstrating that LIBS has the potential to differentiate and discriminate among pharmaceutical tablets. We report excellent prospective classification accuracy using supervised classification via the SIMCA algorithm, demonstrating its potential for future applications in process analytical technology, especially for fast on-line process control monitoring applications in the pharmaceutical industry. PMID:22099648

  5. Comparison of standard maximum likelihood classification and polytomous logistic regression used in remote sensing

    Treesearch

    John Hogland; Nedret Billor; Nathaniel Anderson

    2013-01-01

    Discriminant analysis, referred to as maximum likelihood classification within popular remote sensing software packages, is a common supervised technique used by analysts. Polytomous logistic regression (PLR), also referred to as multinomial logistic regression, is an alternative classification approach that is less restrictive, more flexible, and easy to interpret. To...

  6. The UXO Classification Demonstration at the Former Camp Butner, NC

    DTIC Science & Technology

    2011-07-01

    Symposium and Workshop, Technical Session 2D: Classification Methods for Military Munitions Response. 1 December 2010. [49] Pasion , L. Personal...Communication. 15 June 2011. [50] Pasion , L. “Practical Strategies for UXO Discrimination: Camp Butner Analysis.” ESTCP Munitions Management In-Progress...Review. 9 February 2011. [51] Pasion , L., et al. “UXO Discrimination Using Full Coverage and Cued Interrogation Data Sets at Camp Butner, NC.” Partners

  7. Prediction of unwanted pregnancies using logistic regression, probit regression and discriminant analysis

    PubMed Central

    Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon

    2015-01-01

    Background: Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. Methods: In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. Results: The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Conclusion: Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended. PMID:26793655

  8. Prediction of unwanted pregnancies using logistic regression, probit regression and discriminant analysis.

    PubMed

    Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon

    2015-01-01

    Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended.

  9. Combination of laser-induced breakdown spectroscopy and Raman spectroscopy for multivariate classification of bacteria

    NASA Astrophysics Data System (ADS)

    Prochazka, D.; Mazura, M.; Samek, O.; Rebrošová, K.; Pořízka, P.; Klus, J.; Prochazková, P.; Novotný, J.; Novotný, K.; Kaiser, J.

    2018-01-01

    In this work, we investigate the impact of data provided by complementary laser-based spectroscopic methods on multivariate classification accuracy. Discrimination and classification of five Staphylococcus bacterial strains and one strain of Escherichia coli is presented. The technique that we used for measurements is a combination of Raman spectroscopy and Laser-Induced Breakdown Spectroscopy (LIBS). Obtained spectroscopic data were then processed using Multivariate Data Analysis algorithms. Principal Components Analysis (PCA) was selected as the most suitable technique for visualization of bacterial strains data. To classify the bacterial strains, we used Neural Networks, namely a supervised version of Kohonen's self-organizing maps (SOM). We were processing results in three different ways - separately from LIBS measurements, from Raman measurements, and we also merged data from both mentioned methods. The three types of results were then compared. By applying the PCA to Raman spectroscopy data, we observed that two bacterial strains were fully distinguished from the rest of the data set. In the case of LIBS data, three bacterial strains were fully discriminated. Using a combination of data from both methods, we achieved the complete discrimination of all bacterial strains. All the data were classified with a high success rate using SOM algorithm. The most accurate classification was obtained using a combination of data from both techniques. The classification accuracy varied, depending on specific samples and techniques. As for LIBS, the classification accuracy ranged from 45% to 100%, as for Raman Spectroscopy from 50% to 100% and in case of merged data, all samples were classified correctly. Based on the results of the experiments presented in this work, we can assume that the combination of Raman spectroscopy and LIBS significantly enhances discrimination and classification accuracy of bacterial species and strains. The reason is the complementarity in

  10. Trace element analysis of rough diamond by LA-ICP-MS: a case of source discrimination?

    PubMed

    Dalpé, Claude; Hudon, Pierre; Ballantyne, David J; Williams, Darrell; Marcotte, Denis

    2010-11-01

    Current profiling of rough diamond source is performed using different physical and/or morphological techniques that require strong knowledge and experience in the field. More recently, chemical impurities have been used to discriminate diamond source and with the advance of laser ablation-inductively coupled plasma-mass spectrometry (LA-ICP-MS) empirical profiling of rough diamonds is possible to some extent. In this study, we present a LA-ICP-MS methodology that we developed for analyzing ultra-trace element impurities in rough diamond for origin determination ("profiling"). Diamonds from two sources were analyzed by LA-ICP-MS and were statistically classified by accepted methods. For the two diamond populations analyzed in this study, binomial logistic regression produced a better overall correct classification than linear discriminant analysis. The results suggest that an anticipated matrix match reference material would improve the robustness of our methodology for forensic applications. © 2010 American Academy of Forensic Sciences.

  11. Speech Music Discrimination Using Class-Specific Features

    DTIC Science & Technology

    2004-08-01

    Speech Music Discrimination Using Class-Specific Features Thomas Beierholm...between speech and music . Feature extraction is class-specific and can therefore be tailored to each class meaning that segment size, model orders...interest. Some of the applications of audio signal classification are speech/ music classification [1], acoustical environmental classification [2][3

  12. Vessel Classification in Cosmo-Skymed SAR Data Using Hierarchical Feature Selection

    NASA Astrophysics Data System (ADS)

    Makedonas, A.; Theoharatos, C.; Tsagaris, V.; Anastasopoulos, V.; Costicoglou, S.

    2015-04-01

    SAR based ship detection and classification are important elements of maritime monitoring applications. Recently, high-resolution SAR data have opened new possibilities to researchers for achieving improved classification results. In this work, a hierarchical vessel classification procedure is presented based on a robust feature extraction and selection scheme that utilizes scale, shape and texture features in a hierarchical way. Initially, different types of feature extraction algorithms are implemented in order to form the utilized feature pool, able to represent the structure, material, orientation and other vessel type characteristics. A two-stage hierarchical feature selection algorithm is utilized next in order to be able to discriminate effectively civilian vessels into three distinct types, in COSMO-SkyMed SAR images: cargos, small ships and tankers. In our analysis, scale and shape features are utilized in order to discriminate smaller types of vessels present in the available SAR data, or shape specific vessels. Then, the most informative texture and intensity features are incorporated in order to be able to better distinguish the civilian types with high accuracy. A feature selection procedure that utilizes heuristic measures based on features' statistical characteristics, followed by an exhaustive research with feature sets formed by the most qualified features is carried out, in order to discriminate the most appropriate combination of features for the final classification. In our analysis, five COSMO-SkyMed SAR data with 2.2m x 2.2m resolution were used to analyse the detailed characteristics of these types of ships. A total of 111 ships with available AIS data were used in the classification process. The experimental results show that this method has good performance in ship classification, with an overall accuracy reaching 83%. Further investigation of additional features and proper feature selection is currently in progress.

  13. Classification of Bacillus and Brevibacillus species using rapid analysis of lipids by mass spectrometry.

    PubMed

    AlMasoud, Najla; Xu, Yun; Trivedi, Drupad K; Salivo, Simona; Abban, Tom; Rattray, Nicholas J W; Szula, Ewa; AlRabiah, Haitham; Sayqal, Ali; Goodacre, Royston

    2016-11-01

    Bacillus are aerobic spore-forming bacteria that are known to lead to specific diseases, such as anthrax and food poisoning. This study focuses on the characterization of these bacteria by the detection of lipids extracted from 33 well-characterized strains from the Bacillus and Brevibacillus genera, with the aim to discriminate between the different species. For the purpose of analysing the lipids extracted from these bacterial samples, two rapid physicochemical techniques were used: matrix-assisted laser desorption/ionisation time-of-flight mass spectrometry (MALDI-TOF-MS) and liquid chromatography in conjunction with mass spectrometry (LC-MS). The findings of this investigation confirmed that MALDI-TOF-MS could be used to identify different bacterial lipids and, in combination with appropriate chemometrics, allowed for the discrimination between these different bacterial species, which was supported by LC-MS. The average correct classification rates for the seven species of bacteria were 62.23 and 77.03 % based on MALDI-TOF-MS and LC-MS data, respectively. The Procrustes distance for the two datasets was 0.0699, indicating that the results from the two techniques were very similar. In addition, we also compared these bacterial lipid MALDI-TOF-MS profiles to protein profiles also collected by MALDI-TOF-MS on the same bacteria (Procrustes distance, 0.1006). The level of discrimination between lipids and proteins was equivalent, and this further indicated the potential of MALDI-TOF-MS analysis as a rapid, robust and reliable method for the classification of bacteria based on different bacterial chemical components. Graphical abstract MALDI-MS has been successfully developed for the characterization of bacteria at the subspecies level using lipids and benchmarked against HPLC.

  14. Classification image analysis: estimation and statistical inference for two-alternative forced-choice experiments

    NASA Technical Reports Server (NTRS)

    Abbey, Craig K.; Eckstein, Miguel P.

    2002-01-01

    We consider estimation and statistical hypothesis testing on classification images obtained from the two-alternative forced-choice experimental paradigm. We begin with a probabilistic model of task performance for simple forced-choice detection and discrimination tasks. Particular attention is paid to general linear filter models because these models lead to a direct interpretation of the classification image as an estimate of the filter weights. We then describe an estimation procedure for obtaining classification images from observer data. A number of statistical tests are presented for testing various hypotheses from classification images based on some more compact set of features derived from them. As an example of how the methods we describe can be used, we present a case study investigating detection of a Gaussian bump profile.

  15. Voice based gender classification using machine learning

    NASA Astrophysics Data System (ADS)

    Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.

    2017-11-01

    Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.

  16. Discrimination of Bacillus anthracis from closely related microorganisms by analysis of 16S and 23S rRNA with oligonucleotide microchips

    DOEpatents

    Bavykin, Sergei G.; Mirzabekova, legal representative, Natalia V.; Mirzabekov, deceased, Andrei D.

    2007-12-04

    The present invention relates to methods and compositions for using nucleotide sequence variations of 16S and 23S rRNA within the B. cereus group to discriminate a highly infectious bacterium B. anthracis from closely related microorganisms. Sequence variations in the 16S and 23S rRNA of the B. cereus subgroup including B. anthracis are utilized to construct an array that can detect these sequence variations through selective hybridizations and discriminate B. cereus group that includes B. anthracis. Discrimination of single base differences in rRNA was achieved with a microchip during analysis of B. cereus group isolates from both single and in mixed samples, as well as identification of polymorphic sites. Successful use of a microchip to determine the appropriate subgroup classification using eight reference microorganisms from the B. cereus group as a study set, was demonstrated.

  17. Medical image classification based on multi-scale non-negative sparse coding.

    PubMed

    Zhang, Ruijie; Shen, Jian; Wei, Fushan; Li, Xiong; Sangaiah, Arun Kumar

    2017-11-01

    With the rapid development of modern medical imaging technology, medical image classification has become more and more important in medical diagnosis and clinical practice. Conventional medical image classification algorithms usually neglect the semantic gap problem between low-level features and high-level image semantic, which will largely degrade the classification performance. To solve this problem, we propose a multi-scale non-negative sparse coding based medical image classification algorithm. Firstly, Medical images are decomposed into multiple scale layers, thus diverse visual details can be extracted from different scale layers. Secondly, for each scale layer, the non-negative sparse coding model with fisher discriminative analysis is constructed to obtain the discriminative sparse representation of medical images. Then, the obtained multi-scale non-negative sparse coding features are combined to form a multi-scale feature histogram as the final representation for a medical image. Finally, SVM classifier is combined to conduct medical image classification. The experimental results demonstrate that our proposed algorithm can effectively utilize multi-scale and contextual spatial information of medical images, reduce the semantic gap in a large degree and improve medical image classification performance. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. Rapid discrimination of bergamot essential oil by paper spray mass spectrometry and chemometric analysis.

    PubMed

    Taverna, Domenico; Di Donna, Leonardo; Mazzotti, Fabio; Tagarelli, Antonio; Napoli, Anna; Furia, Emilia; Sindona, Giovanni

    2016-09-01

    A novel approach for the rapid discrimination of bergamot essential oil from other citrus fruits oils is presented. The method was developed using paper spray mass spectrometry (PS-MS) allowing for a rapid molecular profiling coupled with a statistic tool for a precise and reliable discrimination between the bergamot complex matrix and other similar matrices, commonly used for its reconstitution. Ambient mass spectrometry possesses the ability to record mass spectra of ordinary samples, in their native environment, without sample preparation or pre-separation by creating ions outside the instrument. The present study reports a PS-MS method for the determination of oxygen heterocyclic compounds such as furocoumarins, psoralens and flavonoids present in the non-volatile fraction of citrus fruits essential oils followed by chemometric analysis. The volatile fraction of Bergamot is one of the most known and fashionable natural products, which found applications in flavoring industry as ingredient in beverages and flavored foodstuff. The development of the presented method employed bergamot, sweet orange, orange, cedar, grapefruit and mandarin essential oils. PS-MS measurements were carried out in full scan mode for a total run time of 2 min. The capability of PS-MS profiling to act as marker for the classification of bergamot essential oils was evaluated by using multivariate statistical analysis. Two pattern recognition techniques, linear discriminant analysis and soft independent modeling of class analogy, were applied to MS data. The cross-validation procedure has shown excellent results in terms of the prediction ability because both models have correctly classified all samples for each category. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  19. Discriminating semiarid vegetation using airborne imaging spectrometer data - A preliminary assessment

    NASA Technical Reports Server (NTRS)

    Thomas, Randall W.; Ustin, Susan L.

    1987-01-01

    A preliminary assessment was made of Airborne Imaging Spectrometer (AIS) data for discriminating and characterizing vegetation in a semiarid environment. May and October AIS data sets were acquired over a large alluvial fan in eastern California, on which were found Great Basin desert shrub communities. Maximum likelihood classification of a principal components representation of the May AIS data enabled discrimination of subtle spatial detail in images relating to vegetation and soil characteristics. The spatial patterns in the May AIS classification were, however, too detailed for complete interpretation with existing ground data. A similar analysis of the October AIS data yielded poor results. Comparison of AIS results with a similar analysis of May Landsat Thematic Mapper data showed that the May AIS data contained approximately three to four times as much spectrally coherent information. When only two shortwave infrared TM bands were used, results were similar to those from AIS data acquired in October.

  20. Rapid differentiation of Ghana cocoa beans by FT-NIR spectroscopy coupled with multivariate classification

    NASA Astrophysics Data System (ADS)

    Teye, Ernest; Huang, Xingyi; Dai, Huang; Chen, Quansheng

    2013-10-01

    Quick, accurate and reliable technique for discrimination of cocoa beans according to geographical origin is essential for quality control and traceability management. This current study presents the application of Near Infrared Spectroscopy technique and multivariate classification for the differentiation of Ghana cocoa beans. A total of 194 cocoa bean samples from seven cocoa growing regions were used. Principal component analysis (PCA) was used to extract relevant information from the spectral data and this gave visible cluster trends. The performance of four multivariate classification methods: Linear discriminant analysis (LDA), K-nearest neighbors (KNN), Back propagation artificial neural network (BPANN) and Support vector machine (SVM) were compared. The performances of the models were optimized by cross validation. The results revealed that; SVM model was superior to all the mathematical methods with a discrimination rate of 100% in both the training and prediction set after preprocessing with Mean centering (MC). BPANN had a discrimination rate of 99.23% for the training set and 96.88% for prediction set. While LDA model had 96.15% and 90.63% for the training and prediction sets respectively. KNN model had 75.01% for the training set and 72.31% for prediction set. The non-linear classification methods used were superior to the linear ones. Generally, the results revealed that NIR Spectroscopy coupled with SVM model could be used successfully to discriminate cocoa beans according to their geographical origins for effective quality assurance.

  1. Benign-malignant mass classification in mammogram using edge weighted local texture features

    NASA Astrophysics Data System (ADS)

    Rabidas, Rinku; Midya, Abhishek; Sadhu, Anup; Chakraborty, Jayasree

    2016-03-01

    This paper introduces novel Discriminative Robust Local Binary Pattern (DRLBP) and Discriminative Robust Local Ternary Pattern (DRLTP) for the classification of mammographic masses as benign or malignant. Mass is one of the common, however, challenging evidence of breast cancer in mammography and diagnosis of masses is a difficult task. Since DRLBP and DRLTP overcome the drawbacks of Local Binary Pattern (LBP) and Local Ternary Pattern (LTP) by discriminating a brighter object against the dark background and vice-versa, in addition to the preservation of the edge information along with the texture information, several edge-preserving texture features are extracted, in this study, from DRLBP and DRLTP. Finally, a Fisher Linear Discriminant Analysis method is incorporated with discriminating features, selected by stepwise logistic regression method, for the classification of benign and malignant masses. The performance characteristics of DRLBP and DRLTP features are evaluated using a ten-fold cross-validation technique with 58 masses from the mini-MIAS database, and the best result is observed with DRLBP having an area under the receiver operating characteristic curve of 0.982.

  2. Commercial tree species discrimination using airborne AISA Eagle hyperspectral imagery and partial least squares discriminant analysis (PLS-DA) in KwaZulu-Natal, South Africa

    NASA Astrophysics Data System (ADS)

    Peerbhay, Kabir Yunus; Mutanga, Onisimo; Ismail, Riyad

    2013-05-01

    Discriminating commercial tree species using hyperspectral remote sensing techniques is critical in monitoring the spatial distributions and compositions of commercial forests. However, issues related to data dimensionality and multicollinearity limit the successful application of the technology. The aim of this study was to examine the utility of the partial least squares discriminant analysis (PLS-DA) technique in accurately classifying six exotic commercial forest species (Eucalyptus grandis, Eucalyptus nitens, Eucalyptus smithii, Pinus patula, Pinus elliotii and Acacia mearnsii) using airborne AISA Eagle hyperspectral imagery (393-900 nm). Additionally, the variable importance in the projection (VIP) method was used to identify subsets of bands that could successfully discriminate the forest species. Results indicated that the PLS-DA model that used all the AISA Eagle bands (n = 230) produced an overall accuracy of 80.61% and a kappa value of 0.77, with user's and producer's accuracies ranging from 50% to 100%. In comparison, incorporating the optimal subset of VIP selected wavebands (n = 78) in the PLS-DA model resulted in an improved overall accuracy of 88.78% and a kappa value of 0.87, with user's and producer's accuracies ranging from 70% to 100%. Bands located predominantly within the visible region of the electromagnetic spectrum (393-723 nm) showed the most capability in terms of discriminating between the six commercial forest species. Overall, the research has demonstrated the potential of using PLS-DA for reducing the dimensionality of hyperspectral datasets as well as determining the optimal subset of bands to produce the highest classification accuracies.

  3. Classification and disease prediction via mathematical programming

    NASA Astrophysics Data System (ADS)

    Lee, Eva K.; Wu, Tsung-Lin

    2007-11-01

    In this chapter, we present classification models based on mathematical programming approaches. We first provide an overview on various mathematical programming approaches, including linear programming, mixed integer programming, nonlinear programming and support vector machines. Next, we present our effort of novel optimization-based classification models that are general purpose and suitable for developing predictive rules for large heterogeneous biological and medical data sets. Our predictive model simultaneously incorporates (1) the ability to classify any number of distinct groups; (2) the ability to incorporate heterogeneous types of attributes as input; (3) a high-dimensional data transformation that eliminates noise and errors in biological data; (4) the ability to incorporate constraints to limit the rate of misclassification, and a reserved-judgment region that provides a safeguard against over-training (which tends to lead to high misclassification rates from the resulting predictive rule) and (5) successive multi-stage classification capability to handle data points placed in the reserved judgment region. To illustrate the power and flexibility of the classification model and solution engine, and its multigroup prediction capability, application of the predictive model to a broad class of biological and medical problems is described. Applications include: the differential diagnosis of the type of erythemato-squamous diseases; predicting presence/absence of heart disease; genomic analysis and prediction of aberrant CpG island meythlation in human cancer; discriminant analysis of motility and morphology data in human lung carcinoma; prediction of ultrasonic cell disruption for drug delivery; identification of tumor shape and volume in treatment of sarcoma; multistage discriminant analysis of biomarkers for prediction of early atherosclerois; fingerprinting of native and angiogenic microvascular networks for early diagnosis of diabetes, aging, macular

  4. Discrimination of wine from grape cultivated in Japan, imported wine, and others by multi-elemental analysis.

    PubMed

    Shimizu, Hideaki; Akamatsu, Fumikazu; Kamada, Aya; Koyama, Kazuya; Okuda, Masaki; Fukuda, Hisashi; Iwashita, Kazuhiro; Goto-Yamamoto, Nami

    2018-04-01

    Differences in mineral concentrations were examined among three types of wine in the Japanese market place: Japan wine, imported wine, and domestically produced wine mainly from foreign ingredients (DWF), where Japan wine has been recently defined by the National Tax Agency as domestically produced wine from grapes cultivated in Japan. The main objective of this study was to examine the possibility of controlling the authenticity of Japan wine. The concentrations of 18 minerals (Li, B, Na, Mg, Si, P, S, K, Ca, Mn, Co, Ni, Ga, Rb, Sr, Mo, Ba, and Pb) in 214 wine samples were determined by inductively coupled-plasma mass spectrometry (ICP-MS) and ICP-atomic emission spectrometry (ICP-AES). In general, Japan wine had a higher concentration of potassium and lower concentrations of eight elements (Li, B, Na, Si, S, Co, Sr, and Pb) as compared with the other two groups of wine. Linear discriminant analysis (LDA) models based on concentrations of the 18 minerals facilitated the identification of three wine groups: Japan wine, imported wine, and DWF with a 91.1% classification score and 87.9% prediction score. In addition, an LDA model for discrimination of wine from four domestic geographic origins (Yamanashi, Nagano, Hokkaido, and Yamagata Prefectures) using 18 elements gave a classification score of 93.1% and a prediction score of 76.4%. In summary, we have shown that an LDA model based on mineral concentrations is useful for distinguishing Japan wine from other wine groups, and can contribute to classification of the four main domestic wine-producing regions of Japan. Copyright © 2017 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.

  5. Improved classification accuracy of powdery mildew infection levels of wine grapes by spatial-spectral analysis of hyperspectral images.

    PubMed

    Knauer, Uwe; Matros, Andrea; Petrovic, Tijana; Zanker, Timothy; Scott, Eileen S; Seiffert, Udo

    2017-01-01

    Hyperspectral imaging is an emerging means of assessing plant vitality, stress parameters, nutrition status, and diseases. Extraction of target values from the high-dimensional datasets either relies on pixel-wise processing of the full spectral information, appropriate selection of individual bands, or calculation of spectral indices. Limitations of such approaches are reduced classification accuracy, reduced robustness due to spatial variation of the spectral information across the surface of the objects measured as well as a loss of information intrinsic to band selection and use of spectral indices. In this paper we present an improved spatial-spectral segmentation approach for the analysis of hyperspectral imaging data and its application for the prediction of powdery mildew infection levels (disease severity) of intact Chardonnay grape bunches shortly before veraison. Instead of calculating texture features (spatial features) for the huge number of spectral bands independently, dimensionality reduction by means of Linear Discriminant Analysis (LDA) was applied first to derive a few descriptive image bands. Subsequent classification was based on modified Random Forest classifiers and selective extraction of texture parameters from the integral image representation of the image bands generated. Dimensionality reduction, integral images, and the selective feature extraction led to improved classification accuracies of up to [Formula: see text] for detached berries used as a reference sample (training dataset). Our approach was validated by predicting infection levels for a sample of 30 intact bunches. Classification accuracy improved with the number of decision trees of the Random Forest classifier. These results corresponded with qPCR results. An accuracy of 0.87 was achieved in classification of healthy, infected, and severely diseased bunches. However, discrimination between visually healthy and infected bunches proved to be challenging for a few samples

  6. Morphometric classification of Spanish thoroughbred stallion sperm heads.

    PubMed

    Hidalgo, Manuel; Rodríguez, Inmaculada; Dorado, Jesús; Soler, Carles

    2008-01-30

    This work used semen samples collected from 12 stallions and assessed for sperm morphometry by the Sperm Class Analyzer (SCA) computer-assisted system. A discriminant analysis was performed on the morphometric data from that sperm to obtain a classification matrix for sperm head shape. Thereafter, we defined six types of sperm head shape. Classification of sperm head by this method obtained a globally correct assignment of 90.1%. Moreover, significant differences (p<0.05) were found between animals for all the sperm head morphometric parameters assessed.

  7. Automatic classification of spectral units in the Aristarchus plateau

    NASA Astrophysics Data System (ADS)

    Erard, S.; Le Mouelic, S.; Langevin, Y.

    1999-09-01

    A reduction scheme has been recently proposed for the NIR images of Clementine (Le Mouelic et al, JGR 1999). This reduction has been used to build an integrated UVvis-NIR image cube of the Aristarchus region, from which compositional and maturity variations can be studied (Pinet et al, LPSC 1999). We will present an analysis of this image cube, providing a classification in spectral types and spectral units. The image cube is processed with Gmode analysis using three different data sets: Normalized spectra provide a classification based mainly on spectral slope variations (ie. maturity and volcanic glasses). This analysis discriminates between craters plus ejecta, mare basalts, and DMD. Olivine-rich areas and Aristarchus central peak are also recognized. Continuum-removed spectra provide a classification more related to compositional variations, which correctly identifies olivine and pyroxenes-rich areas (in Aristarchus, Krieger, Schiaparelli\\ldots). A third analysis uses spectral parameters related to maturity and Fe composition (reflectance, 1 mu m band depth, and spectral slope) rather than intensities. It provides the most spatially consistent picture, but fails in detecting Vallis Schroeteri and DMDs. A supplementary unit, younger and rich in pyroxene, is found on Aristarchus south rim. In conclusion, Gmode analysis can discriminate between different spectral types already identified with more classic methods (PCA, linear mixing\\ldots). No previous assumption is made on the data structure, such as endmembers number and nature, or linear relationship between input variables. The variability of the spectral types is intrinsically accounted for, so that the level of analysis is always restricted to meaningful limits. A complete classification should integrate several analyses based on different sets of parameters. Gmode is therefore a powerful light toll to perform first look analysis of spectral imaging data. This research has been partly founded by the French

  8. Classification and pose estimation of objects using nonlinear features

    NASA Astrophysics Data System (ADS)

    Talukder, Ashit; Casasent, David P.

    1998-03-01

    A new nonlinear feature extraction method called the maximum representation and discrimination feature (MRDF) method is presented for extraction of features from input image data. It implements transformations similar to the Sigma-Pi neural network. However, the weights of the MRDF are obtained in closed form, and offer advantages compared to nonlinear neural network implementations. The features extracted are useful for both object discrimination (classification) and object representation (pose estimation). We show its use in estimating the class and pose of images of real objects and rendered solid CAD models of machine parts from single views using a feature-space trajectory (FST) neural network classifier. We show more accurate classification and pose estimation results than are achieved by standard principal component analysis (PCA) and Fukunaga-Koontz (FK) feature extraction methods.

  9. Comparing ungulate dietary proxies using discriminant function analysis.

    PubMed

    Fraser, Danielle; Theodor, Jessica M

    2011-12-01

    A variety of tooth-wear and morphological dietary proxies have been proposed for ungulates. In turn, they have been applied to fossil specimens with the purpose of reconstructing the diets of extinct taxa. Although these dietary proxies have been used in isolation and in combination, a consistent set of statistical analyses has never been applied to all of the available datasets. The purpose of this study is to determine how well the most commonly used dietary proxies classify ungulates as browsers, grazers, and mixed feeders individually and in combination. Discriminant function analysis is applied to individual dietary proxies (hypsodonty, mesowear, microwear, and several cranial dietary proxies) and to combinations thereof to compare rates of successful dietary classification. In general, the tooth-wear dietary proxies (mesowear and microwear) perform better than morphological dietary proxies, though none are strong proxies in isolation. The success rates of the cranial dietary proxies are not increased substantially when ruminants and bovids are analyzed separately, and significance among the three dietary guilds is reduced when controlling for phylogenetic relatedness. The combination of hypsodonty, mesowear, and microwear is found to have a high rate of successful dietary classification, but a combination of all commonly used proxies increases the success rate to 100%. In most cases, mixed feeders bear the greatest resemblance to browsers suggesting that a morphology intermediate to browsers and grazers may represent a fitness valley resulting from the inability to exploit both browse and graze efficiently. These results are important for future paleoecological studies and should be used as a guide for determining which dietary proxies are appropriate to the research question. Copyright © 2011 Wiley-Liss, Inc.

  10. Phylogenetic comparative methods complement discriminant function analysis in ecomorphology.

    PubMed

    Barr, W Andrew; Scott, Robert S

    2014-04-01

    In ecomorphology, Discriminant Function Analysis (DFA) has been used as evidence for the presence of functional links between morphometric variables and ecological categories. Here we conduct simulations of characters containing phylogenetic signal to explore the performance of DFA under a variety of conditions. Characters were simulated using a phylogeny of extant antelope species from known habitats. Characters were modeled with no biomechanical relationship to the habitat category; the only sources of variation were body mass, phylogenetic signal, or random "noise." DFA on the discriminability of habitat categories was performed using subsets of the simulated characters, and Phylogenetic Generalized Least Squares (PGLS) was performed for each character. Analyses were repeated with randomized habitat assignments. When simulated characters lacked phylogenetic signal and/or habitat assignments were random, <5.6% of DFAs and <8.26% of PGLS analyses were significant. When characters contained phylogenetic signal and actual habitats were used, 33.27 to 45.07% of DFAs and <13.09% of PGLS analyses were significant. False Discovery Rate (FDR) corrections for multiple PGLS analyses reduced the rate of significance to <4.64%. In all cases using actual habitats and characters with phylogenetic signal, correct classification rates of DFAs exceeded random chance. In simulations involving phylogenetic signal in both predictor variables and predicted categories, PGLS with FDR was rarely significant, while DFA often was. In short, DFA offered no indication that differences between categories might be explained by phylogenetic signal, while PGLS did. As such, PGLS provides a valuable tool for testing the functional hypotheses at the heart of ecomorphology. Copyright © 2013 Wiley Periodicals, Inc.

  11. Forensic Discrimination of Latent Fingerprints Using Laser-Induced Breakdown Spectroscopy (LIBS) and Chemometric Approaches.

    PubMed

    Yang, Jun-Ho; Yoh, Jack J

    2018-01-01

    A novel technique is reported for separating overlapping latent fingerprints using chemometric approaches that combine laser-induced breakdown spectroscopy (LIBS) and multivariate analysis. The LIBS technique provides the capability of real time analysis and high frequency scanning as well as the data regarding the chemical composition of overlapping latent fingerprints. These spectra offer valuable information for the classification and reconstruction of overlapping latent fingerprints by implementing appropriate statistical multivariate analysis. The current study employs principal component analysis and partial least square methods for the classification of latent fingerprints from the LIBS spectra. This technique was successfully demonstrated through a classification study of four distinct latent fingerprints using classification methods such as soft independent modeling of class analogy (SIMCA) and partial least squares discriminant analysis (PLS-DA). The novel method yielded an accuracy of more than 85% and was proven to be sufficiently robust. Furthermore, through laser scanning analysis at a spatial interval of 125 µm, the overlapping fingerprints were reconstructed as separate two-dimensional forms.

  12. Bulk Magnetization Effects in EMI-Based Classification and Discrimination

    DTIC Science & Technology

    2012-04-01

    response adds to classification performance and ( 2 ) develop a comprehensive understanding of the engineering challenges of primary field cancellation...response adds to classification performance and ( 2 ) develop a comprehensive understanding of the engineering challenges of primary field cancellation...classification performance and ( 2 ) develop a comprehensive understanding of the engineering challenges of primary field cancellation that can support a

  13. AVHRR channel selection for land cover classification

    USGS Publications Warehouse

    Maxwell, S.K.; Hoffer, R.M.; Chapman, P.L.

    2002-01-01

    Mapping land cover of large regions often requires processing of satellite images collected from several time periods at many spectral wavelength channels. However, manipulating and processing large amounts of image data increases the complexity and time, and hence the cost, that it takes to produce a land cover map. Very few studies have evaluated the importance of individual Advanced Very High Resolution Radiometer (AVHRR) channels for discriminating cover types, especially the thermal channels (channels 3, 4 and 5). Studies rarely perform a multi-year analysis to determine the impact of inter-annual variability on the classification results. We evaluated 5 years of AVHRR data using combinations of the original AVHRR spectral channels (1-5) to determine which channels are most important for cover type discrimination, yet stabilize inter-annual variability. Particular attention was placed on the channels in the thermal portion of the spectrum. Fourteen cover types over the entire state of Colorado were evaluated using a supervised classification approach on all two-, three-, four- and five-channel combinations for seven AVHRR biweekly composite datasets covering the entire growing season for each of 5 years. Results show that all three of the major portions of the electromagnetic spectrum represented by the AVHRR sensor are required to discriminate cover types effectively and stabilize inter-annual variability. Of the two-channel combinations, channels 1 (red visible) and 2 (near-infrared) had, by far, the highest average overall accuracy (72.2%), yet the inter-annual classification accuracies were highly variable. Including a thermal channel (channel 4) significantly increased the average overall classification accuracy by 5.5% and stabilized interannual variability. Each of the thermal channels gave similar classification accuracies; however, because of the problems in consistently interpreting channel 3 data, either channel 4 or 5 was found to be a more

  14. Discriminant Analysis of Student Loan Applications

    ERIC Educational Resources Information Center

    Dyl, Edward A.; McGann, Anthony F.

    1977-01-01

    The use of discriminant analysis in identifying potentially "good" versus potentially "bad" student loans is explained. The technique is applied to a sample of 200 student loan applications at the University of Wyoming. (LBH)

  15. Retinal vasculature classification using novel multifractal features

    NASA Astrophysics Data System (ADS)

    Ding, Y.; Ward, W. O. C.; Duan, Jinming; Auer, D. P.; Gowland, Penny; Bai, L.

    2015-11-01

    Retinal blood vessels have been implicated in a large number of diseases including diabetic retinopathy and cardiovascular diseases, which cause damages to retinal blood vessels. The availability of retinal vessel imaging provides an excellent opportunity for monitoring and diagnosis of retinal diseases, and automatic analysis of retinal vessels will help with the processes. However, state of the art vascular analysis methods such as counting the number of branches or measuring the curvature and diameter of individual vessels are unsuitable for the microvasculature. There has been published research using fractal analysis to calculate fractal dimensions of retinal blood vessels, but so far there has been no systematic research extracting discriminant features from retinal vessels for classifications. This paper introduces new methods for feature extraction from multifractal spectra of retinal vessels for classification. Two publicly available retinal vascular image databases are used for the experiments, and the proposed methods have produced accuracies of 85.5% and 77% for classification of healthy and diabetic retinal vasculatures. Experiments show that classification with multiple fractal features produces better rates compared with methods using a single fractal dimension value. In addition to this, experiments also show that classification accuracy can be affected by the accuracy of vessel segmentation algorithms.

  16. Discrimination of Gastrodia elata from Different Geographical Origin for Quality Evaluation Using Newly-Build Near Infrared Spectrum Coupled with Multivariate Analysis.

    PubMed

    Zuo, Yamin; Deng, Xuehua; Wu, Qing

    2018-05-04

    Discrimination of Gastrodia elata ( G. elata ) geographical origin is of great importance to pharmaceutical companies and consumers in China. this paper focuses on the feasibility of near infrared spectrum (NIRS) combined multivariate analysis as a rapid and non-destructive method to prove its fit for this purpose. Firstly, 16 batches of G. elata samples from four main-cultivation regions in China were quantified by traditional HPLC method. It showed that samples from different origins could not be efficiently differentiated by the contents of four phenolic compounds in this study. Secondly, the raw near infrared (NIR) spectra of those samples were acquired and two different pattern recognition techniques were used to classify the geographical origins. The results showed that with spectral transformation optimized, discriminant analysis (DA) provided 97% and 99% correct classification for the calibration and validation sets of samples from discriminating of four different main-cultivation regions, and provided 98% and 99% correct classifications for the calibration and validation sets of samples from eight different cities, respectively, which all performed better than the principal component analysis (PCA) method. Thirdly, as phenolic compounds content (PCC) is highly related with the quality of G. elata , synergy interval partial least squares (Si-PLS) was applied to build the PCC prediction model. The coefficient of determination for prediction (R p ²) of the Si-PLS model was 0.9209, and root mean square error for prediction (RMSEP) was 0.338. The two regions (4800 cm −1 ⁻5200 cm −1 , and 5600 cm −1 ⁻6000 cm −1 ) selected by Si-PLS corresponded to the absorptions of aromatic ring in the basic phenolic structure. It can be concluded that NIR spectroscopy combined with PCA, DA and Si-PLS would be a potential tool to provide a reference for the quality control of G. elata.

  17. Discriminant analysis of multiple cortical changes in mild cognitive impairment

    NASA Astrophysics Data System (ADS)

    Wu, Congling; Guo, Shengwen; Lai, Chunren; Wu, Yupeng; Zhao, Di; Jiang, Xingjun

    2017-02-01

    To reveal the differences in brain structures and morphological changes between the mild cognitive impairment (MCI) and the normal control (NC), analyze and predict the risk of MCI conversion. First, the baseline and 2-year longitudinal follow-up magnetic resonance (MR) images of 73 NC, 46 patients with stable MCI (sMCI) and 40 patients with converted MCI (cMCI) were selected. Second, the FreeSurfer was used to extract the cortical features, including the cortical thickness, surface area, gray matter volume and mean curvature. Third, the support vector machine-recursive feature elimination method (SVM-RFE) were adopted to determine salient features for effective discrimination. Finally, the distribution and importance of essential brain regions were described. The experimental results showed that the cortical thickness and gray matter volume exhibited prominent capability in discrimination, and surface area and mean curvature behaved relatively weak. Furthermore, the combination of different morphological features, especially the baseline combined with the longitudinal changes, can be used to evidently improve the performance of classification. In addition, brain regions with high weights predominately located in the temporal lobe and the frontal lobe, which were relative to emotional control and memory functions. It suggests that there were significant different patterns in the brain structure and changes between the compared group, which could not only be effectively applied for classification, but also be used to evaluate and predict the conversion of the patients with MCI.

  18. LANDSAT applications to wetlands classification in the upper Mississippi River Valley. Ph.D. Thesis. Final Report

    NASA Technical Reports Server (NTRS)

    Lillesand, T. M.; Werth, L. F. (Principal Investigator)

    1980-01-01

    A 25% improvement in average classification accuracy was realized by processing double-date vs. single-date data. Under the spectrally and spatially complex site conditions characterizing the geographical area used, further improvement in wetland classification accuracy is apparently precluded by the spectral and spatial resolution restrictions of the LANDSAT MSS. Full scene analysis of scanning densitometer data extracted from scale infrared photography failed to permit discrimination of many wetland and nonwetland cover types. When classification of photographic data was limited to wetland areas only, much more detailed and accurate classification could be made. The integration of conventional image interpretation (to simply delineate wetland boundaries) and machine assisted classification (to discriminate among cover types present within the wetland areas) appears to warrant further research to study the feasibility and cost of extending this methodology over a large area using LANDSAT and/or small scale photography.

  19. Classification of M1/M2-polarized human macrophages by label-free hyperspectral reflectance confocal microscopy and multivariate analysis.

    PubMed

    Bertani, Francesca R; Mozetic, Pamela; Fioramonti, Marco; Iuliani, Michele; Ribelli, Giulia; Pantano, Francesco; Santini, Daniele; Tonini, Giuseppe; Trombetta, Marcella; Businaro, Luca; Selci, Stefano; Rainer, Alberto

    2017-08-21

    The possibility of detecting and classifying living cells in a label-free and non-invasive manner holds significant theranostic potential. In this work, Hyperspectral Imaging (HSI) has been successfully applied to the analysis of macrophagic polarization, given its central role in several pathological settings, including the regulation of tumour microenvironment. Human monocyte derived macrophages have been investigated using hyperspectral reflectance confocal microscopy, and hyperspectral datasets have been analysed in terms of M1 vs. M2 polarization by Principal Components Analysis (PCA). Following PCA, Linear Discriminant Analysis has been implemented for semi-automatic classification of macrophagic polarization from HSI data. Our results confirm the possibility to perform single-cell-level in vitro classification of M1 vs. M2 macrophages in a non-invasive and label-free manner with a high accuracy (above 98% for cells deriving from the same donor), supporting the idea of applying the technique to the study of complex interacting cellular systems, such in the case of tumour-immunity in vitro models.

  20. Conversion Discriminative Analysis on Mild Cognitive Impairment Using Multiple Cortical Features from MR Images.

    PubMed

    Guo, Shengwen; Lai, Chunren; Wu, Congling; Cen, Guiyin

    2017-01-01

    Neuroimaging measurements derived from magnetic resonance imaging provide important information required for detecting changes related to the progression of mild cognitive impairment (MCI). Cortical features and changes play a crucial role in revealing unique anatomical patterns of brain regions, and further differentiate MCI patients from normal states. Four cortical features, namely, gray matter volume, cortical thickness, surface area, and mean curvature, were explored for discriminative analysis among three groups including the stable MCI (sMCI), the converted MCI (cMCI), and the normal control (NC) groups. In this study, 158 subjects (72 NC, 46 sMCI, and 40 cMCI) were selected from the Alzheimer's Disease Neuroimaging Initiative. A sparse-constrained regression model based on the l2-1-norm was introduced to reduce the feature dimensionality and retrieve essential features for the discrimination of the three groups by using a support vector machine (SVM). An optimized strategy of feature addition based on the weight of each feature was adopted for the SVM classifier in order to achieve the best classification performance. The baseline cortical features combined with the longitudinal measurements for 2 years of follow-up data yielded prominent classification results. In particular, the cortical thickness produced a classification with 98.84% accuracy, 97.5% sensitivity, and 100% specificity for the sMCI-cMCI comparison; 92.37% accuracy, 84.78% sensitivity, and 97.22% specificity for the cMCI-NC comparison; and 93.75% accuracy, 92.5% sensitivity, and 94.44% specificity for the sMCI-NC comparison. The best performances obtained by the SVM classifier using the essential features were 5-40% more than those using all of the retained features. The feasibility of the cortical features for the recognition of anatomical patterns was certified; thus, the proposed method has the potential to improve the clinical diagnosis of sub-types of MCI and predict the risk of its

  1. Discrimination between smiling faces: Human observers vs. automated face analysis.

    PubMed

    Del Líbano, Mario; Calvo, Manuel G; Fernández-Martín, Andrés; Recio, Guillermo

    2018-05-11

    This study investigated (a) how prototypical happy faces (with happy eyes and a smile) can be discriminated from blended expressions with a smile but non-happy eyes, depending on type and intensity of the eye expression; and (b) how smile discrimination differs for human perceivers versus automated face analysis, depending on affective valence and morphological facial features. Human observers categorized faces as happy or non-happy, or rated their valence. Automated analysis (FACET software) computed seven expressions (including joy/happiness) and 20 facial action units (AUs). Physical properties (low-level image statistics and visual saliency) of the face stimuli were controlled. Results revealed, first, that some blended expressions (especially, with angry eyes) had lower discrimination thresholds (i.e., they were identified as "non-happy" at lower non-happy eye intensities) than others (especially, with neutral eyes). Second, discrimination sensitivity was better for human perceivers than for automated FACET analysis. As an additional finding, affective valence predicted human discrimination performance, whereas morphological AUs predicted FACET discrimination. FACET can be a valid tool for categorizing prototypical expressions, but is currently more limited than human observers for discrimination of blended expressions. Configural processing facilitates detection of in/congruence(s) across regions, and thus detection of non-genuine smiling faces (due to non-happy eyes). Copyright © 2018 Elsevier B.V. All rights reserved.

  2. Authentication of whisky due to its botanical origin and way of production by instrumental analysis and multivariate classification methods

    NASA Astrophysics Data System (ADS)

    Wiśniewska, Paulina; Boqué, Ricard; Borràs, Eva; Busto, Olga; Wardencki, Waldemar; Namieśnik, Jacek; Dymerski, Tomasz

    2017-02-01

    Headspace mass-spectrometry (HS-MS), mid infrared (MIR) and UV-vis spectroscopy were used to authenticate whisky samples from different origins and ways of production ((Irish, Spanish, Bourbon, Tennessee Whisky and Scotch). The collected spectra were processed with partial least-squares discriminant analysis (PLS-DA) to build the classification models. In all cases the five groups of whiskies were distinguished, but the best results were obtained by HS-MS, which indicates that the biggest differences between different types of whisky are due to their aroma. Differences were also found inside groups, showing that not only raw material is important to discriminate samples but also the way of their production. The methodology is quick, easy and does not require sample preparation.

  3. Rapid discrimination of the causal agents of urinary tract infection using ToF-SIMS with chemometric cluster analysis

    NASA Astrophysics Data System (ADS)

    Fletcher, John S.; Henderson, Alexander; Jarvis, Roger M.; Lockyer, Nicholas P.; Vickerman, John C.; Goodacre, Royston

    2006-07-01

    Advances in time of flight secondary ion mass spectrometry (ToF-SIMS) have enabled this technique to become a powerful tool for the analysis of biological samples. Such samples are often very complex and as a result full interpretation of the acquired data can be extremely difficult. To simplify the interpretation of these information rich data, the use of chemometric techniques is becoming widespread in the ToF-SIMS community. Here we discuss the application of principal components-discriminant function analysis (PC-DFA) to the separation and classification of a number of bacterial samples that are known to be major causal agents of urinary tract infection. A large data set has been generated using three biological replicates of each isolate and three machine replicates were acquired from each biological replicate. Ordination plots generated using the PC-DFA are presented demonstrating strain level discrimination of the bacteria. The results are discussed in terms of biological differences between certain species and with reference to FT-IR, Raman spectroscopy and pyrolysis mass spectrometric studies of similar samples.

  4. Quantization of liver tissue in dual kVp computed tomography using linear discriminant analysis

    NASA Astrophysics Data System (ADS)

    Tkaczyk, J. Eric; Langan, David; Wu, Xiaoye; Xu, Daniel; Benson, Thomas; Pack, Jed D.; Schmitz, Andrea; Hara, Amy; Palicek, William; Licato, Paul; Leverentz, Jaynne

    2009-02-01

    Linear discriminate analysis (LDA) is applied to dual kVp CT and used for tissue characterization. The potential to quantitatively model both malignant and benign, hypo-intense liver lesions is evaluated by analysis of portal-phase, intravenous CT scan data obtained on human patients. Masses with an a priori classification are mapped to a distribution of points in basis material space. The degree of localization of tissue types in the material basis space is related to both quantum noise and real compositional differences. The density maps are analyzed with LDA and studied with system simulations to differentiate these factors. The discriminant analysis is formulated so as to incorporate the known statistical properties of the data. Effective kVp separation and mAs relates to precision of tissue localization. Bias in the material position is related to the degree of X-ray scatter and partial-volume effect. Experimental data and simulations demonstrate that for single energy (HU) imaging or image-based decomposition pixel values of water-like tissues depend on proximity to other iodine-filled bodies. Beam-hardening errors cause a shift in image value on the scale of that difference sought between in cancerous and cystic lessons. In contrast, projection-based decomposition or its equivalent when implemented on a carefully calibrated system can provide accurate data. On such a system, LDA may provide novel quantitative capabilities for tissue characterization in dual energy CT.

  5. Recursive Partitioning Analysis for New Classification of Patients With Esophageal Cancer Treated by Chemoradiotherapy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nomura, Motoo, E-mail: excell@hkg.odn.ne.jp; Department of Clinical Oncology, Aichi Cancer Center Hospital, Nagoya; Department of Radiation Oncology, Aichi Cancer Center Hospital, Nagoya

    2012-11-01

    Background: The 7th edition of the American Joint Committee on Cancer staging system does not include lymph node size in the guidelines for staging patients with esophageal cancer. The objectives of this study were to determine the prognostic impact of the maximum metastatic lymph node diameter (ND) on survival and to develop and validate a new staging system for patients with esophageal squamous cell cancer who were treated with definitive chemoradiotherapy (CRT). Methods: Information on 402 patients with esophageal cancer undergoing CRT at two institutions was reviewed. Univariate and multivariate analyses of data from one institution were used to assessmore » the impact of clinical factors on survival, and recursive partitioning analysis was performed to develop the new staging classification. To assess its clinical utility, the new classification was validated using data from the second institution. Results: By multivariate analysis, gender, T, N, and ND stages were independently and significantly associated with survival (p < 0.05). The resulting new staging classification was based on the T and ND. The four new stages led to good separation of survival curves in both the developmental and validation datasets (p < 0.05). Conclusions: Our results showed that lymph node size is a strong independent prognostic factor and that the new staging system, which incorporated lymph node size, provided good prognostic power, and discriminated effectively for patients with esophageal cancer undergoing CRT.« less

  6. Sensory classification of table olives using an electronic tongue: Analysis of aqueous pastes and brines.

    PubMed

    Marx, Ítala; Rodrigues, Nuno; Dias, Luís G; Veloso, Ana C A; Pereira, José A; Drunkler, Deisy A; Peres, António M

    2017-01-01

    Table olives are highly appreciated and consumed worldwide. Different aspects are used for trade category classification being the sensory assessment of negative defects present in the olives and brines one of the most important. The trade category quality classification must follow the International Olive Council directives, requiring the organoleptic assessment of defects by a trained sensory panel. However, the training process is a hard, complex and sometimes subjective task, being the low number of samples that can be evaluated per day a major drawback considering the real needs of the olive industry. In this context, the development of electronic tongues as taste sensors for defects' sensory evaluation is of utmost relevance. So, an electronic tongue was used for table olives classification according to the presence and intensity of negative defects. Linear discrimination models were established based on sub-sets of sensor signals selected by a simulated annealing algorithm. The predictive potential of the novel approach was first demonstrated for standard solutions of chemical compounds that mimic butyric, putrid and zapateria defects (≥93% for cross-validation procedures). Then its applicability was verified; using reference table olives/brine solutions samples identified with a single intense negative attribute, namely butyric, musty, putrid, zapateria or winey-vinegary defects (≥93% cross-validation procedures). Finally, the E-tongue coupled with the same chemometric approach was applied to classify table olive samples according to the trade commercial categories (extra, 1 st choice, 2 nd choice and unsuitable for consumption) and an additional quality category (extra free of defects), established based on sensory analysis data. Despite the heterogeneity of the samples studied and number of different sensory defects perceived, the predictive linear discriminant model established showed sensitivities greater than 86%. So, the overall performance

  7. Discrimination and supervised classification of volcanic flows of the Puna-Altiplano, Central Andes Mountains using Landsat TM data

    NASA Technical Reports Server (NTRS)

    Mcbride, J. H.; Fielding, E. J.; Isacks, B. L.

    1987-01-01

    Landsat Thematic Mapper (TM) images of portions of the Central Andean Puna-Altiplano volcanic belt have been tested for the feasibility of discriminating individual volcanic flows using supervised classifications. This technique distinguishes volcanic rock classes as well as individual phases (i.e., relative age groups) within each class. The spectral signature of a volcanic rock class appears to depend on original texture and composition and on the degree of erosion, weathering, and chemical alteration. Basalts and basaltic andesite stand out as a clearly distinguishable class. The age dependent degree of weathering of these generally dark volcanic rocks can be correlated with reflectance: older rocks have a higher reflectance. On the basis of this relationship, basaltaic lava flows can be separated into several subclasses. These individual subclasses would correspond to mappable geologic units on the ground at a reconnaissance scale. The supervised classification maps are therefore useful for establishing a general stratigraphic framework for later detailed surface mapping of volcanic sequences.

  8. Analysis and classification of commercial ham slice images using directional fractal dimension features.

    PubMed

    Mendoza, Fernando; Valous, Nektarios A; Allen, Paul; Kenny, Tony A; Ward, Paddy; Sun, Da-Wen

    2009-02-01

    This paper presents a novel and non-destructive approach to the appearance characterization and classification of commercial pork, turkey and chicken ham slices. Ham slice images were modelled using directional fractal (DF(0°;45°;90°;135°)) dimensions and a minimum distance classifier was adopted to perform the classification task. Also, the role of different colour spaces and the resolution level of the images on DF analysis were investigated. This approach was applied to 480 wafer thin ham slices from four types of hams (120 slices per type): i.e., pork (cooked and smoked), turkey (smoked) and chicken (roasted). DF features were extracted from digitalized intensity images in greyscale, and R, G, B, L(∗), a(∗), b(∗), H, S, and V colour components for three image resolution levels (100%, 50%, and 25%). Simulation results show that in spite of the complexity and high variability in colour and texture appearance, the modelling of ham slice images with DF dimensions allows the capture of differentiating textural features between the four commercial ham types. Independent DF features entail better discrimination than that using the average of four directions. However, DF dimensions reveal a high sensitivity to colour channel, orientation and image resolution for the fractal analysis. The classification accuracy using six DF dimension features (a(90°)(∗),a(135°)(∗),H(0°),H(45°),S(0°),H(90°)) was 93.9% for training data and 82.2% for testing data.

  9. EEG source space analysis of the supervised factor analytic approach for the classification of multi-directional arm movement

    NASA Astrophysics Data System (ADS)

    Shenoy Handiru, Vikram; Vinod, A. P.; Guan, Cuntai

    2017-08-01

    Objective. In electroencephalography (EEG)-based brain-computer interface (BCI) systems for motor control tasks the conventional practice is to decode motor intentions by using scalp EEG. However, scalp EEG only reveals certain limited information about the complex tasks of movement with a higher degree of freedom. Therefore, our objective is to investigate the effectiveness of source-space EEG in extracting relevant features that discriminate arm movement in multiple directions. Approach. We have proposed a novel feature extraction algorithm based on supervised factor analysis that models the data from source-space EEG. To this end, we computed the features from the source dipoles confined to Brodmann areas of interest (BA4a, BA4p and BA6). Further, we embedded class-wise labels of multi-direction (multi-class) source-space EEG to an unsupervised factor analysis to make it into a supervised learning method. Main Results. Our approach provided an average decoding accuracy of 71% for the classification of hand movement in four orthogonal directions, that is significantly higher (>10%) than the classification accuracy obtained using state-of-the-art spatial pattern features in sensor space. Also, the group analysis on the spectral characteristics of source-space EEG indicates that the slow cortical potentials from a set of cortical source dipoles reveal discriminative information regarding the movement parameter, direction. Significance. This study presents evidence that low-frequency components in the source space play an important role in movement kinematics, and thus it may lead to new strategies for BCI-based neurorehabilitation.

  10. Facial Affect Recognition Using Regularized Discriminant Analysis-Based Algorithms

    NASA Astrophysics Data System (ADS)

    Lee, Chien-Cheng; Huang, Shin-Sheng; Shih, Cheng-Yuan

    2010-12-01

    This paper presents a novel and effective method for facial expression recognition including happiness, disgust, fear, anger, sadness, surprise, and neutral state. The proposed method utilizes a regularized discriminant analysis-based boosting algorithm (RDAB) with effective Gabor features to recognize the facial expressions. Entropy criterion is applied to select the effective Gabor feature which is a subset of informative and nonredundant Gabor features. The proposed RDAB algorithm uses RDA as a learner in the boosting algorithm. The RDA combines strengths of linear discriminant analysis (LDA) and quadratic discriminant analysis (QDA). It solves the small sample size and ill-posed problems suffered from QDA and LDA through a regularization technique. Additionally, this study uses the particle swarm optimization (PSO) algorithm to estimate optimal parameters in RDA. Experiment results demonstrate that our approach can accurately and robustly recognize facial expressions.

  11. Olive oil sensory defects classification with data fusion of instrumental techniques and multivariate analysis (PLS-DA).

    PubMed

    Borràs, Eva; Ferré, Joan; Boqué, Ricard; Mestres, Montserrat; Aceña, Laura; Calvo, Angels; Busto, Olga

    2016-07-15

    Three instrumental techniques, headspace-mass spectrometry (HS-MS), mid-infrared spectroscopy (MIR) and UV-visible spectrophotometry (UV-vis), have been combined to classify virgin olive oil samples based on the presence or absence of sensory defects. The reference sensory values were provided by an official taste panel. Different data fusion strategies were studied to improve the discrimination capability compared to using each instrumental technique individually. A general model was applied to discriminate high-quality non-defective olive oils (extra-virgin) and the lowest-quality olive oils considered non-edible (lampante). A specific identification of key off-flavours, such as musty, winey, fusty and rancid, was also studied. The data fusion of the three techniques improved the classification results in most of the cases. Low-level data fusion was the best strategy to discriminate musty, winey and fusty defects, using HS-MS, MIR and UV-vis, and the rancid defect using only HS-MS and MIR. The mid-level data fusion approach using partial least squares-discriminant analysis (PLS-DA) scores was found to be the best strategy for defective vs non-defective and edible vs non-edible oil discrimination. However, the data fusion did not sufficiently improve the results obtained by a single technique (HS-MS) to classify non-defective classes. These results indicate that instrumental data fusion can be useful for the identification of sensory defects in virgin olive oils. Copyright © 2016 Elsevier Ltd. All rights reserved.

  12. A nonlinear discriminant algorithm for feature extraction and data classification.

    PubMed

    Santa Cruz, C; Dorronsoro, J R

    1998-01-01

    This paper presents a nonlinear supervised feature extraction algorithm that combines Fisher's criterion function with a preliminary perceptron-like nonlinear projection of vectors in pattern space. Its main motivation is to combine the approximation properties of multilayer perceptrons (MLP's) with the target free nature of Fisher's classical discriminant analysis. In fact, although MLP's provide good classifiers for many problems, there may be some situations, such as unequal class sizes with a high degree of pattern mixing among them, that may make difficult the construction of good MLP classifiers. In these instances, the features extracted by our procedure could be more effective. After the description of its construction and the analysis of its complexity, we will illustrate its use over a synthetic problem with the above characteristics.

  13. Assay based on electrical impedance spectroscopy to discriminate between normal and cancerous mammalian cells

    NASA Astrophysics Data System (ADS)

    Giana, Fabián Eduardo; Bonetto, Fabián José; Bellotti, Mariela Inés

    2018-03-01

    In this work we present an assay to discriminate between normal and cancerous cells. The method is based on the measurement of electrical impedance spectra of in vitro cell cultures. We developed a protocol consisting on four consecutive measurement phases, each of them designed to obtain different information about the cell cultures. Through the analysis of the measured data, 26 characteristic features were obtained for both cell types. From the complete set of features, we selected the most relevant in terms of their discriminant capacity by means of conventional statistical tests. A linear discriminant analysis was then carried out on the selected features, allowing the classification of the samples in normal or cancerous with 4.5% of false positives and no false negatives.

  14. Assay based on electrical impedance spectroscopy to discriminate between normal and cancerous mammalian cells.

    PubMed

    Giana, Fabián Eduardo; Bonetto, Fabián José; Bellotti, Mariela Inés

    2018-03-01

    In this work we present an assay to discriminate between normal and cancerous cells. The method is based on the measurement of electrical impedance spectra of in vitro cell cultures. We developed a protocol consisting on four consecutive measurement phases, each of them designed to obtain different information about the cell cultures. Through the analysis of the measured data, 26 characteristic features were obtained for both cell types. From the complete set of features, we selected the most relevant in terms of their discriminant capacity by means of conventional statistical tests. A linear discriminant analysis was then carried out on the selected features, allowing the classification of the samples in normal or cancerous with 4.5% of false positives and no false negatives.

  15. Discrimination-Aware Classifiers for Student Performance Prediction

    ERIC Educational Resources Information Center

    Luo, Ling; Koprinska, Irena; Liu, Wei

    2015-01-01

    In this paper we consider discrimination-aware classification of educational data. Mining and using rules that distinguish groups of students based on sensitive attributes such as gender and nationality may lead to discrimination. It is desirable to keep the sensitive attributes during the training of a classifier to avoid information loss but…

  16. A two-stage linear discriminant analysis via QR-decomposition.

    PubMed

    Ye, Jieping; Li, Qi

    2005-06-01

    Linear Discriminant Analysis (LDA) is a well-known method for feature extraction and dimension reduction. It has been used widely in many applications involving high-dimensional data, such as image and text classification. An intrinsic limitation of classical LDA is the so-called singularity problems; that is, it fails when all scatter matrices are singular. Many LDA extensions were proposed in the past to overcome the singularity problems. Among these extensions, PCA+LDA, a two-stage method, received relatively more attention. In PCA+LDA, the LDA stage is preceded by an intermediate dimension reduction stage using Principal Component Analysis (PCA). Most previous LDA extensions are computationally expensive, and not scalable, due to the use of Singular Value Decomposition or Generalized Singular Value Decomposition. In this paper, we propose a two-stage LDA method, namely LDA/QR, which aims to overcome the singularity problems of classical LDA, while achieving efficiency and scalability simultaneously. The key difference between LDA/QR and PCA+LDA lies in the first stage, where LDA/QR applies QR decomposition to a small matrix involving the class centroids, while PCA+LDA applies PCA to the total scatter matrix involving all training data points. We further justify the proposed algorithm by showing the relationship among LDA/QR and previous LDA methods. Extensive experiments on face images and text documents are presented to show the effectiveness of the proposed algorithm.

  17. Acromegaly determination using discriminant analysis of the three-dimensional facial classification in Taiwanese.

    PubMed

    Wang, Ming-Hsu; Lin, Jen-Der; Chang, Chen-Nen; Chiou, Wen-Ko

    2017-08-01

    The aim of this study was to assess the size, angles and positional characteristics of facial anthropometry between "acromegalic" patients and control subjects. We also identify possible facial soft tissue measurements for generating discriminant functions toward acromegaly determination in males and females for acromegaly early self-awareness. This is a cross-sectional study. Subjects participating in this study included 70 patients diagnosed with acromegaly (35 females and 35 males) and 140 gender-matched control individuals. Three-dimensional facial images were collected via a camera system. Thirteen landmarks were selected. Eleven measurements from the three categories were selected and applied, including five frontal widths, three lateral depths and three lateral angular measurements. Descriptive analyses were conducted using means and standard deviations for each measurement. Univariate and multivariate discriminant function analyses were applied in order to calculate the accuracy of acromegaly detection. Patients with acromegaly exhibit soft-tissue facial enlargement and hypertrophy. Frontal widths as well as lateral depth and angle of facial changes were evident. The average accuracies of all functions for female patient detection ranged from 80.0-91.40%. The average accuracies of all functions for male patient detection were from 81.0-94.30%. The greatest anomaly observed was evidenced in the lateral angles, with greater enlargement of "nasofrontal" angles for females and greater "mentolabial" angles for males. Additionally, shapes of the lateral angles showed changes. The majority of the facial measurements proved dynamic for acromegaly patients; however, it is problematic to detect the disease with progressive body anthropometric changes. The discriminant functions of detection developed in this study could help patients, their families, medical practitioners and others to identify and track progressive facial change patterns before the possible patients

  18. A general soft label based linear discriminant analysis for semi-supervised dimensionality reduction.

    PubMed

    Zhao, Mingbo; Zhang, Zhao; Chow, Tommy W S; Li, Bing

    2014-07-01

    Dealing with high-dimensional data has always been a major problem in research of pattern recognition and machine learning, and Linear Discriminant Analysis (LDA) is one of the most popular methods for dimension reduction. However, it only uses labeled samples while neglecting unlabeled samples, which are abundant and can be easily obtained in the real world. In this paper, we propose a new dimension reduction method, called "SL-LDA", by using unlabeled samples to enhance the performance of LDA. The new method first propagates label information from the labeled set to the unlabeled set via a label propagation process, where the predicted labels of unlabeled samples, called "soft labels", can be obtained. It then incorporates the soft labels into the construction of scatter matrixes to find a transformed matrix for dimension reduction. In this way, the proposed method can preserve more discriminative information, which is preferable when solving the classification problem. We further propose an efficient approach for solving SL-LDA under a least squares framework, and a flexible method of SL-LDA (FSL-LDA) to better cope with datasets sampled from a nonlinear manifold. Extensive simulations are carried out on several datasets, and the results show the effectiveness of the proposed method. Copyright © 2014 Elsevier Ltd. All rights reserved.

  19. General Approach for Rock Classification Based on Digital Image Analysis of Electrical Borehole Wall Images

    NASA Astrophysics Data System (ADS)

    Linek, M.; Jungmann, M.; Berlage, T.; Clauser, C.

    2005-12-01

    Within the Ocean Drilling Program (ODP), image logging tools have been routinely deployed such as the Formation MicroScanner (FMS) or the Resistivity-At-Bit (RAB) tools. Both logging methods are based on resistivity measurements at the borehole wall and therefore are sensitive to conductivity contrasts, which are mapped in color scale images. These images are commonly used to study the structure of the sedimentary rocks and the oceanic crust (petrologic fabric, fractures, veins, etc.). So far, mapping of lithology from electrical images is purely based on visual inspection and subjective interpretation. We apply digital image analysis on electrical borehole wall images in order to develop a method, which augments objective rock identification. We focus on supervised textural pattern recognition which studies the spatial gray level distribution with respect to certain rock types. FMS image intervals of rock classes known from core data are taken in order to train textural characteristics for each class. A so-called gray level co-occurrence matrix is computed by counting the occurrence of a pair of gray levels that are a certain distant apart. Once the matrix for an image interval is computed, we calculate the image contrast, homogeneity, energy, and entropy. We assign characteristic textural features to different rock types by reducing the image information into a small set of descriptive features. Once a discriminating set of texture features for each rock type is found, we are able to discriminate the entire FMS images regarding the trained rock type classification. A rock classification based on texture features enables quantitative lithology mapping and is characterized by a high repeatability, in contrast to a purely visual subjective image interpretation. We show examples for the rock classification between breccias, pillows, massive units, and horizontally bedded tuffs based on ODP image data.

  20. Skin injury model classification based on shape vector analysis

    PubMed Central

    2012-01-01

    Background: Skin injuries can be crucial in judicial decision making. Forensic experts base their classification on subjective opinions. This study investigates whether known classes of simulated skin injuries are correctly classified statistically based on 3D surface models and derived numerical shape descriptors. Methods: Skin injury surface characteristics are simulated with plasticine. Six injury classes – abrasions, incised wounds, gunshot entry wounds, smooth and textured strangulation marks as well as patterned injuries - with 18 instances each are used for a k-fold cross validation with six partitions. Deformed plasticine models are captured with a 3D surface scanner. Mean curvature is estimated for each polygon surface vertex. Subsequently, distance distributions and derived aspect ratios, convex hulls, concentric spheres, hyperbolic points and Fourier transforms are used to generate 1284-dimensional shape vectors. Subsequent descriptor reduction maximizing SNR (signal-to-noise ratio) result in an average of 41 descriptors (varying across k-folds). With non-normal multivariate distribution of heteroskedastic data, requirements for LDA (linear discriminant analysis) are not met. Thus, shrinkage parameters of RDA (regularized discriminant analysis) are optimized yielding a best performance with λ = 0.99 and γ = 0.001. Results: Receiver Operating Characteristic of a descriptive RDA yields an ideal Area Under the Curve of 1.0for all six categories. Predictive RDA results in an average CRR (correct recognition rate) of 97,22% under a 6 partition k-fold. Adding uniform noise within the range of one standard deviation degrades the average CRR to 71,3%. Conclusions: Digitized 3D surface shape data can be used to automatically classify idealized shape models of simulated skin injuries. Deriving some well established descriptors such as histograms, saddle shape of hyperbolic points or convex hulls with subsequent reduction of dimensionality while maximizing SNR

  1. Assessing the Effectiveness of Statistical Classification Techniques in Predicting Future Employment of Participants in the Temporary Assistance for Needy Families Program

    ERIC Educational Resources Information Center

    Montoya, Isaac D.

    2008-01-01

    Three classification techniques (Chi-square Automatic Interaction Detection [CHAID], Classification and Regression Tree [CART], and discriminant analysis) were tested to determine their accuracy in predicting Temporary Assistance for Needy Families program recipients' future employment. Technique evaluation was based on proportion of correctly…

  2. (13)C NMR pattern recognition techniques for the classification of Atlantic salmon (Salmo salar L.) according to their wild, farmed, and geographical origin.

    PubMed

    Aursand, Marit; Standal, Inger B; Praël, Angelika; McEvoy, Lesley; Irvine, Joe; Axelson, David E

    2009-05-13

    (13)C nuclear magnetic resonance (NMR) in combination with multivariate data analysis was used to (1) discriminate between farmed and wild Atlantic salmon ( Salmo salar L.), (2) discriminate between different geographical origins, and (3) verify the origin of market samples. Muscle lipids from 195 Atlantic salmon of known origin (wild and farmed salmon from Norway, Scotland, Canada, Iceland, Ireland, the Faroes, and Tasmania) in addition to market samples were analyzed by (13)C NMR spectroscopy and multivariate analysis. Both probabilistic neural networks (PNN) and support vector machines (SVM) provided excellent discrimination (98.5 and 100.0%, respectively) between wild and farmed salmon. Discrimination with respect to geographical origin was somewhat more difficult, with correct classification rates ranging from 82.2 to 99.3% by PNN and SVM, respectively. In the analysis of market samples, five fish labeled and purchased as wild salmon were classified as farmed salmon (indicating mislabeling), and there were also some discrepancies between the classification and the product declaration with regard to geographical origin.

  3. Sensitivity and specificity of 3-D texture analysis of lung parenchyma is better than 2-D for discrimination of lung pathology in stage 0 COPD

    NASA Astrophysics Data System (ADS)

    Xu, Ye; Sonka, Milan; McLennan, Geoffrey; Guo, Junfeng; Hoffman, Eric

    2005-04-01

    Lung parenchyma evaluation via multidetector-row CT (MDCT), has significantly altered clinical practice in the early detection of lung disease. Our goal is to enhance our texture-based tissue classification ability to differentiate early pathologic processes by extending our 2-D Adaptive Multiple Feature Method (AMFM) to 3-D AMFM. We performed MDCT on 34 human volunteers in five categories: emphysema in severe Chronic Obstructive Pulmonary Disease (COPD) as EC, emphysema in mild COPD (MC), normal appearing lung in COPD (NC), non-smokers with normal lung function (NN), smokers with normal function (NS). We volumetrically excluded the airway and vessel regions, calculated 24 volumetric texture features for each Volume of Interest (VOI); and used Bayesian rules for discrimination. Leave-one-out and half-half methods were used for testing. Sensitivity, specificity and accuracy were calculated. The accuracy of the leave-one-out method for the four-class classification in the form of 3-D/2-D is: EC: 84.9%/70.7%, MC: 89.8%/82.7%; NC: 87.5.0%/49.6%; NN: 100.0%/60.0%. The accuracy of the leave-one-out method for the two-class classification in the form of 3-D/2-D is: NN: 99.3%/71.6%; NS: 99.7%/74.5%. We conclude that 3-D AMFM analysis of the lung parenchyma improves discrimination compared to 2-D analysis of the same images.

  4. [Discrimination of varieties of brake fluid using visual-near infrared spectra].

    PubMed

    Jiang, Lu-lu; Tan, Li-hong; Qiu, Zheng-jun; Lu, Jiang-feng; He, Yong

    2008-06-01

    A new method was developed to fast discriminate brands of brake fluid by means of visual-near infrared spectroscopy. Five different brands of brake fluid were analyzed using a handheld near infrared spectrograph, manufactured by ASD Company, and 60 samples were gotten from each brand of brake fluid. The samples data were pretreated using average smoothing and standard normal variable method, and then analyzed using principal component analysis (PCA). A 2-dimensional plot was drawn based on the first and the second principal components, and the plot indicated that the clustering characteristic of different brake fluid is distinct. The foregoing 6 principal components were taken as input variable, and the band of brake fluid as output variable to build the discriminate model by stepwise discriminant analysis method. Two hundred twenty five samples selected randomly were used to create the model, and the rest 75 samples to verify the model. The result showed that the distinguishing rate was 94.67%, indicating that the method proposed in this paper has good performance in classification and discrimination. It provides a new way to fast discriminate different brands of brake fluid.

  5. Discrimination of Rhizoma Gastrodiae (Tianma) using 3D synchronous fluorescence spectroscopy coupled with principal component analysis

    NASA Astrophysics Data System (ADS)

    Fan, Qimeng; Chen, Chaoyin; Huang, Zaiqiang; Zhang, Chunmei; Liang, Pengjuan; Zhao, Shenglan

    2015-02-01

    Rhizoma Gastrodiae (Tianma) of different variants and different geographical origins has vital difference in quality and physiological efficacy. This paper focused on the classification and identification of Tianma of six types (two variants from three different geographical origins) using three dimensional synchronous fluorescence spectroscopy (3D-SFS) coupled with principal component analysis (PCA). 3D-SF spectra of aqueous extracts, which were obtained from Tianma of the six types, were measured by a LS-50B luminescence spectrofluorometer. The experimental results showed that the characteristic fluorescent spectral regions of the 3D-SF spectra were similar, while the intensities of characteristic regions are different significantly. Coupled these differences in peak intensities with PCA, Tianma of six types could be discriminated successfully. In conclusion, 3D-SFS coupled with PCA, which has such advantages as effective, specific, rapid, non-polluting, has an edge for discrimination of the similar Chinese herbal medicine. And the proposed methodology is a useful tool to classify and identify Tianma of different variants and different geographical origins.

  6. Feasibility of laser-induced breakdown spectroscopy (LIBS) for classification of sea salts.

    PubMed

    Tan, Man Minh; Cui, Sheng; Yoo, Jonghyun; Han, Song-Hee; Ham, Kyung-Sik; Nam, Sang-Ho; Lee, Yonghoon

    2012-03-01

    We have investigated the feasibility of laser-induced breakdown spectroscopy (LIBS) as a fast, reliable classification tool for sea salts. For 11 kinds of sea salts, potassium (K), magnesium (Mg), calcium (Ca), and aluminum (Al), concentrations were measured by inductively coupled plasma-atomic emission spectroscopy (ICP-AES), and the LIBS spectra were recorded in the narrow wavelength region between 760 and 800 nm where K (I), Mg (I), Ca (II), Al (I), and cyanide (CN) band emissions are observed. The ICP-AES measurements revealed that the K, Mg, Ca, and Al concentrations varied significantly with the provenance of each salt. The relative intensities of the K (I), Mg (I), Ca (II), and Al (I) peaks observed in the LIBS spectra are consistent with the results using ICP-AES. The principal component analysis of the LIBS spectra provided the score plot with quite a high degree of clustering. This indicates that classification of sea salts by chemometric analysis of LIBS spectra is very promising. Classification models were developed by partial least squares discriminant analysis (PLS-DA) and evaluated. In addition, the Al (I) peaks enabled us to discriminate between different production methods of the salts. © 2012 Society for Applied Spectroscopy

  7. An Extended Spectral-Spatial Classification Approach for Hyperspectral Data

    NASA Astrophysics Data System (ADS)

    Akbari, D.

    2017-11-01

    In this paper an extended classification approach for hyperspectral imagery based on both spectral and spatial information is proposed. The spatial information is obtained by an enhanced marker-based minimum spanning forest (MSF) algorithm. Three different methods of dimension reduction are first used to obtain the subspace of hyperspectral data: (1) unsupervised feature extraction methods including principal component analysis (PCA), independent component analysis (ICA), and minimum noise fraction (MNF); (2) supervised feature extraction including decision boundary feature extraction (DBFE), discriminate analysis feature extraction (DAFE), and nonparametric weighted feature extraction (NWFE); (3) genetic algorithm (GA). The spectral features obtained are then fed into the enhanced marker-based MSF classification algorithm. In the enhanced MSF algorithm, the markers are extracted from the classification maps obtained by both SVM and watershed segmentation algorithm. To evaluate the proposed approach, the Pavia University hyperspectral data is tested. Experimental results show that the proposed approach using GA achieves an approximately 8 % overall accuracy higher than the original MSF-based algorithm.

  8. Authentication of whisky due to its botanical origin and way of production by instrumental analysis and multivariate classification methods.

    PubMed

    Wiśniewska, Paulina; Boqué, Ricard; Borràs, Eva; Busto, Olga; Wardencki, Waldemar; Namieśnik, Jacek; Dymerski, Tomasz

    2017-02-15

    Headspace mass-spectrometry (HS-MS), mid infrared (MIR) and UV-vis spectroscopy were used to authenticate whisky samples from different origins and ways of production ((Irish, Spanish, Bourbon, Tennessee Whisky and Scotch). The collected spectra were processed with partial least-squares discriminant analysis (PLS-DA) to build the classification models. In all cases the five groups of whiskies were distinguished, but the best results were obtained by HS-MS, which indicates that the biggest differences between different types of whisky are due to their aroma. Differences were also found inside groups, showing that not only raw material is important to discriminate samples but also the way of their production. The methodology is quick, easy and does not require sample preparation. Copyright © 2016 Elsevier B.V. All rights reserved.

  9. Comparison between two race/skin color classifications in relation to health-related outcomes in Brazil.

    PubMed

    Travassos, Claudia; Laguardia, Josué; Marques, Priscilla M; Mota, Jurema C; Szwarcwald, Celia L

    2011-08-25

    This paper aims to compare the classification of race/skin color based on the discrete categories used by the Demographic Census of the Brazilian Institute of Geography and Statistics (IBGE) and a skin color scale with values ranging from 1 (lighter skin) to 10 (darker skin), examining whether choosing one alternative or the other can influence measures of self-evaluation of health status, health care service utilization and discrimination in the health services. This is a cross-sectional study based on data from the World Health Survey carried out in Brazil in 2003 with a sample of 5000 individuals older than 18 years. Similarities between the two classifications were evaluated by means of correspondence analysis. The effect of the two classifications on health outcomes was tested through logistic regression models for each sex, using age, educational level and ownership of consumer goods as covariables. Both measures of race/skin color represent the same race/skin color construct. The results show a tendency among Brazilians to classify their skin color in shades closer to the center of the color gradient. Women tend to classify their race/skin color as a little lighter than men in the skin color scale, an effect not observed when IBGE categories are used. With regard to health and health care utilization, race/skin color was not relevant in explaining any of them, regardless of the race/skin color classification. Lack of money and social class were the most prevalent reasons for discrimination in healthcare reported in the survey, suggesting that in Brazil the discussion about discrimination in the health care must not be restricted to racial discrimination and should also consider class-based discrimination. The study shows that the differences of the two classifications of race/skin color are small. However, the interval scale measure appeared to increase the freedom of choice of the respondent.

  10. A Study on the Learning Processes in Discrimination Shift Learning of Children with Mental Retardation: From the Point of Developmental View of "Logical Manipulation by Classification."

    ERIC Educational Resources Information Center

    Kanno, Atsushi

    1989-01-01

    The study was designed to investigate the learning processes in discrimination shift learning, in terms of developmental views of "logical manipulation by classification." Tasks comparing sizes of intradimensional value-classes and comparing sizes of interdimensional value-classes were devised in order to measure subjects' levels of…

  11. Large-scale optimization-based classification models in medicine and biology.

    PubMed

    Lee, Eva K

    2007-06-01

    We present novel optimization-based classification models that are general purpose and suitable for developing predictive rules for large heterogeneous biological and medical data sets. Our predictive model simultaneously incorporates (1) the ability to classify any number of distinct groups; (2) the ability to incorporate heterogeneous types of attributes as input; (3) a high-dimensional data transformation that eliminates noise and errors in biological data; (4) the ability to incorporate constraints to limit the rate of misclassification, and a reserved-judgment region that provides a safeguard against over-training (which tends to lead to high misclassification rates from the resulting predictive rule); and (5) successive multi-stage classification capability to handle data points placed in the reserved-judgment region. To illustrate the power and flexibility of the classification model and solution engine, and its multi-group prediction capability, application of the predictive model to a broad class of biological and medical problems is described. Applications include: the differential diagnosis of the type of erythemato-squamous diseases; predicting presence/absence of heart disease; genomic analysis and prediction of aberrant CpG island meythlation in human cancer; discriminant analysis of motility and morphology data in human lung carcinoma; prediction of ultrasonic cell disruption for drug delivery; identification of tumor shape and volume in treatment of sarcoma; discriminant analysis of biomarkers for prediction of early atherosclerois; fingerprinting of native and angiogenic microvascular networks for early diagnosis of diabetes, aging, macular degeneracy and tumor metastasis; prediction of protein localization sites; and pattern recognition of satellite images in classification of soil types. In all these applications, the predictive model yields correct classification rates ranging from 80 to 100%. This provides motivation for pursuing its use as a

  12. Discriminant analysis of functional optical topography for schizophrenia diagnosis

    NASA Astrophysics Data System (ADS)

    Chuang, Ching-Cheng; Nakagome, Kazuyuki; Pu, Shenghong; Lan, Tsuo-Hung; Lee, Chia-Yen; Sun, Chia-Wei

    2014-01-01

    Abnormal prefrontal function plays a central role in the cognition deficits of schizophrenic patients; however, the character of the relationship between discriminant analysis and prefrontal activation remains undetermined. Recently, evidence of low prefrontal cortex (PFC) activation in individuals with schizophrenia has also been found during verbal fluency tests (VFT) and other cognitive tests with several neuroimaging methods. The purpose of this study is to assess the hemodynamic changes of the PFC and discriminant analysis between schizophrenia patients and healthy controls during VFT task by utilizing functional optical topography. A total of 99 subjects including 53 schizophrenic patients and 46 age- and gender-matched healthy controls were studied. The results showed that the healthy group had larger activation in the right and left PFC than in the middle PFC. Besides, the schizophrenic group showed weaker task performance and lower activation in the whole PFC than the healthy group. The result of the discriminant analysis showed a significant difference with P value <0.001 in six channels (CH 23, 29, 31, 40, 42, 52) between the schizophrenic and healthy groups. Finally, 68.69% and 71.72% of subjects are correctly classified as being schizophrenic or healthy with all 52 channels and six significantly different channels, respectively. Our findings suggest that the left PFC can be a feature region for discriminant analysis of schizophrenic diagnosis.

  13. Protein subcellular location pattern classification in cellular images using latent discriminative models.

    PubMed

    Li, Jieyue; Xiong, Liang; Schneider, Jeff; Murphy, Robert F

    2012-06-15

    Knowledge of the subcellular location of a protein is crucial for understanding its functions. The subcellular pattern of a protein is typically represented as the set of cellular components in which it is located, and an important task is to determine this set from microscope images. In this article, we address this classification problem using confocal immunofluorescence images from the Human Protein Atlas (HPA) project. The HPA contains images of cells stained for many proteins; each is also stained for three reference components, but there are many other components that are invisible. Given one such cell, the task is to classify the pattern type of the stained protein. We first randomly select local image regions within the cells, and then extract various carefully designed features from these regions. This region-based approach enables us to explicitly study the relationship between proteins and different cell components, as well as the interactions between these components. To achieve these two goals, we propose two discriminative models that extend logistic regression with structured latent variables. The first model allows the same protein pattern class to be expressed differently according to the underlying components in different regions. The second model further captures the spatial dependencies between the components within the same cell so that we can better infer these components. To learn these models, we propose a fast approximate algorithm for inference, and then use gradient-based methods to maximize the data likelihood. In the experiments, we show that the proposed models help improve the classification accuracies on synthetic data and real cellular images. The best overall accuracy we report in this article for classifying 942 proteins into 13 classes of patterns is about 84.6%, which to our knowledge is the best so far. In addition, the dependencies learned are consistent with prior knowledge of cell organization. http://murphylab.web.cmu.edu/software/.

  14. Near infrared spectroscopy is suitable for the classification of hazelnuts according to Protected Designation of Origin.

    PubMed

    Moscetti, Roberto; Radicetti, Emanuele; Monarca, Danilo; Cecchini, Massimo; Massantini, Riccardo

    2015-10-01

    This study investigates the possibility of using near infrared spectroscopy for the authentication of the 'Nocciola Romana' hazelnut (Corylus avellana L. cvs Tonda Gentile Romana and Nocchione) as a Protected Designation of Origin (PDO) hazelnut from central Italy. Algorithms for the selection of the optimal pretreatments were tested in combination with the following discriminant routines: k-nearest neighbour, soft independent modelling of class analogy, partial least squares discriminant analysis and support vector machine discriminant analysis. The best results were obtained using a support vector machine discriminant analysis routine. Thus, classification performance rates with specificities, sensitivities and accuracies as high as 96.0%, 95.0% and 95.5%, respectively, were achieved. Various pretreatments, such as standard normal variate, mean centring and a Savitzky-Golay filter with seven smoothing points, were used. The optimal wavelengths for classification were mainly correlated with lipids, although some contribution from minor constituents, such as proteins and carbohydrates, was also observed. Near infrared spectroscopy could classify hazelnut according to the PDO 'Nocciola Romana' designation. Thus, the experimentation lays the foundations for a rapid, online, authentication system for hazelnut. However, model robustness should be improved taking into account agro-pedo-climatic growing conditions. © 2014 Society of Chemical Industry.

  15. An Initial Analysis of LANDSAT-4 Thematic Mapper Data for the Discrimination of Agricultural, Forested Wetlands, and Urban Land Cover. [Poinsett County, Arkansas; and Reelfoot Lake and Union City, Tennessee

    NASA Technical Reports Server (NTRS)

    Quattrochi, D. A.

    1985-01-01

    The capabilities of TM data for discriminating land covers within three particular cultural and ecological realms was assessed. The agricultural investigation in Poinsett County, Arkansas illustrates that TM data can successfully be used to discriminate a variety of crop cover types within the study area. The single-date TM classification produced results that were significantly better than those developed from multitemporal MSS data. For the Reelfoot Lake area of Tennessee TM data, processed using unsupervised signature development techniques, produced a detailed classification of forested wetlands with excellent accuracy. Even in a small city of approximately 15,000 people (Union City, Tennessee). TM data can successfully be used to spectrally distinguish specific urban classes. Furthermore, the principal components analysis evaluation of the data shows that through photointerpretation, it is possible to distinguish individual buildings and roof responses with the TM.

  16. Discriminative analysis of non-linear brain connectivity for leukoaraiosis with resting-state fMRI

    NASA Astrophysics Data System (ADS)

    Lai, Youzhi; Xu, Lele; Yao, Li; Wu, Xia

    2015-03-01

    Leukoaraiosis (LA) describes diffuse white matter abnormalities on CT or MR brain scans, often seen in the normal elderly and in association with vascular risk factors such as hypertension, or in the context of cognitive impairment. The mechanism of cognitive dysfunction is still unclear. The recent clinical studies have revealed that the severity of LA was not corresponding to the cognitive level, and functional connectivity analysis is an appropriate method to detect the relation between LA and cognitive decline. However, existing functional connectivity analyses of LA have been mostly limited to linear associations. In this investigation, a novel measure utilizing the extended maximal information coefficient (eMIC) was applied to construct non-linear functional connectivity in 44 LA subjects (9 dementia, 25 mild cognitive impairment (MCI) and 10 cognitively normal (CN)). The strength of non-linear functional connections for the first 1% of discriminative power increased in MCI compared with CN and dementia, which was opposed to its linear counterpart. Further functional network analysis revealed that the changes of the non-linear and linear connectivity have similar but not completely the same spatial distribution in human brain. In the multivariate pattern analysis with multiple classifiers, the non-linear functional connectivity mostly identified dementia, MCI and CN from LA with a relatively higher accuracy rate than the linear measure. Our findings revealed the non-linear functional connectivity provided useful discriminative power in classification of LA, and the spatial distributed changes between the non-linear and linear measure may indicate the underlying mechanism of cognitive dysfunction in LA.

  17. Improved EEG Event Classification Using Differential Energy.

    PubMed

    Harati, A; Golmohammadi, M; Lopez, S; Obeid, I; Picone, J

    2015-12-01

    Feature extraction for automatic classification of EEG signals typically relies on time frequency representations of the signal. Techniques such as cepstral-based filter banks or wavelets are popular analysis techniques in many signal processing applications including EEG classification. In this paper, we present a comparison of a variety of approaches to estimating and postprocessing features. To further aid in discrimination of periodic signals from aperiodic signals, we add a differential energy term. We evaluate our approaches on the TUH EEG Corpus, which is the largest publicly available EEG corpus and an exceedingly challenging task due to the clinical nature of the data. We demonstrate that a variant of a standard filter bank-based approach, coupled with first and second derivatives, provides a substantial reduction in the overall error rate. The combination of differential energy and derivatives produces a 24 % absolute reduction in the error rate and improves our ability to discriminate between signal events and background noise. This relatively simple approach proves to be comparable to other popular feature extraction approaches such as wavelets, but is much more computationally efficient.

  18. Multivariate classification of edible salts: Simultaneous Laser-Induced Breakdown Spectroscopy and Laser-Ablation Inductively Coupled Plasma Mass Spectrometry Analysis

    NASA Astrophysics Data System (ADS)

    Lee, Yonghoon; Nam, Sang-Ho; Ham, Kyung-Sik; Gonzalez, Jhanis; Oropeza, Dayana; Quarles, Derrick; Yoo, Jonghyun; Russo, Richard E.

    2016-04-01

    Laser-Induced Breakdown Spectroscopy (LIBS) and Laser-Ablation Inductively Coupled Plasma Mass Spectrometry (LA-ICP-MS), both based on laser ablation sampling, can be employed simultaneously to obtain different chemical fingerprints from a sample. We demonstrated that this analysis approach can provide complementary information for improved classification of edible salts. LIBS could detect several of the minor metallic elements along with Na and Cl, while LA-ICP-MS spectra were used to measure non-metallic and trace heavy metal elements. Principal component analysis using LIBS and LA-ICP-MS spectra showed that their major spectral variations classified the sample salts in different ways. Three classification models were developed by using partial least squares-discriminant analysis based on the LIBS, LA-ICP-MS, and their fused data. From the cross-validation performances and confusion matrices of these models, the minor metallic elements (Mg, Ca, and K) detected by LIBS and the non-metallic (I) and trace heavy metal (Ba, W, and Pb) elements detected by LA-ICP-MS provided complementary chemical information to distinguish particular salt samples.

  19. An electronic nose for reliable measurement and correct classification of beverages.

    PubMed

    Mamat, Mazlina; Samad, Salina Abdul; Hannan, Mahammad A

    2011-01-01

    This paper reports the design of an electronic nose (E-nose) prototype for reliable measurement and correct classification of beverages. The prototype was developed and fabricated in the laboratory using commercially available metal oxide gas sensors and a temperature sensor. The repeatability, reproducibility and discriminative ability of the developed E-nose prototype were tested on odors emanating from different beverages such as blackcurrant juice, mango juice and orange juice, respectively. Repeated measurements of three beverages showed very high correlation (r > 0.97) between the same beverages to verify the repeatability. The prototype also produced highly correlated patterns (r > 0.97) in the measurement of beverages using different sensor batches to verify its reproducibility. The E-nose prototype also possessed good discriminative ability whereby it was able to produce different patterns for different beverages, different milk heat treatments (ultra high temperature, pasteurization) and fresh and spoiled milks. The discriminative ability of the E-nose was evaluated using Principal Component Analysis and a Multi Layer Perception Neural Network, with both methods showing good classification results.

  20. An Electronic Nose for Reliable Measurement and Correct Classification of Beverages

    PubMed Central

    Mamat, Mazlina; Samad, Salina Abdul; Hannan, Mahammad A.

    2011-01-01

    This paper reports the design of an electronic nose (E-nose) prototype for reliable measurement and correct classification of beverages. The prototype was developed and fabricated in the laboratory using commercially available metal oxide gas sensors and a temperature sensor. The repeatability, reproducibility and discriminative ability of the developed E-nose prototype were tested on odors emanating from different beverages such as blackcurrant juice, mango juice and orange juice, respectively. Repeated measurements of three beverages showed very high correlation (r > 0.97) between the same beverages to verify the repeatability. The prototype also produced highly correlated patterns (r > 0.97) in the measurement of beverages using different sensor batches to verify its reproducibility. The E-nose prototype also possessed good discriminative ability whereby it was able to produce different patterns for different beverages, different milk heat treatments (ultra high temperature, pasteurization) and fresh and spoiled milks. The discriminative ability of the E-nose was evaluated using Principal Component Analysis and a Multi Layer Perception Neural Network, with both methods showing good classification results. PMID:22163964

  1. Observation versus classification in supervised category learning.

    PubMed

    Levering, Kimery R; Kurtz, Kenneth J

    2015-02-01

    The traditional supervised classification paradigm encourages learners to acquire only the knowledge needed to predict category membership (a discriminative approach). An alternative that aligns with important aspects of real-world concept formation is learning with a broader focus to acquire knowledge of the internal structure of each category (a generative approach). Our work addresses the impact of a particular component of the traditional classification task: the guess-and-correct cycle. We compare classification learning to a supervised observational learning task in which learners are shown labeled examples but make no classification response. The goals of this work sit at two levels: (1) testing for differences in the nature of the category representations that arise from two basic learning modes; and (2) evaluating the generative/discriminative continuum as a theoretical tool for understand learning modes and their outcomes. Specifically, we view the guess-and-correct cycle as consistent with a more discriminative approach and therefore expected it to lead to narrower category knowledge. Across two experiments, the observational mode led to greater sensitivity to distributional properties of features and correlations between features. We conclude that a relatively subtle procedural difference in supervised category learning substantially impacts what learners come to know about the categories. The results demonstrate the value of the generative/discriminative continuum as a tool for advancing the psychology of category learning and also provide a valuable constraint for formal models and associated theories.

  2. Discrimination of Oil Slicks and Lookalikes in Polarimetric SAR Images Using CNN.

    PubMed

    Guo, Hao; Wu, Danni; An, Jubai

    2017-08-09

    Oil slicks and lookalikes (e.g., plant oil and oil emulsion) all appear as dark areas in polarimetric Synthetic Aperture Radar (SAR) images and are highly heterogeneous, so it is very difficult to use a single feature that can allow classification of dark objects in polarimetric SAR images as oil slicks or lookalikes. We established multi-feature fusion to support the discrimination of oil slicks and lookalikes. In the paper, simple discrimination analysis is used to rationalize a preferred features subset. The features analyzed include entropy, alpha, and Single-bounce Eigenvalue Relative Difference (SERD) in the C-band polarimetric mode. We also propose a novel SAR image discrimination method for oil slicks and lookalikes based on Convolutional Neural Network (CNN). The regions of interest are selected as the training and testing samples for CNN on the three kinds of polarimetric feature images. The proposed method is applied to a training data set of 5400 samples, including 1800 crude oil, 1800 plant oil, and 1800 oil emulsion samples. In the end, the effectiveness of the method is demonstrated through the analysis of some experimental results. The classification accuracy obtained using 900 samples of test data is 91.33%. It is here observed that the proposed method not only can accurately identify the dark spots on SAR images but also verify the ability of the proposed algorithm to classify unstructured features.

  3. Discrimination of Oil Slicks and Lookalikes in Polarimetric SAR Images Using CNN

    PubMed Central

    An, Jubai

    2017-01-01

    Oil slicks and lookalikes (e.g., plant oil and oil emulsion) all appear as dark areas in polarimetric Synthetic Aperture Radar (SAR) images and are highly heterogeneous, so it is very difficult to use a single feature that can allow classification of dark objects in polarimetric SAR images as oil slicks or lookalikes. We established multi-feature fusion to support the discrimination of oil slicks and lookalikes. In the paper, simple discrimination analysis is used to rationalize a preferred features subset. The features analyzed include entropy, alpha, and Single-bounce Eigenvalue Relative Difference (SERD) in the C-band polarimetric mode. We also propose a novel SAR image discrimination method for oil slicks and lookalikes based on Convolutional Neural Network (CNN). The regions of interest are selected as the training and testing samples for CNN on the three kinds of polarimetric feature images. The proposed method is applied to a training data set of 5400 samples, including 1800 crude oil, 1800 plant oil, and 1800 oil emulsion samples. In the end, the effectiveness of the method is demonstrated through the analysis of some experimental results. The classification accuracy obtained using 900 samples of test data is 91.33%. It is here observed that the proposed method not only can accurately identify the dark spots on SAR images but also verify the ability of the proposed algorithm to classify unstructured features. PMID:28792477

  4. Nonlinear features for classification and pose estimation of machined parts from single views

    NASA Astrophysics Data System (ADS)

    Talukder, Ashit; Casasent, David P.

    1998-10-01

    A new nonlinear feature extraction method is presented for classification and pose estimation of objects from single views. The feature extraction method is called the maximum representation and discrimination feature (MRDF) method. The nonlinear MRDF transformations to use are obtained in closed form, and offer significant advantages compared to nonlinear neural network implementations. The features extracted are useful for both object discrimination (classification) and object representation (pose estimation). We consider MRDFs on image data, provide a new 2-stage nonlinear MRDF solution, and show it specializes to well-known linear and nonlinear image processing transforms under certain conditions. We show the use of MRDF in estimating the class and pose of images of rendered solid CAD models of machine parts from single views using a feature-space trajectory neural network classifier. We show new results with better classification and pose estimation accuracy than are achieved by standard principal component analysis and Fukunaga-Koontz feature extraction methods.

  5. Racial discrimination and cortisol output: A meta-analysis.

    PubMed

    Korous, Kevin M; Causadias, José M; Casper, Deborah M

    2017-11-01

    Although the relation between stress and physiology is well documented, attempts at understanding the link between racial discrimination and cortisol output, specifically, have produced mixed results, likely due to study characteristics such as racial/ethnic composition of the samples (e.g., African American, Latino), measures of discrimination, and research design (e.g., cross-sectional, experimental). To estimate the overall association between racial discrimination and cortisol output among racial/ethnic minority individuals and to determine if the association between racial discrimination and cortisol output is moderated by age, race/ethnicity, type of discrimination measure, sex, and research design. Using a random effects model, the overall effect size based on k = 16 studies (19% unpublished) and N = 1506 participants was r¯ = 0.040, 95% CI = -0.038 to 0.117. Studies were conducted predominantly in the U.S. (81%). Notably, experimental studies (r¯ = 0.267) exhibited larger effect sizes compared to non-experimental studies (r¯ = -0.007). Age, race/ethnicity, type of discrimination measure, and sex did not moderate the effect sizes. This meta-analysis provides evidence that the measurement of the association between racial discrimination and cortisol is complex, and it offers valuable insight regarding methods and designs that can inform future research on this topic. Limitations and future directions are discussed. Copyright © 2017 Elsevier Ltd. All rights reserved.

  6. Partial Least Squares for Discrimination in fMRI Data

    PubMed Central

    Andersen, Anders H.; Rayens, William S.; Liu, Yushu; Smith, Charles D.

    2011-01-01

    Multivariate methods for discrimination were used in the comparison of brain activation patterns between groups of cognitively normal women who are at either high or low Alzheimer's disease risk based on family history and apolipoprotein-E4 status. Linear discriminant analysis (LDA) was preceded by dimension reduction using either principal component analysis (PCA), partial least squares (PLS), or a new oriented partial least squares (OrPLS) method. The aim was to identify a spatial pattern of functionally connected brain regions that was differentially expressed by the risk groups and yielded optimal classification accuracy. Multivariate dimension reduction is required prior to LDA when the data contains more feature variables than there are observations on individual subjects. Whereas PCA has been commonly used to identify covariance patterns in neuroimaging data, this approach only identifies gross variability and is not capable of distinguishing among-groups from within-groups variability. PLS and OrPLS provide a more focused dimension reduction by incorporating information on class structure and therefore lead to more parsimonious models for discrimination. Performance was evaluated in terms of the cross-validated misclassification rates. The results support the potential of using fMRI as an imaging biomarker or diagnostic tool to discriminate individuals with disease or high risk. PMID:22227352

  7. Discriminant functions for sex estimation of modern Japanese skulls.

    PubMed

    Ogawa, Yoshinori; Imaizumi, Kazuhiko; Miyasaka, Sachio; Yoshino, Mineo

    2013-05-01

    The purpose of this study is to generate a set of discriminant functions in order to estimate the sex of modern Japanese skulls. To conduct the analysis, the anthropological measurement data of 113 individuals (73 males and 40 females) were collected from recent forensic anthropological test records at the National Research Institute of Police Science, Japan. Birth years of the individuals ranged from 1926 to 1979, and age at death was over 19 years for all individuals. A total of 10 anthropological measurements were used in the discriminant function analysis: maximum cranial length, cranial base length, maximum cranial breadth, maximum frontal breadth, basion-bregmatic height, upper facial breadth, bizygomatic breadth, bicondylar breadth, bigonial breadth, and ramal height. As a result, nine discriminant functions were established. The classification accuracy ranged from 79.0 to 89.9% when the measurements of the 113 individuals were substituted into the established functions, from 77.8 to 88.1% when a leave-one-out cross-validation procedure was applied to the data, and from 86.7 to 93.0% when the measurements of 50 new individuals (25 males and 25 females), unrelated to the establishment of the discriminant functions, were used. Copyright © 2012 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.

  8. Fuzziness-based active learning framework to enhance hyperspectral image classification performance for discriminative and generative classifiers

    PubMed Central

    2018-01-01

    Hyperspectral image classification with a limited number of training samples without loss of accuracy is desirable, as collecting such data is often expensive and time-consuming. However, classifiers trained with limited samples usually end up with a large generalization error. To overcome the said problem, we propose a fuzziness-based active learning framework (FALF), in which we implement the idea of selecting optimal training samples to enhance generalization performance for two different kinds of classifiers, discriminative and generative (e.g. SVM and KNN). The optimal samples are selected by first estimating the boundary of each class and then calculating the fuzziness-based distance between each sample and the estimated class boundaries. Those samples that are at smaller distances from the boundaries and have higher fuzziness are chosen as target candidates for the training set. Through detailed experimentation on three publically available datasets, we showed that when trained with the proposed sample selection framework, both classifiers achieved higher classification accuracy and lower processing time with the small amount of training data as opposed to the case where the training samples were selected randomly. Our experiments demonstrate the effectiveness of our proposed method, which equates favorably with the state-of-the-art methods. PMID:29304512

  9. ASTM clustering for improving coal analysis by near-infrared spectroscopy.

    PubMed

    Andrés, J M; Bona, M T

    2006-11-15

    Multivariate analysis techniques have been applied to near-infrared (NIR) spectra coals to investigate the relationship between nine coal properties (moisture (%), ash (%), volatile matter (%), fixed carbon (%), heating value (kcal/kg), carbon (%), hydrogen (%), nitrogen (%) and sulphur (%)) and the corresponding predictor variables. In this work, a whole set of coal samples was grouped into six more homogeneous clusters following the ASTM reference method for classification prior to the application of calibration methods to each coal set. The results obtained showed a considerable improvement of the error determination compared with the calibration for the whole sample set. For some groups, the established calibrations approached the quality required by the ASTM/ISO norms for laboratory analysis. To predict property values for a new coal sample it is necessary the assignation of that sample to its respective group. Thus, the discrimination and classification ability of coal samples by Diffuse Reflectance Infrared Fourier Transform Spectroscopy (DRIFTS) in the NIR range was also studied by applying Soft Independent Modelling of Class Analogy (SIMCA) and Linear Discriminant Analysis (LDA) techniques. Modelling of the groups by SIMCA led to overlapping models that cannot discriminate for unique classification. On the other hand, the application of Linear Discriminant Analysis improved the classification of the samples but not enough to be satisfactory for every group considered.

  10. Multispectral imaging burn wound tissue classification system: a comparison of test accuracies between several common machine learning algorithms

    NASA Astrophysics Data System (ADS)

    Squiers, John J.; Li, Weizhi; King, Darlene R.; Mo, Weirong; Zhang, Xu; Lu, Yang; Sellke, Eric W.; Fan, Wensheng; DiMaio, J. Michael; Thatcher, Jeffrey E.

    2016-03-01

    The clinical judgment of expert burn surgeons is currently the standard on which diagnostic and therapeutic decisionmaking regarding burn injuries is based. Multispectral imaging (MSI) has the potential to increase the accuracy of burn depth assessment and the intraoperative identification of viable wound bed during surgical debridement of burn injuries. A highly accurate classification model must be developed using machine-learning techniques in order to translate MSI data into clinically-relevant information. An animal burn model was developed to build an MSI training database and to study the burn tissue classification ability of several models trained via common machine-learning algorithms. The algorithms tested, from least to most complex, were: K-nearest neighbors (KNN), decision tree (DT), linear discriminant analysis (LDA), weighted linear discriminant analysis (W-LDA), quadratic discriminant analysis (QDA), ensemble linear discriminant analysis (EN-LDA), ensemble K-nearest neighbors (EN-KNN), and ensemble decision tree (EN-DT). After the ground-truth database of six tissue types (healthy skin, wound bed, blood, hyperemia, partial injury, full injury) was generated by histopathological analysis, we used 10-fold cross validation to compare the algorithms' performances based on their accuracies in classifying data against the ground truth, and each algorithm was tested 100 times. The mean test accuracy of the algorithms were KNN 68.3%, DT 61.5%, LDA 70.5%, W-LDA 68.1%, QDA 68.9%, EN-LDA 56.8%, EN-KNN 49.7%, and EN-DT 36.5%. LDA had the highest test accuracy, reflecting the bias-variance tradeoff over the range of complexities inherent to the algorithms tested. Several algorithms were able to match the current standard in burn tissue classification, the clinical judgment of expert burn surgeons. These results will guide further development of an MSI burn tissue classification system. Given that there are few surgeons and facilities specializing in burn care

  11. Discrimination of rectal cancer through human serum using surface-enhanced Raman spectroscopy

    NASA Astrophysics Data System (ADS)

    Li, Xiaozhou; Yang, Tianyue; Li, Siqi; Zhang, Su; Jin, Lili

    2015-05-01

    In this paper, surface-enhanced Raman spectroscopy (SERS) was used to detect the changes in blood serum components that accompany rectal cancer. The differences in serum SERS data between rectal cancer patients and healthy controls were examined. Postoperative rectal cancer patients also participated in the comparison to monitor the effects of cancer treatments. The results show that there are significant variations at certain wavenumbers which indicates alteration of corresponding biological substances. Principal component analysis (PCA) and parameters of intensity ratios were used on the original SERS spectra for the extraction of featured variables. These featured variables then underwent linear discriminant analysis (LDA) and classification and regression tree (CART) for the discrimination analysis. Accuracies of 93.5 and 92.4 % were obtained for PCA-LDA and parameter-CART, respectively.

  12. A Retrospective Analysis of Pressure Ulcer Incidence and Modified Braden Scale Score Risk Classifications.

    PubMed

    Chen, Hong-Lin; Cao, Ying-Juan; Wang, Jing; Huai, Bao-Sha

    2015-09-01

    The Braden Scale is the most widely used pressure ulcer risk assessment in the world, but the currently used 5 risk classification groups do not accurately discriminate among their risk categories. To optimize risk classification based on Braden Scale scores, a retrospective analysis of all consecutively admitted patients in an acute care facility who were at risk for pressure ulcer development was performed between January 2013 and December 2013. Predicted pressure ulcer incidence first was calculated by logistic regression model based on original Braden score. Risk classification then was modified based on the predicted pressure ulcer incidence and compared between different risk categories in the modified (3-group) classification and the traditional (5-group) classification using chi-square test. Two thousand, six hundred, twenty-five (2,625) patients (mean age 59.8 ± 16.5, range 1 month to 98 years, 1,601 of whom were men) were included in the study; 81 patients (3.1%) developed a pressure ulcer. The predicted pressure ulcer incidence ranged from 0.1% to 49.7%. When the predicted pressure ulcer incidence was greater than 10.0% (high risk), the corresponding Braden scores were less than 11; when the predicted incidence ranged from 1.0% to 10.0% (moderate risk), the corresponding Braden scores ranged from 12 to 16; and when the predicted incidence was less than 1.0% (mild risk), the corresponding Braden scores were greater than 17. In the modified classification, observed pressure ulcer incidence was significantly different between each of the 3 risk categories (P less than 0.05). However, in the traditional classification, the observed incidence was not significantly different between the high-risk category and moderate-risk category (P less than 0.05) and between the mild-risk category and no-risk category (P less than 0.05). If future studies confirm the validity of these findings, pressure ulcer prevention protocols of care based on Braden Scale scores can

  13. Sex determination from the talus in a contemporary Greek population using discriminant function analysis.

    PubMed

    Peckmann, Tanya R; Orr, Kayla; Meek, Susan; Manolis, Sotiris K

    2015-07-01

    The determination of sex is an important part of building the biological profile for unknown human remains. Many of the bones traditionally used for the determination of sex are often found fragmented or incomplete in forensic and archaeological cases. The goal of the present research was to derive discriminant function equations from the talus, a preservationally favoured bone, for sexing skeletons from a contemporary Greek population. Nine parameters were measured on 182 individuals (96 males and 86 females) from the University of Athens Human Skeletal Reference Collection. The individuals ranged in age from 20 to 99 years old. The statistical analyses showed that all measured parameters were sexually dimorphic. Discriminant function score equations were generated for use in sex determination. The average accuracy of sex classification ranged from 65.2% to 93.4% for the univariate analysis, 90%-96.5% for the direct method and 86.7% for the stepwise method. Comparisons to other populations were made. Overall, the cross-validated accuracies ranged from 65.5% to 83.2% and males were most often correctly identified. The talus was shown to be useful for sex determination in the modern Greek population. Copyright © 2015 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.

  14. Discrimination surfaces with application to region-specific brain asymmetry analysis.

    PubMed

    Martos, Gabriel; de Carvalho, Miguel

    2018-05-20

    Discrimination surfaces are here introduced as a diagnostic tool for localizing brain regions where discrimination between diseased and nondiseased participants is higher. To estimate discrimination surfaces, we introduce a Mann-Whitney type of statistic for random fields and present large-sample results characterizing its asymptotic behavior. Simulation results demonstrate that our estimator accurately recovers the true surface and corresponding interval of maximal discrimination. The empirical analysis suggests that in the anterior region of the brain, schizophrenic patients tend to present lower local asymmetry scores in comparison with participants in the control group. Copyright © 2018 John Wiley & Sons, Ltd.

  15. Differentiating sex and species of Western Grebes (Aechmophorus occidentalis) and Clark's Grebes (Aechmophorus clarkii) and their eggs using external morphometrics and discriminant function analysis

    USGS Publications Warehouse

    Hartman, C. Alex; Ackerman, Joshua T.; Eagles-Smith, Collin A.; Herzog, Mark

    2016-01-01

    In birds where males and females are similar in size and plumage, sex determination by alternative means is necessary. Discriminant function analysis based on external morphometrics was used to distinguish males from females in two closely related species: Western Grebe (Aechmophorus occidentalis) and Clark's Grebe (A. clarkii). Additionally, discriminant function analysis was used to evaluate morphometric divergence between Western and Clark's grebe adults and eggs. Aechmophorus grebe adults (n = 576) and eggs (n = 130) were sampled across 29 lakes and reservoirs throughout California, USA, and adult sex was determined using molecular analysis. Both Western and Clark's grebes exhibited considerable sexual size dimorphism. Males averaged 6–26% larger than females among seven morphological measurements, with the greatest sexual size dimorphism occurring for bill morphometrics. Discriminant functions based on bill length, bill depth, and short tarsus length correctly assigned sex to 98% of Western Grebes, and a function based on bill length and bill depth correctly assigned sex to 99% of Clark's Grebes. Further, a simplified discriminant function based only on bill depth correctly assigned sex to 96% of Western Grebes and 98% of Clark's Grebes. In contrast, external morphometrics were not suitable for differentiating between Western and Clark's grebe adults or their eggs, with correct classification rates of discriminant functions of only 60%, 63%, and 61% for adult males, adult females, and eggs, respectively. Our results indicate little divergence in external morphology between species of Aechmophorus grebes, and instead separation is much greater between males and females.

  16. Diagnosis and Classification of 17 Diseases from 1404 Subjects via Pattern Analysis of Exhaled Molecules

    PubMed Central

    2016-01-01

    We report on an artificially intelligent nanoarray based on molecularly modified gold nanoparticles and a random network of single-walled carbon nanotubes for noninvasive diagnosis and classification of a number of diseases from exhaled breath. The performance of this artificially intelligent nanoarray was clinically assessed on breath samples collected from 1404 subjects having one of 17 different disease conditions included in the study or having no evidence of any disease (healthy controls). Blind experiments showed that 86% accuracy could be achieved with the artificially intelligent nanoarray, allowing both detection and discrimination between the different disease conditions examined. Analysis of the artificially intelligent nanoarray also showed that each disease has its own unique breathprint, and that the presence of one disease would not screen out others. Cluster analysis showed a reasonable classification power of diseases from the same categories. The effect of confounding clinical and environmental factors on the performance of the nanoarray did not significantly alter the obtained results. The diagnosis and classification power of the nanoarray was also validated by an independent analytical technique, i.e., gas chromatography linked with mass spectrometry. This analysis found that 13 exhaled chemical species, called volatile organic compounds, are associated with certain diseases, and the composition of this assembly of volatile organic compounds differs from one disease to another. Overall, these findings could contribute to one of the most important criteria for successful health intervention in the modern era, viz. easy-to-use, inexpensive (affordable), and miniaturized tools that could also be used for personalized screening, diagnosis, and follow-up of a number of diseases, which can clearly be extended by further development. PMID:28000444

  17. Discrimination of Clover and Citrus Honeys from Egypt According to Floral Type Using Easily Assessable Physicochemical Parameters and Discriminant Analysis: An External Validation of the Chemometric Approach.

    PubMed

    Karabagias, Ioannis K; Karabournioti, Sofia

    2018-05-03

    Twenty-two honey samples, namely clover and citrus honeys, were collected from the greater Cairo area during the harvesting year 2014⁻2015. The main purpose of the present study was to characterize the aforementioned honey types and to investigate whether the use of easily assessable physicochemical parameters, including color attributes in combination with chemometrics, could differentiate honey floral origin. Parameters taken into account were: pH, electrical conductivity, ash, free acidity, lactonic acidity, total acidity, moisture content, total sugars (degrees Brix-°Bx), total dissolved solids and their ratio to total acidity, salinity, CIELAB color parameters, along with browning index values. Results showed that all honey samples analyzed met the European quality standards set for honey and had variations in the aforementioned physicochemical parameters depending on floral origin. Application of linear discriminant analysis showed that eight physicochemical parameters, including color, could classify Egyptian honeys according to floral origin ( p < 0.05). Correct classification rate was 95.5% using the original method and 90.9% using the cross validation method. The discriminatory ability of the developed model was further validated using unknown honey samples. The overall correct classification rate was not affected. Specific physicochemical parameter analysis in combination with chemometrics has the potential to enhance the differences in floral honeys produced in a given geographical zone.

  18. Discrimination of Clover and Citrus Honeys from Egypt According to Floral Type Using Easily Assessable Physicochemical Parameters and Discriminant Analysis: An External Validation of the Chemometric Approach

    PubMed Central

    Karabournioti, Sofia

    2018-01-01

    Twenty-two honey samples, namely clover and citrus honeys, were collected from the greater Cairo area during the harvesting year 2014–2015. The main purpose of the present study was to characterize the aforementioned honey types and to investigate whether the use of easily assessable physicochemical parameters, including color attributes in combination with chemometrics, could differentiate honey floral origin. Parameters taken into account were: pH, electrical conductivity, ash, free acidity, lactonic acidity, total acidity, moisture content, total sugars (degrees Brix-°Bx), total dissolved solids and their ratio to total acidity, salinity, CIELAB color parameters, along with browning index values. Results showed that all honey samples analyzed met the European quality standards set for honey and had variations in the aforementioned physicochemical parameters depending on floral origin. Application of linear discriminant analysis showed that eight physicochemical parameters, including color, could classify Egyptian honeys according to floral origin (p < 0.05). Correct classification rate was 95.5% using the original method and 90.9% using the cross validation method. The discriminatory ability of the developed model was further validated using unknown honey samples. The overall correct classification rate was not affected. Specific physicochemical parameter analysis in combination with chemometrics has the potential to enhance the differences in floral honeys produced in a given geographical zone. PMID:29751543

  19. CARSVM: a class association rule-based classification framework and its application to gene expression data.

    PubMed

    Kianmehr, Keivan; Alhajj, Reda

    2008-09-01

    In this study, we aim at building a classification framework, namely the CARSVM model, which integrates association rule mining and support vector machine (SVM). The goal is to benefit from advantages of both, the discriminative knowledge represented by class association rules and the classification power of the SVM algorithm, to construct an efficient and accurate classifier model that improves the interpretability problem of SVM as a traditional machine learning technique and overcomes the efficiency issues of associative classification algorithms. In our proposed framework: instead of using the original training set, a set of rule-based feature vectors, which are generated based on the discriminative ability of class association rules over the training samples, are presented to the learning component of the SVM algorithm. We show that rule-based feature vectors present a high-qualified source of discrimination knowledge that can impact substantially the prediction power of SVM and associative classification techniques. They provide users with more conveniences in terms of understandability and interpretability as well. We have used four datasets from UCI ML repository to evaluate the performance of the developed system in comparison with five well-known existing classification methods. Because of the importance and popularity of gene expression analysis as real world application of the classification model, we present an extension of CARSVM combined with feature selection to be applied to gene expression data. Then, we describe how this combination will provide biologists with an efficient and understandable classifier model. The reported test results and their biological interpretation demonstrate the applicability, efficiency and effectiveness of the proposed model. From the results, it can be concluded that a considerable increase in classification accuracy can be obtained when the rule-based feature vectors are integrated in the learning process of the SVM

  20. Improved discrimination between monocotyledonous and dicotyledonous plants for weed control based on the blue-green region of ultraviolet-induced fluorescence spectra.

    PubMed

    Panneton, Bernard; Guillaume, Serge; Roger, Jean-Michel; Samson, Guy

    2010-01-01

    Precision weeding by spot spraying in real time requires sensors to discriminate between weeds and crop without contact. Among the optical based solutions, the ultraviolet (UV) induced fluorescence of the plants appears as a promising alternative. In a first paper, the feasibility of discriminating between corn hybrids, monocotyledonous, and dicotyledonous weeds was demonstrated on the basis of the complete spectra. Some considerations about the different sources of fluorescence oriented the focus to the blue-green fluorescence (BGF) part, ignoring the chlorophyll fluorescence that is inherently more variable in time. This paper investigates the potential of performing weed/crop discrimination on the basis of several large spectral bands in the BGF area. A partial least squares discriminant analysis (PLS-DA) was performed on a set of 1908 spectra of corn and weed plants over 3 years and various growing conditions. The discrimination between monocotyledonous and dicotyledonous plants based on the blue-green fluorescence yielded robust models (classification error between 1.3 and 4.6% for between-year validation). On the basis of the analysis of the PLS-DA model, two large bands were chosen in the blue-green fluorescence zone (400-425 nm and 425-490 nm). A linear discriminant analysis based on the signal from these two bands also provided very robust inter-year results (classification error from 1.5% to 5.2%). The same selection process was applied to discriminate between monocotyledonous weeds and maize but yielded no robust models (up to 50% inter-year error). Further work will be required to solve this problem and provide a complete UV fluorescence based sensor for weed-maize discrimination.

  1. Comparison of Radio Frequency Distinct Native Attribute and Matched Filtering Techniques for Device Discrimination and Operation Identification

    DTIC Science & Technology

    identification. URE from ten MSP430F5529 16-bit microcontrollers were analyzed using: 1) RF distinct native attributes (RF-DNA) fingerprints paired with multiple...discriminant analysis/maximum likelihood (MDA/ML) classification, 2) RF-DNA fingerprints paired with generalized relevance learning vector quantized

  2. Classification of Stellar Spectra with Fuzzy Minimum Within-Class Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Zhong-bao, Liu; Wen-ai, Song; Jing, Zhang; Wen-juan, Zhao

    2017-06-01

    Classification is one of the important tasks in astronomy, especially in spectra analysis. Support Vector Machine (SVM) is a typical classification method, which is widely used in spectra classification. Although it performs well in practice, its classification accuracies can not be greatly improved because of two limitations. One is it does not take the distribution of the classes into consideration. The other is it is sensitive to noise. In order to solve the above problems, inspired by the maximization of the Fisher's Discriminant Analysis (FDA) and the SVM separability constraints, fuzzy minimum within-class support vector machine (FMWSVM) is proposed in this paper. In FMWSVM, the distribution of the classes is reflected by the within-class scatter in FDA and the fuzzy membership function is introduced to decrease the influence of the noise. The comparative experiments with SVM on the SDSS datasets verify the effectiveness of the proposed classifier FMWSVM.

  3. Declining Bias and Gender Wage Discrimination? A Meta-Regression Analysis

    ERIC Educational Resources Information Center

    Jarrell, Stephen B.; Stanley, T. D.

    2004-01-01

    The meta-regression analysis reveals that there is a strong tendency for discrimination estimates to fall and wage discrimination exist against the woman. The biasing effect of researchers' gender of not correcting for selection bias has weakened and changes in labor market have made it less important.

  4. Chemical discrimination of lubricant marketing types using direct analysis in real time time-of-flight mass spectrometry.

    PubMed

    Maric, Mark; Harvey, Lauren; Tomcsak, Maren; Solano, Angelique; Bridge, Candice

    2017-06-30

    In comparison to other violent crimes, sexual assaults suffer from very low prosecution and conviction rates especially in the absence of DNA evidence. As a result, the forensic community needs to utilize other forms of trace contact evidence, like lubricant evidence, in order to provide a link between the victim and the assailant. In this study, 90 personal bottled and condom lubricants from the three main marketing types, silicone-based, water-based and condoms, were characterized by direct analysis in real time time of flight mass spectrometry (DART-TOFMS). The instrumental data was analyzed by multivariate statistics including hierarchal cluster analysis, principal component analysis, and linear discriminant analysis. By interpreting the mass spectral data with multivariate statistics, 12 discrete groupings were identified, indicating inherent chemical diversity not only between but within the three main marketing groups. A number of unique chemical markers, both major and minor, were identified, other than the three main chemical components (i.e. PEG, PDMS and nonoxynol-9) currently used for lubricant classification. The data was validated by a stratified 20% withheld cross-validation which demonstrated that there was minimal overlap between the groupings. Based on the groupings identified and unique features of each group, a highly discriminating statistical model was then developed that aims to provide the foundation for the development of a forensic lubricant database that may eventually be applied to casework. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  5. Multidimensional classification of magma types for altered igneous rocks and application to their tectonomagmatic discrimination and igneous provenance of siliciclastic sediments

    NASA Astrophysics Data System (ADS)

    Verma, Surendra P.; Rivera-Gómez, M. Abdelaly; Díaz-González, Lorena; Pandarinath, Kailasa; Amezcua-Valdez, Alejandra; Rosales-Rivera, Mauricio; Verma, Sanjeet K.; Quiroz-Ruiz, Alfredo; Armstrong-Altrin, John S.

    2017-05-01

    A new multidimensional scheme consistent with the International Union of Geological Sciences (IUGS) is proposed for the classification of igneous rocks in terms of four magma types: ultrabasic, basic, intermediate, and acid. Our procedure is based on an extensive database of major element composition of a total of 33,868 relatively fresh rock samples having a multinormal distribution (initial database with 37,215 samples). Multinormally distributed database in terms of log-ratios of samples was ascertained by a new computer program DOMuDaF, in which the discordancy test was applied at the 99.9% confidence level. Isometric log-ratio (ilr) transformation was used to provide overall percent correct classification of 88.7%, 75.8%, 88.0%, and 80.9% for ultrabasic, basic, intermediate, and acid rocks, respectively. Given the known mathematical and uncertainty propagation properties, this transformation could be adopted for routine applications. The incorrect classification was mainly for the "neighbour" magma types, e.g., basic for ultrabasic and vice versa. Some of these misclassifications do not have any effect on multidimensional tectonic discrimination. For an efficient application of this multidimensional scheme, a new computer program MagClaMSys_ilr (MagClaMSys-Magma Classification Major-element based System) was written, which is available for on-line processing on http://tlaloc.ier.unam.mx/index.html. This classification scheme was tested from newly compiled data for relatively fresh Neogene igneous rocks and was found to be consistent with the conventional IUGS procedure. The new scheme was successfully applied to inter-laboratory data for three geochemical reference materials (basalts JB-1 and JB-1a, and andesite JA-3) from Japan and showed that the inferred magma types are consistent with the rock name (basic for basalts JB-1 and JB-1a and intermediate for andesite JA-3). The scheme was also successfully applied to five case studies of older Archaean to

  6. Recurrence quantification analysis and support vector machines for golf handicap and low back pain EMG classification.

    PubMed

    Silva, Luís; Vaz, João Rocha; Castro, Maria António; Serranho, Pedro; Cabri, Jan; Pezarat-Correia, Pedro

    2015-08-01

    The quantification of non-linear characteristics of electromyography (EMG) must contain information allowing to discriminate neuromuscular strategies during dynamic skills. There are a lack of studies about muscle coordination under motor constrains during dynamic contractions. In golf, both handicap (Hc) and low back pain (LBP) are the main factors associated with the occurrence of injuries. The aim of this study was to analyze the accuracy of support vector machines SVM on EMG-based classification to discriminate Hc (low and high handicap) and LBP (with and without LPB) in the main phases of golf swing. For this purpose recurrence quantification analysis (RQA) features of the trunk and the lower limb muscles were used to feed a SVM classifier. Recurrence rate (RR) and the ratio between determinism (DET) and RR showed a high discriminant power. The Hc accuracy for the swing, backswing, and downswing were 94.4±2.7%, 97.1±2.3%, and 95.3±2.6%, respectively. For LBP, the accuracy was 96.9±3.8% for the swing, and 99.7±0.4% in the backswing. External oblique (EO), biceps femoris (BF), semitendinosus (ST) and rectus femoris (RF) showed high accuracy depending on the laterality within the phase. RQA features and SVM showed a high muscle discriminant capacity within swing phases by Hc and by LBP. Low back pain golfers showed different neuromuscular coordination strategies when compared with asymptomatic. Copyright © 2015 Elsevier Ltd. All rights reserved.

  7. [Research on Rapid Discrimination of Edible Oil by ATR Infrared Spectroscopy].

    PubMed

    Ma, Xiao; Yuan, Hong-fu; Song, Chun-feng; Hu, Ai-qin; Li, Xiao-yu; Zhao, Zhong; Li, Xiu-qin; Guo Zhen; Zhu, Zhi-qiang

    2015-07-01

    A rapid discrimination method of edible oils, KL-BP model, was proposed by attenuated total reflectance infrared spectroscopy. The model extracts the characteristic of classification from source data by KL and reduces data dimension at the same time. Then the neural network model is constructed by the new data which as the input of the model. 84 edible oil samples which include sesame oil, corn oil, canola oil, blend oil, sunflower oil, peanut oil, olive oil, soybean oil and tea seed oil, were collected and their infrared spectra determined using an ATR FT-IR spectrometer. In order to compare the method performance, principal component analysis (PCA) direct-classification model, KL direct-classification model, PLS-DA model, PCA-BP model and KL-BP model are constructed in this paper. The results show that the recognition rates of PCA, PCA-BP, KL, PLS-DA and KL-BP are 59.1%, 68.2%, 77.3%, 77.3% and 90.9% for discriminating the 9 kinds of edible oils, respectively. KL extracts the eigenvector which make the distance between different class and distance of every class ratio is the largest. So the method can get much more classify information than PCA. BP neural network can effectively enhance the classification ability and accuracy. Taking full of the advantages of KL in extracting more category information in dimension reducing and the features of BP neural network in self-learning, adaptive, nonlinear, the KL-BP method has the best classification ability and recognition accuracy and great importance for rapidly recognizing edible oil in practice.

  8. The ITE Land classification: Providing an environmental stratification of Great Britain.

    PubMed

    Bunce, R G; Barr, C J; Gillespie, M K; Howard, D C

    1996-01-01

    The surface of Great Britain (GB) varies continuously in land cover from one area to another. The objective of any environmentally based land classification is to produce classes that match the patterns that are present by helping to define clear boundaries. The more appropriate the analysis and data used, the better the classes will fit the natural patterns. The observation of inter-correlations between ecological factors is the basis for interpreting ecological patterns in the field, and the Institute of Terrestrial Ecology (ITE) Land Classification formalises such subjective ideas. The data inevitably comprise a large number of factors in order to describe the environment adequately. Single factors, such as altitude, would only be useful on a national basis if they were the only dominant causative agent of ecological variation.The ITE Land Classification has defined 32 environmental categories called 'land classes', initially based on a sample of 1-km squares in Great Britain but subsequently extended to all 240 000 1-km squares. The original classification was produced using multivariate analysis of 75 environmental variables. The extension to all squares in GB was performed using a combination of logistic discrimination and discriminant functions. The classes have provided a stratification for successive ecological surveys, the results of which have characterised the classes in terms of botanical, zoological and landscape features.The classification has also been applied to integrate diverse datasets including satellite imagery, soils and socio-economic information. A variety of models have used the structure of the classification, for example to show potential land use change under different economic conditions. The principal data sets relevant for planning purposes have been incorporated into a user-friendly computer package, called the 'Countryside Information System'.

  9. A comprehensive simulation study on classification of RNA-Seq data.

    PubMed

    Zararsız, Gökmen; Goksuluk, Dincer; Korkmaz, Selcuk; Eldem, Vahap; Zararsiz, Gozde Erturk; Duru, Izzet Parug; Ozturk, Ahmet

    2017-01-01

    RNA sequencing (RNA-Seq) is a powerful technique for the gene-expression profiling of organisms that uses the capabilities of next-generation sequencing technologies. Developing gene-expression-based classification algorithms is an emerging powerful method for diagnosis, disease classification and monitoring at molecular level, as well as providing potential markers of diseases. Most of the statistical methods proposed for the classification of gene-expression data are either based on a continuous scale (eg. microarray data) or require a normal distribution assumption. Hence, these methods cannot be directly applied to RNA-Seq data since they violate both data structure and distributional assumptions. However, it is possible to apply these algorithms with appropriate modifications to RNA-Seq data. One way is to develop count-based classifiers, such as Poisson linear discriminant analysis and negative binomial linear discriminant analysis. Another way is to bring the data closer to microarrays and apply microarray-based classifiers. In this study, we compared several classifiers including PLDA with and without power transformation, NBLDA, single SVM, bagging SVM (bagSVM), classification and regression trees (CART), and random forests (RF). We also examined the effect of several parameters such as overdispersion, sample size, number of genes, number of classes, differential-expression rate, and the transformation method on model performances. A comprehensive simulation study is conducted and the results are compared with the results of two miRNA and two mRNA experimental datasets. The results revealed that increasing the sample size, differential-expression rate and decreasing the dispersion parameter and number of groups lead to an increase in classification accuracy. Similar with differential-expression studies, the classification of RNA-Seq data requires careful attention when handling data overdispersion. We conclude that, as a count-based classifier, the power

  10. Comparison of different classification algorithms for underwater target discrimination.

    PubMed

    Li, Donghui; Azimi-Sadjadi, Mahmood R; Robinson, Marc

    2004-01-01

    Classification of underwater targets from the acoustic backscattered signals is considered here. Several different classification algorithms are tested and benchmarked not only for their performance but also to gain insight to the properties of the feature space. Results on a wideband 80-kHz acoustic backscattered data set collected for six different objects are presented in terms of the receiver operating characteristic (ROC) and robustness of the classifiers wrt reverberation.

  11. Classification of product inspection items using nonlinear features

    NASA Astrophysics Data System (ADS)

    Talukder, Ashit; Casasent, David P.; Lee, H.-W.

    1998-03-01

    Automated processing and classification of real-time x-ray images of randomly oriented touching pistachio nuts is discussed. The ultimate objective is the development of a system for automated non-invasive detection of defective product items on a conveyor belt. This approach involves two main steps: preprocessing and classification. Preprocessing locates individual items and segments ones that touch using a modified watershed algorithm. The second stage involves extraction of features that allow discrimination between damaged and clean items (pistachio nuts). This feature extraction and classification stage is the new aspect of this paper. We use a new nonlinear feature extraction scheme called the maximum representation and discriminating feature (MRDF) extraction method to compute nonlinear features that are used as inputs to a classifier. The MRDF is shown to provide better classification and a better ROC (receiver operating characteristic) curve than other methods.

  12. A Machine-Learning Algorithm Toward Color Analysis for Chronic Liver Disease Classification, Employing Ultrasound Shear Wave Elastography.

    PubMed

    Gatos, Ilias; Tsantis, Stavros; Spiliopoulos, Stavros; Karnabatidis, Dimitris; Theotokas, Ioannis; Zoumpoulis, Pavlos; Loupas, Thanasis; Hazle, John D; Kagadis, George C

    2017-09-01

    The purpose of the present study was to employ a computer-aided diagnosis system that classifies chronic liver disease (CLD) using ultrasound shear wave elastography (SWE) imaging, with a stiffness value-clustering and machine-learning algorithm. A clinical data set of 126 patients (56 healthy controls, 70 with CLD) was analyzed. First, an RGB-to-stiffness inverse mapping technique was employed. A five-cluster segmentation was then performed associating corresponding different-color regions with certain stiffness value ranges acquired from the SWE manufacturer-provided color bar. Subsequently, 35 features (7 for each cluster), indicative of physical characteristics existing within the SWE image, were extracted. A stepwise regression analysis toward feature reduction was used to derive a reduced feature subset that was fed into the support vector machine classification algorithm to classify CLD from healthy cases. The highest accuracy in classification of healthy to CLD subject discrimination from the support vector machine model was 87.3% with sensitivity and specificity values of 93.5% and 81.2%, respectively. Receiver operating characteristic curve analysis gave an area under the curve value of 0.87 (confidence interval: 0.77-0.92). A machine-learning algorithm that quantifies color information in terms of stiffness values from SWE images and discriminates CLD from healthy cases is introduced. New objective parameters and criteria for CLD diagnosis employing SWE images provided by the present study can be considered an important step toward color-based interpretation, and could assist radiologists' diagnostic performance on a daily basis after being installed in a PC and employed retrospectively, immediately after the examination. Copyright © 2017 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.

  13. Chromatographic profiles of Phyllanthus aqueous extracts samples: a proposition of classification using chemometric models.

    PubMed

    Martins, Lucia Regina Rocha; Pereira-Filho, Edenir Rodrigues; Cass, Quezia Bezerra

    2011-04-01

    Taking in consideration the global analysis of complex samples, proposed by the metabolomic approach, the chromatographic fingerprint encompasses an attractive chemical characterization of herbal medicines. Thus, it can be used as a tool in quality control analysis of phytomedicines. The generated multivariate data are better evaluated by chemometric analyses, and they can be modeled by classification methods. "Stone breaker" is a popular Brazilian plant of Phyllanthus genus, used worldwide to treat renal calculus, hepatitis, and many other diseases. In this study, gradient elution at reversed-phase conditions with detection at ultraviolet region were used to obtain chemical profiles (fingerprints) of botanically identified samples of six Phyllanthus species. The obtained chromatograms, at 275 nm, were organized in data matrices, and the time shifts of peaks were adjusted using the Correlation Optimized Warping algorithm. Principal Component Analyses were performed to evaluate similarities among cultivated and uncultivated samples and the discrimination among the species and, after that, the samples were used to compose three classification models using Soft Independent Modeling of Class analogy, K-Nearest Neighbor, and Partial Least Squares for Discriminant Analysis. The ability of classification models were discussed after their successful application for authenticity evaluation of 25 commercial samples of "stone breaker."

  14. Autonomic specificity of basic emotions: evidence from pattern classification and cluster analysis.

    PubMed

    Stephens, Chad L; Christie, Israel C; Friedman, Bruce H

    2010-07-01

    Autonomic nervous system (ANS) specificity of emotion remains controversial in contemporary emotion research, and has received mixed support over decades of investigation. This study was designed to replicate and extend psychophysiological research, which has used multivariate pattern classification analysis (PCA) in support of ANS specificity. Forty-nine undergraduates (27 women) listened to emotion-inducing music and viewed affective films while a montage of ANS variables, including heart rate variability indices, peripheral vascular activity, systolic time intervals, and electrodermal activity, were recorded. Evidence for ANS discrimination of emotion was found via PCA with 44.6% of overall observations correctly classified into the predicted emotion conditions, using ANS variables (z=16.05, p<.001). Cluster analysis of these data indicated a lack of distinct clusters, which suggests that ANS responses to the stimuli were nomothetic and stimulus-specific rather than idiosyncratic and individual-specific. Collectively these results further confirm and extend support for the notion that basic emotions have distinct ANS signatures. Copyright © 2010 Elsevier B.V. All rights reserved.

  15. Effect of radiance-to-reflectance transformation and atmosphere removal on maximum likelihood classification accuracy of high-dimensional remote sensing data

    NASA Technical Reports Server (NTRS)

    Hoffbeck, Joseph P.; Landgrebe, David A.

    1994-01-01

    Many analysis algorithms for high-dimensional remote sensing data require that the remotely sensed radiance spectra be transformed to approximate reflectance to allow comparison with a library of laboratory reflectance spectra. In maximum likelihood classification, however, the remotely sensed spectra are compared to training samples, thus a transformation to reflectance may or may not be helpful. The effect of several radiance-to-reflectance transformations on maximum likelihood classification accuracy is investigated in this paper. We show that the empirical line approach, LOWTRAN7, flat-field correction, single spectrum method, and internal average reflectance are all non-singular affine transformations, and that non-singular affine transformations have no effect on discriminant analysis feature extraction and maximum likelihood classification accuracy. (An affine transformation is a linear transformation with an optional offset.) Since the Atmosphere Removal Program (ATREM) and the log residue method are not affine transformations, experiments with Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) data were conducted to determine the effect of these transformations on maximum likelihood classification accuracy. The average classification accuracy of the data transformed by ATREM and the log residue method was slightly less than the accuracy of the original radiance data. Since the radiance-to-reflectance transformations allow direct comparison of remotely sensed spectra with laboratory reflectance spectra, they can be quite useful in labeling the training samples required by maximum likelihood classification, but these transformations have only a slight effect or no effect at all on discriminant analysis and maximum likelihood classification accuracy.

  16. Sub-pattern based multi-manifold discriminant analysis for face recognition

    NASA Astrophysics Data System (ADS)

    Dai, Jiangyan; Guo, Changlu; Zhou, Wei; Shi, Yanjiao; Cong, Lin; Yi, Yugen

    2018-04-01

    In this paper, we present a Sub-pattern based Multi-manifold Discriminant Analysis (SpMMDA) algorithm for face recognition. Unlike existing Multi-manifold Discriminant Analysis (MMDA) approach which is based on holistic information of face image for recognition, SpMMDA operates on sub-images partitioned from the original face image and then extracts the discriminative local feature from the sub-images separately. Moreover, the structure information of different sub-images from the same face image is considered in the proposed method with the aim of further improve the recognition performance. Extensive experiments on three standard face databases (Extended YaleB, CMU PIE and AR) demonstrate that the proposed method is effective and outperforms some other sub-pattern based face recognition methods.

  17. Semi-supervised vibration-based classification and condition monitoring of compressors

    NASA Astrophysics Data System (ADS)

    Potočnik, Primož; Govekar, Edvard

    2017-09-01

    Semi-supervised vibration-based classification and condition monitoring of the reciprocating compressors installed in refrigeration appliances is proposed in this paper. The method addresses the problem of industrial condition monitoring where prior class definitions are often not available or difficult to obtain from local experts. The proposed method combines feature extraction, principal component analysis, and statistical analysis for the extraction of initial class representatives, and compares the capability of various classification methods, including discriminant analysis (DA), neural networks (NN), support vector machines (SVM), and extreme learning machines (ELM). The use of the method is demonstrated on a case study which was based on industrially acquired vibration measurements of reciprocating compressors during the production of refrigeration appliances. The paper presents a comparative qualitative analysis of the applied classifiers, confirming the good performance of several nonlinear classifiers. If the model parameters are properly selected, then very good classification performance can be obtained from NN trained by Bayesian regularization, SVM and ELM classifiers. The method can be effectively applied for the industrial condition monitoring of compressors.

  18. Supervised DNA Barcodes species classification: analysis, comparisons and results

    PubMed Central

    2014-01-01

    Background Specific fragments, coming from short portions of DNA (e.g., mitochondrial, nuclear, and plastid sequences), have been defined as DNA Barcode and can be used as markers for organisms of the main life kingdoms. Species classification with DNA Barcode sequences has been proven effective on different organisms. Indeed, specific gene regions have been identified as Barcode: COI in animals, rbcL and matK in plants, and ITS in fungi. The classification problem assigns an unknown specimen to a known species by analyzing its Barcode. This task has to be supported with reliable methods and algorithms. Methods In this work the efficacy of supervised machine learning methods to classify species with DNA Barcode sequences is shown. The Weka software suite, which includes a collection of supervised classification methods, is adopted to address the task of DNA Barcode analysis. Classifier families are tested on synthetic and empirical datasets belonging to the animal, fungus, and plant kingdoms. In particular, the function-based method Support Vector Machines (SVM), the rule-based RIPPER, the decision tree C4.5, and the Naïve Bayes method are considered. Additionally, the classification results are compared with respect to ad-hoc and well-established DNA Barcode classification methods. Results A software that converts the DNA Barcode FASTA sequences to the Weka format is released, to adapt different input formats and to allow the execution of the classification procedure. The analysis of results on synthetic and real datasets shows that SVM and Naïve Bayes outperform on average the other considered classifiers, although they do not provide a human interpretable classification model. Rule-based methods have slightly inferior classification performances, but deliver the species specific positions and nucleotide assignments. On synthetic data the supervised machine learning methods obtain superior classification performances with respect to the traditional DNA Barcode

  19. One input-class and two input-class classifications for differentiating olive oil from other edible vegetable oils by use of the normal-phase liquid chromatography fingerprint of the methyl-transesterified fraction.

    PubMed

    Jiménez-Carvelo, Ana M; Pérez-Castaño, Estefanía; González-Casado, Antonio; Cuadros-Rodríguez, Luis

    2017-04-15

    A new method for differentiation of olive oil (independently of the quality category) from other vegetable oils (canola, safflower, corn, peanut, seeds, grapeseed, palm, linseed, sesame and soybean) has been developed. The analytical procedure for chromatographic fingerprinting of the methyl-transesterified fraction of each vegetable oil, using normal-phase liquid chromatography, is described and the chemometric strategies applied and discussed. Some chemometric methods, such as k-nearest neighbours (kNN), partial least squared-discriminant analysis (PLS-DA), support vector machine classification analysis (SVM-C), and soft independent modelling of class analogies (SIMCA), were applied to build classification models. Performance of the classification was evaluated and ranked using several classification quality metrics. The discriminant analysis, based on the use of one input-class, (plus a dummy class) was applied for the first time in this study. Copyright © 2016 Elsevier Ltd. All rights reserved.

  20. Mining for class-specific motifs in protein sequence classification

    PubMed Central

    2013-01-01

    Background In protein sequence classification, identification of the sequence motifs or n-grams that can precisely discriminate between classes is a more interesting scientific question than the classification itself. A number of classification methods aim at accurate classification but fail to explain which sequence features indeed contribute to the accuracy. We hypothesize that sequences in lower denominations (n-grams) can be used to explore the sequence landscape and to identify class-specific motifs that discriminate between classes during classification. Discriminative n-grams are short peptide sequences that are highly frequent in one class but are either minimally present or absent in other classes. In this study, we present a new substitution-based scoring function for identifying discriminative n-grams that are highly specific to a class. Results We present a scoring function based on discriminative n-grams that can effectively discriminate between classes. The scoring function, initially, harvests the entire set of 4- to 8-grams from the protein sequences of different classes in the dataset. Similar n-grams of the same size are combined to form new n-grams, where the similarity is defined by positive amino acid substitution scores in the BLOSUM62 matrix. Substitution has resulted in a large increase in the number of discriminatory n-grams harvested. Due to the unbalanced nature of the dataset, the frequencies of the n-grams are normalized using a dampening factor, which gives more weightage to the n-grams that appear in fewer classes and vice-versa. After the n-grams are normalized, the scoring function identifies discriminative 4- to 8-grams for each class that are frequent enough to be above a selection threshold. By mapping these discriminative n-grams back to the protein sequences, we obtained contiguous n-grams that represent short class-specific motifs in protein sequences. Our method fared well compared to an existing motif finding method known as

  1. Linear discriminant analysis with misallocation in training samples

    NASA Technical Reports Server (NTRS)

    Chhikara, R. (Principal Investigator); Mckeon, J.

    1982-01-01

    Linear discriminant analysis for a two-class case is studied in the presence of misallocation in training samples. A general appraoch to modeling of mislocation is formulated, and the mean vectors and covariance matrices of the mixture distributions are derived. The asymptotic distribution of the discriminant boundary is obtained and the asymptotic first two moments of the two types of error rate given. Certain numerical results for the error rates are presented by considering the random and two non-random misallocation models. It is shown that when the allocation procedure for training samples is objectively formulated, the effect of misallocation on the error rates of the Bayes linear discriminant rule can almost be eliminated. If, however, this is not possible, the use of Fisher rule may be preferred over the Bayes rule.

  2. Classification of CT examinations for COPD visual severity analysis

    NASA Astrophysics Data System (ADS)

    Tan, Jun; Zheng, Bin; Wang, Xingwei; Pu, Jiantao; Gur, David; Sciurba, Frank C.; Leader, J. Ken

    2012-03-01

    In this study we present a computational method of CT examination classification into visual assessed emphysema severity. The visual severity categories ranged from 0 to 5 and were rated by an experienced radiologist. The six categories were none, trace, mild, moderate, severe and very severe. Lung segmentation was performed for every input image and all image features are extracted from the segmented lung only. We adopted a two-level feature representation method for the classification. Five gray level distribution statistics, six gray level co-occurrence matrix (GLCM), and eleven gray level run-length (GLRL) features were computed for each CT image depicted segment lung. Then we used wavelets decomposition to obtain the low- and high-frequency components of the input image, and again extract from the lung region six GLCM features and eleven GLRL features. Therefore our feature vector length is 56. The CT examinations were classified using the support vector machine (SVM) and k-nearest neighbors (KNN) and the traditional threshold (density mask) approach. The SVM classifier had the highest classification performance of all the methods with an overall sensitivity of 54.4% and a 69.6% sensitivity to discriminate "no" and "trace visually assessed emphysema. We believe this work may lead to an automated, objective method to categorically classify emphysema severity on CT exam.

  3. REGIONAL-SCALE WIND FIELD CLASSIFICATION EMPLOYING CLUSTER ANALYSIS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Glascoe, L G; Glaser, R E; Chin, H S

    2004-06-17

    The classification of time-varying multivariate regional-scale wind fields at a specific location can assist event planning as well as consequence and risk analysis. Further, wind field classification involves data transformation and inference techniques that effectively characterize stochastic wind field variation. Such a classification scheme is potentially useful for addressing overall atmospheric transport uncertainty and meteorological parameter sensitivity issues. Different methods to classify wind fields over a location include the principal component analysis of wind data (e.g., Hardy and Walton, 1978) and the use of cluster analysis for wind data (e.g., Green et al., 1992; Kaufmann and Weber, 1996). The goalmore » of this study is to use a clustering method to classify the winds of a gridded data set, i.e, from meteorological simulations generated by a forecast model.« less

  4. Fossil Signatures Using Elemental Abundance Distributions and Bayesian Probabilistic Classification

    NASA Technical Reports Server (NTRS)

    Hoover, Richard B.; Storrie-Lombardi, Michael C.

    2004-01-01

    Elemental abundances (C6, N7, O8, Na11, Mg12, Al3, P15, S16, Cl17, K19, Ca20, Ti22, Mn25, Fe26, and Ni28) were obtained for a set of terrestrial fossils and the rock matrix surrounding them. Principal Component Analysis extracted five factors accounting for the 92.5% of the data variance, i.e. information content, of the elemental abundance data. Hierarchical Cluster Analysis provided unsupervised sample classification distinguishing fossil from matrix samples on the basis of either raw abundances or PCA input that agreed strongly with visual classification. A stochastic, non-linear Artificial Neural Network produced a Bayesian probability of correct sample classification. The results provide a quantitative probabilistic methodology for discriminating terrestrial fossils from the surrounding rock matrix using chemical information. To demonstrate the applicability of these techniques to the assessment of meteoritic samples or in situ extraterrestrial exploration, we present preliminary data on samples of the Orgueil meteorite. In both systems an elemental signature produces target classification decisions remarkably consistent with morphological classification by a human expert using only structural (visual) information. We discuss the possibility of implementing a complexity analysis metric capable of automating certain image analysis and pattern recognition abilities of the human eye using low magnification optical microscopy images and discuss the extension of this technique across multiple scales.

  5. Fourier transform infrared spectroscopy combined with chemometrics for discrimination of Curcuma longa, Curcuma xanthorrhiza and Zingiber cassumunar

    NASA Astrophysics Data System (ADS)

    Rohaeti, Eti; Rafi, Mohamad; Syafitri, Utami Dyah; Heryanto, Rudi

    2015-02-01

    Turmeric (Curcuma longa), java turmeric (Curcuma xanthorrhiza) and cassumunar ginger (Zingiber cassumunar) are widely used in traditional Indonesian medicines (jamu). They have similar color for their rhizome and possess some similar uses, so it is possible to substitute one for the other. The identification and discrimination of these closely-related plants is a crucial task to ensure the quality of the raw materials. Therefore, an analytical method which is rapid, simple and accurate for discriminating these species using Fourier transform infrared spectroscopy (FTIR) combined with some chemometrics methods was developed. FTIR spectra were acquired in the mid-IR region (4000-400 cm-1). Standard normal variate, first and second order derivative spectra were compared for the spectral data. Principal component analysis (PCA) and canonical variate analysis (CVA) were used for the classification of the three species. Samples could be discriminated by visual analysis of the FTIR spectra by using their marker bands. Discrimination of the three species was also possible through the combination of the pre-processed FTIR spectra with PCA and CVA, in which CVA gave clearer discrimination. Subsequently, the developed method could be used for the identification and discrimination of the three closely-related plant species.

  6. Inter-class sparsity based discriminative least square regression.

    PubMed

    Wen, Jie; Xu, Yong; Li, Zuoyong; Ma, Zhongli; Xu, Yuanrong

    2018-06-01

    Least square regression is a very popular supervised classification method. However, two main issues greatly limit its performance. The first one is that it only focuses on fitting the input features to the corresponding output labels while ignoring the correlations among samples. The second one is that the used label matrix, i.e., zero-one label matrix is inappropriate for classification. To solve these problems and improve the performance, this paper presents a novel method, i.e., inter-class sparsity based discriminative least square regression (ICS_DLSR), for multi-class classification. Different from other methods, the proposed method pursues that the transformed samples have a common sparsity structure in each class. For this goal, an inter-class sparsity constraint is introduced to the least square regression model such that the margins of samples from the same class can be greatly reduced while those of samples from different classes can be enlarged. In addition, an error term with row-sparsity constraint is introduced to relax the strict zero-one label matrix, which allows the method to be more flexible in learning the discriminative transformation matrix. These factors encourage the method to learn a more compact and discriminative transformation for regression and thus has the potential to perform better than other methods. Extensive experimental results show that the proposed method achieves the best performance in comparison with other methods for multi-class classification. Copyright © 2018 Elsevier Ltd. All rights reserved.

  7. Comparison between two race/skin color classifications in relation to health-related outcomes in Brazil

    PubMed Central

    2011-01-01

    Background This paper aims to compare the classification of race/skin color based on the discrete categories used by the Demographic Census of the Brazilian Institute of Geography and Statistics (IBGE) and a skin color scale with values ranging from 1 (lighter skin) to 10 (darker skin), examining whether choosing one alternative or the other can influence measures of self-evaluation of health status, health care service utilization and discrimination in the health services. Methods This is a cross-sectional study based on data from the World Health Survey carried out in Brazil in 2003 with a sample of 5000 individuals older than 18 years. Similarities between the two classifications were evaluated by means of correspondence analysis. The effect of the two classifications on health outcomes was tested through logistic regression models for each sex, using age, educational level and ownership of consumer goods as covariables. Results Both measures of race/skin color represent the same race/skin color construct. The results show a tendency among Brazilians to classify their skin color in shades closer to the center of the color gradient. Women tend to classify their race/skin color as a little lighter than men in the skin color scale, an effect not observed when IBGE categories are used. With regard to health and health care utilization, race/skin color was not relevant in explaining any of them, regardless of the race/skin color classification. Lack of money and social class were the most prevalent reasons for discrimination in healthcare reported in the survey, suggesting that in Brazil the discussion about discrimination in the health care must not be restricted to racial discrimination and should also consider class-based discrimination. The study shows that the differences of the two classifications of race/skin color are small. However, the interval scale measure appeared to increase the freedom of choice of the respondent. PMID:21867522

  8. Characteristic fingerprinting based on macamides for discrimination of maca (Lepidium meyenii) by LC/MS/MS and multivariate statistical analysis.

    PubMed

    Pan, Yu; Zhang, Ji; Li, Hong; Wang, Yuan-Zhong; Li, Wan-Yi

    2016-10-01

    Macamides with a benzylalkylamide nucleus are characteristic and major bioactive compounds in the functional food maca (Lepidium meyenii Walp). The aim of this study was to explore variations in macamide content among maca from China and Peru. Twenty-seven batches of maca hypocotyls with different phenotypes, sampled from different geographical origins, were extracted and profiled by liquid chromatography with ultraviolet detection/tandem mass spectrometry (LC-UV/MS/MS). Twelve macamides were identified by MS operated in multiple scanning modes. Similarity analysis showed that maca samples differed significantly in their macamide fingerprinting. Partial least squares discriminant analysis (PLS-DA) was used to differentiate samples according to their geographical origin and to identify the most relevant variables in the classification model. The prediction accuracy for raw maca was 91% and five macamides were selected and considered as chemical markers for sample classification. When combined with a PLS-DA model, characteristic fingerprinting based on macamides could be recommended for labelling for the authentication of maca from different geographical origins. The results provided potential evidence for the relationships between environmental or other factors and distribution of macamides. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.

  9. [Discrimination of Red Tide algae by fluorescence spectra and principle component analysis].

    PubMed

    Su, Rong-guo; Hu, Xu-peng; Zhang, Chuan-song; Wang, Xiu-lin

    2007-07-01

    Fluorescence discrimination technology for 11 species of the Red Tide algae at genus level was constructed by principle component analysis and non-negative least squares. Rayleigh and Raman scattering peaks of 3D fluorescence spectra were eliminated by Delaunay triangulation method. According to the results of Fisher linear discrimination, the first principle component score and the second component score of 3D fluorescence spectra were chosen as discriminant feature and the feature base was established. The 11 algae species were tested, and more than 85% samples were accurately determinated, especially for Prorocentrum donghaiense, Skeletonema costatum, Gymnodinium sp., which have frequently brought Red tide in the East China Sea. More than 95% samples were right discriminated. The results showed that the genus discriminant feature of 3D fluorescence spectra of Red Tide algae given by principle component analysis could work well.

  10. Hybrid analysis of multiaxis electromagnetic data for discrimination of munitions and explosives of concern

    USGS Publications Warehouse

    Friedel, M.J.; Asch, T.H.; Oden, C.

    2012-01-01

    The remediation of land containing munitions and explosives of concern, otherwise known as unexploded ordnance, is an ongoing problem facing the U.S. Department of Defense and similar agencies worldwide that have used or are transferring training ranges or munitions disposal areas to civilian control. The expense associated with cleanup of land previously used for military training and war provides impetus for research towards enhanced discrimination of buried unexploded ordnance. Towards reducing that expense, a multiaxis electromagnetic induction data collection and software system, called ALLTEM, was designed and tested with support from the U.S. Department of Defense Environmental Security Technology Certification Program. ALLTEM is an on-time time-domain system that uses a continuous triangle-wave excitation to measure the target-step response rather than traditional impulse response. The system cycles through three orthogonal transmitting loops and records a total of 19 different transmitting and receiving loop combinations with a nominal spatial data sampling interval of 20 cm. Recorded data are pre-processed and then used in a hybrid discrimination scheme involving both data-driven and numerical classification techniques. The data-driven classification scheme is accomplished in three steps. First, field observations are used to train a type of unsupervised artificial neural network, a self-organizing map (SOM). Second, the SOM is used to simultaneously estimate target parameters (depth, azimuth, inclination, item type and weight) by iterative minimization of the topographic error vectors. Third, the target classification is accomplished by evaluating histograms of the estimated parameters. The numerical classification scheme is also accomplished in three steps. First, the Biot–Savart law is used to model the primary magnetic fields from the transmitter coils and the secondary magnetic fields generated by currents induced in the target materials in the

  11. Hybrid analysis of multiaxis electromagnetic data for discrimination of munitions and explosives of concern

    NASA Astrophysics Data System (ADS)

    Friedel, M. J.; Asch, T. H.; Oden, C.

    2012-08-01

    The remediation of land containing munitions and explosives of concern, otherwise known as unexploded ordnance, is an ongoing problem facing the U.S. Department of Defense and similar agencies worldwide that have used or are transferring training ranges or munitions disposal areas to civilian control. The expense associated with cleanup of land previously used for military training and war provides impetus for research towards enhanced discrimination of buried unexploded ordnance. Towards reducing that expense, a multiaxis electromagnetic induction data collection and software system, called ALLTEM, was designed and tested with support from the U.S. Department of Defense Environmental Security Technology Certification Program. ALLTEM is an on-time time-domain system that uses a continuous triangle-wave excitation to measure the target-step response rather than traditional impulse response. The system cycles through three orthogonal transmitting loops and records a total of 19 different transmitting and receiving loop combinations with a nominal spatial data sampling interval of 20 cm. Recorded data are pre-processed and then used in a hybrid discrimination scheme involving both data-driven and numerical classification techniques. The data-driven classification scheme is accomplished in three steps. First, field observations are used to train a type of unsupervised artificial neural network, a self-organizing map (SOM). Second, the SOM is used to simultaneously estimate target parameters (depth, azimuth, inclination, item type and weight) by iterative minimization of the topographic error vectors. Third, the target classification is accomplished by evaluating histograms of the estimated parameters. The numerical classification scheme is also accomplished in three steps. First, the Biot-Savart law is used to model the primary magnetic fields from the transmitter coils and the secondary magnetic fields generated by currents induced in the target materials in the

  12. Classification Techniques for Multivariate Data Analysis.

    DTIC Science & Technology

    1980-03-28

    analysis among biologists, botanists, and ecologists, while some social scientists may refer "typology". Other frequently encountered terms are pattern...the determinantal equation: lB -XW 0 (42) 49 The solutions X. are the eigenvalues of the matrix W-1 B 1 as in discriminant analysis. There are t non...Statistical Package for Social Sciences (SPSS) (14) subprogram FACTOR was used for the principal components analysis. It is designed both for the factor

  13. Superiority of artificial neural networks for a genetic classification procedure.

    PubMed

    Sant'Anna, I C; Tomaz, R S; Silva, G N; Nascimento, M; Bhering, L L; Cruz, C D

    2015-08-19

    The correct classification of individuals is extremely important for the preservation of genetic variability and for maximization of yield in breeding programs using phenotypic traits and genetic markers. The Fisher and Anderson discriminant functions are commonly used multivariate statistical techniques for these situations, which allow for the allocation of an initially unknown individual to predefined groups. However, for higher levels of similarity, such as those found in backcrossed populations, these methods have proven to be inefficient. Recently, much research has been devoted to developing a new paradigm of computing known as artificial neural networks (ANNs), which can be used to solve many statistical problems, including classification problems. The aim of this study was to evaluate the feasibility of ANNs as an evaluation technique of genetic diversity by comparing their performance with that of traditional methods. The discriminant functions were equally ineffective in discriminating the populations, with error rates of 23-82%, thereby preventing the correct discrimination of individuals between populations. The ANN was effective in classifying populations with low and high differentiation, such as those derived from a genetic design established from backcrosses, even in cases of low differentiation of the data sets. The ANN appears to be a promising technique to solve classification problems, since the number of individuals classified incorrectly by the ANN was always lower than that of the discriminant functions. We envisage the potential relevant application of this improved procedure in the genomic classification of markers to distinguish between breeds and accessions.

  14. Real-Time Speech/Music Classification With a Hierarchical Oblique Decision Tree

    DTIC Science & Technology

    2008-04-01

    REAL-TIME SPEECH/ MUSIC CLASSIFICATION WITH A HIERARCHICAL OBLIQUE DECISION TREE Jun Wang, Qiong Wu, Haojiang Deng, Qin Yan Institute of Acoustics...time speech/ music classification with a hierarchical oblique decision tree. A set of discrimination features in frequency domain are selected...handle signals without discrimination and can not work properly in the existence of multimedia signals. This paper proposes a real-time speech/ music

  15. Discrimination of soft tissues using laser-induced breakdown spectroscopy in combination with k nearest neighbors (kNN) and support vector machine (SVM) classifiers

    NASA Astrophysics Data System (ADS)

    Li, Xiaohui; Yang, Sibo; Fan, Rongwei; Yu, Xin; Chen, Deying

    2018-06-01

    In this paper, discrimination of soft tissues using laser-induced breakdown spectroscopy (LIBS) in combination with multivariate statistical methods is presented. Fresh pork fat, skin, ham, loin and tenderloin muscle tissues are manually cut into slices and ablated using a 1064 nm pulsed Nd:YAG laser. Discrimination analyses between fat, skin and muscle tissues, and further between highly similar ham, loin and tenderloin muscle tissues, are performed based on the LIBS spectra in combination with multivariate statistical methods, including principal component analysis (PCA), k nearest neighbors (kNN) classification, and support vector machine (SVM) classification. Performances of the discrimination models, including accuracy, sensitivity and specificity, are evaluated using 10-fold cross validation. The classification models are optimized to achieve best discrimination performances. The fat, skin and muscle tissues can be definitely discriminated using both kNN and SVM classifiers, with accuracy of over 99.83%, sensitivity of over 0.995 and specificity of over 0.998. The highly similar ham, loin and tenderloin muscle tissues can also be discriminated with acceptable performances. The best performances are achieved with SVM classifier using Gaussian kernel function, with accuracy of 76.84%, sensitivity of over 0.742 and specificity of over 0.869. The results show that the LIBS technique assisted with multivariate statistical methods could be a powerful tool for online discrimination of soft tissues, even for tissues of high similarity, such as muscles from different parts of the animal body. This technique could be used for discrimination of tissues suffering minor clinical changes, thus may advance the diagnosis of early lesions and abnormalities.

  16. A Hybrid Sensing Approach for Pure and Adulterated Honey Classification

    PubMed Central

    Subari, Norazian; Saleh, Junita Mohamad; Shakaff, Ali Yeon Md; Zakaria, Ammar

    2012-01-01

    This paper presents a comparison between data from single modality and fusion methods to classify Tualang honey as pure or adulterated using Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) statistical classification approaches. Ten different brands of certified pure Tualang honey were obtained throughout peninsular Malaysia and Sumatera, Indonesia. Various concentrations of two types of sugar solution (beet and cane sugar) were used in this investigation to create honey samples of 20%, 40%, 60% and 80% adulteration concentrations. Honey data extracted from an electronic nose (e-nose) and Fourier Transform Infrared Spectroscopy (FTIR) were gathered, analyzed and compared based on fusion methods. Visual observation of classification plots revealed that the PCA approach able to distinct pure and adulterated honey samples better than the LDA technique. Overall, the validated classification results based on FTIR data (88.0%) gave higher classification accuracy than e-nose data (76.5%) using the LDA technique. Honey classification based on normalized low-level and intermediate-level FTIR and e-nose fusion data scored classification accuracies of 92.2% and 88.7%, respectively using the Stepwise LDA method. The results suggested that pure and adulterated honey samples were better classified using FTIR and e-nose fusion data than single modality data. PMID:23202033

  17. A hybrid sensing approach for pure and adulterated honey classification.

    PubMed

    Subari, Norazian; Mohamad Saleh, Junita; Md Shakaff, Ali Yeon; Zakaria, Ammar

    2012-10-17

    This paper presents a comparison between data from single modality and fusion methods to classify Tualang honey as pure or adulterated using Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) statistical classification approaches. Ten different brands of certified pure Tualang honey were obtained throughout peninsular Malaysia and Sumatera, Indonesia. Various concentrations of two types of sugar solution (beet and cane sugar) were used in this investigation to create honey samples of 20%, 40%, 60% and 80% adulteration concentrations. Honey data extracted from an electronic nose (e-nose) and Fourier Transform Infrared Spectroscopy (FTIR) were gathered, analyzed and compared based on fusion methods. Visual observation of classification plots revealed that the PCA approach able to distinct pure and adulterated honey samples better than the LDA technique. Overall, the validated classification results based on FTIR data (88.0%) gave higher classification accuracy than e-nose data (76.5%) using the LDA technique. Honey classification based on normalized low-level and intermediate-level FTIR and e-nose fusion data scored classification accuracies of 92.2% and 88.7%, respectively using the Stepwise LDA method. The results suggested that pure and adulterated honey samples were better classified using FTIR and e-nose fusion data than single modality data.

  18. Texture-Based Analysis of 100 MR Examinations of Head and Neck Tumors - Is It Possible to Discriminate Between Benign and Malignant Masses in a Multicenter Trial?

    PubMed

    Fruehwald-Pallamar, J; Hesselink, J R; Mafee, M F; Holzer-Fruehwald, L; Czerny, C; Mayerhoefer, M E

    2016-02-01

    To evaluate whether texture-based analysis of standard MRI sequences can help in the discrimination between benign and malignant head and neck tumors. The MR images of 100 patients with a histologically clarified head or neck mass, from two different institutions, were analyzed. Texture-based analysis was performed using texture analysis software, with region of interest measurements for 2 D and 3 D evaluation independently for all axial sequences. COC, RUN, GRA, ARM, and WAV features were calculated for all ROIs. 10 texture feature subsets were used for a linear discriminant analysis, in combination with k-nearest-neighbor classification. Benign and malignant tumors were compared with regard to texture-based values. There were differences in the images from different field-strength scanners, as well as from different vendors. For the differentiation of benign and malignant tumors, we found differences on STIR and T2-weighted images for 2 D, and on contrast-enhanced T1-TSE with fat saturation for 3 D evaluation. In a separate analysis of the subgroups 1.5 and 3 Tesla, more discriminating features were found. Texture-based analysis is a useful tool in the discrimination of benign and malignant tumors when performed on one scanner with the same protocol. We cannot recommend this technique for the use of multicenter studies with clinical data. 2 D/3 D texture-based analysis can be performed in head and neck tumors. Texture-based analysis can differentiate between benign and malignant masses. Analyzed MR images should originate from one scanner with an identical protocol. © Georg Thieme Verlag KG Stuttgart · New York.

  19. Factors that Affect Poverty Areas in North Sumatera Using Discriminant Analysis

    NASA Astrophysics Data System (ADS)

    Nasution, D. H.; Bangun, P.; Sitepu, H. R.

    2018-04-01

    In Indonesia, especially North Sumatera, the problem of poverty is one of the fundamental problems that become the focus of government both central and local government. Although the poverty rate decreased but the fact is there are many people who are poor. Poverty happens covers several aspects such as education, health, demographics, and also structural and cultural. This research will discuss about several factors such as population density, Unemployment Rate, GDP per capita ADHK, ADHB GDP per capita, economic growth and life expectancy that affect poverty in Indonesia. To determine the factors that most influence and differentiate the level of poverty of the Regency/City North Sumatra used discriminant analysis method. Discriminant analysis is one multivariate analysis technique are used to classify the data into a group based on the dependent variable and independent variable. Using discriminant analysis, it is evident that the factor affecting poverty is Unemployment Rate.

  20. Tire traces - discrimination and classification of pyrolysis-GC/MS profiles.

    PubMed

    Gueissaz, Line; Massonnet, Geneviève

    2013-07-10

    Tire traces can be observed on several crime scenes as vehicles are often used by criminals. The tread abrasion on the road, while braking or skidding, leads to the production of small rubber particles which can be collected for comparison purposes. This research focused on the statistical comparison of Py-GC/MS profiles of tire traces and tire treads. The optimisation of the analytical method was carried out using experimental designs. The aim was to determine the best pyrolysis parameters regarding the repeatability of the results. Thus, the pyrolysis factor effect could also be calculated. The pyrolysis temperature was found to be five time more important than time. Finally, a pyrolysis at 650°C during 15s was selected. Ten tires of different manufacturers and models were used for this study. Several samples were collected on each tire, and several replicates were carried out to study the variability within each tire (intravariability). More than eighty compounds were integrated for each analysis and the variability study showed that more than 75% presented a relative standard deviation (RSD) below 5% for the ten tires, thus supporting a low intravariability. The variability between the ten tires (intervariability) presented higher values and the ten most variant compounds had a RSD value above 13%, supporting their high potential of discrimination between the tires tested. Principal Component Analysis (PCA) was able to fully discriminate the ten tires with the help of the first three principal components. The ten tires were finally used to perform braking tests on a racetrack with a vehicle equipped with an anti-lock braking system. The resulting tire traces were adequately collected using sheets of white gelatine. As for tires, the intravariability for the traces was found to be lower than the intervariability. Clustering methods were carried out and the Ward's method based on the squared Euclidean distance was able to correctly group all of the tire traces

  1. A Comparative Study of Land Cover Classification by Using Multispectral and Texture Data

    PubMed Central

    Qadri, Salman; Khan, Dost Muhammad; Ahmad, Farooq; Qadri, Syed Furqan; Babar, Masroor Ellahi; Shahid, Muhammad; Ul-Rehman, Muzammil; Razzaq, Abdul; Shah Muhammad, Syed; Fahad, Muhammad; Ahmad, Sarfraz; Pervez, Muhammad Tariq; Naveed, Nasir; Aslam, Naeem; Jamil, Mutiullah; Rehmani, Ejaz Ahmad; Ahmad, Nazir; Akhtar Khan, Naeem

    2016-01-01

    The main objective of this study is to find out the importance of machine vision approach for the classification of five types of land cover data such as bare land, desert rangeland, green pasture, fertile cultivated land, and Sutlej river land. A novel spectra-statistical framework is designed to classify the subjective land cover data types accurately. Multispectral data of these land covers were acquired by using a handheld device named multispectral radiometer in the form of five spectral bands (blue, green, red, near infrared, and shortwave infrared) while texture data were acquired with a digital camera by the transformation of acquired images into 229 texture features for each image. The most discriminant 30 features of each image were obtained by integrating the three statistical features selection techniques such as Fisher, Probability of Error plus Average Correlation, and Mutual Information (F + PA + MI). Selected texture data clustering was verified by nonlinear discriminant analysis while linear discriminant analysis approach was applied for multispectral data. For classification, the texture and multispectral data were deployed to artificial neural network (ANN: n-class). By implementing a cross validation method (80-20), we received an accuracy of 91.332% for texture data and 96.40% for multispectral data, respectively. PMID:27376088

  2. Cloud cover analysis with Arctic Advanced Very High Resolution Radiometer data. II - Classification with spectral and textural measures

    NASA Technical Reports Server (NTRS)

    Key, J.

    1990-01-01

    The spectral and textural characteristics of polar clouds and surfaces for a 7-day summer series of AVHRR data in two Arctic locations are examined, and the results used in the development of a cloud classification procedure for polar satellite data. Since spatial coherence and texture sensitivity tests indicate that a joint spectral-textural analysis based on the same cell size is inappropriate, cloud detection with AVHRR data and surface identification with passive microwave data are first done on the pixel level as described by Key and Barry (1989). Next, cloud patterns within 250-sq-km regions are described, then the spectral and local textural characteristics of cloud patterns in the image are determined and each cloud pixel is classified by statistical methods. Results indicate that both spectral and textural features can be utilized in the classification of cloudy pixels, although spectral features are most useful for the discrimination between cloud classes.

  3. Challenges in discriminating profanity from hate speech

    NASA Astrophysics Data System (ADS)

    Malmasi, Shervin; Zampieri, Marcos

    2018-03-01

    In this study, we approach the problem of distinguishing general profanity from hate speech in social media, something which has not been widely considered. Using a new dataset annotated specifically for this task, we employ supervised classification along with a set of features that includes ?-grams, skip-grams and clustering-based word representations. We apply approaches based on single classifiers as well as more advanced ensemble classifiers and stacked generalisation, achieving the best result of ? accuracy for this 3-class classification task. Analysis of the results reveals that discriminating hate speech and profanity is not a simple task, which may require features that capture a deeper understanding of the text not always possible with surface ?-grams. The variability of gold labels in the annotated data, due to differences in the subjective adjudications of the annotators, is also an issue. Other directions for future work are discussed.

  4. Forest tree species discrimination in western Himalaya using EO-1 Hyperion

    NASA Astrophysics Data System (ADS)

    George, Rajee; Padalia, Hitendra; Kushwaha, S. P. S.

    2014-05-01

    The information acquired in the narrow bands of hyperspectral remote sensing data has potential to capture plant species spectral variability, thereby improving forest tree species mapping. This study assessed the utility of spaceborne EO-1 Hyperion data in discrimination and classification of broadleaved evergreen and conifer forest tree species in western Himalaya. The pre-processing of 242 bands of Hyperion data resulted into 160 noise-free and vertical stripe corrected reflectance bands. Of these, 29 bands were selected through step-wise exclusion of bands (Wilk's Lambda). Spectral Angle Mapper (SAM) and Support Vector Machine (SVM) algorithms were applied to the selected bands to assess their effectiveness in classification. SVM was also applied to broadband data (Landsat TM) to compare the variation in classification accuracy. All commonly occurring six gregarious tree species, viz., white oak, brown oak, chir pine, blue pine, cedar and fir in western Himalaya could be effectively discriminated. SVM produced a better species classification (overall accuracy 82.27%, kappa statistic 0.79) than SAM (overall accuracy 74.68%, kappa statistic 0.70). It was noticed that classification accuracy achieved with Hyperion bands was significantly higher than Landsat TM bands (overall accuracy 69.62%, kappa statistic 0.65). Study demonstrated the potential utility of narrow spectral bands of Hyperion data in discriminating tree species in a hilly terrain.

  5. Why Does Rebalancing Class-Unbalanced Data Improve AUC for Linear Discriminant Analysis?

    PubMed

    Xue, Jing-Hao; Hall, Peter

    2015-05-01

    Many established classifiers fail to identify the minority class when it is much smaller than the majority class. To tackle this problem, researchers often first rebalance the class sizes in the training dataset, through oversampling the minority class or undersampling the majority class, and then use the rebalanced data to train the classifiers. This leads to interesting empirical patterns. In particular, using the rebalanced training data can often improve the area under the receiver operating characteristic curve (AUC) for the original, unbalanced test data. The AUC is a widely-used quantitative measure of classification performance, but the property that it increases with rebalancing has, as yet, no theoretical explanation. In this note, using Gaussian-based linear discriminant analysis (LDA) as the classifier, we demonstrate that, at least for LDA, there is an intrinsic, positive relationship between the rebalancing of class sizes and the improvement of AUC. We show that the largest improvement of AUC is achieved, asymptotically, when the two classes are fully rebalanced to be of equal sizes.

  6. The role of social determinants on men's and women's mobility in Italy. A comparison of discriminant analysis and artificial neural networks.

    PubMed

    de Lillo, A; Meraviglia, C

    1998-02-01

    The paper focuses on the role of the spouse's occupation as a resource for mobile individuals, from the perspective that social positions are held by families, rather than by individuals. Three groups are confronted in terms of the role of the key variables and other relevant factors: men whose spouse does not have a paid job (group 1), men and women whose spouse has a paid job (group 2 and 3). The data set is provided by the national survey on social mobility in Italy, carried out in 1985; social achievements of members of the three groups are considered, including social origins and destinations, social position corresponding to respondent's first job, cultural background (educational achievement of respondent's father and mother found), respondent's education and spouse's social position. The techniques used are discriminant analysis and back propagation Neural Networks. Both techniques traced a clear boundary between group 1 and groups 2 and 3, which were discriminated mainly on the basis of the spouse's occupation; Artificial Neural Networks reached better classification results and allowed a deeper insight into the nonlinear effects of the discriminating variables for the three groups.

  7. Repertoire and classification of non-song calls in Southeast Alaskan humpback whales (Megaptera novaeangliae).

    PubMed

    Fournet, Michelle E; Szabo, Andy; Mellinger, David K

    2015-01-01

    On low-latitude breeding grounds, humpback whales produce complex and highly stereotyped songs as well as a range of non-song sounds associated with breeding behaviors. While on their Southeast Alaskan foraging grounds, humpback whales produce a range of previously unclassified non-song vocalizations. This study investigates the vocal repertoire of Southeast Alaskan humpback whales from a sample of 299 non-song vocalizations collected over a 3-month period on foraging grounds in Frederick Sound, Southeast Alaska. Three classification systems were used, including aural spectrogram analysis, statistical cluster analysis, and discriminant function analysis, to describe and classify vocalizations. A hierarchical acoustic structure was identified; vocalizations were classified into 16 individual call types nested within four vocal classes. The combined classification method shows promise for identifying variability in call stereotypy between vocal groupings and is recommended for future classification of broad vocal repertoires.

  8. JDINAC: joint density-based non-parametric differential interaction network analysis and classification using high-dimensional sparse omics data.

    PubMed

    Ji, Jiadong; He, Di; Feng, Yang; He, Yong; Xue, Fuzhong; Xie, Lei

    2017-10-01

    A complex disease is usually driven by a number of genes interwoven into networks, rather than a single gene product. Network comparison or differential network analysis has become an important means of revealing the underlying mechanism of pathogenesis and identifying clinical biomarkers for disease classification. Most studies, however, are limited to network correlations that mainly capture the linear relationship among genes, or rely on the assumption of a parametric probability distribution of gene measurements. They are restrictive in real application. We propose a new Joint density based non-parametric Differential Interaction Network Analysis and Classification (JDINAC) method to identify differential interaction patterns of network activation between two groups. At the same time, JDINAC uses the network biomarkers to build a classification model. The novelty of JDINAC lies in its potential to capture non-linear relations between molecular interactions using high-dimensional sparse data as well as to adjust confounding factors, without the need of the assumption of a parametric probability distribution of gene measurements. Simulation studies demonstrate that JDINAC provides more accurate differential network estimation and lower classification error than that achieved by other state-of-the-art methods. We apply JDINAC to a Breast Invasive Carcinoma dataset, which includes 114 patients who have both tumor and matched normal samples. The hub genes and differential interaction patterns identified were consistent with existing experimental studies. Furthermore, JDINAC discriminated the tumor and normal sample with high accuracy by virtue of the identified biomarkers. JDINAC provides a general framework for feature selection and classification using high-dimensional sparse omics data. R scripts available at https://github.com/jijiadong/JDINAC. lxie@iscb.org. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford

  9. Sequential Ideal-Observer Analysis of Visual Discriminations.

    ERIC Educational Resources Information Center

    Geisler, Wilson S.

    1989-01-01

    A new analysis, based on the concept of the ideal observer in signal detection theory, is described. It allows: tracing of the flow of discrimination information through the initial physiological stages of visual processing for arbitrary spatio-chromatic stimuli, and measurement of the information content of said visual stimuli. (TJH)

  10. Classification of adulterated honeys by multivariate analysis.

    PubMed

    Amiry, Saber; Esmaiili, Mohsen; Alizadeh, Mohammad

    2017-06-01

    In this research, honey samples were adulterated with date syrup (DS) and invert sugar syrup (IS) at three concentrations (7%, 15% and 30%). 102 adulterated samples were prepared in six batches with 17 replications for each batch. For each sample, 32 parameters including color indices, rheological, physical, and chemical parameters were determined. To classify the samples, based on type and concentrations of adulterant, a multivariate analysis was applied using principal component analysis (PCA) followed by a linear discriminant analysis (LDA). Then, 21 principal components (PCs) were selected in five sets. Approximately two-thirds were identified correctly using color indices (62.75%) or rheological properties (67.65%). A power discrimination was obtained using physical properties (97.06%), and the best separations were achieved using two sets of chemical properties (set 1: lactone, diastase activity, sucrose - 100%) (set 2: free acidity, HMF, ash - 95%). Copyright © 2016 Elsevier Ltd. All rights reserved.

  11. Toward optimal feature and time segment selection by divergence method for EEG signals classification.

    PubMed

    Wang, Jie; Feng, Zuren; Lu, Na; Luo, Jing

    2018-06-01

    Feature selection plays an important role in the field of EEG signals based motor imagery pattern classification. It is a process that aims to select an optimal feature subset from the original set. Two significant advantages involved are: lowering the computational burden so as to speed up the learning procedure and removing redundant and irrelevant features so as to improve the classification performance. Therefore, feature selection is widely employed in the classification of EEG signals in practical brain-computer interface systems. In this paper, we present a novel statistical model to select the optimal feature subset based on the Kullback-Leibler divergence measure, and automatically select the optimal subject-specific time segment. The proposed method comprises four successive stages: a broad frequency band filtering and common spatial pattern enhancement as preprocessing, features extraction by autoregressive model and log-variance, the Kullback-Leibler divergence based optimal feature and time segment selection and linear discriminate analysis classification. More importantly, this paper provides a potential framework for combining other feature extraction models and classification algorithms with the proposed method for EEG signals classification. Experiments on single-trial EEG signals from two public competition datasets not only demonstrate that the proposed method is effective in selecting discriminative features and time segment, but also show that the proposed method yields relatively better classification results in comparison with other competitive methods. Copyright © 2018 Elsevier Ltd. All rights reserved.

  12. Log-ratio transformed major element based multidimensional classification for altered High-Mg igneous rocks

    NASA Astrophysics Data System (ADS)

    Verma, Surendra P.; Rivera-Gómez, M. Abdelaly; Díaz-González, Lorena; Quiroz-Ruiz, Alfredo

    2016-12-01

    A new multidimensional classification scheme consistent with the chemical classification of the International Union of Geological Sciences (IUGS) is proposed for the nomenclature of High-Mg altered rocks. Our procedure is based on an extensive database of major element (SiO2, TiO2, Al2O3, Fe2O3t, MnO, MgO, CaO, Na2O, K2O, and P2O5) compositions of a total of 33,868 (920 High-Mg and 32,948 "Common") relatively fresh igneous rock samples. The database consisting of these multinormally distributed samples in terms of their isometric log-ratios was used to propose a set of 11 discriminant functions and 6 diagrams to facilitate High-Mg rock classification. The multinormality required by linear discriminant and canonical analysis was ascertained by a new computer program DOMuDaF. One multidimensional function can distinguish the High-Mg and Common igneous rocks with high percent success values of about 86.4% and 98.9%, respectively. Similarly, from 10 discriminant functions the High-Mg rocks can also be classified as one of the four rock types (komatiite, meimechite, picrite, and boninite), with high success values of about 88%-100%. Satisfactory functioning of this new classification scheme was confirmed by seven independent tests. Five further case studies involving application to highly altered rocks illustrate the usefulness of our proposal. A computer program HMgClaMSys was written to efficiently apply the proposed classification scheme, which will be available for online processing of igneous rock compositional data. Monte Carlo simulation modeling and mass-balance computations confirmed the robustness of our classification with respect to analytical errors and postemplacement compositional changes.

  13. Weighted Discriminative Dictionary Learning based on Low-rank Representation

    NASA Astrophysics Data System (ADS)

    Chang, Heyou; Zheng, Hao

    2017-01-01

    Low-rank representation has been widely used in the field of pattern classification, especially when both training and testing images are corrupted with large noise. Dictionary plays an important role in low-rank representation. With respect to the semantic dictionary, the optimal representation matrix should be block-diagonal. However, traditional low-rank representation based dictionary learning methods cannot effectively exploit the discriminative information between data and dictionary. To address this problem, this paper proposed weighted discriminative dictionary learning based on low-rank representation, where a weighted representation regularization term is constructed. The regularization associates label information of both training samples and dictionary atoms, and encourages to generate a discriminative representation with class-wise block-diagonal structure, which can further improve the classification performance where both training and testing images are corrupted with large noise. Experimental results demonstrate advantages of the proposed method over the state-of-the-art methods.

  14. Determining the saliency of feature measurements obtained from images of sedimentary organic matter for use in its classification

    NASA Astrophysics Data System (ADS)

    Weller, Andrew F.; Harris, Anthony J.; Ware, J. Andrew; Jarvis, Paul S.

    2006-11-01

    The classification of sedimentary organic matter (OM) images can be improved by determining the saliency of image analysis (IA) features measured from them. Knowing the saliency of IA feature measurements means that only the most significant discriminating features need be used in the classification process. This is an important consideration for classification techniques such as artificial neural networks (ANNs), where too many features can lead to the 'curse of dimensionality'. The classification scheme adopted in this work is a hybrid of morphologically and texturally descriptive features from previous manual classification schemes. Some of these descriptive features are assigned to IA features, along with several others built into the IA software (Halcon) to ensure that a valid cross-section is available. After an image is captured and segmented, a total of 194 features are measured for each particle. To reduce this number to a more manageable magnitude, the SPSS AnswerTree Exhaustive CHAID (χ 2 automatic interaction detector) classification tree algorithm is used to establish each measurement's saliency as a classification discriminator. In the case of continuous data as used here, the F-test is used as opposed to the published algorithm. The F-test checks various statistical hypotheses about the variance of groups of IA feature measurements obtained from the particles to be classified. The aim is to reduce the number of features required to perform the classification without reducing its accuracy. In the best-case scenario, 194 inputs are reduced to 8, with a subsequent multi-layer back-propagation ANN recognition rate of 98.65%. This paper demonstrates the ability of the algorithm to reduce noise, help overcome the curse of dimensionality, and facilitate an understanding of the saliency of IA features as discriminators for sedimentary OM classification.

  15. Real-time image annotation by manifold-based biased Fisher discriminant analysis

    NASA Astrophysics Data System (ADS)

    Ji, Rongrong; Yao, Hongxun; Wang, Jicheng; Sun, Xiaoshuai; Liu, Xianming

    2008-01-01

    Automatic Linguistic Annotation is a promising solution to bridge the semantic gap in content-based image retrieval. However, two crucial issues are not well addressed in state-of-art annotation algorithms: 1. The Small Sample Size (3S) problem in keyword classifier/model learning; 2. Most of annotation algorithms can not extend to real-time online usage due to their low computational efficiencies. This paper presents a novel Manifold-based Biased Fisher Discriminant Analysis (MBFDA) algorithm to address these two issues by transductive semantic learning and keyword filtering. To address the 3S problem, Co-Training based Manifold learning is adopted for keyword model construction. To achieve real-time annotation, a Bias Fisher Discriminant Analysis (BFDA) based semantic feature reduction algorithm is presented for keyword confidence discrimination and semantic feature reduction. Different from all existing annotation methods, MBFDA views image annotation from a novel Eigen semantic feature (which corresponds to keywords) selection aspect. As demonstrated in experiments, our manifold-based biased Fisher discriminant analysis annotation algorithm outperforms classical and state-of-art annotation methods (1.K-NN Expansion; 2.One-to-All SVM; 3.PWC-SVM) in both computational time and annotation accuracy with a large margin.

  16. EEG-based classification of imaginary left and right foot movements using beta rebound.

    PubMed

    Hashimoto, Yasunari; Ushiba, Junichi

    2013-11-01

    The purpose of this study was to investigate cortical lateralization of event-related (de)synchronization during left and right foot motor imagery tasks and to determine classification accuracy of the two imaginary movements in a brain-computer interface (BCI) paradigm. We recorded 31-channel scalp electroencephalograms (EEGs) from nine healthy subjects during brisk imagery tasks of left and right foot movements. EEG was analyzed with time-frequency maps and topographies, and the accuracy rate of classification between left and right foot movements was calculated. Beta rebound at the end of imagination (increase of EEG beta rhythm amplitude) was identified from the two EEGs derived from the right-shift and left-shift bipolar pairs at the vertex. This process enabled discrimination between right or left foot imagery at a high accuracy rate (maximum 81.6% in single trial analysis). These data suggest that foot motor imagery has potential to elicit left-right differences in EEG, while BCI using the unilateral foot imagery can achieve high classification accuracy, similar to ordinary BCI, based on hand motor imagery. By combining conventional discrimination techniques, the left-right discrimination of unilateral foot motor imagery provides a novel BCI system that could control a foot neuroprosthesis or a robotic foot. Copyright © 2013 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.

  17. [Comparison of Discriminant Analysis and Decision Trees for the Detection of Subclinical Keratoconus].

    PubMed

    Kleinhans, Sonja; Herrmann, Eva; Kohnen, Thomas; Bühren, Jens

    2017-08-15

    Background Iatrogenic keratectasia is one of the most dreaded complications of refractive surgery. In most cases, keratectasia develops after refractive surgery of eyes suffering from subclinical stages of keratoconus with few or no signs. Unfortunately, there has been no reliable procedure for the early detection of keratoconus. In this study, we used binary decision trees (recursive partitioning) to assess their suitability for discrimination between normal eyes and eyes with subclinical keratoconus. Patients and Methods The method of decision tree analysis was compared with discriminant analysis which has shown good results in previous studies. Input data were 32 eyes of 32 patients with newly diagnosed keratoconus in the contralateral eye and preoperative data of 10 eyes of 5 patients with keratectasia after laser in-situ keratomileusis (LASIK). The control group was made up of 245 normal eyes after LASIK and 12-month follow-up without any signs of iatrogenic keratectasia. Results Decision trees gave better accuracy and specificity than did discriminant analysis. The sensitivity of decision trees was lower than the sensitivity of discriminant analysis. Conclusion On the basis of the patient population of this study, decision trees did not prove to be superior to linear discriminant analysis for the detection of subclinical keratoconus. Georg Thieme Verlag KG Stuttgart · New York.

  18. Three-dimensional passive sensing photon counting for object classification

    NASA Astrophysics Data System (ADS)

    Yeom, Seokwon; Javidi, Bahram; Watson, Edward

    2007-04-01

    In this keynote address, we address three-dimensional (3D) distortion-tolerant object recognition using photon-counting integral imaging (II). A photon-counting linear discriminant analysis (LDA) is discussed for classification of photon-limited images. We develop a compact distortion-tolerant recognition system based on the multiple-perspective imaging of II. Experimental and simulation results have shown that a low level of photons is sufficient to classify out-of-plane rotated objects.

  19. Single-trial classification of motor imagery differing in task complexity: a functional near-infrared spectroscopy study

    PubMed Central

    2011-01-01

    Background For brain computer interfaces (BCIs), which may be valuable in neurorehabilitation, brain signals derived from mental activation can be monitored by non-invasive methods, such as functional near-infrared spectroscopy (fNIRS). Single-trial classification is important for this purpose and this was the aim of the presented study. In particular, we aimed to investigate a combined approach: 1) offline single-trial classification of brain signals derived from a novel wireless fNIRS instrument; 2) to use motor imagery (MI) as mental task thereby discriminating between MI signals in response to different tasks complexities, i.e. simple and complex MI tasks. Methods 12 subjects were asked to imagine either a simple finger-tapping task using their right thumb or a complex sequential finger-tapping task using all fingers of their right hand. fNIRS was recorded over secondary motor areas of the contralateral hemisphere. Using Fisher's linear discriminant analysis (FLDA) and cross validation, we selected for each subject a best-performing feature combination consisting of 1) one out of three channel, 2) an analysis time interval ranging from 5-15 s after stimulation onset and 3) up to four Δ[O2Hb] signal features (Δ[O2Hb] mean signal amplitudes, variance, skewness and kurtosis). Results The results of our single-trial classification showed that using the simple combination set of channels, time intervals and up to four Δ[O2Hb] signal features comprising Δ[O2Hb] mean signal amplitudes, variance, skewness and kurtosis, it was possible to discriminate single-trials of MI tasks differing in complexity, i.e. simple versus complex tasks (inter-task paired t-test p ≤ 0.001), over secondary motor areas with an average classification accuracy of 81%. Conclusions Although the classification accuracies look promising they are nevertheless subject of considerable subject-to-subject variability. In the discussion we address each of these aspects, their limitations for

  20. Identification of sexually abused female adolescents at risk for suicidal ideations: a classification and regression tree analysis.

    PubMed

    Brabant, Marie-Eve; Hébert, Martine; Chagnon, François

    2013-01-01

    This study explored the clinical profiles of 77 female teenager survivors of sexual abuse and examined the association of abuse-related and personal variables with suicidal ideations. Analyses revealed that 64% of participants experienced suicidal ideations. Findings from classification and regression tree analysis indicated that depression, posttraumatic stress symptoms, and hopelessness discriminated profiles of suicidal and nonsuicidal survivors. The elevated prevalence of suicidal ideations among adolescent survivors of sexual abuse underscores the importance of investigating the presence of suicidal ideations in sexual abuse survivors. However, suicidal ideation is not the sole variable that needs to be investigated; depression, hopelessness and posttraumatic stress symptoms are also related to suicidal ideations in survivors and could therefore guide interventions.

  1. A Biomimetic Sensor for the Classification of Honeys of Different Floral Origin and the Detection of Adulteration

    PubMed Central

    Zakaria, Ammar; Shakaff, Ali Yeon Md; Masnan, Maz Jamilah; Ahmad, Mohd Noor; Adom, Abdul Hamid; Jaafar, Mahmad Nor; Ghani, Supri A.; Abdullah, Abu Hassan; Aziz, Abdul Hallis Abdul; Kamarudin, Latifah Munirah; Subari, Norazian; Fikri, Nazifah Ahmad

    2011-01-01

    The major compounds in honey are carbohydrates such as monosaccharides and disaccharides. The same compounds are found in cane-sugar concentrates. Unfortunately when sugar concentrate is added to honey, laboratory assessments are found to be ineffective in detecting this adulteration. Unlike tracing heavy metals in honey, sugar adulterated honey is much trickier and harder to detect, and traditionally it has been very challenging to come up with a suitable method to prove the presence of adulterants in honey products. This paper proposes a combination of array sensing and multi-modality sensor fusion that can effectively discriminate the samples not only based on the compounds present in the sample but also mimic the way humans perceive flavours and aromas. Conversely, analytical instruments are based on chemical separations which may alter the properties of the volatiles or flavours of a particular honey. The present work is focused on classifying 18 samples of different honeys, sugar syrups and adulterated samples using data fusion of electronic nose (e-nose) and electronic tongue (e-tongue) measurements. Each group of samples was evaluated separately by the e-nose and e-tongue. Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) were able to separately discriminate monofloral honey from sugar syrup, and polyfloral honey from sugar and adulterated samples using the e-nose and e-tongue. The e-nose was observed to give better separation compared to e-tongue assessment, particularly when LDA was applied. However, when all samples were combined in one classification analysis, neither PCA nor LDA were able to discriminate between honeys of different floral origins, sugar syrup and adulterated samples. By applying a sensor fusion technique, the classification for the 18 different samples was improved. Significant improvement was observed using PCA, while LDA not only improved the discrimination but also gave better classification. An improvement

  2. A biomimetic sensor for the classification of honeys of different floral origin and the detection of adulteration.

    PubMed

    Zakaria, Ammar; Shakaff, Ali Yeon Md; Masnan, Maz Jamilah; Ahmad, Mohd Noor; Adom, Abdul Hamid; Jaafar, Mahmad Nor; Ghani, Supri A; Abdullah, Abu Hassan; Aziz, Abdul Hallis Abdul; Kamarudin, Latifah Munirah; Subari, Norazian; Fikri, Nazifah Ahmad

    2011-01-01

    The major compounds in honey are carbohydrates such as monosaccharides and disaccharides. The same compounds are found in cane-sugar concentrates. Unfortunately when sugar concentrate is added to honey, laboratory assessments are found to be ineffective in detecting this adulteration. Unlike tracing heavy metals in honey, sugar adulterated honey is much trickier and harder to detect, and traditionally it has been very challenging to come up with a suitable method to prove the presence of adulterants in honey products. This paper proposes a combination of array sensing and multi-modality sensor fusion that can effectively discriminate the samples not only based on the compounds present in the sample but also mimic the way humans perceive flavours and aromas. Conversely, analytical instruments are based on chemical separations which may alter the properties of the volatiles or flavours of a particular honey. The present work is focused on classifying 18 samples of different honeys, sugar syrups and adulterated samples using data fusion of electronic nose (e-nose) and electronic tongue (e-tongue) measurements. Each group of samples was evaluated separately by the e-nose and e-tongue. Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) were able to separately discriminate monofloral honey from sugar syrup, and polyfloral honey from sugar and adulterated samples using the e-nose and e-tongue. The e-nose was observed to give better separation compared to e-tongue assessment, particularly when LDA was applied. However, when all samples were combined in one classification analysis, neither PCA nor LDA were able to discriminate between honeys of different floral origins, sugar syrup and adulterated samples. By applying a sensor fusion technique, the classification for the 18 different samples was improved. Significant improvement was observed using PCA, while LDA not only improved the discrimination but also gave better classification. An improvement

  3. Discrimination of Aspergillus isolates at the species and strain level by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry fingerprinting.

    PubMed

    Hettick, Justin M; Green, Brett J; Buskirk, Amanda D; Kashon, Michael L; Slaven, James E; Janotka, Erika; Blachere, Francoise M; Schmechel, Detlef; Beezhold, Donald H

    2008-09-15

    Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) was used to generate highly reproducible mass spectral fingerprints for 12 species of fungi of the genus Aspergillus and 5 different strains of Aspergillus flavus. Prior to MALDI-TOF MS analysis, the fungi were subjected to three 1-min bead beating cycles in an acetonitrile/trifluoroacetic acid solvent. The mass spectra contain abundant peaks in the range of 5 to 20kDa and may be used to discriminate between species unambiguously. A discriminant analysis using all peaks from the MALDI-TOF MS data yielded error rates for classification of 0 and 18.75% for resubstitution and cross-validation methods, respectively. If a subset of 28 significant peaks is chosen, resubstitution and cross-validation error rates are 0%. Discriminant analysis of the MALDI-TOF MS data for 5 strains of A. flavus using all peaks yielded error rates for classification of 0 and 5% for resubstitution and cross-validation methods, respectively. These data indicate that MALDI-TOF MS data may be used for unambiguous identification of members of the genus Aspergillus at both the species and strain levels.

  4. New feature extraction method for classification of agricultural products from x-ray images

    NASA Astrophysics Data System (ADS)

    Talukder, Ashit; Casasent, David P.; Lee, Ha-Woon; Keagy, Pamela M.; Schatzki, Thomas F.

    1999-01-01

    Classification of real-time x-ray images of randomly oriented touching pistachio nuts is discussed. The ultimate objective is the development of a system for automated non- invasive detection of defective product items on a conveyor belt. We discuss the extraction of new features that allow better discrimination between damaged and clean items. This feature extraction and classification stage is the new aspect of this paper; our new maximum representation and discrimination between damaged and clean items. This feature extraction and classification stage is the new aspect of this paper; our new maximum representation and discriminating feature (MRDF) extraction method computes nonlinear features that are used as inputs to a new modified k nearest neighbor classifier. In this work the MRDF is applied to standard features. The MRDF is robust to various probability distributions of the input class and is shown to provide good classification and new ROC data.

  5. Discrimination against Latina/os: A Meta-Analysis of Individual-Level Resources and Outcomes

    ERIC Educational Resources Information Center

    Lee, Debbiesiu L.; Ahn, Soyeon

    2012-01-01

    This meta-analysis synthesizes the findings of 60 independent samples from 51 studies examining racial/ethnic discrimination against Latina/os in the United States. The purpose was to identify individual-level resources and outcomes that most strongly relate to discrimination. Discrimination against Latina/os significantly results in outcomes…

  6. Similarity-dissimilarity plot for visualization of high dimensional data in biomedical pattern classification.

    PubMed

    Arif, Muhammad

    2012-06-01

    In pattern classification problems, feature extraction is an important step. Quality of features in discriminating different classes plays an important role in pattern classification problems. In real life, pattern classification may require high dimensional feature space and it is impossible to visualize the feature space if the dimension of feature space is greater than four. In this paper, we have proposed a Similarity-Dissimilarity plot which can project high dimensional space to a two dimensional space while retaining important characteristics required to assess the discrimination quality of the features. Similarity-dissimilarity plot can reveal information about the amount of overlap of features of different classes. Separable data points of different classes will also be visible on the plot which can be classified correctly using appropriate classifier. Hence, approximate classification accuracy can be predicted. Moreover, it is possible to know about whom class the misclassified data points will be confused by the classifier. Outlier data points can also be located on the similarity-dissimilarity plot. Various examples of synthetic data are used to highlight important characteristics of the proposed plot. Some real life examples from biomedical data are also used for the analysis. The proposed plot is independent of number of dimensions of the feature space.

  7. Fourier transform infrared spectroscopy combined with chemometrics for discrimination of Curcuma longa, Curcuma xanthorrhiza and Zingiber cassumunar.

    PubMed

    Rohaeti, Eti; Rafi, Mohamad; Syafitri, Utami Dyah; Heryanto, Rudi

    2015-02-25

    Turmeric (Curcuma longa), java turmeric (Curcuma xanthorrhiza) and cassumunar ginger (Zingiber cassumunar) are widely used in traditional Indonesian medicines (jamu). They have similar color for their rhizome and possess some similar uses, so it is possible to substitute one for the other. The identification and discrimination of these closely-related plants is a crucial task to ensure the quality of the raw materials. Therefore, an analytical method which is rapid, simple and accurate for discriminating these species using Fourier transform infrared spectroscopy (FTIR) combined with some chemometrics methods was developed. FTIR spectra were acquired in the mid-IR region (4000-400 cm(-1)). Standard normal variate, first and second order derivative spectra were compared for the spectral data. Principal component analysis (PCA) and canonical variate analysis (CVA) were used for the classification of the three species. Samples could be discriminated by visual analysis of the FTIR spectra by using their marker bands. Discrimination of the three species was also possible through the combination of the pre-processed FTIR spectra with PCA and CVA, in which CVA gave clearer discrimination. Subsequently, the developed method could be used for the identification and discrimination of the three closely-related plant species. Copyright © 2014 Elsevier B.V. All rights reserved.

  8. Classification of fracture and non-fracture groups by analysis of coherent X-ray scatter

    PubMed Central

    Dicken, A. J.; Evans, J. P. O.; Rogers, K. D.; Stone, N.; Greenwood, C.; Godber, S. X.; Clement, J. G.; Lyburn, I. D.; Martin, R. M.; Zioupos, P.

    2016-01-01

    Osteoporotic fractures present a significant social and economic burden, which is set to rise commensurately with the aging population. Greater understanding of the physicochemical differences between osteoporotic and normal conditions will facilitate the development of diagnostic technologies with increased performance and treatments with increased efficacy. Using coherent X-ray scattering we have evaluated a population of 108 ex vivo human bone samples comprised of non-fracture and fracture groups. Principal component fed linear discriminant analysis was used to develop a classification model to discern each condition resulting in a sensitivity and specificity of 93% and 91%, respectively. Evaluating the coherent X-ray scatter differences from each condition supports the hypothesis that a causal physicochemical change has occurred in the fracture group. This work is a critical step along the path towards developing an in vivo diagnostic tool for fracture risk prediction. PMID:27363947

  9. Accurate classification of brain gliomas by discriminate dictionary learning based on projective dictionary pair learning of proton magnetic resonance spectra.

    PubMed

    Adebileje, Sikiru Afolabi; Ghasemi, Keyvan; Aiyelabegan, Hammed Tanimowo; Saligheh Rad, Hamidreza

    2017-04-01

    Proton magnetic resonance spectroscopy is a powerful noninvasive technique that complements the structural images of cMRI, which aids biomedical and clinical researches, by identifying and visualizing the compositions of various metabolites within the tissues of interest. However, accurate classification of proton magnetic resonance spectroscopy is still a challenging issue in clinics due to low signal-to-noise ratio, overlapping peaks of metabolites, and the presence of background macromolecules. This paper evaluates the performance of a discriminate dictionary learning classifiers based on projective dictionary pair learning method for brain gliomas proton magnetic resonance spectroscopy spectra classification task, and the result were compared with the sub-dictionary learning methods. The proton magnetic resonance spectroscopy data contain a total of 150 spectra (74 healthy, 23 grade II, 23 grade III, and 30 grade IV) from two databases. The datasets from both databases were first coupled together, followed by column normalization. The Kennard-Stone algorithm was used to split the datasets into its training and test sets. Performance comparison based on the overall accuracy, sensitivity, specificity, and precision was conducted. Based on the overall accuracy of our classification scheme, the dictionary pair learning method was found to outperform the sub-dictionary learning methods 97.78% compared with 68.89%, respectively. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  10. Advanced signal processing analysis of laser-induced breakdown spectroscopy data for the discrimination of obsidian sources.

    PubMed

    Remus, Jeremiah J; Harmon, Russell S; Hark, Richard R; Haverstock, Gregory; Baron, Dirk; Potter, Ian K; Bristol, Samantha K; East, Lucille J

    2012-03-01

    Obsidian is a natural glass of volcanic origin and a primary resource used by indigenous peoples across North America for making tools. Geochemical studies of obsidian enhance understanding of artifact production and procurement and remain a priority activity within the archaeological community. Laser-induced breakdown spectroscopy (LIBS) is an analytical technique being examined as a means for identifying obsidian from different sources on the basis of its 'geochemical fingerprint'. This study tested whether two major California obsidian centers could be distinguished from other obsidian localities and the extent to which subsources could be recognized within each of these centers. LIBS data sets were collected in two different spectral bands (350±130 nm and 690±115 nm) using a Nd:YAG 1064 nm laser operated at ~23 mJ, a Czerny-Turner spectrograph with 0.2-0.3 nm spectral resolution and a high performance imaging charge couple device (ICCD) detector. Classification of the samples was performed using partial least-squares discriminant analysis (PLSDA), a common chemometric technique for performing statistical regression on high-dimensional data. Discrimination of samples from the Coso Volcanic Field, Bodie Hills, and other major obsidian areas in north-central California was possible with an accuracy of greater than 90% using either spectral band. © 2012 Optical Society of America

  11. Canonical Measure of Correlation (CMC) and Canonical Measure of Distance (CMD) between sets of data. Part 3. Variable selection in classification.

    PubMed

    Ballabio, Davide; Consonni, Viviana; Mauri, Andrea; Todeschini, Roberto

    2010-01-11

    In multivariate regression and classification issues variable selection is an important procedure used to select an optimal subset of variables with the aim of producing more parsimonious and eventually more predictive models. Variable selection is often necessary when dealing with methodologies that produce thousands of variables, such as Quantitative Structure-Activity Relationships (QSARs) and highly dimensional analytical procedures. In this paper a novel method for variable selection for classification purposes is introduced. This method exploits the recently proposed Canonical Measure of Correlation between two sets of variables (CMC index). The CMC index is in this case calculated for two specific sets of variables, the former being comprised of the independent variables and the latter of the unfolded class matrix. The CMC values, calculated by considering one variable at a time, can be sorted and a ranking of the variables on the basis of their class discrimination capabilities results. Alternatively, CMC index can be calculated for all the possible combinations of variables and the variable subset with the maximal CMC can be selected, but this procedure is computationally more demanding and classification performance of the selected subset is not always the best one. The effectiveness of the CMC index in selecting variables with discriminative ability was compared with that of other well-known strategies for variable selection, such as the Wilks' Lambda, the VIP index based on the Partial Least Squares-Discriminant Analysis, and the selection provided by classification trees. A variable Forward Selection based on the CMC index was finally used in conjunction of Linear Discriminant Analysis. This approach was tested on several chemical data sets. Obtained results were encouraging.

  12. Classification of illicit heroin by UPLC-Q-TOF analysis of acidic and neutral manufacturing impurities.

    PubMed

    Liu, Cuimei; Hua, Zhendong; Bai, Yanping

    2015-12-01

    The illicit manufacture of heroin results in the formation of trace levels of acidic and neutral manufacturing impurities that provide valuable information about the manufacturing process used. In this work, a new ultra performance liquid chromatography-quadrupole-time of flight mass spectrometry (UPLC-Q-TOF) method; that features high resolution, mass accuracy and sensitivity for profiling neutral and acidic heroin manufacturing impurities was developed. After the UPLC-Q-TOF analysis, the retention times and m/z data pairs of acidic and neutral manufacturing impurities were detected, and 19 peaks were found to be evidently different between heroin samples from "Golden Triangle" and "Golden Crescent". Based on the data set of these 19 impurities in 150 authentic heroin samples, classification of heroin geographic origins was successfully achieved utilizing partial least squares discriminant analysis (PLS-DA). By analyzing another data set of 267 authentic heroin samples, the developed discrimiant model was validated and proved to be accurate and reliable. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  13. Classification Based on Hierarchical Linear Models: The Need for Incorporation of Social Contexts in Classification Analysis

    ERIC Educational Resources Information Center

    Vaughn, Brandon K.; Wang, Qui

    2009-01-01

    Many areas in educational and psychological research involve the use of classification statistical analysis. For example, school districts might be interested in attaining variables that provide optimal prediction of school dropouts. In psychology, a researcher might be interested in the classification of a subject into a particular psychological…

  14. Semi-Supervised Marginal Fisher Analysis for Hyperspectral Image Classification

    NASA Astrophysics Data System (ADS)

    Huang, H.; Liu, J.; Pan, Y.

    2012-07-01

    The problem of learning with both labeled and unlabeled examples arises frequently in Hyperspectral image (HSI) classification. While marginal Fisher analysis is a supervised method, which cannot be directly applied for Semi-supervised classification. In this paper, we proposed a novel method, called semi-supervised marginal Fisher analysis (SSMFA), to process HSI of natural scenes, which uses a combination of semi-supervised learning and manifold learning. In SSMFA, a new difference-based optimization objective function with unlabeled samples has been designed. SSMFA preserves the manifold structure of labeled and unlabeled samples in addition to separating labeled samples in different classes from each other. The semi-supervised method has an analytic form of the globally optimal solution, and it can be computed based on eigen decomposition. Classification experiments with a challenging HSI task demonstrate that this method outperforms current state-of-the-art HSI-classification methods.

  15. [Phoneme analysis and phoneme discrimination of juvenile speech therapy school students].

    PubMed

    Franz, S; Rosanowski, F; Eysholdt, U; Hoppe, U

    2011-05-01

    Phoneme analysis and phoneme discrimination, important factors in acquiring spoken and written language, have been evaluated in juvenile speech therapy school students. The results have been correlated with the results of a school achievement test. The following questions were of interest: Do students in the lower verbal skill segment show pathological phoneme analysis and phoneme discrimination skills? Do the results of the school achievement test differ from the results by students visiting German "Hauptschule"? How does phoneme analysis and phoneme discrimination performance correlate to other tested parameters? 74 students of a speech therapy school ranging from 7 (th) to 9 (th) grade were examined (ages 12;10-17;04) with the Heidelberg Phoneme Discrimination Test H-LAD and the school achievement test "Prüfsystem für Schul- und Bildungsberatung PSB-R 6-13". Compared to 4 (th) graders the juvenile speech therapy school students showed worse results in the H-LAD test with good differentiation in the lower measuring range. In the PSB-R 6-13 test the examined students did worse compared to students visiting German "Hauptschule" for all grades except 9 (th) grade. Comparing H-LAD and PSB-R 6-13 shows a significant correlation for the sub-tests covering language competence and intelligence but not for the concentration tests. Pathological phoneme analysis and phoneme discrimination skills suggest elevated need for counseling, but this needs to corroborated through additional linguistic parameters and measuring non-verbal intelligence. Further trails are needed in order to clarify whether the results can lead to sophisticated therapy algorithms for educational purposes. © Georg Thieme Verlag KG Stuttgart · New York.

  16. Statistical classification techniques for engineering and climatic data samples

    NASA Technical Reports Server (NTRS)

    Temple, E. C.; Shipman, J. R.

    1981-01-01

    Fisher's sample linear discriminant function is modified through an appropriate alteration of the common sample variance-covariance matrix. The alteration consists of adding nonnegative values to the eigenvalues of the sample variance covariance matrix. The desired results of this modification is to increase the number of correct classifications by the new linear discriminant function over Fisher's function. This study is limited to the two-group discriminant problem.

  17. Multilevel image recognition using discriminative patches and kernel covariance descriptor

    NASA Astrophysics Data System (ADS)

    Lu, Le; Yao, Jianhua; Turkbey, Evrim; Summers, Ronald M.

    2014-03-01

    Computer-aided diagnosis of medical images has emerged as an important tool to objectively improve the performance, accuracy and consistency for clinical workflow. To computerize the medical image diagnostic recognition problem, there are three fundamental problems: where to look (i.e., where is the region of interest from the whole image/volume), image feature description/encoding, and similarity metrics for classification or matching. In this paper, we exploit the motivation, implementation and performance evaluation of task-driven iterative, discriminative image patch mining; covariance matrix based descriptor via intensity, gradient and spatial layout; and log-Euclidean distance kernel for support vector machine, to address these three aspects respectively. To cope with often visually ambiguous image patterns for the region of interest in medical diagnosis, discovery of multilabel selective discriminative patches is desired. Covariance of several image statistics summarizes their second order interactions within an image patch and is proved as an effective image descriptor, with low dimensionality compared with joint statistics and fast computation regardless of the patch size. We extensively evaluate two extended Gaussian kernels using affine-invariant Riemannian metric or log-Euclidean metric with support vector machines (SVM), on two medical image classification problems of degenerative disc disease (DDD) detection on cortical shell unwrapped CT maps and colitis detection on CT key images. The proposed approach is validated with promising quantitative results on these challenging tasks. Our experimental findings and discussion also unveil some interesting insights on the covariance feature composition with or without spatial layout for classification and retrieval, and different kernel constructions for SVM. This will also shed some light on future work using covariance feature and kernel classification for medical image analysis.

  18. Applying Neural Networks to Hyperspectral and Multispectral Field Data for Discrimination of Cruciferous Weeds in Winter Crops

    PubMed Central

    de Castro, Ana-Isabel; Jurado-Expósito, Montserrat; Gómez-Casero, María-Teresa; López-Granados, Francisca

    2012-01-01

    In the context of detection of weeds in crops for site-specific weed control, on-ground spectral reflectance measurements are the first step to determine the potential of remote spectral data to classify weeds and crops. Field studies were conducted for four years at different locations in Spain. We aimed to distinguish cruciferous weeds in wheat and broad bean crops, using hyperspectral and multispectral readings in the visible and near-infrared spectrum. To identify differences in reflectance between cruciferous weeds, we applied three classification methods: stepwise discriminant (STEPDISC) analysis and two neural networks, specifically, multilayer perceptron (MLP) and radial basis function (RBF). Hyperspectral and multispectral signatures of cruciferous weeds, and wheat and broad bean crops can be classified using STEPDISC analysis, and MLP and RBF neural networks with different success, being the MLP model the most accurate with 100%, or higher than 98.1%, of classification performance for all the years. Classification accuracy from hyperspectral signatures was similar to that from multispectral and spectral indices, suggesting that little advantage would be obtained by using more expensive airborne hyperspectral imagery. Therefore, for next investigations, we recommend using multispectral remote imagery to explore whether they can potentially discriminate these weeds and crops. PMID:22629171

  19. Applying neural networks to hyperspectral and multispectral field data for discrimination of cruciferous weeds in winter crops.

    PubMed

    de Castro, Ana-Isabel; Jurado-Expósito, Montserrat; Gómez-Casero, María-Teresa; López-Granados, Francisca

    2012-01-01

    In the context of detection of weeds in crops for site-specific weed control, on-ground spectral reflectance measurements are the first step to determine the potential of remote spectral data to classify weeds and crops. Field studies were conducted for four years at different locations in Spain. We aimed to distinguish cruciferous weeds in wheat and broad bean crops, using hyperspectral and multispectral readings in the visible and near-infrared spectrum. To identify differences in reflectance between cruciferous weeds, we applied three classification methods: stepwise discriminant (STEPDISC) analysis and two neural networks, specifically, multilayer perceptron (MLP) and radial basis function (RBF). Hyperspectral and multispectral signatures of cruciferous weeds, and wheat and broad bean crops can be classified using STEPDISC analysis, and MLP and RBF neural networks with different success, being the MLP model the most accurate with 100%, or higher than 98.1%, of classification performance for all the years. Classification accuracy from hyperspectral signatures was similar to that from multispectral and spectral indices, suggesting that little advantage would be obtained by using more expensive airborne hyperspectral imagery. Therefore, for next investigations, we recommend using multispectral remote imagery to explore whether they can potentially discriminate these weeds and crops.

  20. Acoustic discrimination of Southern Ocean zooplankton

    NASA Astrophysics Data System (ADS)

    Brierley, Andrew S.; Ward, Peter; Watkins, Jonathan L.; Goss, Catherine

    Acoustic surveys in the vicinity of the sub-Antarctic island of South Georgia during a period of exceptionally calm weather revealed the existence of a number of horizontally extensive yet vertically discrete scattering layers in the upper 250 m of the water column. These layers were fished with a Longhurst-Hardy plankton recorder (LHPR) and a multiple-opening 8 m 2 rectangular mid-water trawl (RMT8). Analysis of catches suggested that each scattering layer was composed predominantly of a single species (biovolume>95%) of either the euphausiids Euphausia frigida or Thysanöessa macrura, the hyperiid amphipod Themisto gaudichaudii, or the eucalaniid copepod Rhincalanus gigas. Instrumentation on the nets allowed their trajectories to be reconstructed precisely, and thus catch data to be related directly to the corresponding acoustic signals. Discriminant function analysis of differences between mean volume backscattering strength at 38, 120 and 200 kHz separated echoes originating from each of the dominant scattering layers, and other signals identified as originating from Antarctic krill ( Euphausia superba), with an overall correct classification rate of 77%. Using echo intensity data alone, gathered using hardware commonly employed for fishery acoustics, it is therefore possible to discriminate in situ between several zooplanktonic taxa, taxa which in some instances exhibit similar gross morphological characteristics and have overlapping length- frequency distributions. Acoustic signals from the mysid Antarctomysis maxima could also be discriminated once information on target distribution was considered, highlighting the value of incorporating multiple descriptors of echo characteristics into signal identification procedures. The ability to discriminate acoustically between zooplankton taxa could be applied to provide improved acoustic estimates of species abundance, and to enhance field studies of zooplankton ecology, distribution and species interactions.

  1. Principal Component Clustering Approach to Teaching Quality Discriminant Analysis

    ERIC Educational Resources Information Center

    Xian, Sidong; Xia, Haibo; Yin, Yubo; Zhai, Zhansheng; Shang, Yan

    2016-01-01

    Teaching quality is the lifeline of the higher education. Many universities have made some effective achievement about evaluating the teaching quality. In this paper, we establish the Students' evaluation of teaching (SET) discriminant analysis model and algorithm based on principal component clustering analysis. Additionally, we classify the SET…

  2. Persistent topographic quantitative EEG sequelae of chronic marihuana use: a replication study and initial discriminant function analysis.

    PubMed

    Struve, F A; Straumanis, J J; Patrick, G

    1994-04-01

    In a previous pilot study using psychiatric patients we reported that daily marihuana users had significant elevations of (1) Absolute Alpha Power, (2) Relative Alpha Power, and (3) Interhemispheric Alpha Coherence over both frontal and frontal-central areas when contrasted with subjects who did not use marihuana. We referred to this phenomenon as Hyperfrontality of Alpha. The study presented here is a successful replication of our previous findings using new samples of subjects and identical methods. Post hoc analyses based on the combined sample from both studies suggest that variables of psychiatric diagnoses and medication did not bias our results. In addition, a discriminant function analysis using quantitative EEG variables as candidate predictors generated a 95% correct THC user versus nonuser classification accuracy which received a successful jackknife replication.

  3. Video based object representation and classification using multiple covariance matrices.

    PubMed

    Zhang, Yurong; Liu, Quan

    2017-01-01

    Video based object recognition and classification has been widely studied in computer vision and image processing area. One main issue of this task is to develop an effective representation for video. This problem can generally be formulated as image set representation. In this paper, we present a new method called Multiple Covariance Discriminative Learning (MCDL) for image set representation and classification problem. The core idea of MCDL is to represent an image set using multiple covariance matrices with each covariance matrix representing one cluster of images. Firstly, we use the Nonnegative Matrix Factorization (NMF) method to do image clustering within each image set, and then adopt Covariance Discriminative Learning on each cluster (subset) of images. At last, we adopt KLDA and nearest neighborhood classification method for image set classification. Promising experimental results on several datasets show the effectiveness of our MCDL method.

  4. Histogram analysis of apparent diffusion coefficient maps for assessing thymic epithelial tumours: correlation with world health organization classification and clinical staging.

    PubMed

    Kong, Ling-Yan; Zhang, Wei; Zhou, Yue; Xu, Hai; Shi, Hai-Bin; Feng, Qing; Xu, Xiao-Quan; Yu, Tong-Fu

    2018-04-01

    To investigate the value of apparent diffusion coefficients (ADCs) histogram analysis for assessing World Health Organization (WHO) pathological classification and Masaoka clinical stages of thymic epithelial tumours. 37 patients with histologically confirmed thymic epithelial tumours were enrolled. ADC measurements were performed using hot-spot ROI (ADC HS-ROI ) and histogram-based approach. ADC histogram parameters included mean ADC (ADC mean ), median ADC (ADC median ), 10 and 90 percentile of ADC (ADC 10 and ADC 90 ), kurtosis and skewness. One-way ANOVA, independent-sample t-test, and receiver operating characteristic were used for statistical analyses. There were significant differences in ADC mean , ADC median , ADC 10 , ADC 90 and ADC HS-ROI among low-risk thymoma (type A, AB, B1; n = 14), high-risk thymoma (type B2, B3; n = 9) and thymic carcinoma (type C, n = 14) groups (all p-values <0.05), while no significant difference in skewness (p = 0.181) and kurtosis (p = 0.088). ADC 10 showed best differentiating ability (cut-off value, ≤0.689 × 10 -3 mm 2 s -1 ; AUC, 0.957; sensitivity, 95.65%; specificity, 92.86%) for discriminating low-risk thymoma from high-risk thymoma and thymic carcinoma. Advanced Masaoka stages (Stage III and IV; n = 24) tumours showed significant lower ADC parameters and higher kurtosis than early Masaoka stage (Stage I and II; n = 13) tumours (all p-values <0.05), while no significant difference on skewness (p = 0.063). ADC 10 showed best differentiating ability (cut-off value, ≤0.689 × 10 -3 mm 2 s -1 ; AUC, 0.913; sensitivity, 91.30%; specificity, 85.71%) for discriminating advanced and early Masaoka stage epithelial tumours. ADC histogram analysis may assist in assessing the WHO pathological classification and Masaoka clinical stages of thymic epithelial tumours. Advances in knowledge: 1. ADC histogram analysis could help to assess WHO pathological classification of thymic epithelial tumours. 2. ADC histogram analysis could

  5. Investigating the Potential of Deep Neural Networks for Large-Scale Classification of Very High Resolution Satellite Images

    NASA Astrophysics Data System (ADS)

    Postadjian, T.; Le Bris, A.; Sahbi, H.; Mallet, C.

    2017-05-01

    Semantic classification is a core remote sensing task as it provides the fundamental input for land-cover map generation. The very recent literature has shown the superior performance of deep convolutional neural networks (DCNN) for many classification tasks including the automatic analysis of Very High Spatial Resolution (VHR) geospatial images. Most of the recent initiatives have focused on very high discrimination capacity combined with accurate object boundary retrieval. Therefore, current architectures are perfectly tailored for urban areas over restricted areas but not designed for large-scale purposes. This paper presents an end-to-end automatic processing chain, based on DCNNs, that aims at performing large-scale classification of VHR satellite images (here SPOT 6/7). Since this work assesses, through various experiments, the potential of DCNNs for country-scale VHR land-cover map generation, a simple yet effective architecture is proposed, efficiently discriminating the main classes of interest (namely buildings, roads, water, crops, vegetated areas) by exploiting existing VHR land-cover maps for training.

  6. Discriminative factor analysis of juvenile delinquency in South Korea.

    PubMed

    Kim, Hyun Sil; Kim, Hun Soo

    2006-12-01

    The present study was intended to compare difference in research variables between delinquent adolescents and student adolescents, and to analyze discriminative factors of delinquent behaviors among Korean adolescents. The research design of this study was a questionnaire survey. Questionnaires were administered to 2,167 adolescents (1,196 students and 971 delinquents), sampled from 8 middle and high school and 6 juvenile corrective institutions, using the proportional stratified random sampling method. Statistical methods employed were Chi-square, t-test, and logistic regression analysis. The discriminative factors of delinquent behaviors were smoking, alcohol use, other drug use, being sexually abused, viewing time of media violence and pornography. Among these discriminative factors, the factor most strongly associated with delinquency was smoking (odds ratio: 32.32). That is, smoking adolescent has a 32-fold higher possibility of becoming a delinquent adolescent than a non-smoking adolescent. Our findings, that smoking was the strongest discriminative factor of delinquent behavior, suggest that educational strategies to prevent adolescent smoking may reduce the rate of juvenile delinquency. Antismoking educational efforts are therefore urgently needed in South Korea.

  7. Improved classification accuracy in 1- and 2-dimensional NMR metabolomics data using the variance stabilising generalised logarithm transformation

    PubMed Central

    Parsons, Helen M; Ludwig, Christian; Günther, Ulrich L; Viant, Mark R

    2007-01-01

    Background Classifying nuclear magnetic resonance (NMR) spectra is a crucial step in many metabolomics experiments. Since several multivariate classification techniques depend upon the variance of the data, it is important to first minimise any contribution from unwanted technical variance arising from sample preparation and analytical measurements, and thereby maximise any contribution from wanted biological variance between different classes. The generalised logarithm (glog) transform was developed to stabilise the variance in DNA microarray datasets, but has rarely been applied to metabolomics data. In particular, it has not been rigorously evaluated against other scaling techniques used in metabolomics, nor tested on all forms of NMR spectra including 1-dimensional (1D) 1H, projections of 2D 1H, 1H J-resolved (pJRES), and intact 2D J-resolved (JRES). Results Here, the effects of the glog transform are compared against two commonly used variance stabilising techniques, autoscaling and Pareto scaling, as well as unscaled data. The four methods are evaluated in terms of the effects on the variance of NMR metabolomics data and on the classification accuracy following multivariate analysis, the latter achieved using principal component analysis followed by linear discriminant analysis. For two of three datasets analysed, classification accuracies were highest following glog transformation: 100% accuracy for discriminating 1D NMR spectra of hypoxic and normoxic invertebrate muscle, and 100% accuracy for discriminating 2D JRES spectra of fish livers sampled from two rivers. For the third dataset, pJRES spectra of urine from two breeds of dog, the glog transform and autoscaling achieved equal highest accuracies. Additionally we extended the glog algorithm to effectively suppress noise, which proved critical for the analysis of 2D JRES spectra. Conclusion We have demonstrated that the glog and extended glog transforms stabilise the technical variance in NMR metabolomics

  8. Tree species classification in subtropical forests using small-footprint full-waveform LiDAR data

    NASA Astrophysics Data System (ADS)

    Cao, Lin; Coops, Nicholas C.; Innes, John L.; Dai, Jinsong; Ruan, Honghua; She, Guanghui

    2016-07-01

    The accurate classification of tree species is critical for the management of forest ecosystems, particularly subtropical forests, which are highly diverse and complex ecosystems. While airborne Light Detection and Ranging (LiDAR) technology offers significant potential to estimate forest structural attributes, the capacity of this new tool to classify species is less well known. In this research, full-waveform metrics were extracted by a voxel-based composite waveform approach and examined with a Random Forests classifier to discriminate six subtropical tree species (i.e., Masson pine (Pinus massoniana Lamb.)), Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.), Slash pines (Pinus elliottii Engelm.), Sawtooth oak (Quercus acutissima Carruth.) and Chinese holly (Ilex chinensis Sims.) at three levels of discrimination. As part of the analysis, the optimal voxel size for modelling the composite waveforms was investigated, the most important predictor metrics for species classification assessed and the effect of scan angle on species discrimination examined. Results demonstrate that all tree species were classified with relatively high accuracy (68.6% for six classes, 75.8% for four main species and 86.2% for conifers and broadleaved trees). Full-waveform metrics (based on height of median energy, waveform distance and number of waveform peaks) demonstrated high classification importance and were stable among various voxel sizes. The results also suggest that the voxel based approach can alleviate some of the issues associated with large scan angles. In summary, the results indicate that full-waveform LIDAR data have significant potential for tree species classification in the subtropical forests.

  9. Polarization-based material classification technique using passive millimeter-wave polarimetric imagery.

    PubMed

    Hu, Fei; Cheng, Yayun; Gui, Liangqi; Wu, Liang; Zhang, Xinyi; Peng, Xiaohui; Su, Jinlong

    2016-11-01

    The polarization properties of thermal millimeter-wave emission capture inherent information of objects, e.g., material composition, shape, and surface features. In this paper, a polarization-based material-classification technique using passive millimeter-wave polarimetric imagery is presented. Linear polarization ratio (LPR) is created to be a new feature discriminator that is sensitive to material type and to remove the reflected ambient radiation effect. The LPR characteristics of several common natural and artificial materials are investigated by theoretical and experimental analysis. Based on a priori information about LPR characteristics, the optimal range of incident angle and the classification criterion are discussed. Simulation and measurement results indicate that the presented classification technique is effective for distinguishing between metals and dielectrics. This technique suggests possible applications for outdoor metal target detection in open scenes.

  10. Rapid discrimination of plastic packaging materials using MIR spectroscopy coupled with independent components analysis (ICA)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kassouf, Amine, E-mail: amine.kassouf@agroparistech.fr; INRA, UMR1145 Ingénierie Procédés Aliments, 1 Avenue des Olympiades, 91300 Massy; AgroParisTech, UMR1145 Ingénierie Procédés Aliments, 16 rue Claude Bernard, 75005 Paris

    2014-11-15

    Highlights: • An innovative technique, MIR-ICA, was applied to plastic packaging separation. • This study was carried out on PE, PP, PS, PET and PLA plastic packaging materials. • ICA was applied to discriminate plastics and 100% separation rates were obtained. • Analyses performed on two spectrometers proved the reproducibility of the method. • MIR-ICA is a simple and fast technique allowing plastic identification/classification. - Abstract: Plastic packaging wastes increased considerably in recent decades, raising a major and serious public concern on political, economical and environmental levels. Dealing with this kind of problems is generally done by landfilling and energymore » recovery. However, these two methods are becoming more and more expensive, hazardous to the public health and the environment. Therefore, recycling is gaining worldwide consideration as a solution to decrease the growing volume of plastic packaging wastes and simultaneously reduce the consumption of oil required to produce virgin resin. Nevertheless, a major shortage is encountered in recycling which is related to the sorting of plastic wastes. In this paper, a feasibility study was performed in order to test the potential of an innovative approach combining mid infrared (MIR) spectroscopy with independent components analysis (ICA), as a simple and fast approach which could achieve high separation rates. This approach (MIR-ICA) gave 100% discrimination rates in the separation of all studied plastics: polyethylene terephthalate (PET), polyethylene (PE), polypropylene (PP), polystyrene (PS) and polylactide (PLA). In addition, some more specific discriminations were obtained separating plastic materials belonging to the same polymer family e.g. high density polyethylene (HDPE) from low density polyethylene (LDPE). High discrimination rates were obtained despite the heterogeneity among samples especially differences in colors, thicknesses and surface textures. The

  11. Deep neural networks for texture classification-A theoretical analysis.

    PubMed

    Basu, Saikat; Mukhopadhyay, Supratik; Karki, Manohar; DiBiano, Robert; Ganguly, Sangram; Nemani, Ramakrishna; Gayaka, Shreekant

    2018-01-01

    We investigate the use of Deep Neural Networks for the classification of image datasets where texture features are important for generating class-conditional discriminative representations. To this end, we first derive the size of the feature space for some standard textural features extracted from the input dataset and then use the theory of Vapnik-Chervonenkis dimension to show that hand-crafted feature extraction creates low-dimensional representations which help in reducing the overall excess error rate. As a corollary to this analysis, we derive for the first time upper bounds on the VC dimension of Convolutional Neural Network as well as Dropout and Dropconnect networks and the relation between excess error rate of Dropout and Dropconnect networks. The concept of intrinsic dimension is used to validate the intuition that texture-based datasets are inherently higher dimensional as compared to handwritten digits or other object recognition datasets and hence more difficult to be shattered by neural networks. We then derive the mean distance from the centroid to the nearest and farthest sampling points in an n-dimensional manifold and show that the Relative Contrast of the sample data vanishes as dimensionality of the underlying vector space tends to infinity. Copyright © 2017 Elsevier Ltd. All rights reserved.

  12. Learning Robust and Discriminative Subspace With Low-Rank Constraints.

    PubMed

    Li, Sheng; Fu, Yun

    2016-11-01

    In this paper, we aim at learning robust and discriminative subspaces from noisy data. Subspace learning is widely used in extracting discriminative features for classification. However, when data are contaminated with severe noise, the performance of most existing subspace learning methods would be limited. Recent advances in low-rank modeling provide effective solutions for removing noise or outliers contained in sample sets, which motivates us to take advantage of low-rank constraints in order to exploit robust and discriminative subspace for classification. In particular, we present a discriminative subspace learning method called the supervised regularization-based robust subspace (SRRS) approach, by incorporating the low-rank constraint. SRRS seeks low-rank representations from the noisy data, and learns a discriminative subspace from the recovered clean data jointly. A supervised regularization function is designed to make use of the class label information, and therefore to enhance the discriminability of subspace. Our approach is formulated as a constrained rank-minimization problem. We design an inexact augmented Lagrange multiplier optimization algorithm to solve it. Unlike the existing sparse representation and low-rank learning methods, our approach learns a low-dimensional subspace from recovered data, and explicitly incorporates the supervised information. Our approach and some baselines are evaluated on the COIL-100, ALOI, Extended YaleB, FERET, AR, and KinFace databases. The experimental results demonstrate the effectiveness of our approach, especially when the data contain considerable noise or variations.

  13. Discrimination and classification of acute lymphoblastic leukemia cells by Raman spectroscopy

    NASA Astrophysics Data System (ADS)

    Managò, Stefano; Valente, Carmen; Mirabelli, Peppino; De Luca, Anna Chiara

    2015-05-01

    Currently, a combination of technologies is typically required to identify and classify leukemia cells. These methods often lack the specificity and sensitivity necessary for early and accurate diagnosis. Here, we demonstrate the use of Raman spectroscopy to identify normal B cells, collected from healthy patients, and three ALL cell lines (RS4;11, REH and MN60 at different differentiation level, respectively). Raman markers associated with DNA and protein vibrational modes have been identified that exhibit excellent discriminating power for leukemia cell identification. Principal Component Analysis was finally used to confirm the significance of these markers for identify leukemia cells and classifying the data. The obtained results indicate a sorting accuracy of 96% between the three leukemia cell lines.

  14. Classification of typical and atypical antipsychotic drugs on the basis of dopamine D-1, D-2 and serotonin2 pKi values.

    PubMed

    Meltzer, H Y; Matsubara, S; Lee, J C

    1989-10-01

    The pKi values of 13 reference typical and 7 reference atypical antipsychotic drugs (APDs) for rat striatal dopamine D-1 and D-2 receptor binding sites and cortical serotonin (5-HT2) receptor binding sites were determined. The atypical antipsychotics had significantly lower pKi values for the D-2 but not 5-HT2 binding sites. There was a trend for a lower pKi value for the D-1 binding site for the atypical APD. The 5-HT2 and D-1 pKi values were correlated for the typical APD whereas the 5-HT2 and D-2 pKi values were correlated for the atypical APD. A stepwise discriminant function analysis to determine the independent contribution of each pKi value for a given binding site to the classification as a typical or atypical APD entered the D-2 pKi value first, followed by the 5-HT2 pKi value. The D-1 pKi value was not entered. A discriminant function analysis correctly classified 19 of 20 of these compounds plus 14 of 17 additional test compounds as typical or atypical APD for an overall correct classification rate of 89.2%. The major contributors to the discriminant function were the D-2 and 5-HT2 pKi values. A cluster analysis based only on the 5-HT2/D2 ratio grouped 15 of 17 atypical + one typical APD in one cluster and 19 of 20 typical + two atypical APDs in a second cluster, for an overall correct classification rate of 91.9%. When the stepwise discriminant function was repeated for all 37 compounds, only the D-2 and 5-HT2 pKi values were entered into the discriminant function.(ABSTRACT TRUNCATED AT 250 WORDS)

  15. Application of Neutral Networks to Seismic Signal Discrimination

    DTIC Science & Technology

    1993-05-15

    AD-A276 626 PL-TR-93-2154 Application of Neural Networks to Seismic Signal Discrimination James A. Cercone V. Shane Foster W. Mike Clark Larry... Networks to Seismic Signal Discrimination PE 61101E PR 1DMO TA DA WU AA .AUTHOR(S) Stephen Goodman John Martin C James A. Cercone Don J. Smith G...of Technology Applications of Neural Networks to Seismic Classification project. The first year of research focused on identification and collection

  16. Industrial defect discrimination applying infrared imaging spectroscopy and artificial neural networks

    NASA Astrophysics Data System (ADS)

    Garcia-Allende, Pilar Beatriz; Conde, Olga M.; Madruga, Francisco J.; Cubillas, Ana M.; Lopez-Higuera, Jose M.

    2008-03-01

    A non-intrusive infrared sensor for the detection of spurious elements in an industrial raw material chain has been developed. The system is an extension to the whole near infrared range of the spectrum of a previously designed system based on the Vis-NIR range (400 - 1000 nm). It incorporates a hyperspectral imaging spectrograph able to register simultaneously the NIR reflected spectrum of the material under study along all the points of an image line. The working material has been different tobacco leaf blends mixed with typical spurious elements of this field such as plastics, cardboards, etc. Spurious elements are discriminated automatically by an artificial neural network able to perform the classification with a high degree of accuracy. Due to the high amount of information involved in the process, Principal Component Analysis is first applied to perform data redundancy removal. By means of the extension to the whole NIR range of the spectrum, from 1000 to 2400 nm, the characterization of the material under test is highly improved. The developed technique could be applied to the classification and discrimination of other materials, and, as a consequence of its non-contact operation it is particularly suitable for food quality control.

  17. Multivariate qualitative analysis of banned additives in food safety using surface enhanced Raman scattering spectroscopy

    NASA Astrophysics Data System (ADS)

    He, Shixuan; Xie, Wanyi; Zhang, Wei; Zhang, Liqun; Wang, Yunxia; Liu, Xiaoling; Liu, Yulong; Du, Chunlei

    2015-02-01

    A novel strategy which combines iteratively cubic spline fitting baseline correction method with discriminant partial least squares qualitative analysis is employed to analyze the surface enhanced Raman scattering (SERS) spectroscopy of banned food additives, such as Sudan I dye and Rhodamine B in food, Malachite green residues in aquaculture fish. Multivariate qualitative analysis methods, using the combination of spectra preprocessing iteratively cubic spline fitting (ICSF) baseline correction with principal component analysis (PCA) and discriminant partial least squares (DPLS) classification respectively, are applied to investigate the effectiveness of SERS spectroscopy for predicting the class assignments of unknown banned food additives. PCA cannot be used to predict the class assignments of unknown samples. However, the DPLS classification can discriminate the class assignment of unknown banned additives using the information of differences in relative intensities. The results demonstrate that SERS spectroscopy combined with ICSF baseline correction method and exploratory analysis methodology DPLS classification can be potentially used for distinguishing the banned food additives in field of food safety.

  18. Improving the analysis of near-spectroscopy data with multivariate classification of hemodynamic patterns: a theoretical formulation and validation.

    PubMed

    Gemignani, Jessica; Middell, Eike; Barbour, Randall L; Graber, Harry L; Blankertz, Benjamin

    2018-04-04

    The statistical analysis of functional near infrared spectroscopy (fNIRS) data based on the general linear model (GLM) is often made difficult by serial correlations, high inter-subject variability of the hemodynamic response, and the presence of motion artifacts. In this work we propose to extract information on the pattern of hemodynamic activations without using any a priori model for the data, by classifying the channels as 'active' or 'not active' with a multivariate classifier based on linear discriminant analysis (LDA). This work is developed in two steps. First we compared the performance of the two analyses, using a synthetic approach in which simulated hemodynamic activations were combined with either simulated or real resting-state fNIRS data. This procedure allowed for exact quantification of the classification accuracies of GLM and LDA. In the case of real resting-state data, the correlations between classification accuracy and demographic characteristics were investigated by means of a Linear Mixed Model. In the second step, to further characterize the reliability of the newly proposed analysis method, we conducted an experiment in which participants had to perform a simple motor task and data were analyzed with the LDA-based classifier as well as with the standard GLM analysis. The results of the simulation study show that the LDA-based method achieves higher classification accuracies than the GLM analysis, and that the LDA results are more uniform across different subjects and, in contrast to the accuracies achieved by the GLM analysis, have no significant correlations with any of the demographic characteristics. Findings from the real-data experiment are consistent with the results of the real-plus-simulation study, in that the GLM-analysis results show greater inter-subject variability than do the corresponding LDA results. The results obtained suggest that the outcome of GLM analysis is highly vulnerable to violations of theoretical assumptions

  19. Harassment and discrimination in medical training: a systematic review and meta-analysis.

    PubMed

    Fnais, Naif; Soobiah, Charlene; Chen, Maggie Hong; Lillie, Erin; Perrier, Laure; Tashkhandi, Mariam; Straus, Sharon E; Mamdani, Muhammad; Al-Omran, Mohammed; Tricco, Andrea C

    2014-05-01

    Harassment and discrimination include a wide range of behaviors that medical trainees perceive as being humiliating, hostile, or abusive. To understand the significance of such mistreatment and to explore potential preventive strategies, the authors conducted a systematic review and meta-analysis to examine the prevalence, risk factors, and sources of harassment and discrimination among medical trainees. In 2011, the authors identified relevant studies by searching MEDLINE and EMBASE, scanning reference lists of relevant studies, and contacting experts. They included studies that reported the prevalence, risk factors, and sources of harassment and discrimination among medical trainees. Two reviewers independently screened all articles and abstracted study and participant characteristics and study results. The authors assessed the methodological quality in individual studies using the Newcastle-Ottawa Scale. They also conducted a meta-analysis. The authors included 57 cross-sectional and 2 cohort studies in their review. The meta-analysis of 51 studies demonstrated that 59.4% of medical trainees had experienced at least one form of harassment or discrimination during their training (95% confidence interval [CI]: 52.0%-66.7%). Verbal harassment was the most commonly cited form of harassment (prevalence: 63.0%; 95% CI: 54.8%-71.2%). Consultants were the most commonly cited source of harassment and discrimination, followed by patients or patients' families (34.4% and 21.9%, respectively). This review demonstrates the surprisingly high prevalence of harassment and discrimination among medical trainees that has not declined over time. The authors recommend both drafting policies and promoting cultural change within academic institutions to prevent future abuse.

  20. Classification of EEG Signals Based on Pattern Recognition Approach.

    PubMed

    Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed

    2017-01-01

    Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a "pattern recognition" approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90-7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11-89.63% and 91.60-81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy.

  1. Classification of EEG Signals Based on Pattern Recognition Approach

    PubMed Central

    Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed

    2017-01-01

    Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a “pattern recognition” approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90–7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11–89.63% and 91.60–81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy. PMID

  2. Noncontact discrimination of animal and human blood with vacuum blood vessel and factors affect the discrimination

    NASA Astrophysics Data System (ADS)

    Zhang, Linna; Zhang, Shengzhao; Sun, Meixiu; Li, Hongxiao; Li, Yingxin; Fu, Zhigang; Guan, Yang; Li, Gang; Lin, Ling

    2017-03-01

    Discrimination of human and nonhuman blood is crucial for import-export ports and inspection and quarantine departments. Current methods are usually destructive, complicated and time-consuming. We had previously demonstrated that visible diffuse reflectance spectroscopy combining PLS-DA method can successfully realize human blood discrimination. In that research, the spectra were measured with the fiber probe under the surface of blood samples. However, open sampling may pollute the blood samples. Virulence factors in blood samples can also endanger inspectors. In this paper, we explored the classification effect with the blood samples measured in the original containers-vacuum blood vessel. Furthermore, we studied the impact of different conditions of blood samples, such as coagulation and hemolysis, on the prediction ability of the discrimination model. The calibration model built with blood samples in different conditions displayed a satisfactory prediction result. This research demonstrated that visible and near-infrared diffuse reflectance spectroscopy method was potential for noncontact discrimination of human blood.

  3. Using non-targeted direct analysis in real time-mass spectrometry (DART-MS) to discriminate seeds based on endogenous or exogenous chemicals.

    PubMed

    Subbaraj, Arvind K; Barrett, Brent A; Wakelin, Steve A; Fraser, Karl

    2015-10-01

    Forage seeds are a highly traded agricultural commodity, and therefore, quality control and assurance is high priority. In this study, we have used direct analysis in real time-mass spectrometry (DART-MS) as a tool to discriminate forage seeds based on their non-targeted chemical profiles. In the first experiment, two lots of perennial ryegrass (Lolium perenne L.) seed were discriminated based on exogenous residues of N-(3, 4-dichlorophenyl)-N,N-dimethylurea (Diuron(TM)), a herbicide. In a separate experiment, washed and unwashed seeds of the forage legumes white clover (Trifolium repens L.) and alfalfa (Medicago sativa L.) were discriminated based on the presence or absence of oxylipins, a class of endogenous antimicrobial compounds. Unwashed seeds confer toxicity towards symbiotic, nitrogen-fixing rhizobia which are routinely coated on legume seeds before planting, resulting in reduced rhizobial count. This is the first report of automatic introduction of intact seeds in the DART ion source and detecting oxylipins using DART-MS. Apart from providing scope to investigate legume-rhizobia symbiosis further in the context of oxylipins, the results presented here will enable future studies aimed at classification of seeds based on chemicals bound to the seed coat, thereby offering an efficient screening device for industry.

  4. Robust L1-norm two-dimensional linear discriminant analysis.

    PubMed

    Li, Chun-Na; Shao, Yuan-Hai; Deng, Nai-Yang

    2015-05-01

    In this paper, we propose an L1-norm two-dimensional linear discriminant analysis (L1-2DLDA) with robust performance. Different from the conventional two-dimensional linear discriminant analysis with L2-norm (L2-2DLDA), where the optimization problem is transferred to a generalized eigenvalue problem, the optimization problem in our L1-2DLDA is solved by a simple justifiable iterative technique, and its convergence is guaranteed. Compared with L2-2DLDA, our L1-2DLDA is more robust to outliers and noises since the L1-norm is used. This is supported by our preliminary experiments on toy example and face datasets, which show the improvement of our L1-2DLDA over L2-2DLDA. Copyright © 2015 Elsevier Ltd. All rights reserved.

  5. Two-dimensional statistical linear discriminant analysis for real-time robust vehicle-type recognition

    NASA Astrophysics Data System (ADS)

    Zafar, I.; Edirisinghe, E. A.; Acar, S.; Bez, H. E.

    2007-02-01

    Automatic vehicle Make and Model Recognition (MMR) systems provide useful performance enhancements to vehicle recognitions systems that are solely based on Automatic License Plate Recognition (ALPR) systems. Several car MMR systems have been proposed in literature. However these approaches are based on feature detection algorithms that can perform sub-optimally under adverse lighting and/or occlusion conditions. In this paper we propose a real time, appearance based, car MMR approach using Two Dimensional Linear Discriminant Analysis that is capable of addressing this limitation. We provide experimental results to analyse the proposed algorithm's robustness under varying illumination and occlusions conditions. We have shown that the best performance with the proposed 2D-LDA based car MMR approach is obtained when the eigenvectors of lower significance are ignored. For the given database of 200 car images of 25 different make-model classifications, a best accuracy of 91% was obtained with the 2D-LDA approach. We use a direct Principle Component Analysis (PCA) based approach as a benchmark to compare and contrast the performance of the proposed 2D-LDA approach to car MMR. We conclude that in general the 2D-LDA based algorithm supersedes the performance of the PCA based approach.

  6. Sex discrimination from the acetabulum in a twentieth-century skeletal sample from France using digital photogrammetry.

    PubMed

    Macaluso, P J

    2011-02-01

    Digital photogrammetric methods were used to collect diameter, area, and perimeter data of the acetabulum for a twentieth-century skeletal sample from France (Georges Olivier Collection, Musée de l'Homme, Paris) consisting of 46 males and 36 females. The measurements were then subjected to both discriminant function and logistic regression analyses in order to develop osteometric standards for sex assessment. Univariate discriminant functions and logistic regression equations yielded overall correct classification accuracy rates for both the left and the right acetabula ranging from 84.1% to 89.6%. The multivariate models developed in this study did not provide increased accuracy over those using only a single variable. Classification sex bias ratios ranged between 1.1% and 7.3% for the majority of models. The results of this study, therefore, demonstrate that metric analysis of acetabular size provides a highly accurate, and easily replicable, method of discriminating sex in this documented skeletal collection. The results further suggest that the addition of area and perimeter data derived from digital images may provide a more effective method of sex assessment than that offered by traditional linear measurements alone. Copyright © 2010 Elsevier GmbH. All rights reserved.

  7. Structural vibration-based damage classification of delaminated smart composite laminates

    NASA Astrophysics Data System (ADS)

    Khan, Asif; Kim, Heung Soo; Sohn, Jung Woo

    2018-03-01

    Separation along the interfaces of layers (delamination) is a principal mode of failure in laminated composites and its detection is of prime importance for structural integrity of composite materials. In this work, structural vibration response is employed to detect and classify delaminations in piezo-bonded laminated composites. Improved layerwise theory and finite element method are adopted to develop the electromechanically coupled governing equation of a smart composite laminate with and without delaminations. Transient responses of the healthy and damaged structures are obtained through a surface bonded piezoelectric sensor by solving the governing equation in the time domain. Wavelet packet transform (WPT) and linear discriminant analysis (LDA) are employed to extract discriminative features from the structural vibration response of the healthy and delaminated structures. Dendrogram-based support vector machine (DSVM) is used to classify the discriminative features. The confusion matrix of the classification algorithm provided physically consistent results.

  8. Information analysis of a spatial database for ecological land classification

    NASA Technical Reports Server (NTRS)

    Davis, Frank W.; Dozier, Jeff

    1990-01-01

    An ecological land classification was developed for a complex region in southern California using geographic information system techniques of map overlay and contingency table analysis. Land classes were identified by mutual information analysis of vegetation pattern in relation to other mapped environmental variables. The analysis was weakened by map errors, especially errors in the digital elevation data. Nevertheless, the resulting land classification was ecologically reasonable and performed well when tested with higher quality data from the region.

  9. [Nondestructive discrimination of strawberry varieties by NIR and BP-ANN].

    PubMed

    Niu, Xiao-ying; Shao, Li-min; Zhao, Zhi-lei; Zhang, Xiao-yu

    2012-08-01

    Strawberry variety is a main factor that can influence strawberry fruit quality. The use of near-infrared reflectance spectroscopy was explored discriminate among samples of strawberry of different varieties. And the significance of difference among different varieties was analyzed by comparison of the chemical composition of the different varieties samples. The performance of models established using back propagation-artificial neural networks (BP-ANN), least squares-support vector machine and discriminant analysis were evaluated on spectra range of 4545-9090 cm(-1). The optimal model was obtained by BP-ANN with a topology of 12-18-3, which correctly classified 96.68% of calibration set and 97.14% of prediction set. And the 94.95%, 97% and 98.29% classifications were given respectively for "Tianbao" (n=99), "Fengxiang" (n=100) and "Mingxing" (n=117). One-way analysis of variance was made for comparison of the mean values for soluble solids content (SSC), titratable acid (TA), pH value and SSC-TA ratio, and the statistically significant differences were found. Principal component analysis was performed on the four chemical compositions, and obvious clustering tendencies for different varieties were found. These results showed that NIR combined with BP-ANN can discriminate strawberry of different varieties effectively, and the difference in chemical compositions of different varieties strawberry might be a chemical validation for NIR results.

  10. Joint recognition and discrimination in nonlinear feature space

    NASA Astrophysics Data System (ADS)

    Talukder, Ashit; Casasent, David P.

    1997-09-01

    A new general method for linear and nonlinear feature extraction is presented. It is novel since it provides both representation and discrimination while most other methods are concerned with only one of these issues. We call this approach the maximum representation and discrimination feature (MRDF) method and show that the Bayes classifier and the Karhunen- Loeve transform are special cases of it. We refer to our nonlinear feature extraction technique as nonlinear eigen- feature extraction. It is new since it has a closed-form solution and produces nonlinear decision surfaces with higher rank than do iterative methods. Results on synthetic databases are shown and compared with results from standard Fukunaga- Koontz transform and Fisher discriminant function methods. The method is also applied to an automated product inspection problem (discrimination) and to the classification and pose estimation of two similar objects (representation and discrimination).

  11. Performance of International Classification of Diseases-based injury severity measures used to predict in-hospital mortality: A systematic review and meta-analysis.

    PubMed

    Gagné, Mathieu; Moore, Lynne; Beaudoin, Claudia; Batomen Kuimi, Brice Lionel; Sirois, Marie-Josée

    2016-03-01

    The International Classification of Diseases (ICD) is the main classification system used for population-based injury surveillance activities but does not contain information on injury severity. ICD-based injury severity measures can be empirically derived or mapped, but no single approach has been formally recommended. This study aimed to compare the performance of ICD-based injury severity measures to predict in-hospital mortality among injury-related admissions. A systematic review and a meta-analysis were conducted. MEDLINE, EMBASE, and Global Health databases were searched from their inception through September 2014. Observational studies that assessed the performance of ICD-based injury severity measures to predict in-hospital mortality and reported discriminative ability using the area under a receiver operating characteristic curve (AUC) were included. Metrics of model performance were extracted. Pooled AUC were estimated under random-effects models. Twenty-two eligible studies reported 72 assessments of discrimination on ICD-based injury severity measures. Reported AUC ranged from 0.681 to 0.958. Of the 72 assessments, 46 showed excellent (0.80 ≤ AUC < 0.90) and 6 outstanding (AUC ≥ 0.90) discriminative ability. Pooled AUC for ICD-based Injury Severity Score (ICISS) based on the product of traditional survival proportions was significantly higher than measures based on ICD mapped to Abbreviated Injury Scale (AIS) scores (0.863 vs. 0.825 for ICDMAP-ISS [p = 0.005] and ICDMAP-NISS [p = 0.016]). Similar results were observed when studies were stratified by the type of data used (trauma registry or hospital discharge) or the provenance of survival proportions (internally or externally derived). However, among studies published after 2003 the Trauma Mortality Prediction Model based on ICD-9 codes (TMPM-9) demonstrated superior discriminative ability than ICISS using the product of traditional survival proportions (0.850 vs. 0.802, p = 0.002). Models

  12. Discrimination of transgenic soybean seeds by terahertz spectroscopy

    NASA Astrophysics Data System (ADS)

    Liu, Wei; Liu, Changhong; Chen, Feng; Yang, Jianbo; Zheng, Lei

    2016-10-01

    Discrimination of genetically modified organisms is increasingly demanded by legislation and consumers worldwide. The feasibility of a non-destructive discrimination of glyphosate-resistant and conventional soybean seeds and their hybrid descendants was examined by terahertz time-domain spectroscopy system combined with chemometrics. Principal component analysis (PCA), least squares-support vector machines (LS-SVM) and PCA-back propagation neural network (PCA-BPNN) models with the first and second derivative and standard normal variate (SNV) transformation pre-treatments were applied to classify soybean seeds based on genotype. Results demonstrated clear differences among glyphosate-resistant, hybrid descendants and conventional non-transformed soybean seeds could easily be visualized with an excellent classification (accuracy was 88.33% in validation set) using the LS-SVM and the spectra with SNV pre-treatment. The results indicated that THz spectroscopy techniques together with chemometrics would be a promising technique to distinguish transgenic soybean seeds from non-transformed seeds with high efficiency and without any major sample preparation.

  13. Diverse Region-Based CNN for Hyperspectral Image Classification.

    PubMed

    Zhang, Mengmeng; Li, Wei; Du, Qian

    2018-06-01

    Convolutional neural network (CNN) is of great interest in machine learning and has demonstrated excellent performance in hyperspectral image classification. In this paper, we propose a classification framework, called diverse region-based CNN, which can encode semantic context-aware representation to obtain promising features. With merging a diverse set of discriminative appearance factors, the resulting CNN-based representation exhibits spatial-spectral context sensitivity that is essential for accurate pixel classification. The proposed method exploiting diverse region-based inputs to learn contextual interactional features is expected to have more discriminative power. The joint representation containing rich spectral and spatial information is then fed to a fully connected network and the label of each pixel vector is predicted by a softmax layer. Experimental results with widely used hyperspectral image data sets demonstrate that the proposed method can surpass any other conventional deep learning-based classifiers and other state-of-the-art classifiers.

  14. Application of different classification methods for litho-fluid facies prediction: a case study from the offshore Nile Delta

    NASA Astrophysics Data System (ADS)

    Aleardi, Mattia; Ciabarri, Fabio

    2017-10-01

    In this work we test four classification methods for litho-fluid facies identification in a clastic reservoir located in the offshore Nile Delta. The ultimate goal of this study is to find an optimal classification method for the area under examination. The geologic context of the investigated area allows us to consider three different facies in the classification: shales, brine sands and gas sands. The depth at which the reservoir zone is located (2300-2700 m) produces a significant overlap of the P- and S-wave impedances of brine sands and gas sands that makes discrimination between these two litho-fluid classes particularly problematic. The classification is performed on the feature space defined by the elastic properties that are derived from recorded reflection seismic data by means of amplitude versus angle Bayesian inversion. As classification methods we test both deterministic and probabilistic approaches: the quadratic discriminant analysis and the neural network methods belong to the first group, whereas the standard Bayesian approach and the Bayesian approach that includes a 1D Markov chain a priori model to constrain the vertical continuity of litho-fluid facies belong to the second group. The ability of each method to discriminate the different facies is evaluated both on synthetic seismic data (computed on the basis of available borehole information) and on field seismic data. The outcomes of each classification method are compared with the known facies profile derived from well log data and the goodness of the results is quantitatively evaluated using the so-called confusion matrix. The results show that all methods return vertical facies profiles in which the main reservoir zone is correctly identified. However, the consideration of as much prior information as possible in the classification process is the winning choice for deriving a reliable and physically plausible predicted facies profile.

  15. Application of quantum-behaved particle swarm optimization to motor imagery EEG classification.

    PubMed

    Hsu, Wei-Yen

    2013-12-01

    In this study, we propose a recognition system for single-trial analysis of motor imagery (MI) electroencephalogram (EEG) data. Applying event-related brain potential (ERP) data acquired from the sensorimotor cortices, the system chiefly consists of automatic artifact elimination, feature extraction, feature selection and classification. In addition to the use of independent component analysis, a similarity measure is proposed to further remove the electrooculographic (EOG) artifacts automatically. Several potential features, such as wavelet-fractal features, are then extracted for subsequent classification. Next, quantum-behaved particle swarm optimization (QPSO) is used to select features from the feature combination. Finally, selected sub-features are classified by support vector machine (SVM). Compared with without artifact elimination, feature selection using a genetic algorithm (GA) and feature classification with Fisher's linear discriminant (FLD) on MI data from two data sets for eight subjects, the results indicate that the proposed method is promising in brain-computer interface (BCI) applications.

  16. Classification of the Correct Quranic Letters Pronunciation of Male and Female Reciters

    NASA Astrophysics Data System (ADS)

    Khairuddin, Safiah; Ahmad, Salmiah; Embong, Abdul Halim; Nur Wahidah Nik Hashim, Nik; Altamas, Tareq M. K.; Nuratikah Syd Badaruddin, Syarifah; Shahbudin Hassan, Surul

    2017-11-01

    Recitation of the Holy Quran with the correct Tajweed is essential for every Muslim. Islam has encouraged Quranic education since early age as the recitation of the Quran correctly will represent the correct meaning of the words of Allah. It is important to recite the Quranic verses according to its characteristics (sifaat) and from its point of articulations (makhraj). This paper presents the identification and classification analysis of Quranic letters pronunciation for both male and female reciters, to obtain the unique representation of each letter by male as compared to female expert reciters. Linear Discriminant Analysis (LDA) was used as the classifier to classify the data with Formants and Power Spectral Density (PSD) as the acoustic features. The result shows that linear classifier of PSD with band 1 and band 2 power spectral combinations gives a high percentage of classification accuracy for most of the Quranic letters. It is also shown that the pronunciation by male reciters gives better result in the classification of the Quranic letters.

  17. Unambiguous discrimination between linearly dependent equidistant states with multiple copies

    NASA Astrophysics Data System (ADS)

    Zhang, Wen-Hai; Ren, Gang

    2018-07-01

    Linearly independent quantum states can be unambiguously discriminated, but linearly dependent ones cannot. For linearly dependent quantum states, however, if C copies of the single states are available, then they may form linearly independent states, and can be unambiguously discriminated. We consider unambiguous discrimination among N = D + 1 linearly dependent states given that C copies are available and that the single copies span a D-dimensional space with equal inner products. The maximum unambiguous discrimination probability is derived for all C with equal a priori probabilities. For this classification of the linearly dependent equidistant states, our result shows that if C is even then adding a further copy fails to increase the maximum discrimination probability.

  18. Characterization and classification of patients with different levels of cardiac death risk by using Poincaré plot analysis.

    PubMed

    Rodriguez, Javier; Voss, Andreas; Caminal, Pere; Bayes-Genis, Antoni; Giraldo, Beatriz F

    2017-07-01

    Cardiac death risk is still a big problem by an important part of the population, especially in elderly patients. In this study, we propose to characterize and analyze the cardiovascular and cardiorespiratory systems using the Poincaré plot. A total of 46 cardiomyopathy patients and 36 healthy subjets were analyzed. Left ventricular ejection fraction (LVEF) was used to stratify patients with low risk (LR: LVEF > 35%, 16 patients), and high risk (HR: LVEF ≤ 35%, 30 patients) of heart attack. RR, SBP and T Tot time series were extracted from the ECG, blood pressure and respiratory flow signals, respectively. Parameters that describe the scatterplott of Poincaré method, related to short- and long-term variabilities, acceleration and deceleration of the dynamic system, and the complex correlation index were extracted. The linear discriminant analysis (LDA) and the support vector machines (SVM) classification methods were used to analyze the results of the extracted parameters. The results showed that cardiac parameters were the best to discriminate between HR and LR groups, especially the complex correlation index (p = 0.009). Analising the interaction, the best result was obtained with the relation between the difference of the standard deviation of the cardiac and respiratory system (p = 0.003). When comparing HR vs LR groups, the best classification was obtained applying SVM method, using an ANOVA kernel, with an accuracy of 98.12%. An accuracy of 97.01% was obtained by comparing patients versus healthy, with a SVM classifier and Laplacian kernel. The morphology of Poincaré plot introduces parameters that allow the characterization of the cardiorespiratory system dynamics.

  19. Data fusion for food authentication. Combining rare earth elements and trace metals to discriminate "Fava Santorinis" from other yellow split peas using chemometric tools.

    PubMed

    Drivelos, Spiros A; Higgins, Kevin; Kalivas, John H; Haroutounian, Serkos A; Georgiou, Constantinos A

    2014-12-15

    "Fava Santorinis", is a protected designation of origin (PDO) yellow split pea species growing only in the island of Santorini in Greece. Due to its nutritional quality and taste, it has gained a high monetary value. Thus, it is prone to adulteration with other yellow split peas. In order to discriminate "Fava Santorinis" from other yellow split peas, four classification methods utilising rare earth elements (REEs) measured through inductively coupled plasma-mass spectrometry (ICP-MS) are studied. The four classification processes are orthogonal projection analysis (OPA), Mahalanobis distance (MD), partial least squares discriminant analysis (PLS-DA) and k nearest neighbours (KNN). Since it is known that trace elements are often useful to determine geographical origin of food products, we further quantitated for trace elements using ICP-MS. Presented in this paper are results using the four classification processes based on the fusion of the REEs data with the trace element data. Overall, the OPA method was found to perform best with up to 100% accuracy using the fused data. Copyright © 2014 Elsevier Ltd. All rights reserved.

  20. A face and palmprint recognition approach based on discriminant DCT feature extraction.

    PubMed

    Jing, Xiao-Yuan; Zhang, David

    2004-12-01

    In the field of image processing and recognition, discrete cosine transform (DCT) and linear discrimination are two widely used techniques. Based on them, we present a new face and palmprint recognition approach in this paper. It first uses a two-dimensional separability judgment to select the DCT frequency bands with favorable linear separability. Then from the selected bands, it extracts the linear discriminative features by an improved Fisherface method and performs the classification by the nearest neighbor classifier. We detailedly analyze theoretical advantages of our approach in feature extraction. The experiments on face databases and palmprint database demonstrate that compared to the state-of-the-art linear discrimination methods, our approach obtains better classification performance. It can significantly improve the recognition rates for face and palmprint data and effectively reduce the dimension of feature space.

  1. Geographical identification of saffron (Crocus sativus L.) by linear discriminant analysis applied to the UV-visible spectra of aqueous extracts.

    PubMed

    D'Archivio, Angelo Antonio; Maggi, Maria Anna

    2017-03-15

    We attempted geographical classification of saffron using UV-visible spectroscopy, conventionally adopted for quality grading according to the ISO Normative 3632. We investigated 81 saffron samples produced in L'Aquila, Città della Pieve, Cascia, and Sardinia (Italy) and commercial products purchased in various supermarkets. Exploratory principal component analysis applied to the UV-vis spectra of saffron aqueous extracts revealed a clear differentiation of the samples belonging to different quality categories, but a poor separation according to the geographical origin of the spices. On the other hand, linear discriminant analysis based on 8 selected absorbance values, concentrated near 279, 305 and 328nm, allowed a good distinction of the spices coming from different sites. Under severe validation conditions (30% and 50% of saffron samples in the evaluation set), correct predictions were 85 and 83%, respectively. Copyright © 2016 Elsevier Ltd. All rights reserved.

  2. Exploration of computational methods for classification of movement intention during human voluntary movement from single trial EEG.

    PubMed

    Bai, Ou; Lin, Peter; Vorbach, Sherry; Li, Jiang; Furlani, Steve; Hallett, Mark

    2007-12-01

    To explore effective combinations of computational methods for the prediction of movement intention preceding the production of self-paced right and left hand movements from single trial scalp electroencephalogram (EEG). Twelve naïve subjects performed self-paced movements consisting of three key strokes with either hand. EEG was recorded from 128 channels. The exploration was performed offline on single trial EEG data. We proposed that a successful computational procedure for classification would consist of spatial filtering, temporal filtering, feature selection, and pattern classification. A systematic investigation was performed with combinations of spatial filtering using principal component analysis (PCA), independent component analysis (ICA), common spatial patterns analysis (CSP), and surface Laplacian derivation (SLD); temporal filtering using power spectral density estimation (PSD) and discrete wavelet transform (DWT); pattern classification using linear Mahalanobis distance classifier (LMD), quadratic Mahalanobis distance classifier (QMD), Bayesian classifier (BSC), multi-layer perceptron neural network (MLP), probabilistic neural network (PNN), and support vector machine (SVM). A robust multivariate feature selection strategy using a genetic algorithm was employed. The combinations of spatial filtering using ICA and SLD, temporal filtering using PSD and DWT, and classification methods using LMD, QMD, BSC and SVM provided higher performance than those of other combinations. Utilizing one of the better combinations of ICA, PSD and SVM, the discrimination accuracy was as high as 75%. Further feature analysis showed that beta band EEG activity of the channels over right sensorimotor cortex was most appropriate for discrimination of right and left hand movement intention. Effective combinations of computational methods provide possible classification of human movement intention from single trial EEG. Such a method could be the basis for a potential brain

  3. Eye-gaze control of the computer interface: Discrimination of zoom intent

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Goldberg, J.H.; Schryver, J.C.

    1993-10-01

    An analysis methodology and associated experiment were developed to assess whether definable and repeatable signatures of eye-gaze characteristics are evident, preceding a decision to zoom-in, zoom-out, or not to zoom at a computer interface. This user intent discrimination procedure can have broad application in disability aids and telerobotic control. Eye-gaze was collected from 10 subjects in a controlled experiment, requiring zoom decisions. The eye-gaze data were clustered, then fed into a multiple discriminant analysis (MDA) for optimal definition of heuristics separating the zoom-in, zoom-out, and no-zoom conditions. Confusion matrix analyses showed that a number of variable combinations classified at amore » statistically significant level, but practical significance was more difficult to establish. Composite contour plots demonstrated the regions in parameter space consistently assigned by the MDA to unique zoom conditions. Peak classification occurred at about 1200--1600 msec. Improvements in the methodology to achieve practical real-time zoom control are considered.« less

  4. Study of a Vocal Feature Selection Method and Vocal Properties for Discriminating Four Constitution Types

    PubMed Central

    Kim, Keun Ho; Ku, Boncho; Kang, Namsik; Kim, Young-Su; Jang, Jun-Su; Kim, Jong Yeol

    2012-01-01

    The voice has been used to classify the four constitution types, and to recognize a subject's health condition by extracting meaningful physical quantities, in traditional Korean medicine. In this paper, we propose a method of selecting the reliable variables from various voice features, such as frequency derivative features, frequency band ratios, and intensity, from vowels and a sentence. Further, we suggest a process to extract independent variables by eliminating explanatory variables and reducing their correlation and remove outlying data to enable reliable discriminant analysis. Moreover, the suitable division of data for analysis, according to the gender and age of subjects, is discussed. Finally, the vocal features are applied to a discriminant analysis to classify each constitution type. This method of voice classification can be widely used in the u-Healthcare system of personalized medicine and for improving diagnostic accuracy. PMID:22529874

  5. Identification of important image features for pork and turkey ham classification using colour and wavelet texture features and genetic selection.

    PubMed

    Jackman, Patrick; Sun, Da-Wen; Allen, Paul; Valous, Nektarios A; Mendoza, Fernando; Ward, Paddy

    2010-04-01

    A method to discriminate between various grades of pork and turkey ham was developed using colour and wavelet texture features. Image analysis methods originally developed for predicting the palatability of beef were applied to rapidly identify the ham grade. With high quality digital images of 50-94 slices per ham it was possible to identify the greyscale that best expressed the differences between the various ham grades. The best 10 discriminating image features were then found with a genetic algorithm. Using the best 10 image features, simple linear discriminant analysis models produced 100% correct classifications for both pork and turkey on both calibration and validation sets. 2009 Elsevier Ltd. All rights reserved.

  6. Graphene Nanoplatelet-Polymer Chemiresistive Sensor Arrays for the Detection and Discrimination of Chemical Warfare Agent Simulants.

    PubMed

    Wiederoder, Michael S; Nallon, Eric C; Weiss, Matt; McGraw, Shannon K; Schnee, Vincent P; Bright, Collin J; Polcha, Michael P; Paffenroth, Randy; Uzarski, Joshua R

    2017-11-22

    A cross-reactive array of semiselective chemiresistive sensors made of polymer-graphene nanoplatelet (GNP) composite coated electrodes was examined for detection and discrimination of chemical warfare agents (CWA). The arrays employ a set of chemically diverse polymers to generate a unique response signature for multiple CWA simulants and background interferents. The developed sensors' signal remains consistent after repeated exposures to multiple analytes for up to 5 days with a similar signal magnitude across different replicate sensors with the same polymer-GNP coating. An array of 12 sensors each coated with a different polymer-GNP mixture was exposed 100 times to a cycle of single analyte vapors consisting of 5 chemically similar CWA simulants and 8 common background interferents. The collected data was vector normalized to reduce concentration dependency, z-scored to account for baseline drift and signal-to-noise ratio, and Kalman filtered to reduce noise. The processed data was dimensionally reduced with principal component analysis and analyzed with four different machine learning algorithms to evaluate discrimination capabilities. For 5 similarly structured CWA simulants alone 100% classification accuracy was achieved. For all analytes tested 99% classification accuracy was achieved demonstrating the CWA discrimination capabilities of the developed system. The novel sensor fabrication methods and data processing techniques are attractive for development of sensor platforms for discrimination of CWA and other classes of chemical vapors.

  7. General methodology for simultaneous representation and discrimination of multiple object classes

    NASA Astrophysics Data System (ADS)

    Talukder, Ashit; Casasent, David P.

    1998-03-01

    We address a new general method for linear and nonlinear feature extraction for simultaneous representation and classification. We call this approach the maximum representation and discrimination feature (MRDF) method. We develop a novel nonlinear eigenfeature extraction technique to represent data with closed-form solutions and use it to derive a nonlinear MRDF algorithm. Results of the MRDF method on synthetic databases are shown and compared with results from standard Fukunaga-Koontz transform and Fisher discriminant function methods. The method is also applied to an automated product inspection problem and for classification and pose estimation of two similar objects under 3D aspect angle variations.

  8. Multi-factorial analysis of class prediction error: estimating optimal number of biomarkers for various classification rules.

    PubMed

    Khondoker, Mizanur R; Bachmann, Till T; Mewissen, Muriel; Dickinson, Paul; Dobrzelecki, Bartosz; Campbell, Colin J; Mount, Andrew R; Walton, Anthony J; Crain, Jason; Schulze, Holger; Giraud, Gerard; Ross, Alan J; Ciani, Ilenia; Ember, Stuart W J; Tlili, Chaker; Terry, Jonathan G; Grant, Eilidh; McDonnell, Nicola; Ghazal, Peter

    2010-12-01

    Machine learning and statistical model based classifiers have increasingly been used with more complex and high dimensional biological data obtained from high-throughput technologies. Understanding the impact of various factors associated with large and complex microarray datasets on the predictive performance of classifiers is computationally intensive, under investigated, yet vital in determining the optimal number of biomarkers for various classification purposes aimed towards improved detection, diagnosis, and therapeutic monitoring of diseases. We investigate the impact of microarray based data characteristics on the predictive performance for various classification rules using simulation studies. Our investigation using Random Forest, Support Vector Machines, Linear Discriminant Analysis and k-Nearest Neighbour shows that the predictive performance of classifiers is strongly influenced by training set size, biological and technical variability, replication, fold change and correlation between biomarkers. Optimal number of biomarkers for a classification problem should therefore be estimated taking account of the impact of all these factors. A database of average generalization errors is built for various combinations of these factors. The database of generalization errors can be used for estimating the optimal number of biomarkers for given levels of predictive accuracy as a function of these factors. Examples show that curves from actual biological data resemble that of simulated data with corresponding levels of data characteristics. An R package optBiomarker implementing the method is freely available for academic use from the Comprehensive R Archive Network (http://www.cran.r-project.org/web/packages/optBiomarker/).

  9. Application of musical timbre discrimination features to active sonar classification

    NASA Astrophysics Data System (ADS)

    Young, Victor W.; Hines, Paul C.; Pecknold, Sean

    2005-04-01

    In musical acoustics significant effort has been devoted to uncovering the physical basis of timbre perception. Most investigations into timbre rely on multidimensional scaling (MDS), in which different musical sounds are arranged as points in multidimensional space. The Euclidean distance between points corresponds to the perceptual distance between sounds and the multidimensional axes are linked to measurable properties of the sounds. MDS has identified numerous temporal and spectral features believed to be important to timbre perception. There is reason to believe that some of these features may have wider application in the disparate field of underwater acoustics, since anecdotal evidence suggests active sonar returns from metallic objects sound different than natural clutter returns when auralized by human operators. This is particularly encouraging since attempts to develop robust automatic classifiers capable of target-clutter discrimination over a wide range of operational conditions have met with limited success. Spectral features relevant to target-clutter discrimination are believed to include click-pitch and envelope irregularity; relevant temporal features are believed to include duration, sub-band attack/decay time, and time separation pitch. Preliminary results from an investigation into the role of these timbre features in target-clutter discrimination will be presented. [Work supported by NSERC and GDC.

  10. Discriminant Analysis of Time Series in the Presence of Within-Group Spectral Variability.

    PubMed

    Krafty, Robert T

    2016-07-01

    Many studies record replicated time series epochs from different groups with the goal of using frequency domain properties to discriminate between the groups. In many applications, there exists variation in cyclical patterns from time series in the same group. Although a number of frequency domain methods for the discriminant analysis of time series have been explored, there is a dearth of models and methods that account for within-group spectral variability. This article proposes a model for groups of time series in which transfer functions are modeled as stochastic variables that can account for both between-group and within-group differences in spectra that are identified from individual replicates. An ensuing discriminant analysis of stochastic cepstra under this model is developed to obtain parsimonious measures of relative power that optimally separate groups in the presence of within-group spectral variability. The approach possess favorable properties in classifying new observations and can be consistently estimated through a simple discriminant analysis of a finite number of estimated cepstral coefficients. Benefits in accounting for within-group spectral variability are empirically illustrated in a simulation study and through an analysis of gait variability.

  11. A Computational Discriminability Analysis on Twin Fingerprints

    NASA Astrophysics Data System (ADS)

    Liu, Yu; Srihari, Sargur N.

    Sharing similar genetic traits makes the investigation of twins an important study in forensics and biometrics. Fingerprints are one of the most commonly found types of forensic evidence. The similarity between twins’ prints is critical establish to the reliability of fingerprint identification. We present a quantitative analysis of the discriminability of twin fingerprints on a new data set (227 pairs of identical twins and fraternal twins) recently collected from a twin population using both level 1 and level 2 features. Although the patterns of minutiae among twins are more similar than in the general population, the similarity of fingerprints of twins is significantly different from that between genuine prints of the same finger. Twins fingerprints are discriminable with a 1.5%~1.7% higher EER than non-twins. And identical twins can be distinguished by examine fingerprint with a slightly higher error rate than fraternal twins.

  12. The software application and classification algorithms for welds radiograms analysis

    NASA Astrophysics Data System (ADS)

    Sikora, R.; Chady, T.; Baniukiewicz, P.; Grzywacz, B.; Lopato, P.; Misztal, L.; Napierała, L.; Piekarczyk, B.; Pietrusewicz, T.; Psuj, G.

    2013-01-01

    The paper presents a software implementation of an Intelligent System for Radiogram Analysis (ISAR). The system has to support radiologists in welds quality inspection. The image processing part of software with a graphical user interface and a welds classification part are described with selected classification results. Classification was based on a few algorithms: an artificial neural network, a k-means clustering, a simplified k-means and a rough sets theory.

  13. Racial classification in the evolutionary sciences: a comparative analysis.

    PubMed

    Billinger, Michael S

    2007-01-01

    Human racial classification has long been a problem for the discipline of anthropology, but much of the criticism of the race concept has focused on its social and political connotations. The central argument of this paper is that race is not a specifically human problem, but one that exists in evolutionary thought in general. This paper looks at various disciplinary approaches to racial or subspecies classification, extending its focus beyond the anthropological race concept by providing a comparative analysis of the use of racial classification in evolutionary biology, genetics, and anthropology.

  14. [Discriminant Analysis of Lavender Essential Oil by Attenuated Total Reflectance Infrared Spectroscopy].

    PubMed

    Tang, Jun; Wang, Qing; Tong, Hong; Liao, Xiang; Zhang, Zheng-fang

    2016-03-01

    This work aimed to use attenuated total reflectance Fourier transform infrared spectroscopy to identify the lavender essential oil by establishing a Lavender variety and quality analysis model. So, 96 samples were tested. For all samples, the raw spectra were pretreated as second derivative, and to determine the 1 750-900 cm(-1) wavelengths for pattern recognition analysis on the basis of the variance calculation. The results showed that principal component analysis (PCA) can basically discriminate lavender oil cultivar and the first three principal components mainly represent the ester, alcohol and terpenoid substances. When the orthogonal partial least-squares discriminant analysis (OPLS-DA) model was established, the 68 samples were used for the calibration set. Determination coefficients of OPLS-DA regression curve were 0.959 2, 0.976 4, and 0.958 8 respectively for three varieties of lavender essential oil. Three varieties of essential oil's the root mean square error of prediction (RMSEP) in validation set were 0.142 9, 0.127 3, and 0.124 9, respectively. The discriminant rate of calibration set and the prediction rate of validation set had reached 100%. The model has the very good recognition capability to detect the variety and quality of lavender essential oil. The result indicated that a model which provides a quick, intuitive and feasible method had been built to discriminate lavender oils.

  15. Malay sentiment analysis based on combined classification approaches and Senti-lexicon algorithm.

    PubMed

    Al-Saffar, Ahmed; Awang, Suryanti; Tao, Hai; Omar, Nazlia; Al-Saiagh, Wafaa; Al-Bared, Mohammed

    2018-01-01

    Sentiment analysis techniques are increasingly exploited to categorize the opinion text to one or more predefined sentiment classes for the creation and automated maintenance of review-aggregation websites. In this paper, a Malay sentiment analysis classification model is proposed to improve classification performances based on the semantic orientation and machine learning approaches. First, a total of 2,478 Malay sentiment-lexicon phrases and words are assigned with a synonym and stored with the help of more than one Malay native speaker, and the polarity is manually allotted with a score. In addition, the supervised machine learning approaches and lexicon knowledge method are combined for Malay sentiment classification with evaluating thirteen features. Finally, three individual classifiers and a combined classifier are used to evaluate the classification accuracy. In experimental results, a wide-range of comparative experiments is conducted on a Malay Reviews Corpus (MRC), and it demonstrates that the feature extraction improves the performance of Malay sentiment analysis based on the combined classification. However, the results depend on three factors, the features, the number of features and the classification approach.

  16. Malay sentiment analysis based on combined classification approaches and Senti-lexicon algorithm

    PubMed Central

    Awang, Suryanti; Tao, Hai; Omar, Nazlia; Al-Saiagh, Wafaa; Al-bared, Mohammed

    2018-01-01

    Sentiment analysis techniques are increasingly exploited to categorize the opinion text to one or more predefined sentiment classes for the creation and automated maintenance of review-aggregation websites. In this paper, a Malay sentiment analysis classification model is proposed to improve classification performances based on the semantic orientation and machine learning approaches. First, a total of 2,478 Malay sentiment-lexicon phrases and words are assigned with a synonym and stored with the help of more than one Malay native speaker, and the polarity is manually allotted with a score. In addition, the supervised machine learning approaches and lexicon knowledge method are combined for Malay sentiment classification with evaluating thirteen features. Finally, three individual classifiers and a combined classifier are used to evaluate the classification accuracy. In experimental results, a wide-range of comparative experiments is conducted on a Malay Reviews Corpus (MRC), and it demonstrates that the feature extraction improves the performance of Malay sentiment analysis based on the combined classification. However, the results depend on three factors, the features, the number of features and the classification approach. PMID:29684036

  17. Sex determination from the calcaneus in a 20th century Greek population using discriminant function analysis.

    PubMed

    Peckmann, Tanya R; Orr, Kayla; Meek, Susan; Manolis, Sotiris K

    2015-12-01

    The skull and post-cranium have been used for the determination of sex for unknown human remains. However, in forensic cases where skeletal remains often exhibit postmortem damage and taphonomic changes the calcaneus may be used for the determination of sex as it is a preservationally favored bone. The goal of the present research was to derive discriminant function equations from the calcaneus for estimation of sex from a contemporary Greek population. Nine parameters were measured on 198 individuals (103 males and 95 females), ranging in age from 20 to 99 years old, from the University of Athens Human Skeletal Reference Collection. The statistical analyses showed that all variables were sexually dimorphic. Discriminant function score equations were generated for use in sex determination. The average accuracy of sex classification ranged from 70% to 90% for the univariate analysis, 82.9% to 87.5% for the direct method, and 86.2% for the stepwise method. Comparisons to other populations were made. Overall, the cross-validated accuracies ranged from 48.6% to 56.1% with males most often identified correctly and females most often misidentified. The calcaneus was shown to be useful for sex determination in the twentieth century Greek population. Copyright © 2015 The Chartered Society of Forensic Sciences. Published by Elsevier Ireland Ltd. All rights reserved.

  18. Feature selection and classification of multiparametric medical images using bagging and SVM

    NASA Astrophysics Data System (ADS)

    Fan, Yong; Resnick, Susan M.; Davatzikos, Christos

    2008-03-01

    This paper presents a framework for brain classification based on multi-parametric medical images. This method takes advantage of multi-parametric imaging to provide a set of discriminative features for classifier construction by using a regional feature extraction method which takes into account joint correlations among different image parameters; in the experiments herein, MRI and PET images of the brain are used. Support vector machine classifiers are then trained based on the most discriminative features selected from the feature set. To facilitate robust classification and optimal selection of parameters involved in classification, in view of the well-known "curse of dimensionality", base classifiers are constructed in a bagging (bootstrap aggregating) framework for building an ensemble classifier and the classification parameters of these base classifiers are optimized by means of maximizing the area under the ROC (receiver operating characteristic) curve estimated from their prediction performance on left-out samples of bootstrap sampling. This classification system is tested on a sex classification problem, where it yields over 90% classification rates for unseen subjects. The proposed classification method is also compared with other commonly used classification algorithms, with favorable results. These results illustrate that the methods built upon information jointly extracted from multi-parametric images have the potential to perform individual classification with high sensitivity and specificity.

  19. Metabolite Profiling and Classification of DNA-Authenticated Licorice Botanicals

    PubMed Central

    Simmler, Charlotte; Anderson, Jeffrey R.; Gauthier, Laura; Lankin, David C.; McAlpine, James B.; Chen, Shao-Nong; Pauli, Guido F.

    2015-01-01

    Raw licorice roots represent heterogeneous materials obtained from mainly three Glycyrrhiza species. G. glabra, G. uralensis, and G. inflata exhibit marked metabolite differences in terms of flavanones (Fs), chalcones (Cs), and other phenolic constituents. The principal objective of this work was to develop complementary chemometric models for the metabolite profiling, classification, and quality control of authenticated licorice. A total of 51 commercial and macroscopically verified samples were DNA authenticated. Principal component analysis and canonical discriminant analysis were performed on 1H NMR spectra and area under the curve values obtained from UHPLC-UV chromatograms, respectively. The developed chemometric models enable the identification and classification of Glycyrrhiza species according to their composition in major Fs, Cs, and species specific phenolic compounds. Further key outcomes demonstrated that DNA authentication combined with chemometric analyses enabled the characterization of mixtures, hybrids, and species outliers. This study provides a new foundation for the botanical and chemical authentication, classification, and metabolomic characterization of crude licorice botanicals and derived materials. Collectively, the proposed methods offer a comprehensive approach for the quality control of licorice as one of the most widely used botanical dietary supplements. PMID:26244884

  20. Three-Way Analysis of Spectrospatial Electromyography Data: Classification and Interpretation

    PubMed Central

    Kauppi, Jukka-Pekka; Hahne, Janne; Müller, Klaus-Robert; Hyvärinen, Aapo

    2015-01-01

    Classifying multivariate electromyography (EMG) data is an important problem in prosthesis control as well as in neurophysiological studies and diagnosis. With modern high-density EMG sensor technology, it is possible to capture the rich spectrospatial structure of the myoelectric activity. We hypothesize that multi-way machine learning methods can efficiently utilize this structure in classification as well as reveal interesting patterns in it. To this end, we investigate the suitability of existing three-way classification methods to EMG-based hand movement classification in spectrospatial domain, as well as extend these methods by sparsification and regularization. We propose to use Fourier-domain independent component analysis as preprocessing to improve classification and interpretability of the results. In high-density EMG experiments on hand movements across 10 subjects, three-way classification yielded higher average performance compared with state-of-the art classification based on temporal features, suggesting that the three-way analysis approach can efficiently utilize detailed spectrospatial information of high-density EMG. Phase and amplitude patterns of features selected by the classifier in finger-movement data were found to be consistent with known physiology. Thus, our approach can accurately resolve hand and finger movements on the basis of detailed spectrospatial information, and at the same time allows for physiological interpretation of the results. PMID:26039100

  1. Can early hepatic fibrosis stages be discriminated by combining ultrasonic parameters?

    PubMed

    Bouzitoune, Razika; Meziri, Mahmoud; Machado, Christiano Bittencourt; Padilla, Frédéric; Pereira, Wagner Coelho de Albuquerque

    2016-05-01

    In this study, we put forward a new approach to classify early stages of fibrosis based on a multiparametric characterization using backscatter ultrasonic signals. Ultrasonic parameters, such as backscatter coefficient (Bc), speed of sound (SoS), attenuation coefficient (Ac), mean scatterer spacing (MSS), and spectral slope (SS), have shown their potential to differentiate between healthy and pathologic samples in different organs (eye, breast, prostate, liver). Recently, our group looked into the characterization of stages of hepatic fibrosis using the parameters cited above. The results showed that none of them could individually distinguish between the different stages. Therefore, we explored a multiparametric approach by combining these parameters in two and three, to test their potential to discriminate between the stages of liver fibrosis: F0 (normal), F1, F3, and/without F4 (cirrhosis), according to METAVIR Score. Discriminant analysis showed that the most relevant individual parameter was Bc, followed by SoS, SS, MSS, and Ac. The combination of (Bc, SoS) along with the four stages was the best in differentiating between the stages of fibrosis and correctly classified 85% of the liver samples with a high level of significance (p<0.0001). Nevertheless, when taking into account only stages F0, F1, and F3, the discriminant analysis showed that the parameters (Bc, SoS) and (Bc, Ac) had a better classification (93%) with a high level of significance (p<0.0001). The combination of the three parameters (Bc, SoS, and Ac) led to a 100% correct classification. In conclusion, the current findings show that the multiparametric approach has great potential in differentiating between the stages of fibrosis, and thus could play an important role in the diagnosis and follow-up of hepatic fibrosis. Copyright © 2016 Elsevier B.V. All rights reserved.

  2. Classification of Sporting Activities Using Smartphone Accelerometers

    PubMed Central

    Mitchell, Edmond; Monaghan, David; O'Connor, Noel E.

    2013-01-01

    In this paper we present a framework that allows for the automatic identification of sporting activities using commonly available smartphones. We extract discriminative informational features from smartphone accelerometers using the Discrete Wavelet Transform (DWT). Despite the poor quality of their accelerometers, smartphones were used as capture devices due to their prevalence in today's society. Successful classification on this basis potentially makes the technology accessible to both elite and non-elite athletes. Extracted features are used to train different categories of classifiers. No one classifier family has a reportable direct advantage in activity classification problems to date; thus we examine classifiers from each of the most widely used classifier families. We investigate three classification approaches; a commonly used SVM-based approach, an optimized classification model and a fusion of classifiers. We also investigate the effect of changing several of the DWT input parameters, including mother wavelets, window lengths and DWT decomposition levels. During the course of this work we created a challenging sports activity analysis dataset, comprised of soccer and field-hockey activities. The average maximum F-measure accuracy of 87% was achieved using a fusion of classifiers, which was 6% better than a single classifier model and 23% better than a standard SVM approach. PMID:23604031

  3. From learning taxonomies to phylogenetic learning: integration of 16S rRNA gene data into FAME-based bacterial classification.

    PubMed

    Slabbinck, Bram; Waegeman, Willem; Dawyndt, Peter; De Vos, Paul; De Baets, Bernard

    2010-01-30

    Machine learning techniques have shown to improve bacterial species classification based on fatty acid methyl ester (FAME) data. Nonetheless, FAME analysis has a limited resolution for discrimination of bacteria at the species level. In this paper, we approach the species classification problem from a taxonomic point of view. Such a taxonomy or tree is typically obtained by applying clustering algorithms on FAME data or on 16S rRNA gene data. The knowledge gained from the tree can then be used to evaluate FAME-based classifiers, resulting in a novel framework for bacterial species classification. In view of learning in a taxonomic framework, we consider two types of trees. First, a FAME tree is constructed with a supervised divisive clustering algorithm. Subsequently, based on 16S rRNA gene sequence analysis, phylogenetic trees are inferred by the NJ and UPGMA methods. In this second approach, the species classification problem is based on the combination of two different types of data. Herein, 16S rRNA gene sequence data is used for phylogenetic tree inference and the corresponding binary tree splits are learned based on FAME data. We call this learning approach 'phylogenetic learning'. Supervised Random Forest models are developed to train the classification tasks in a stratified cross-validation setting. In this way, better classification results are obtained for species that are typically hard to distinguish by a single or flat multi-class classification model. FAME-based bacterial species classification is successfully evaluated in a taxonomic framework. Although the proposed approach does not improve the overall accuracy compared to flat multi-class classification, it has some distinct advantages. First, it has better capabilities for distinguishing species on which flat multi-class classification fails. Secondly, the hierarchical classification structure allows to easily evaluate and visualize the resolution of FAME data for the discrimination of bacterial

  4. From learning taxonomies to phylogenetic learning: Integration of 16S rRNA gene data into FAME-based bacterial classification

    PubMed Central

    2010-01-01

    Background Machine learning techniques have shown to improve bacterial species classification based on fatty acid methyl ester (FAME) data. Nonetheless, FAME analysis has a limited resolution for discrimination of bacteria at the species level. In this paper, we approach the species classification problem from a taxonomic point of view. Such a taxonomy or tree is typically obtained by applying clustering algorithms on FAME data or on 16S rRNA gene data. The knowledge gained from the tree can then be used to evaluate FAME-based classifiers, resulting in a novel framework for bacterial species classification. Results In view of learning in a taxonomic framework, we consider two types of trees. First, a FAME tree is constructed with a supervised divisive clustering algorithm. Subsequently, based on 16S rRNA gene sequence analysis, phylogenetic trees are inferred by the NJ and UPGMA methods. In this second approach, the species classification problem is based on the combination of two different types of data. Herein, 16S rRNA gene sequence data is used for phylogenetic tree inference and the corresponding binary tree splits are learned based on FAME data. We call this learning approach 'phylogenetic learning'. Supervised Random Forest models are developed to train the classification tasks in a stratified cross-validation setting. In this way, better classification results are obtained for species that are typically hard to distinguish by a single or flat multi-class classification model. Conclusions FAME-based bacterial species classification is successfully evaluated in a taxonomic framework. Although the proposed approach does not improve the overall accuracy compared to flat multi-class classification, it has some distinct advantages. First, it has better capabilities for distinguishing species on which flat multi-class classification fails. Secondly, the hierarchical classification structure allows to easily evaluate and visualize the resolution of FAME data for

  5. Automated Texture Classification of the Mawrth Vallis Landing Site Region

    NASA Astrophysics Data System (ADS)

    Parente, M.; Bayley, L.; Hunkins, L.; McKeown, N. K.; Bishop, J. L.

    2009-12-01

    Supervised classification techniques have been developed to discriminate geomorphologic units in HiRISE images of Mawrth Vallis on Mars, one of the MSL candidate landing sites. A variety of clay minerals that indicate water was once present have been identified in the ancient bedrock at Mawrth Vallis [1-7]. These clay-rich rocks exhibit distinct surface textures in HiRISE images, where the nontronite-bearing unit consists of two primary textures: 2-5 m irregular inverted polygons and irregular parallel fracture sets ([8,13], Fig. b-c). In contrast, the montmorillonite-bearing unit consists of 0.5-1.5 m regular polygons ([8,13], Fig. e). We also characterized dunes (Fig. d), and the spectrally unremarkable caprock unit (Fig. a). Classification of these textures was performed by extracting discriminatory features from gray-level run length matrices (GLRLMs) [9], gray-level co-occurrence matrices (GLCMs) [10], and semivariograms [11] calculated for small blocks of data in HiRISE images. Preliminary results using an algorithm containing eight of these classification features produced a texture classification technique that is 85 percent accurate. The discriminant analysis (e.g. [12]) classifier we used modeled a linear discriminant function for each class based on the training feature vectors for that class. The test vector with the largest value for its discriminant function was then assigned to each class. We assumed linear functions were acceptable for small training sets and we performed automated selection in order to identify the most discriminative features for the textures in Mawrth Vallis. Continued efforts are underway to test and refine this procedure in order to optimize texture recognition on a broader collection of textures, representing additional surface components from Mawrth Vallis and other landing sites on Mars. [1] Bibring, J.-P., et al. (2005) Science, 307, 1576-1581. [2] Poulet, F., et al. (2005) Nature, 438, 632-627. [3] Bishop, J. L., et al

  6. Rock classification based on resistivity patterns in electrical borehole wall images

    NASA Astrophysics Data System (ADS)

    Linek, Margarete; Jungmann, Matthias; Berlage, Thomas; Pechnig, Renate; Clauser, Christoph

    2007-06-01

    Electrical borehole wall images represent grey-level-coded micro-resistivity measurements at the borehole wall. Different scientific methods have been implemented to transform image data into quantitative log curves. We introduce a pattern recognition technique applying texture analysis, which uses second-order statistics based on studying the occurrence of pixel pairs. We calculate so-called Haralick texture features such as contrast, energy, entropy and homogeneity. The supervised classification method is used for assigning characteristic texture features to different rock classes and assessing the discriminative power of these image features. We use classifiers obtained from training intervals to characterize the entire image data set recovered in ODP hole 1203A. This yields a synthetic lithology profile based on computed texture data. We show that Haralick features accurately classify 89.9% of the training intervals. We obtained misclassification for vesicular basaltic rocks. Hence, further image analysis tools are used to improve the classification reliability. We decompose the 2D image signal by the application of wavelet transformation in order to enhance image objects horizontally, diagonally and vertically. The resulting filtered images are used for further texture analysis. This combined classification based on Haralick features and wavelet transformation improved our classification up to a level of 98%. The application of wavelet transformation increases the consistency between standard logging profiles and texture-derived lithology. Texture analysis of borehole wall images offers the potential to facilitate objective analysis of multiple boreholes with the same lithology.

  7. Metabolic Profiling and Classification of Propolis Samples from Southern Brazil: An NMR-Based Platform Coupled with Machine Learning.

    PubMed

    Maraschin, Marcelo; Somensi-Zeggio, Amélia; Oliveira, Simone K; Kuhnen, Shirley; Tomazzoli, Maíra M; Raguzzoni, Josiane C; Zeri, Ana C M; Carreira, Rafael; Correia, Sara; Costa, Christopher; Rocha, Miguel

    2016-01-22

    The chemical composition of propolis is affected by environmental factors and harvest season, making it difficult to standardize its extracts for medicinal usage. By detecting a typical chemical profile associated with propolis from a specific production region or season, certain types of propolis may be used to obtain a specific pharmacological activity. In this study, propolis from three agroecological regions (plain, plateau, and highlands) from southern Brazil, collected over the four seasons of 2010, were investigated through a novel NMR-based metabolomics data analysis workflow. Chemometrics and machine learning algorithms (PLS-DA and RF), including methods to estimate variable importance in classification, were used in this study. The machine learning and feature selection methods permitted construction of models for propolis sample classification with high accuracy (>75%, reaching ∼90% in the best case), better discriminating samples regarding their collection seasons comparatively to the harvest regions. PLS-DA and RF allowed the identification of biomarkers for sample discrimination, expanding the set of discriminating features and adding relevant information for the identification of the class-determining metabolites. The NMR-based metabolomics analytical platform, coupled to bioinformatic tools, allowed characterization and classification of Brazilian propolis samples regarding the metabolite signature of important compounds, i.e., chemical fingerprint, harvest seasons, and production regions.

  8. Automated classification of mouse pup isolation syllables: from cluster analysis to an Excel-based "mouse pup syllable classification calculator".

    PubMed

    Grimsley, Jasmine M S; Gadziola, Marie A; Wenstrup, Jeffrey J

    2012-01-01

    Mouse pups vocalize at high rates when they are cold or isolated from the nest. The proportions of each syllable type produced carry information about disease state and are being used as behavioral markers for the internal state of animals. Manual classifications of these vocalizations identified 10 syllable types based on their spectro-temporal features. However, manual classification of mouse syllables is time consuming and vulnerable to experimenter bias. This study uses an automated cluster analysis to identify acoustically distinct syllable types produced by CBA/CaJ mouse pups, and then compares the results to prior manual classification methods. The cluster analysis identified two syllable types, based on their frequency bands, that have continuous frequency-time structure, and two syllable types featuring abrupt frequency transitions. Although cluster analysis computed fewer syllable types than manual classification, the clusters represented well the probability distributions of the acoustic features within syllables. These probability distributions indicate that some of the manually classified syllable types are not statistically distinct. The characteristics of the four classified clusters were used to generate a Microsoft Excel-based mouse syllable classifier that rapidly categorizes syllables, with over a 90% match, into the syllable types determined by cluster analysis.

  9. Choice-Based Conjoint Analysis: Classification vs. Discrete Choice Models

    NASA Astrophysics Data System (ADS)

    Giesen, Joachim; Mueller, Klaus; Taneva, Bilyana; Zolliker, Peter

    Conjoint analysis is a family of techniques that originated in psychology and later became popular in market research. The main objective of conjoint analysis is to measure an individual's or a population's preferences on a class of options that can be described by parameters and their levels. We consider preference data obtained in choice-based conjoint analysis studies, where one observes test persons' choices on small subsets of the options. There are many ways to analyze choice-based conjoint analysis data. Here we discuss the intuition behind a classification based approach, and compare this approach to one based on statistical assumptions (discrete choice models) and to a regression approach. Our comparison on real and synthetic data indicates that the classification approach outperforms the discrete choice models.

  10. Neural CMOS-integrated circuit and its application to data classification.

    PubMed

    Göknar, Izzet Cem; Yildiz, Merih; Minaei, Shahram; Deniz, Engin

    2012-05-01

    Implementation and new applications of a tunable complementary metal-oxide-semiconductor-integrated circuit (CMOS-IC) of a recently proposed classifier core-cell (CC) are presented and tested with two different datasets. With two algorithms-one based on Fisher's linear discriminant analysis and the other based on perceptron learning, used to obtain CCs' tunable parameters-the Haberman and Iris datasets are classified. The parameters so obtained are used for hard-classification of datasets with a neural network structured circuit. Classification performance and coefficient calculation times for both algorithms are given. The CC has 6-ns response time and 1.8-mW power consumption. The fabrication parameters used for the IC are taken from CMOS AMS 0.35-μm technology.

  11. Proposition of a Classification of Adult Patients with Hemiparesis in Chronic Phase.

    PubMed

    Chantraine, Frédéric; Filipetti, Paul; Schreiber, Céline; Remacle, Angélique; Kolanowski, Elisabeth; Moissenet, Florent

    2016-01-01

    Patients who have developed hemiparesis as a result of a central nervous system lesion, often experience reduced walking capacity and worse gait quality. Although clinically, similar gait patterns have been observed, presently, no clinically driven classification has been validated to group these patients' gait abnormalities at the level of the hip, knee and ankle joints. This study has thus intended to put forward a new gait classification for adult patients with hemiparesis in chronic phase, and to validate its discriminatory capacity. Twenty-six patients with hemiparesis were included in this observational study. Following a clinical examination, a clinical gait analysis, complemented by a video analysis, was performed whereby participants were requested to walk spontaneously on a 10m walkway. A patient's classification was established from clinical examination data and video analysis. This classification was made up of three groups, including two sub-groups, defined with key abnormalities observed whilst walking. Statistical analysis was achieved on the basis of 25 parameters resulting from the clinical gait analysis in order to assess the discriminatory characteristic of the classification as displayed by the walking speed and kinematic parameters. Results revealed that the parameters related to the discriminant criteria of the proposed classification were all significantly different between groups and subgroups. More generally, nearly two thirds of the 25 parameters showed significant differences (p<0.05) between the groups and sub-groups. However, prior to being fully validated, this classification must still be tested on a larger number of patients, and the repeatability of inter-operator measures must be assessed. This classification enables patients to be grouped on the basis of key abnormalities observed whilst walking and has the advantage of being able to be used in clinical routines without necessitating complex apparatus. In the midterm, this

  12. Proposition of a Classification of Adult Patients with Hemiparesis in Chronic Phase

    PubMed Central

    Filipetti, Paul; Remacle, Angélique; Kolanowski, Elisabeth

    2016-01-01

    Background Patients who have developed hemiparesis as a result of a central nervous system lesion, often experience reduced walking capacity and worse gait quality. Although clinically, similar gait patterns have been observed, presently, no clinically driven classification has been validated to group these patients’ gait abnormalities at the level of the hip, knee and ankle joints. This study has thus intended to put forward a new gait classification for adult patients with hemiparesis in chronic phase, and to validate its discriminatory capacity. Methods and Findings Twenty-six patients with hemiparesis were included in this observational study. Following a clinical examination, a clinical gait analysis, complemented by a video analysis, was performed whereby participants were requested to walk spontaneously on a 10m walkway. A patient’s classification was established from clinical examination data and video analysis. This classification was made up of three groups, including two sub-groups, defined with key abnormalities observed whilst walking. Statistical analysis was achieved on the basis of 25 parameters resulting from the clinical gait analysis in order to assess the discriminatory characteristic of the classification as displayed by the walking speed and kinematic parameters. Results revealed that the parameters related to the discriminant criteria of the proposed classification were all significantly different between groups and subgroups. More generally, nearly two thirds of the 25 parameters showed significant differences (p<0.05) between the groups and sub-groups. However, prior to being fully validated, this classification must still be tested on a larger number of patients, and the repeatability of inter-operator measures must be assessed. Conclusions This classification enables patients to be grouped on the basis of key abnormalities observed whilst walking and has the advantage of being able to be used in clinical routines without necessitating

  13. Synthesis and analysis of discriminators under influence of broadband non-Gaussian noise

    NASA Astrophysics Data System (ADS)

    Artyushenko, V. M.; Volovach, V. I.

    2018-01-01

    We considered the problems of the synthesis and analysis of discriminators, when the useful signal is exposed to non-Gaussian additive broadband noise. It is shown that in this case, the discriminator of the tracking meter should contain the nonlinear transformation unit, the characteristics of which are determined by the Fisher information relative to the probability density function of the mixture of non-Gaussian broadband noise and mismatch errors. The parameters of the discriminatory and phase characteristics of the discriminators working under the above conditions are obtained. It is shown that the efficiency of non-linear processing depends on the ratio of power of FM noise to the power of Gaussian noise. The analysis of the information loss of signal transformation caused by the linear section of discriminatory characteristics of the unit of nonlinear transformations of the discriminator is carried out. It is shown that the average slope of the nonlinear transformation characteristic is determined by the Fisher information relative to the probability density function of the mixture of non-Gaussian noise and mismatch errors.

  14. Chemical Discrimination of Cortex Phellodendri amurensis and Cortex Phellodendri chinensis by Multivariate Analysis Approach.

    PubMed

    Sun, Hui; Wang, Huiyu; Zhang, Aihua; Yan, Guangli; Han, Ying; Li, Yuan; Wu, Xiuhong; Meng, Xiangcai; Wang, Xijun

    2016-01-01

    As herbal medicines have an important position in health care systems worldwide, their current assessment, and quality control are a major bottleneck. Cortex Phellodendri chinensis (CPC) and Cortex Phellodendri amurensis (CPA) are widely used in China, however, how to identify species of CPA and CPC has become urgent. In this study, multivariate analysis approach was performed to the investigation of chemical discrimination of CPA and CPC. Principal component analysis showed that two herbs could be separated clearly. The chemical markers such as berberine, palmatine, phellodendrine, magnoflorine, obacunone, and obaculactone were identified through the orthogonal partial least squared discriminant analysis, and were identified tentatively by the accurate mass of quadruple-time-of-flight mass spectrometry. A total of 29 components can be used as the chemical markers for discrimination of CPA and CPC. Of them, phellodenrine is significantly higher in CPC than that of CPA, whereas obacunone and obaculactone are significantly higher in CPA than that of CPC. The present study proves that multivariate analysis approach based chemical analysis greatly contributes to the investigation of CPA and CPC, and showed that the identified chemical markers as a whole should be used to discriminate the two herbal medicines, and simultaneously the results also provided chemical information for their quality assessment. Multivariate analysis approach was performed to the investigate the herbal medicineThe chemical markers were identified through multivariate analysis approachA total of 29 components can be used as the chemical markers. UPLC-Q/TOF-MS-based multivariate analysis method for the herbal medicine samples Abbreviations used: CPC: Cortex Phellodendri chinensis, CPA: Cortex Phellodendri amurensis, PCA: Principal component analysis, OPLS-DA: Orthogonal partial least squares discriminant analysis, BPI: Base peaks ion intensity.

  15. Active microwave responses - An aid in improved crop classification

    NASA Technical Reports Server (NTRS)

    Rosenthal, W. D.; Blanchard, B. J.

    1984-01-01

    A study determined the feasibility of using visible, infrared, and active microwave data to classify agricultural crops such as corn, sorghum, alfalfa, wheat stubble, millet, shortgrass pasture and bare soil. Visible through microwave data were collected by instruments on board the NASA C-130 aircraft over 40 agricultural fields near Guymon, OK in 1978 and Dalhart, TX in 1980. Results from stepwise and discriminant analysis techniques indicated 4.75 GHz, 1.6 GHz, and 0.4 GHz cross-polarized microwave frequencies were the microwave frequencies most sensitive to crop type differences. Inclusion of microwave data in visible and infrared classification models improved classification accuracy from 73 percent to 92 percent. Despite the results, further studies are needed during different growth stages to validate the visible, infrared, and active microwave responses to vegetation.

  16. Empirical Analysis and Automated Classification of Security Bug Reports

    NASA Technical Reports Server (NTRS)

    Tyo, Jacob P.

    2016-01-01

    With the ever expanding amount of sensitive data being placed into computer systems, the need for effective cybersecurity is of utmost importance. However, there is a shortage of detailed empirical studies of security vulnerabilities from which cybersecurity metrics and best practices could be determined. This thesis has two main research goals: (1) to explore the distribution and characteristics of security vulnerabilities based on the information provided in bug tracking systems and (2) to develop data analytics approaches for automatic classification of bug reports as security or non-security related. This work is based on using three NASA datasets as case studies. The empirical analysis showed that the majority of software vulnerabilities belong only to a small number of types. Addressing these types of vulnerabilities will consequently lead to cost efficient improvement of software security. Since this analysis requires labeling of each bug report in the bug tracking system, we explored using machine learning to automate the classification of each bug report as a security or non-security related (two-class classification), as well as each security related bug report as specific security type (multiclass classification). In addition to using supervised machine learning algorithms, a novel unsupervised machine learning approach is proposed. An ac- curacy of 92%, recall of 96%, precision of 92%, probability of false alarm of 4%, F-Score of 81% and G-Score of 90% were the best results achieved during two-class classification. Furthermore, an accuracy of 80%, recall of 80%, precision of 94%, and F-score of 85% were the best results achieved during multiclass classification.

  17. Employment discrimination, segregation, and health.

    PubMed

    Darity, William A

    2003-02-01

    The author examines available evidence on the effects of exposure to joblessness on emotional well-being according to race and sex. The impact of racism on general health outcomes also is considered, particularly racism in the specific form of wage discrimination. Perceptions of racism and measured exposures to racism may be distinct triggers for adverse health outcomes. Whether the effects of racism are best evaluated on the basis of self-classification or social classification of racial identity is unclear. Some research sorts between the effects of race and socioeconomic status on health. The development of a new longitudinal database will facilitate more accurate identification of connections between racism and negative health effects.

  18. Employment Discrimination, Segregation, and Health

    PubMed Central

    Darity, William A.

    2003-01-01

    The author examines available evidence on the effects of exposure to joblessness on emotional well-being according to race and sex. The impact of racism on general health outcomes also is considered, particularly racism in the specific form of wage discrimination. Perceptions of racism and measured exposures to racism may be distinct triggers for adverse health outcomes. Whether the effects of racism are best evaluated on the basis of self-classification or social classification of racial identity is unclear. Some research sorts between the effects of race and socioeconomic status on health. The development of a new longitudinal database will facilitate more accurate identification of connections between racism and negative health effects. PMID:12554574

  19. Discrimination of complex mixtures by a colorimetric sensor array: coffee aromas.

    PubMed

    Suslick, Benjamin A; Feng, Liang; Suslick, Kenneth S

    2010-03-01

    The analysis of complex mixtures presents a difficult challenge even for modern analytical techniques, and the ability to discriminate among closely similar such mixtures often remains problematic. Coffee provides a readily available archetype of such highly multicomponent systems. The use of a low-cost, sensitive colorimetric sensor array for the detection and identification of coffee aromas is reported. The color changes of the sensor array were used as a digital representation of the array response and analyzed with standard statistical methods, including principal component analysis (PCA) and hierarchical clustering analysis (HCA). PCA revealed that the sensor array has exceptionally high dimensionality with 18 dimensions required to define 90% of the total variance. In quintuplicate runs of 10 commercial coffees and controls, no confusions or errors in classification by HCA were observed in 55 trials. In addition, the effects of temperature and time in the roasting of green coffee beans were readily observed and distinguishable with a resolution better than 10 degrees C and 5 min, respectively. Colorimetric sensor arrays demonstrate excellent potential for complex systems analysis in real-world applications and provide a novel method for discrimination among closely similar complex mixtures.

  20. Discrimination of Complex Mixtures by a Colorimetric Sensor Array: Coffee Aromas

    PubMed Central

    Suslick, Benjamin A.; Feng, Liang; Suslick, Kenneth S.

    2010-01-01

    The analysis of complex mixtures presents a difficult challenge even for modern analytical techniques, and the ability to discriminate among closely similar such mixtures often remains problematic. Coffee provides a readily available archetype of such highly multicomponent systems. The use of a low-cost, sensitive colorimetric sensor array for the detection and identification of coffee aromas is reported. The color changes of the sensor array were used as a digital representation of the array response and analyzed with standard statistical methods, including principal component analysis (PCA) and hierarchical clustering analysis (HCA). PCA revealed that the sensor array has exceptionally high dimensionality with 18 dimensions required to define 90% of the total variance. In quintuplicate runs of 10 commercial coffees and controls, no confusions or errors in classification by HCA were observed in 55 trials. In addition, the effects of temperature and time in the roasting of green coffee beans were readily observed and distinguishable with a resolution better than 10 °C and 5 min, respectively. Colorimetric sensor arrays demonstrate excellent potential for complex systems analysis in real-world applications and provide a novel method for discrimination among closely similar complex mixtures. PMID:20143838

  1. A comparison of autonomous techniques for multispectral image analysis and classification

    NASA Astrophysics Data System (ADS)

    Valdiviezo-N., Juan C.; Urcid, Gonzalo; Toxqui-Quitl, Carina; Padilla-Vivanco, Alfonso

    2012-10-01

    Multispectral imaging has given place to important applications related to classification and identification of objects from a scene. Because of multispectral instruments can be used to estimate the reflectance of materials in the scene, these techniques constitute fundamental tools for materials analysis and quality control. During the last years, a variety of algorithms has been developed to work with multispectral data, whose main purpose has been to perform the correct classification of the objects in the scene. The present study introduces a brief review of some classical as well as a novel technique that have been used for such purposes. The use of principal component analysis and K-means clustering techniques as important classification algorithms is here discussed. Moreover, a recent method based on the min-W and max-M lattice auto-associative memories, that was proposed for endmember determination in hyperspectral imagery, is introduced as a classification method. Besides a discussion of their mathematical foundation, we emphasize their main characteristics and the results achieved for two exemplar images conformed by objects similar in appearance, but spectrally different. The classification results state that the first components computed from principal component analysis can be used to highlight areas with different spectral characteristics. In addition, the use of lattice auto-associative memories provides good results for materials classification even in the cases where some spectral similarities appears in their spectral responses.

  2. Identification, classification, and discrimination of agave syrups from natural sweeteners by infrared spectroscopy and HPAEC-PAD.

    PubMed

    Mellado-Mojica, Erika; López, Mercedes G

    2015-01-15

    Agave syrups are gaining popularity as new natural sweeteners. Identification, classification and discrimination by infrared spectroscopy coupled to chemometrics (NIR-MIR-SIMCA-PCA) and HPAEC-PAD of agave syrups from natural sweeteners were achieved. MIR-SIMCA-PCA allowed us to classify the natural sweeteners according to their natural source. Natural syrups exhibited differences in the MIR spectra region 1500-900 cm(-1). The agave syrups displayed strong absorption in the MIR spectra region 1061-1,063 cm(-1), in agreement with their high fructose content. Additionally, MIR-SIMCA-PCA allowed us to differentiate among syrups from different Agave species (Agavetequilana and Agavesalmiana). Thin-layer chromatography and HPAEC-PAD revealed glucose, fructose, and sucrose as the principal carbohydrates in all of the syrups. Oligosaccharide profiles showed that A. tequilana syrups are mainly composed of fructose (>60%) and fructooligosaccharides, while A. salmiana syrups contain more sucrose (28-32%). We conclude that MIR-SIMCA-PCA and HPAEC-PAD can be used to unequivocally identify and classified agave syrups. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.

  3. Regional Moment Tensor Source-Type Discrimination Analysis

    DTIC Science & Technology

    2015-11-16

    Research Laboratory Space Vehicles Directorate 3550 Aberdeen Avenue SE Kirtland AFB, NM 87117-5776 AFRL /RVBYE 11. SPONSOR/MONITOR’S REPORT...OCP 8725 John J. Kingman Rd, Suite 0944 Ft Belvoir, VA 22060-6218 1 cy AFRL /RVIL Kirtland AFB, NM 87117-5776 2 cys Official Record Copy... AFRL -RV-PS- AFRL -RV-PS- TR-2016-0014 TR-2016-0014 REGIONAL MOMENT TENSOR SOURCE-TYPE DISCRIMINATION ANALYSIS Douglas S. Dreger, et al

  4. Introduction to multivariate discrimination

    NASA Astrophysics Data System (ADS)

    Kégl, Balázs

    2013-07-01

    Multivariate discrimination or classification is one of the best-studied problem in machine learning, with a plethora of well-tested and well-performing algorithms. There are also several good general textbooks [1-9] on the subject written to an average engineering, computer science, or statistics graduate student; most of them are also accessible for an average physics student with some background on computer science and statistics. Hence, instead of writing a generic introduction, we concentrate here on relating the subject to a practitioner experimental physicist. After a short introduction on the basic setup (Section 1) we delve into the practical issues of complexity regularization, model selection, and hyperparameter optimization (Section 2), since it is this step that makes high-complexity non-parametric fitting so different from low-dimensional parametric fitting. To emphasize that this issue is not restricted to classification, we illustrate the concept on a low-dimensional but non-parametric regression example (Section 2.1). Section 3 describes the common algorithmic-statistical formal framework that unifies the main families of multivariate classification algorithms. We explain here the large-margin principle that partly explains why these algorithms work. Section 4 is devoted to the description of the three main (families of) classification algorithms, neural networks, the support vector machine, and AdaBoost. We do not go into the algorithmic details; the goal is to give an overview on the form of the functions these methods learn and on the objective functions they optimize. Besides their technical description, we also make an attempt to put these algorithm into a socio-historical context. We then briefly describe some rather heterogeneous applications to illustrate the pattern recognition pipeline and to show how widespread the use of these methods is (Section 5). We conclude the chapter with three essentially open research problems that are either

  5. Discrimination of Cynanchum wilfordii and Cynanchum auriculatum by terahertz spectroscopic analysis.

    PubMed

    Ham, Woo Sik; Kim, Jinju; Park, Dae Joon; Ryu, Han-Cheol; Jang, Young Pyo

    2018-02-12

    Precise identification of botanical origin of plant species is crucial for the quality control of herbal medicine. In Korea, the root part of Cynanchum auriculatum has been misused for C. wilfordii in the herbal drug market due to their morphological similarities. Currently, DNA analysis using the polymerase chain reaction (PCR) method is employed to discriminate between these species. In order to develop a new analytical tool for the rapid discrimination of C. wilfordii and C. auriculatum, terahertz (THz) spectroscopy was employed. Authentic samples of C. wilfordii and C. auriculatum were provided from the National Institute and standardized pellets for each species were prepared to get optimum results with terahertz time-domain spectroscopy (THz-TDS) in frequency range 0.2-1.20 THz. The C. wilfordii pellet showed longer time delay compare to the sample of C. auriculatum and this was due to the difference in permittivity. The pellet samples of C. wilfordii and C. auriculatum showed a permittivity difference of about 0.08 at 0.2-1.20 THz. The experimental results indicated that THz-TDS analysis can be an effective and rapid method for the discrimination of C. wilfordii and C. auriculatum, and this application can be expanded for the discrimination of other similar herbal medicines. Copyright © 2018 John Wiley & Sons, Ltd.

  6. Improved Hierarchical Optimization-Based Classification of Hyperspectral Images Using Shape Analysis

    NASA Technical Reports Server (NTRS)

    Tarabalka, Yuliya; Tilton, James C.

    2012-01-01

    A new spectral-spatial method for classification of hyperspectral images is proposed. The HSegClas method is based on the integration of probabilistic classification and shape analysis within the hierarchical step-wise optimization algorithm. First, probabilistic support vector machines classification is applied. Then, at each iteration two neighboring regions with the smallest Dissimilarity Criterion (DC) are merged, and classification probabilities are recomputed. The important contribution of this work consists in estimating a DC between regions as a function of statistical, classification and geometrical (area and rectangularity) features. Experimental results are presented on a 102-band ROSIS image of the Center of Pavia, Italy. The developed approach yields more accurate classification results when compared to previously proposed methods.

  7. Multivariate qualitative analysis of banned additives in food safety using surface enhanced Raman scattering spectroscopy.

    PubMed

    He, Shixuan; Xie, Wanyi; Zhang, Wei; Zhang, Liqun; Wang, Yunxia; Liu, Xiaoling; Liu, Yulong; Du, Chunlei

    2015-02-25

    A novel strategy which combines iteratively cubic spline fitting baseline correction method with discriminant partial least squares qualitative analysis is employed to analyze the surface enhanced Raman scattering (SERS) spectroscopy of banned food additives, such as Sudan I dye and Rhodamine B in food, Malachite green residues in aquaculture fish. Multivariate qualitative analysis methods, using the combination of spectra preprocessing iteratively cubic spline fitting (ICSF) baseline correction with principal component analysis (PCA) and discriminant partial least squares (DPLS) classification respectively, are applied to investigate the effectiveness of SERS spectroscopy for predicting the class assignments of unknown banned food additives. PCA cannot be used to predict the class assignments of unknown samples. However, the DPLS classification can discriminate the class assignment of unknown banned additives using the information of differences in relative intensities. The results demonstrate that SERS spectroscopy combined with ICSF baseline correction method and exploratory analysis methodology DPLS classification can be potentially used for distinguishing the banned food additives in field of food safety. Copyright © 2014 Elsevier B.V. All rights reserved.

  8. Fourier Transform Infrared (FT-IR) and Laser Ablation Inductively Coupled Plasma-Mass Spectrometry (LA-ICP-MS) Imaging of Cerebral Ischemia: Combined Analysis of Rat Brain Thin Cuts Toward Improved Tissue Classification.

    PubMed

    Balbekova, Anna; Lohninger, Hans; van Tilborg, Geralda A F; Dijkhuizen, Rick M; Bonta, Maximilian; Limbeck, Andreas; Lendl, Bernhard; Al-Saad, Khalid A; Ali, Mohamed; Celikic, Minja; Ofner, Johannes

    2018-02-01

    Microspectroscopic techniques are widely used to complement histological studies. Due to recent developments in the field of chemical imaging, combined chemical analysis has become attractive. This technique facilitates a deepened analysis compared to single techniques or side-by-side analysis. In this study, rat brains harvested one week after induction of photothrombotic stroke were investigated. Adjacent thin cuts from rats' brains were imaged using Fourier transform infrared (FT-IR) microspectroscopy and laser ablation inductively coupled plasma mass spectrometry (LA-ICP-MS). The LA-ICP-MS data were normalized using an internal standard (a thin gold layer). The acquired hyperspectral data cubes were fused and subjected to multivariate analysis. Brain regions affected by stroke as well as unaffected gray and white matter were identified and classified using a model based on either partial least squares discriminant analysis (PLS-DA) or random decision forest (RDF) algorithms. The RDF algorithm demonstrated the best results for classification. Improved classification was observed in the case of fused data in comparison to individual data sets (either FT-IR or LA-ICP-MS). Variable importance analysis demonstrated that both molecular and elemental content contribute to the improved RDF classification. Univariate spectral analysis identified biochemical properties of the assigned tissue types. Classification of multisensor hyperspectral data sets using an RDF algorithm allows access to a novel and in-depth understanding of biochemical processes and solid chemical allocation of different brain regions.

  9. Classification of edible oils by employing 31P and 1H NMR spectroscopy in combination with multivariate statistical analysis. A proposal for the detection of seed oil adulteration in virgin olive oils.

    PubMed

    Vigli, Georgia; Philippidis, Angelos; Spyros, Apostolos; Dais, Photis

    2003-09-10

    A combination of (1)H NMR and (31)P NMR spectroscopy and multivariate statistical analysis was used to classify 192 samples from 13 types of vegetable oils, namely, hazelnut, sunflower, corn, soybean, sesame, walnut, rapeseed, almond, palm, groundnut, safflower, coconut, and virgin olive oils from various regions of Greece. 1,2-Diglycerides, 1,3-diglycerides, the ratio of 1,2-diglycerides to total diglycerides, acidity, iodine value, and fatty acid composition determined upon analysis of the respective (1)H NMR and (31)P NMR spectra were selected as variables to establish a classification/prediction model by employing discriminant analysis. This model, obtained from the training set of 128 samples, resulted in a significant discrimination among the different classes of oils, whereas 100% of correct validated assignments for 64 samples were obtained. Different artificial mixtures of olive-hazelnut, olive-corn, olive-sunflower, and olive-soybean oils were prepared and analyzed by (1)H NMR and (31)P NMR spectroscopy. Subsequent discriminant analysis of the data allowed detection of adulteration as low as 5% w/w, provided that fresh virgin olive oil samples were used, as reflected by their high 1,2-diglycerides to total diglycerides ratio (D > or = 0.90).

  10. Prostate segmentation by sparse representation based classification

    PubMed Central

    Gao, Yaozong; Liao, Shu; Shen, Dinggang

    2012-01-01

    Purpose: The segmentation of prostate in CT images is of essential importance to external beam radiotherapy, which is one of the major treatments for prostate cancer nowadays. During the radiotherapy, the prostate is radiated by high-energy x rays from different directions. In order to maximize the dose to the cancer and minimize the dose to the surrounding healthy tissues (e.g., bladder and rectum), the prostate in the new treatment image needs to be accurately localized. Therefore, the effectiveness and efficiency of external beam radiotherapy highly depend on the accurate localization of the prostate. However, due to the low contrast of the prostate with its surrounding tissues (e.g., bladder), the unpredicted prostate motion, and the large appearance variations across different treatment days, it is challenging to segment the prostate in CT images. In this paper, the authors present a novel classification based segmentation method to address these problems. Methods: To segment the prostate, the proposed method first uses sparse representation based classification (SRC) to enhance the prostate in CT images by pixel-wise classification, in order to overcome the limitation of poor contrast of the prostate images. Then, based on the classification results, previous segmented prostates of the same patient are used as patient-specific atlases to align onto the current treatment image and the majority voting strategy is finally adopted to segment the prostate. In order to address the limitations of the traditional SRC in pixel-wise classification, especially for the purpose of segmentation, the authors extend SRC from the following four aspects: (1) A discriminant subdictionary learning method is proposed to learn a discriminant and compact representation of training samples for each class so that the discriminant power of SRC can be increased and also SRC can be applied to the large-scale pixel-wise classification. (2) The L1 regularized sparse coding is replaced by

  11. Prostate segmentation by sparse representation based classification.

    PubMed

    Gao, Yaozong; Liao, Shu; Shen, Dinggang

    2012-10-01

    The segmentation of prostate in CT images is of essential importance to external beam radiotherapy, which is one of the major treatments for prostate cancer nowadays. During the radiotherapy, the prostate is radiated by high-energy x rays from different directions. In order to maximize the dose to the cancer and minimize the dose to the surrounding healthy tissues (e.g., bladder and rectum), the prostate in the new treatment image needs to be accurately localized. Therefore, the effectiveness and efficiency of external beam radiotherapy highly depend on the accurate localization of the prostate. However, due to the low contrast of the prostate with its surrounding tissues (e.g., bladder), the unpredicted prostate motion, and the large appearance variations across different treatment days, it is challenging to segment the prostate in CT images. In this paper, the authors present a novel classification based segmentation method to address these problems. To segment the prostate, the proposed method first uses sparse representation based classification (SRC) to enhance the prostate in CT images by pixel-wise classification, in order to overcome the limitation of poor contrast of the prostate images. Then, based on the classification results, previous segmented prostates of the same patient are used as patient-specific atlases to align onto the current treatment image and the majority voting strategy is finally adopted to segment the prostate. In order to address the limitations of the traditional SRC in pixel-wise classification, especially for the purpose of segmentation, the authors extend SRC from the following four aspects: (1) A discriminant subdictionary learning method is proposed to learn a discriminant and compact representation of training samples for each class so that the discriminant power of SRC can be increased and also SRC can be applied to the large-scale pixel-wise classification. (2) The L1 regularized sparse coding is replaced by the elastic net in

  12. Mapping forested wetlands in the Great Zhan River Basin through integrating optical, radar, and topographical data classification techniques.

    PubMed

    Na, X D; Zang, S Y; Wu, C S; Li, W L

    2015-11-01

    Knowledge of the spatial extent of forested wetlands is essential to many studies including wetland functioning assessment, greenhouse gas flux estimation, and wildlife suitable habitat identification. For discriminating forested wetlands from their adjacent land cover types, researchers have resorted to image analysis techniques applied to numerous remotely sensed data. While with some success, there is still no consensus on the optimal approaches for mapping forested wetlands. To address this problem, we examined two machine learning approaches, random forest (RF) and K-nearest neighbor (KNN) algorithms, and applied these two approaches to the framework of pixel-based and object-based classifications. The RF and KNN algorithms were constructed using predictors derived from Landsat 8 imagery, Radarsat-2 advanced synthetic aperture radar (SAR), and topographical indices. The results show that the objected-based classifications performed better than per-pixel classifications using the same algorithm (RF) in terms of overall accuracy and the difference of their kappa coefficients are statistically significant (p<0.01). There were noticeably omissions for forested and herbaceous wetlands based on the per-pixel classifications using the RF algorithm. As for the object-based image analysis, there were also statistically significant differences (p<0.01) of Kappa coefficient between results performed based on RF and KNN algorithms. The object-based classification using RF provided a more visually adequate distribution of interested land cover types, while the object classifications based on the KNN algorithm showed noticeably commissions for forested wetlands and omissions for agriculture land. This research proves that the object-based classification with RF using optical, radar, and topographical data improved the mapping accuracy of land covers and provided a feasible approach to discriminate the forested wetlands from the other land cover types in forestry area.

  13. Perinatal mortality classification: an analysis of 112 cases of stillbirth.

    PubMed

    Reis, Ana Paula; Rocha, Ana; Lebre, Andrea; Ramos, Umbelina; Cunha, Ana

    2017-10-01

    This was a retrospective cohort analysis of stillbirths that occurred from January 2004 to December 2013 in our institution. We compared Tulip and Wigglesworth classification systems on a cohort of stillbirths and analysed the main differences between these two classifications. In this period, there were 112 stillbirths of a total of 31,758 births (stillbirth rate of 3.5 per 1000 births). There were 99 antepartum deaths and 13 intrapartum deaths. Foetal autopsy was performed in 99 cases and placental histopathological examination in all of the cases. The Wigglesworth found 'unknown' causes in 47 cases and the Tulip classification allocated 33 of these. Fourteen cases remained in the group of 'unknown' causes. Therefore, the Wigglesworth classification of stillbirths results in a higher proportion of unexplained stillbirths. We suggest that the traditional Wigglesworth classification should be substituted by a classification that manages the available information.

  14. Classification of upper limb disability levels of children with spastic unilateral cerebral palsy using K-means algorithm.

    PubMed

    Raouafi, Sana; Achiche, Sofiane; Begon, Mickael; Sarcher, Aurélie; Raison, Maxime

    2018-01-01

    Treatment for cerebral palsy depends upon the severity of the child's condition and requires knowledge about upper limb disability. The aim of this study was to develop a systematic quantitative classification method of the upper limb disability levels for children with spastic unilateral cerebral palsy based on upper limb movements and muscle activation. Thirteen children with spastic unilateral cerebral palsy and six typically developing children participated in this study. Patients were matched on age and manual ability classification system levels I to III. Twenty-three kinematic and electromyographic variables were collected from two tasks. Discriminative analysis and K-means clustering algorithm were applied using 23 kinematic and EMG variables of each participant. Among the 23 kinematic and electromyographic variables, only two variables containing the most relevant information for the prediction of the four levels of severity of spastic unilateral cerebral palsy, which are fixed by manual ability classification system, were identified by discriminant analysis: (1) the Falconer index (CAI E ) which represents the ratio of biceps to triceps brachii activity during extension and (2) the maximal angle extension (θ Extension,max ). A good correlation (Kendall Rank correlation coefficient = -0.53, p = 0.01) was found between levels fixed by manual ability classification system and the obtained classes. These findings suggest that the cost and effort needed to assess and characterize the disability level of a child can be further reduced.

  15. Study design in high-dimensional classification analysis.

    PubMed

    Sánchez, Brisa N; Wu, Meihua; Song, Peter X K; Wang, Wen

    2016-10-01

    Advances in high throughput technology have accelerated the use of hundreds to millions of biomarkers to construct classifiers that partition patients into different clinical conditions. Prior to classifier development in actual studies, a critical need is to determine the sample size required to reach a specified classification precision. We develop a systematic approach for sample size determination in high-dimensional (large [Formula: see text] small [Formula: see text]) classification analysis. Our method utilizes the probability of correct classification (PCC) as the optimization objective function and incorporates the higher criticism thresholding procedure for classifier development. Further, we derive the theoretical bound of maximal PCC gain from feature augmentation (e.g. when molecular and clinical predictors are combined in classifier development). Our methods are motivated and illustrated by a study using proteomics markers to classify post-kidney transplantation patients into stable and rejecting classes. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  16. SCOWLP classification: Structural comparison and analysis of protein binding regions

    PubMed Central

    Teyra, Joan; Paszkowski-Rogacz, Maciej; Anders, Gerd; Pisabarro, M Teresa

    2008-01-01

    Background Detailed information about protein interactions is critical for our understanding of the principles governing protein recognition mechanisms. The structures of many proteins have been experimentally determined in complex with different ligands bound either in the same or different binding regions. Thus, the structural interactome requires the development of tools to classify protein binding regions. A proper classification may provide a general view of the regions that a protein uses to bind others and also facilitate a detailed comparative analysis of the interacting information for specific protein binding regions at atomic level. Such classification might be of potential use for deciphering protein interaction networks, understanding protein function, rational engineering and design. Description Protein binding regions (PBRs) might be ideally described as well-defined separated regions that share no interacting residues one another. However, PBRs are often irregular, discontinuous and can share a wide range of interacting residues among them. The criteria to define an individual binding region can be often arbitrary and may differ from other binding regions within a protein family. Therefore, the rational behind protein interface classification should aim to fulfil the requirements of the analysis to be performed. We extract detailed interaction information of protein domains, peptides and interfacial solvent from the SCOWLP database and we classify the PBRs of each domain family. For this purpose, we define a similarity index based on the overlapping of interacting residues mapped in pair-wise structural alignments. We perform our classification with agglomerative hierarchical clustering using the complete-linkage method. Our classification is calculated at different similarity cut-offs to allow flexibility in the analysis of PBRs, feature especially interesting for those protein families with conflictive binding regions. The hierarchical

  17. Discrimination of biological and chemical threat simulants in residue mixtures on multiple substrates.

    PubMed

    Gottfried, Jennifer L

    2011-07-01

    The potential of laser-induced breakdown spectroscopy (LIBS) to discriminate biological and chemical threat simulant residues prepared on multiple substrates and in the presence of interferents has been explored. The simulant samples tested include Bacillus atrophaeus spores, Escherichia coli, MS-2 bacteriophage, α-hemolysin from Staphylococcus aureus, 2-chloroethyl ethyl sulfide, and dimethyl methylphosphonate. The residue samples were prepared on polycarbonate, stainless steel and aluminum foil substrates by Battelle Eastern Science and Technology Center. LIBS spectra were collected by Battelle on a portable LIBS instrument developed by A3 Technologies. This paper presents the chemometric analysis of the LIBS spectra using partial least-squares discriminant analysis (PLS-DA). The performance of PLS-DA models developed based on the full LIBS spectra, and selected emission intensities and ratios have been compared. The full-spectra models generally provided better classification results based on the inclusion of substrate emission features; however, the intensity/ratio models were able to correctly identify more types of simulant residues in the presence of interferents. The fusion of the two types of PLS-DA models resulted in a significant improvement in classification performance for models built using multiple substrates. In addition to identifying the major components of residue mixtures, minor components such as growth media and solvents can be identified with an appropriately designed PLS-DA model.

  18. Evaluation of volatile metabolites as markers in Lycopersicon esculentum L. cultivars discrimination by multivariate analysis of headspace solid phase microextraction and mass spectrometry data.

    PubMed

    Figueira, José; Câmara, Hugo; Pereira, Jorge; Câmara, José S

    2014-02-15

    To gain insights on the effects of cultivar on the volatile metabolomic expression of different tomato (Lycopersicon esculentum L.) cultivars--Plum, Campari, Grape, Cherry and Regional, cultivated under similar edafoclimatic conditions, and to identify the most discriminate volatile marker metabolites related to the cultivar, the chromatographic profiles resulting from headspace solid phase microextraction (HS-SPME) and gas chromatography-mass spectrometry (GC-qMS) analysis, combined with multivariate analysis were investigated. The data set composed by the 77 volatile metabolites identified in the target tomato cultivars, 5 of which (2,2,6-trimethylcyclohexanone, 2-methyl-6-methyleneoctan-2-ol, 4-octadecyl-morpholine, (Z)-methyl-3-hexenoate and 3-octanone) are reported for the first time in tomato volatile metabolomic composition, was evaluated by chemometrics. Firstly, principal component analysis was carried out in order to visualise data trends and clusters, and then, linear discriminant analysis in order to detect the set of volatile metabolites able to differentiate groups according to tomato cultivars. The results obtained revealed a perfect discrimination between the different Lycopersicon esculentum L. cultivars considered. The assignment success rate was 100% in classification and 80% in prediction ability by using "leave-one-out" cross-validation procedure. The volatile profile was able to differentiate all five cultivars and revealed complex interactions between them including the participation in the same biosynthetic pathway. The volatile metabolomic platform for tomato samples obtained by HS-SPME/GC-qMS here described, and the interrelationship detected among the volatile metabolites can be used as a roadmap for biotechnological applications, namely to improve tomato aroma and their acceptance in the final consumer, and for traceability studies. Copyright © 2013 Elsevier Ltd. All rights reserved.

  19. Classification of Hand Grasp Kinetics and Types Using Movement-Related Cortical Potentials and EEG Rhythms.

    PubMed

    Jochumsen, Mads; Rovsing, Cecilie; Rovsing, Helene; Niazi, Imran Khan; Dremstrup, Kim; Kamavuako, Ernest Nlandu

    2017-01-01

    Detection of single-trial movement intentions from EEG is paramount for brain-computer interfacing in neurorehabilitation. These movement intentions contain task-related information and if this is decoded, the neurorehabilitation could potentially be optimized. The aim of this study was to classify single-trial movement intentions associated with two levels of force and speed and three different grasp types using EEG rhythms and components of the movement-related cortical potential (MRCP) as features. The feature importance was used to estimate encoding of discriminative information. Two data sets were used. 29 healthy subjects executed and imagined different hand movements, while EEG was recorded over the contralateral sensorimotor cortex. The following features were extracted: delta, theta, mu/alpha, beta, and gamma rhythms, readiness potential, negative slope, and motor potential of the MRCP. Sequential forward selection was performed, and classification was performed using linear discriminant analysis and support vector machines. Limited classification accuracies were obtained from the EEG rhythms and MRCP-components: 0.48 ± 0.05 (grasp types), 0.41 ± 0.07 (kinetic profiles, motor execution), and 0.39 ± 0.08 (kinetic profiles, motor imagination). Delta activity contributed the most but all features provided discriminative information. These findings suggest that information from the entire EEG spectrum is needed to discriminate between task-related parameters from single-trial movement intentions.

  20. Social Status Correlates of Reporting Racial Discrimination and Gender Discrimination among Racially Diverse Women

    PubMed Central

    Ro, Annie E.; Choi, Kyung-Hee

    2009-01-01

    The growing body of research on discrimination and health indicates a deleterious effect of discrimination on various health outcomes. However, less is known about the sociodemographic correlates of reporting racial discrimination and gender discrimination among racially diverse women. We examined the associations of social status characteristics with lifetime experiences of racial discrimination and gender discrimination using a racially-diverse sample of 754 women attending family planning clinics in Northern California (11.4% African American, 16.8% Latina, 10.1% Asian and 61.7% Caucasian). A multivariate analysis revealed that race, financial difficulty and marital status were significantly correlated with higher reports of racial discrimination, while race, education, financial difficulty and nativity were significantly correlated with gender discrimination scores. Our findings suggest that the social patterning of perceiving racial discrimination is somewhat different from that of gender discrimination. This has implications in the realm of discrimination research and applied interventions, as different forms of discrimination may have unique covariates that should be accounted for in research analysis or program design. PMID:19485231

  1. Social status correlates of reporting gender discrimination and racial discrimination among racially diverse women.

    PubMed

    Ro, Annie E; Choi, Kyung-Hee

    2009-01-01

    The growing body of research on discrimination and health indicates a deleterious effect of discrimination on various health outcomes. However, less is known about the sociodemographic correlates of reporting racial discrimination and gender discrimination among racially diverse women. We examined the associations of social status characteristics with lifetime experiences of racial discrimination and gender discrimination using a racially-diverse sample of 754 women attending family planning clinics in North California (11.4% African American, 16.8% Latina, 10.1% Asian and 61.7% Caucasian). A multivariate analysis revealed that race, financial difficulty and marital status were significantly correlated with higher reports of racial discrimination, while race, education, financial difficulty and nativity were significantly correlated with gender discrimination scores. Our findings suggest that the social patterning of perceiving racial discrimination is somewhat different from that of gender discrimination. This has implications in the realm of discrimination research and applied interventions, as different forms of discrimination may have unique covariates that should be accounted for in research analysis or program design.

  2. A neural network approach to cloud classification

    NASA Technical Reports Server (NTRS)

    Lee, Jonathan; Weger, Ronald C.; Sengupta, Sailes K.; Welch, Ronald M.

    1990-01-01

    It is shown that, using high-spatial-resolution data, very high cloud classification accuracies can be obtained with a neural network approach. A texture-based neural network classifier using only single-channel visible Landsat MSS imagery achieves an overall cloud identification accuracy of 93 percent. Cirrus can be distinguished from boundary layer cloudiness with an accuracy of 96 percent, without the use of an infrared channel. Stratocumulus is retrieved with an accuracy of 92 percent, cumulus at 90 percent. The use of the neural network does not improve cirrus classification accuracy. Rather, its main effect is in the improved separation between stratocumulus and cumulus cloudiness. While most cloud classification algorithms rely on linear parametric schemes, the present study is based on a nonlinear, nonparametric four-layer neural network approach. A three-layer neural network architecture, the nonparametric K-nearest neighbor approach, and the linear stepwise discriminant analysis procedure are compared. A significant finding is that significantly higher accuracies are attained with the nonparametric approaches using only 20 percent of the database as training data, compared to 67 percent of the database in the linear approach.

  3. Estuarial fingerprinting through multidimensional fluorescence and multivariate analysis.

    PubMed

    Hall, Gregory J; Clow, Kerin E; Kenny, Jonathan E

    2005-10-01

    As part of a strategy for preventing the introduction of aquatic nuisance species (ANS) to U.S. estuaries, ballast water exchange (BWE) regulations have been imposed. Enforcing these regulations requires a reliable method for determining the port of origin of water in the ballast tanks of ships entering U.S. waters. This study shows that a three-dimensional fluorescence fingerprinting technique, excitation emission matrix (EEM) spectroscopy, holds great promise as a ballast water analysis tool. In our technique, EEMs are analyzed by multivariate classification and curve resolution methods, such as N-way partial least squares Regression-discriminant analysis (NPLS-DA) and parallel factor analysis (PARAFAC). We demonstrate that classification techniques can be used to discriminate among sampling sites less than 10 miles apart, encompassing Boston Harbor and two tributaries in the Mystic River Watershed. To our knowledge, this work is the first to use multivariate analysis to classify water as to location of origin. Furthermore, it is shown that curve resolution can show seasonal features within the multidimensional fluorescence data sets, which correlate with difficulty in classification.

  4. A composite sensor array impedentiometric electronic tongue Part II. Discrimination of basic tastes.

    PubMed

    Pioggia, G; Di Francesco, F; Marchetti, A; Ferro, M; Leardi, R; Ahluwalia, A

    2007-05-15

    An impedentiometric electronic tongue based on the combination of a composite sensor array and chemometric techniques aimed at the discrimination of soluble compounds able to elicit different gustative perceptions is presented. A composite array consisting of chemo-sensitive layers based on carbon nanotubes or carbon black dispersed in polymeric matrices and doped polythiophenes was used. The electrical impedance of the sensor array was measured at a frequency of 150 Hz by means of an impedance meter. The experimental set-up was designed in order to allow the automatic selection of a test solution and dipping of the sensor array following a dedicated measurement protocol. Measurements were carried out on 15 different solutions eliciting 5 different tastes (sodium chloride, citric acid, glucose, glutamic acid and sodium dehydrocholate for salty, sour, sweet, umami and bitter, respectively) at 3 concentration levels comprising the human perceptive range. In order to avoid over-fitting, more than 100 repetitions for each sample were carried in a 4-month period. Principal component analysis (PCA) was used to detect and remove outliers. Classification was performed by linear discriminant analysis (LDA). A fairly good degree of discrimination was obtained.

  5. Some sequential, distribution-free pattern classification procedures with applications

    NASA Technical Reports Server (NTRS)

    Poage, J. L.

    1971-01-01

    Some sequential, distribution-free pattern classification techniques are presented. The decision problem to which the proposed classification methods are applied is that of discriminating between two kinds of electroencephalogram responses recorded from a human subject: spontaneous EEG and EEG driven by a stroboscopic light stimulus at the alpha frequency. The classification procedures proposed make use of the theory of order statistics. Estimates of the probabilities of misclassification are given. The procedures were tested on Gaussian samples and the EEG responses.

  6. Simultaneous fecal microbial and metabolite profiling enables accurate classification of pediatric irritable bowel syndrome.

    PubMed

    Shankar, Vijay; Reo, Nicholas V; Paliy, Oleg

    2015-12-09

    We previously showed that stool samples of pre-adolescent and adolescent US children diagnosed with diarrhea-predominant IBS (IBS-D) had different compositions of microbiota and metabolites compared to healthy age-matched controls. Here we explored whether observed fecal microbiota and metabolite differences between these two adolescent populations can be used to discriminate between IBS and health. We constructed individual microbiota- and metabolite-based sample classification models based on the partial least squares multivariate analysis and then applied a Bayesian approach to integrate individual models into a single classifier. The resulting combined classification achieved 84 % accuracy of correct sample group assignment and 86 % prediction for IBS-D in cross-validation tests. The performance of the cumulative classification model was further validated by the de novo analysis of stool samples from a small independent IBS-D cohort. High-throughput microbial and metabolite profiling of subject stool samples can be used to facilitate IBS diagnosis.

  7. Semantic and topological classification of images in magnetically guided capsule endoscopy

    NASA Astrophysics Data System (ADS)

    Mewes, P. W.; Rennert, P.; Juloski, A. L.; Lalande, A.; Angelopoulou, E.; Kuth, R.; Hornegger, J.

    2012-03-01

    Magnetically-guided capsule endoscopy (MGCE) is a nascent technology with the goal to allow the steering of a capsule endoscope inside a water filled stomach through an external magnetic field. We developed a classification cascade for MGCE images with groups images in semantic and topological categories. Results can be used in a post-procedure review or as a starting point for algorithms classifying pathologies. The first semantic classification step discards over-/under-exposed images as well as images with a large amount of debris. The second topological classification step groups images with respect to their position in the upper gastrointestinal tract (mouth, esophagus, stomach, duodenum). In the third stage two parallel classifications steps distinguish topologically different regions inside the stomach (cardia, fundus, pylorus, antrum, peristaltic view). For image classification, global image features and local texture features were applied and their performance was evaluated. We show that the third classification step can be improved by a bubble and debris segmentation because it limits feature extraction to discriminative areas only. We also investigated the impact of segmenting intestinal folds on the identification of different semantic camera positions. The results of classifications with a support-vector-machine show the significance of color histogram features for the classification of corrupted images (97%). Features extracted from intestinal fold segmentation lead only to a minor improvement (3%) in discriminating different camera positions.

  8. Human Vision-Motivated Algorithm Allows Consistent Retinal Vessel Classification Based on Local Color Contrast for Advancing General Diagnostic Exams.

    PubMed

    Ivanov, Iliya V; Leitritz, Martin A; Norrenberg, Lars A; Völker, Michael; Dynowski, Marek; Ueffing, Marius; Dietter, Johannes

    2016-02-01

    Abnormalities of blood vessel anatomy, morphology, and ratio can serve as important diagnostic markers for retinal diseases such as AMD or diabetic retinopathy. Large cohort studies demand automated and quantitative image analysis of vascular abnormalities. Therefore, we developed an analytical software tool to enable automated standardized classification of blood vessels supporting clinical reading. A dataset of 61 images was collected from a total of 33 women and 8 men with a median age of 38 years. The pupils were not dilated, and images were taken after dark adaption. In contrast to current methods in which classification is based on vessel profile intensity averages, and similar to human vision, local color contrast was chosen as a discriminator to allow artery vein discrimination and arterial-venous ratio (AVR) calculation without vessel tracking. With 83% ± 1 standard error of the mean for our dataset, we achieved best classification for weighted lightness information from a combination of the red, green, and blue channels. Tested on an independent dataset, our method reached 89% correct classification, which, when benchmarked against conventional ophthalmologic classification, shows significantly improved classification scores. Our study demonstrates that vessel classification based on local color contrast can cope with inter- or intraimage lightness variability and allows consistent AVR calculation. We offer an open-source implementation of this method upon request, which can be integrated into existing tool sets and applied to general diagnostic exams.

  9. Fully-automated identification of fish species based on otolith contour: using short-time Fourier transform and discriminant analysis (STFT-DA).

    PubMed

    Salimi, Nima; Loh, Kar Hoe; Kaur Dhillon, Sarinder; Chong, Ving Ching

    2016-01-01

    Background. Fish species may be identified based on their unique otolith shape or contour. Several pattern recognition methods have been proposed to classify fish species through morphological features of the otolith contours. However, there has been no fully-automated species identification model with the accuracy higher than 80%. The purpose of the current study is to develop a fully-automated model, based on the otolith contours, to identify the fish species with the high classification accuracy. Methods. Images of the right sagittal otoliths of 14 fish species from three families namely Sciaenidae, Ariidae, and Engraulidae were used to develop the proposed identification model. Short-time Fourier transform (STFT) was used, for the first time in the area of otolith shape analysis, to extract important features of the otolith contours. Discriminant Analysis (DA), as a classification technique, was used to train and test the model based on the extracted features. Results. Performance of the model was demonstrated using species from three families separately, as well as all species combined. Overall classification accuracy of the model was greater than 90% for all cases. In addition, effects of STFT variables on the performance of the identification model were explored in this study. Conclusions. Short-time Fourier transform could determine important features of the otolith outlines. The fully-automated model proposed in this study (STFT-DA) could predict species of an unknown specimen with acceptable identification accuracy. The model codes can be accessed at http://mybiodiversityontologies.um.edu.my/Otolith/ and https://peerj.com/preprints/1517/. The current model has flexibility to be used for more species and families in future studies.

  10. Kin discrimination within honey bee (Apis mellifera) colonies: An analysis of the evidence.

    PubMed

    Breed, M D; Welch, C K; Cruz, R

    1994-12-01

    Compelling evolutionary arguments lead to the prediction that honey bee workers should discriminate between supersisters and half-sisters within colonies. We review the theoretical support for discrimination during swarming, queen rearing, feeding, and grooming. A survey of the data that tests whether such discrimination takes place shows that, despite substantial effort in a number of laboratories, there is no conclusive evidence for intracolony discrimination in any of the postulated contexts. The strongest suggestive data is in the critical context of queen rearing, but flaws in experimental design or analysis make the best available tests inconclusive. We present new data that shows that cues exist on which discriminations can be made among adult workers in nestmate recognition interactions and in feeding interactions, but our data does not differentiate between subfamily recognition and recognition associated with color phenotypes. We conclude that while selection may favor discrimination between supersisters and half-sisters, as a practical matter such discriminations play no role, or only a minor role, in the biology of the honey bee. Copyright © 1994. Published by Elsevier B.V.

  11. Joint deconvolution and classification with applications to passive acoustic underwater multipath.

    PubMed

    Anderson, Hyrum S; Gupta, Maya R

    2008-11-01

    This paper addresses the problem of classifying signals that have been corrupted by noise and unknown linear time-invariant (LTI) filtering such as multipath, given labeled uncorrupted training signals. A maximum a posteriori approach to the deconvolution and classification is considered, which produces estimates of the desired signal, the unknown channel, and the class label. For cases in which only a class label is needed, the classification accuracy can be improved by not committing to an estimate of the channel or signal. A variant of the quadratic discriminant analysis (QDA) classifier is proposed that probabilistically accounts for the unknown LTI filtering, and which avoids deconvolution. The proposed QDA classifier can work either directly on the signal or on features whose transformation by LTI filtering can be analyzed; as an example a classifier for subband-power features is derived. Results on simulated data and real Bowhead whale vocalizations show that jointly considering deconvolution with classification can dramatically improve classification performance over traditional methods over a range of signal-to-noise ratios.

  12. Comparing two metabolic profiling approaches (liquid chromatography and gas chromatography coupled to mass spectrometry) for extra-virgin olive oil phenolic compounds analysis: A botanical classification perspective.

    PubMed

    Bajoub, Aadil; Pacchiarotta, Tiziana; Hurtado-Fernández, Elena; Olmo-García, Lucía; García-Villalba, Rocío; Fernández-Gutiérrez, Alberto; Mayboroda, Oleg A; Carrasco-Pancorbo, Alegría

    2016-01-08

    Over the last decades, the phenolic compounds from virgin olive oil (VOO) have become the subject of intensive research because of their biological activities and their influence on some of the most relevant attributes of this interesting matrix. Developing metabolic profiling approaches to determine them in monovarietal virgin olive oils could help to gain a deeper insight into olive oil phenolic compounds composition as well as to promote their use for botanical origin tracing purposes. To this end, two approaches were comparatively investigated (LC-ESI-TOF MS and GC-APCI-TOF MS) to evaluate their capacity to properly classify 25 olive oil samples belonging to five different varieties (Arbequina, Cornicabra, Hojiblanca, Frantoio and Picual), using the entire chromatographic phenolic profiles combined to chemometrics (principal component analysis (PCA) and partial least square-discriminant analysis (PLS-DA)). The application of PCA to LC-MS and GC-MS data showed the natural clustering of the samples, seeing that 2 varieties were dominating the models (Arbequina and Frantoio), suppressing any possible discrimination among the other cultivars. Afterwards, PLS-DA was used to build four different efficient predictive models for varietal classification of the samples under study. The varietal markers pointed out by each platform were compared. In general, with the exception of one GC-MS model, all exhibited proper quality parameters. The models constructed by using the LC-MS data demonstrated superior classification ability. Copyright © 2015 Elsevier B.V. All rights reserved.

  13. Automated fine structure image analysis method for discrimination of diabetic retinopathy stage using conjunctival microvasculature images

    PubMed Central

    Khansari, Maziyar M; O’Neill, William; Penn, Richard; Chau, Felix; Blair, Norman P; Shahidi, Mahnaz

    2016-01-01

    The conjunctiva is a densely vascularized mucus membrane covering the sclera of the eye with a unique advantage of accessibility for direct visualization and non-invasive imaging. The purpose of this study is to apply an automated quantitative method for discrimination of different stages of diabetic retinopathy (DR) using conjunctival microvasculature images. Fine structural analysis of conjunctival microvasculature images was performed by ordinary least square regression and Fisher linear discriminant analysis. Conjunctival images between groups of non-diabetic and diabetic subjects at different stages of DR were discriminated. The automated method’s discriminate rates were higher than those determined by human observers. The method allowed sensitive and rapid discrimination by assessment of conjunctival microvasculature images and can be potentially useful for DR screening and monitoring. PMID:27446692

  14. Principal Component Analysis for pulse-shape discrimination of scintillation radiation detectors

    NASA Astrophysics Data System (ADS)

    Alharbi, T.

    2016-01-01

    In this paper, we report on the application of Principal Component analysis (PCA) for pulse-shape discrimination (PSD) of scintillation radiation detectors. The details of the method are described and the performance of the method is experimentally examined by discriminating between neutrons and gamma-rays with a liquid scintillation detector in a mixed radiation field. The performance of the method is also compared against that of the conventional charge-comparison method, demonstrating the superior performance of the method particularly at low light output range. PCA analysis has the important advantage of automatic extraction of the pulse-shape characteristics which makes the PSD method directly applicable to various scintillation detectors without the need for the adjustment of a PSD parameter.

  15. DTI measurements for Alzheimer’s classification

    NASA Astrophysics Data System (ADS)

    Maggipinto, Tommaso; Bellotti, Roberto; Amoroso, Nicola; Diacono, Domenico; Donvito, Giacinto; Lella, Eufemia; Monaco, Alfonso; Antonella Scelsi, Marzia; Tangaro, Sabina; Disease Neuroimaging Initiative, Alzheimer's.

    2017-03-01

    Diffusion tensor imaging (DTI) is a promising imaging technique that provides insight into white matter microstructure integrity and it has greatly helped identifying white matter regions affected by Alzheimer’s disease (AD) in its early stages. DTI can therefore be a valuable source of information when designing machine-learning strategies to discriminate between healthy control (HC) subjects, AD patients and subjects with mild cognitive impairment (MCI). Nonetheless, several studies have reported so far conflicting results, especially because of the adoption of biased feature selection strategies. In this paper we firstly analyzed DTI scans of 150 subjects from the Alzheimer’s disease neuroimaging initiative (ADNI) database. We measured a significant effect of the feature selection bias on the classification performance (p-value  <  0.01), leading to overoptimistic results (10% up to 30% relative increase in AUC). We observed that this effect is manifest regardless of the choice of diffusion index, specifically fractional anisotropy and mean diffusivity. Secondly, we performed a test on an independent mixed cohort consisting of 119 ADNI scans; thus, we evaluated the informative content provided by DTI measurements for AD classification. Classification performances and biological insight, concerning brain regions related to the disease, provided by cross-validation analysis were both confirmed on the independent test.

  16. Evaluation of Classifier Performance for Multiclass Phenotype Discrimination in Untargeted Metabolomics.

    PubMed

    Trainor, Patrick J; DeFilippis, Andrew P; Rai, Shesh N

    2017-06-21

    Statistical classification is a critical component of utilizing metabolomics data for examining the molecular determinants of phenotypes. Despite this, a comprehensive and rigorous evaluation of the accuracy of classification techniques for phenotype discrimination given metabolomics data has not been conducted. We conducted such an evaluation using both simulated and real metabolomics datasets, comparing Partial Least Squares-Discriminant Analysis (PLS-DA), Sparse PLS-DA, Random Forests, Support Vector Machines (SVM), Artificial Neural Network, k -Nearest Neighbors ( k -NN), and Naïve Bayes classification techniques for discrimination. We evaluated the techniques on simulated data generated to mimic global untargeted metabolomics data by incorporating realistic block-wise correlation and partial correlation structures for mimicking the correlations and metabolite clustering generated by biological processes. Over the simulation studies, covariance structures, means, and effect sizes were stochastically varied to provide consistent estimates of classifier performance over a wide range of possible scenarios. The effects of the presence of non-normal error distributions, the introduction of biological and technical outliers, unbalanced phenotype allocation, missing values due to abundances below a limit of detection, and the effect of prior-significance filtering (dimension reduction) were evaluated via simulation. In each simulation, classifier parameters, such as the number of hidden nodes in a Neural Network, were optimized by cross-validation to minimize the probability of detecting spurious results due to poorly tuned classifiers. Classifier performance was then evaluated using real metabolomics datasets of varying sample medium, sample size, and experimental design. We report that in the most realistic simulation studies that incorporated non-normal error distributions, unbalanced phenotype allocation, outliers, missing values, and dimension reduction

  17. Detection and classification of concealed weapons using a magnetometer-based portal

    NASA Astrophysics Data System (ADS)

    Kotter, Dale K.; Roybal, Lyle G.; Polk, Robert E.

    2002-08-01

    A concealed weapons detection technology was developed through the support of the National Institute of Justice (NIJ) to provide a non intrusive means for rapid detection, location, and archiving of data (including visual) of potential suspects and weapon threats. This technology, developed by the Idaho National Engineering and Environmental Laboratory (INEEL), has been applied in a portal style weapons detection system using passive magnetic sensors as its basis. This paper will report on enhancements to the weapon detection system to enable weapon classification and to discriminate threats from non-threats. Advanced signal processing algorithms were used to analyze the magnetic spectrum generated when a person passes through a portal. These algorithms analyzed multiple variables including variance in the magnetic signature from random weapon placement and/or orientation. They perform pattern recognition and calculate the probability that the collected magnetic signature correlates to a known database of weapon versus non-weapon responses. Neural networks were used to further discriminate weapon type and identify controlled electronic items such as cell phones and pagers. False alarms were further reduced by analyzing the magnetic detector response by using a Joint Time Frequency Analysis digital signal processing technique. The frequency components and power spectrum for a given sensor response were derived. This unique fingerprint provided additional information to aid in signal analysis. This technology has the potential to produce major improvements in weapon detection and classification.

  18. Using spectrotemporal indices to improve the fruit-tree crop classification accuracy

    NASA Astrophysics Data System (ADS)

    Peña, M. A.; Liao, R.; Brenning, A.

    2017-06-01

    This study assesses the potential of spectrotemporal indices derived from satellite image time series (SITS) to improve the classification accuracy of fruit-tree crops. Six major fruit-tree crop types in the Aconcagua Valley, Chile, were classified by applying various linear discriminant analysis (LDA) techniques on a Landsat-8 time series of nine images corresponding to the 2014-15 growing season. As features we not only used the complete spectral resolution of the SITS, but also all possible normalized difference indices (NDIs) that can be constructed from any two bands of the time series, a novel approach to derive features from SITS. Due to the high dimensionality of this "enhanced" feature set we used the lasso and ridge penalized variants of LDA (PLDA). Although classification accuracies yielded by the standard LDA applied on the full-band SITS were good (misclassification error rate, MER = 0.13), they were further improved by 23% (MER = 0.10) with ridge PLDA using the enhanced feature set. The most important bands to discriminate the crops of interest were mainly concentrated on the first two image dates of the time series, corresponding to the crops' greenup stage. Despite the high predictor weights provided by the red and near infrared bands, typically used to construct greenness spectral indices, other spectral regions were also found important for the discrimination, such as the shortwave infrared band at 2.11-2.19 μm, sensitive to foliar water changes. These findings support the usefulness of spectrotemporal indices in the context of SITS-based crop type classifications, which until now have been mainly constructed by the arithmetic combination of two bands of the same image date in order to derive greenness temporal profiles like those from the normalized difference vegetation index.

  19. Modeling time-to-event (survival) data using classification tree analysis.

    PubMed

    Linden, Ariel; Yarnold, Paul R

    2017-12-01

    Time to the occurrence of an event is often studied in health research. Survival analysis differs from other designs in that follow-up times for individuals who do not experience the event by the end of the study (called censored) are accounted for in the analysis. Cox regression is the standard method for analysing censored data, but the assumptions required of these models are easily violated. In this paper, we introduce classification tree analysis (CTA) as a flexible alternative for modelling censored data. Classification tree analysis is a "decision-tree"-like classification model that provides parsimonious, transparent (ie, easy to visually display and interpret) decision rules that maximize predictive accuracy, derives exact P values via permutation tests, and evaluates model cross-generalizability. Using empirical data, we identify all statistically valid, reproducible, longitudinally consistent, and cross-generalizable CTA survival models and then compare their predictive accuracy to estimates derived via Cox regression and an unadjusted naïve model. Model performance is assessed using integrated Brier scores and a comparison between estimated survival curves. The Cox regression model best predicts average incidence of the outcome over time, whereas CTA survival models best predict either relatively high, or low, incidence of the outcome over time. Classification tree analysis survival models offer many advantages over Cox regression, such as explicit maximization of predictive accuracy, parsimony, statistical robustness, and transparency. Therefore, researchers interested in accurate prognoses and clear decision rules should consider developing models using the CTA-survival framework. © 2017 John Wiley & Sons, Ltd.

  20. Improving oil classification quality from oil spill fingerprint beyond six sigma approach.

    PubMed

    Juahir, Hafizan; Ismail, Azimah; Mohamed, Saiful Bahri; Toriman, Mohd Ekhwan; Kassim, Azlina Md; Zain, Sharifuddin Md; Ahmad, Wan Kamaruzaman Wan; Wah, Wong Kok; Zali, Munirah Abdul; Retnam, Ananthy; Taib, Mohd Zaki Mohd; Mokhtar, Mazlin

    2017-07-15

    This study involves the use of quality engineering in oil spill classification based on oil spill fingerprinting from GC-FID and GC-MS employing the six-sigma approach. The oil spills are recovered from various water areas of Peninsular Malaysia and Sabah (East Malaysia). The study approach used six sigma methodologies that effectively serve as the problem solving in oil classification extracted from the complex mixtures of oil spilled dataset. The analysis of six sigma link with the quality engineering improved the organizational performance to achieve its objectivity of the environmental forensics. The study reveals that oil spills are discriminated into four groups' viz. diesel, hydrocarbon fuel oil (HFO), mixture oil lubricant and fuel oil (MOLFO) and waste oil (WO) according to the similarity of the intrinsic chemical properties. Through the validation, it confirmed that four discriminant component, diesel, hydrocarbon fuel oil (HFO), mixture oil lubricant and fuel oil (MOLFO) and waste oil (WO) dominate the oil types with a total variance of 99.51% with ANOVA giving F stat >F critical at 95% confidence level and a Chi Square goodness test of 74.87. Results obtained from this study reveals that by employing six-sigma approach in a data-driven problem such as in the case of oil spill classification, good decision making can be expedited. Copyright © 2017. Published by Elsevier Ltd.