Discriminant forest classification method and system
Chen, Barry Y.; Hanley, William G.; Lemmond, Tracy D.; Hiller, Lawrence J.; Knapp, David A.; Mugge, Marshall J.
2012-11-06
A hybrid machine learning methodology and system for classification that combines classical random forest (RF) methodology with discriminant analysis (DA) techniques to provide enhanced classification capability. A DA technique which uses feature measurements of an object to predict its class membership, such as linear discriminant analysis (LDA) or Andersen-Bahadur linear discriminant technique (AB), is used to split the data at each node in each of its classification trees to train and grow the trees and the forest. When training is finished, a set of n DA-based decision trees of a discriminant forest is produced for use in predicting the classification of new samples of unknown class.
NASA Astrophysics Data System (ADS)
Kurniawan, Dian; Suparti; Sugito
2018-05-01
Population growth in Indonesia has increased every year. According to the population census conducted by the Central Bureau of Statistics (BPS) in 2010, the population of Indonesia has reached 237.6 million people. Therefore, to control the population growth rate, the government hold Family Planning or Keluarga Berencana (KB) program for couples of childbearing age. The purpose of this program is to improve the health of mothers and children in order to manifest prosperous society by controlling births while ensuring control of population growth. The data used in this study is the updated family data of Semarang city in 2016 that conducted by National Family Planning Coordinating Board (BKKBN). From these data, classifiers with kernel discriminant analysis will be obtained, and also classification accuracy will be obtained from that method. The result of the analysis showed that normal kernel discriminant analysis gives 71.05 % classification accuracy with 28.95 % classification error. Whereas triweight kernel discriminant analysis gives 73.68 % classification accuracy with 26.32 % classification error. Using triweight kernel discriminant for data preprocessing of family planning participation of childbearing age couples in Semarang City of 2016 can be stated better than with normal kernel discriminant.
Monakhova, Yulia B; Godelmann, Rolf; Kuballa, Thomas; Mushtakova, Svetlana P; Rutledge, Douglas N
2015-08-15
Discriminant analysis (DA) methods, such as linear discriminant analysis (LDA) or factorial discriminant analysis (FDA), are well-known chemometric approaches for solving classification problems in chemistry. In most applications, principle components analysis (PCA) is used as the first step to generate orthogonal eigenvectors and the corresponding sample scores are utilized to generate discriminant features for the discrimination. Independent components analysis (ICA) based on the minimization of mutual information can be used as an alternative to PCA as a preprocessing tool for LDA and FDA classification. To illustrate the performance of this ICA/DA methodology, four representative nuclear magnetic resonance (NMR) data sets of wine samples were used. The classification was performed regarding grape variety, year of vintage and geographical origin. The average increase for ICA/DA in comparison with PCA/DA in the percentage of correct classification varied between 6±1% and 8±2%. The maximum increase in classification efficiency of 11±2% was observed for discrimination of the year of vintage (ICA/FDA) and geographical origin (ICA/LDA). The procedure to determine the number of extracted features (PCs, ICs) for the optimum DA models was discussed. The use of independent components (ICs) instead of principle components (PCs) resulted in improved classification performance of DA methods. The ICA/LDA method is preferable to ICA/FDA for recognition tasks based on NMR spectroscopic measurements. Copyright © 2015 Elsevier B.V. All rights reserved.
Study on bayes discriminant analysis of EEG data.
Shi, Yuan; He, DanDan; Qin, Fang
2014-01-01
In this paper, we have done Bayes Discriminant analysis to EEG data of experiment objects which are recorded impersonally come up with a relatively accurate method used in feature extraction and classification decisions. In accordance with the strength of α wave, the head electrodes are divided into four species. In use of part of 21 electrodes EEG data of 63 people, we have done Bayes Discriminant analysis to EEG data of six objects. Results In use of part of EEG data of 63 people, we have done Bayes Discriminant analysis, the electrode classification accuracy rates is 64.4%. Bayes Discriminant has higher prediction accuracy, EEG features (mainly αwave) extract more accurate. Bayes Discriminant would be better applied to the feature extraction and classification decisions of EEG data.
NASA Astrophysics Data System (ADS)
Aidi, Muhammad Nur; Sari, Resty Indah
2012-05-01
A decision of credit that given by bank or another creditur must have a risk and it called credit risk. Credit risk is an investor's risk of loss arising from a borrower who does not make payments as promised. The substantial of credit risk can lead to losses for the banks and the debtor. To minimize this problem need a further study to identify a potential new customer before the decision given. Identification of debtor can using various approaches analysis, one of them is by using discriminant analysis. Discriminant analysis in this study are used to classify whether belonging to the debtor's good credit or bad credit. The result of this study are two discriminant functions that can identify new debtor. Before step built the discriminant function, selection of explanatory variables should be done. Purpose of selection independent variable is to choose the variable that can discriminate the group maximally. Selection variables in this study using different test, for categoric variable selection of variable using proportion chi-square test, and stepwise discriminant for numeric variable. The result of this study are two discriminant functions that can identify new debtor. The selected variables that can discriminating two groups of debtor maximally are status of existing checking account, credit history, credit amount, installment rate in percentage of disposable income, sex, age in year, other installment plans, and number of people being liable to provide maintenance. This classification produce a classification accuracy rate is good enough, that is equal to 74,70%. Debtor classification using discriminant analysis has risk level that is small enough, and it ranged beetwen 14,992% and 17,608%. Based on that credit risk rate, using discriminant analysis on the classification of credit status can be used effectively.
Spatial-temporal discriminant analysis for ERP-based brain-computer interface.
Zhang, Yu; Zhou, Guoxu; Zhao, Qibin; Jin, Jing; Wang, Xingyu; Cichocki, Andrzej
2013-03-01
Linear discriminant analysis (LDA) has been widely adopted to classify event-related potential (ERP) in brain-computer interface (BCI). Good classification performance of the ERP-based BCI usually requires sufficient data recordings for effective training of the LDA classifier, and hence a long system calibration time which however may depress the system practicability and cause the users resistance to the BCI system. In this study, we introduce a spatial-temporal discriminant analysis (STDA) to ERP classification. As a multiway extension of the LDA, the STDA method tries to maximize the discriminant information between target and nontarget classes through finding two projection matrices from spatial and temporal dimensions collaboratively, which reduces effectively the feature dimensionality in the discriminant analysis, and hence decreases significantly the number of required training samples. The proposed STDA method was validated with dataset II of the BCI Competition III and dataset recorded from our own experiments, and compared to the state-of-the-art algorithms for ERP classification. Online experiments were additionally implemented for the validation. The superior classification performance in using few training samples shows that the STDA is effective to reduce the system calibration time and improve the classification accuracy, thereby enhancing the practicability of ERP-based BCI.
NASA Technical Reports Server (NTRS)
Quattrochi, D. A.
1984-01-01
An initial analysis of LANDSAT 4 Thematic Mapper (TM) data for the discrimination of agricultural, forested wetland, and urban land covers is conducted using a scene of data collected over Arkansas and Tennessee. A classification of agricultural lands derived from multitemporal LANDSAT Multispectral Scanner (MSS) data is compared with a classification of TM data for the same area. Results from this comparative analysis show that the multitemporal MSS classification produced an overall accuracy of 80.91% while the TM classification yields an overall classification accuracy of 97.06% correct.
Fast classification of hazelnut cultivars through portable infrared spectroscopy and chemometrics
NASA Astrophysics Data System (ADS)
Manfredi, Marcello; Robotti, Elisa; Quasso, Fabio; Mazzucco, Eleonora; Calabrese, Giorgio; Marengo, Emilio
2018-01-01
The authentication and traceability of hazelnuts is very important for both the consumer and the food industry, to safeguard the protected varieties and the food quality. This study investigates the use of a portable FTIR spectrometer coupled to multivariate statistical analysis for the classification of raw hazelnuts. The method discriminates hazelnuts from different origins/cultivars based on differences of the signal intensities of their IR spectra. The multivariate classification methods, namely principal component analysis (PCA) followed by linear discriminant analysis (LDA) and partial least square discriminant analysis (PLS-DA), with or without variable selection, allowed a very good discrimination among the groups, with PLS-DA coupled to variable selection providing the best results. Due to the fast analysis, high sensitivity, simplicity and no sample preparation, the proposed analytical methodology could be successfully used to verify the cultivar of hazelnuts, and the analysis can be performed quickly and directly on site.
Jiménez-Carvelo, Ana M; González-Casado, Antonio; Pérez-Castaño, Estefanía; Cuadros-Rodríguez, Luis
2017-03-01
A new analytical method for the differentiation of olive oil from other vegetable oils using reversed-phase LC and applying chemometric techniques was developed. A 3 cm short column was used to obtain the chromatographic fingerprint of the methyl-transesterified fraction of each vegetable oil. The chromatographic analysis took only 4 min. The multivariate classification methods used were k-nearest neighbors, partial least-squares (PLS) discriminant analysis, one-class PLS, support vector machine classification, and soft independent modeling of class analogies. The discrimination of olive oil from other vegetable edible oils was evaluated by several classification quality metrics. Several strategies for the classification of the olive oil were used: one input-class, two input-class, and pseudo two input-class.
NASA Astrophysics Data System (ADS)
Zhu, Ying; Tan, Tuck Lee
2016-04-01
An effective and simple analytical method using Fourier transform infrared (FTIR) spectroscopy to distinguish wild-grown high-quality Ganoderma lucidum (G. lucidum) from cultivated one is of essential importance for its quality assurance and medicinal value estimation. Commonly used chemical and analytical methods using full spectrum are not so effective for the detection and interpretation due to the complex system of the herbal medicine. In this study, two penalized discriminant analysis models, penalized linear discriminant analysis (PLDA) and elastic net (Elnet),using FTIR spectroscopy have been explored for the purpose of discrimination and interpretation. The classification performances of the two penalized models have been compared with two widely used multivariate methods, principal component discriminant analysis (PCDA) and partial least squares discriminant analysis (PLSDA). The Elnet model involving a combination of L1 and L2 norm penalties enabled an automatic selection of a small number of informative spectral absorption bands and gave an excellent classification accuracy of 99% for discrimination between spectra of wild-grown and cultivated G. lucidum. Its classification performance was superior to that of the PLDA model in a pure L1 setting and outperformed the PCDA and PLSDA models using full wavelength. The well-performed selection of informative spectral features leads to substantial reduction in model complexity and improvement of classification accuracy, and it is particularly helpful for the quantitative interpretations of the major chemical constituents of G. lucidum regarding its anti-cancer effects.
Davis, Philip A.; Grolier, Maurice J.
1984-01-01
Landsat multispectral scanner (MSS) band and band-ratio databases of two scenes covering the Midyan region of northwestern Saudi Arabia were examined quantitatively and qualitatively to determine which databases best discriminate the geologic units of this semi-arid and arid region. Unsupervised, linear-discriminant cluster-analysis was performed on these two band-ratio combinations and on the MSS bands for both scenes. The results for granitoid-rock discrimination indicated that the classification images using the MSS bands are superior to the band-ratio classification images for two reasons, discussed in the paper. Yet, the effects of topography and material type (including desert varnish) on the MSS-band data produced ambiguities in the MSS-band classification results. However, these ambiguities were clarified by using a simulated natural-color image in conjunction with the MSS-band classification image.
Zhu, Ying; Tan, Tuck Lee
2016-04-15
An effective and simple analytical method using Fourier transform infrared (FTIR) spectroscopy to distinguish wild-grown high-quality Ganoderma lucidum (G. lucidum) from cultivated one is of essential importance for its quality assurance and medicinal value estimation. Commonly used chemical and analytical methods using full spectrum are not so effective for the detection and interpretation due to the complex system of the herbal medicine. In this study, two penalized discriminant analysis models, penalized linear discriminant analysis (PLDA) and elastic net (Elnet),using FTIR spectroscopy have been explored for the purpose of discrimination and interpretation. The classification performances of the two penalized models have been compared with two widely used multivariate methods, principal component discriminant analysis (PCDA) and partial least squares discriminant analysis (PLSDA). The Elnet model involving a combination of L1 and L2 norm penalties enabled an automatic selection of a small number of informative spectral absorption bands and gave an excellent classification accuracy of 99% for discrimination between spectra of wild-grown and cultivated G. lucidum. Its classification performance was superior to that of the PLDA model in a pure L1 setting and outperformed the PCDA and PLSDA models using full wavelength. The well-performed selection of informative spectral features leads to substantial reduction in model complexity and improvement of classification accuracy, and it is particularly helpful for the quantitative interpretations of the major chemical constituents of G. lucidum regarding its anti-cancer effects. Copyright © 2016 Elsevier B.V. All rights reserved.
ERIC Educational Resources Information Center
Fan, Xitao; Wang, Lin
The Monte Carlo study compared the performance of predictive discriminant analysis (PDA) and that of logistic regression (LR) for the two-group classification problem. Prior probabilities were used for classification, but the cost of misclassification was assumed to be equal. The study used a fully crossed three-factor experimental design (with…
Singularity and Nonnormality in the Classification of Compositional Data
Bohling, Geoffrey C.; Davis, J.C.; Olea, R.A.; Harff, Jan
1998-01-01
Geologists may want to classify compositional data and express the classification as a map. Regionalized classification is a tool that can be used for this purpose, but it incorporates discriminant analysis, which requires the computation and inversion of a covariance matrix. Covariance matrices of compositional data always will be singular (noninvertible) because of the unit-sum constraint. Fortunately, discriminant analyses can be calculated using a pseudo-inverse of the singular covariance matrix; this is done automatically by some statistical packages such as SAS. Granulometric data from the Darss Sill region of the Baltic Sea is used to explore how the pseudo-inversion procedure influences discriminant analysis results, comparing the algorithm used by SAS to the more conventional Moore-Penrose algorithm. Logratio transforms have been recommended to overcome problems associated with analysis of compositional data, including singularity. A regionalized classification of the Darss Sill data after logratio transformation is different only slightly from one based on raw granulometric data, suggesting that closure problems do not influence severely regionalized classification of compositional data.
Finn, James E.; Burger, Carl V.; Holland-Bartels, Leslie E.
1997-01-01
We used otolith banding patterns formed during incubation to discriminate among hatchery- and wild-incubated fry of sockeye salmon Oncorhynchus nerka from Tustumena Lake, Alaska. Fourier analysis of otolith luminance profiles was used to describe banding patterns: the amplitudes of individual Fourier harmonics were discriminant variables. Correct classification of otoliths to either hatchery or wild origin was 83.1% (cross-validation) and 72.7% (test data) with the use of quadratic discriminant function analysts on 10 Fourier amplitudes. Overall classification rates among the six test groups (one hatchery and five wild groups) were 46.5% (cross-validation) and 39.3% (test data) with the use of linear discriminant function analysis on 16 Fourier amplitudes. Although classification rates for wild-incubated fry from any one site never exceeded 67% (cross-validation) or 60% (test data), location-specific information was evident for all groups because the probability of classifying an individual to its true incubation location was significantly greater than chance. Results indicate phenotypic differences in otolith microstructure among incubation sites separated by less than 10 km. Analysis of otolith luminance profiles is a potentially useful technique for discriminating among and between various populations of hatchery and wild fish.
Canizo, Brenda V; Escudero, Leticia B; Pérez, María B; Pellerano, Roberto G; Wuilloud, Rodolfo G
2018-03-01
The feasibility of the application of chemometric techniques associated with multi-element analysis for the classification of grape seeds according to their provenance vineyard soil was investigated. Grape seed samples from different localities of Mendoza province (Argentina) were evaluated. Inductively coupled plasma mass spectrometry (ICP-MS) was used for the determination of twenty-nine elements (Ag, As, Ce, Co, Cs, Cu, Eu, Fe, Ga, Gd, La, Lu, Mn, Mo, Nb, Nd, Ni, Pr, Rb, Sm, Te, Ti, Tl, Tm, U, V, Y, Zn and Zr). Once the analytical data were collected, supervised pattern recognition techniques such as linear discriminant analysis (LDA), partial least square discriminant analysis (PLS-DA), k-nearest neighbors (k-NN), support vector machine (SVM) and Random Forest (RF) were applied to construct classification/discrimination rules. The results indicated that nonlinear methods, RF and SVM, perform best with up to 98% and 93% accuracy rate, respectively, and therefore are excellent tools for classification of grapes. Copyright © 2017 Elsevier Ltd. All rights reserved.
Discriminative Nonlinear Analysis Operator Learning: When Cosparse Model Meets Image Classification.
Wen, Zaidao; Hou, Biao; Jiao, Licheng
2017-05-03
Linear synthesis model based dictionary learning framework has achieved remarkable performances in image classification in the last decade. Behaved as a generative feature model, it however suffers from some intrinsic deficiencies. In this paper, we propose a novel parametric nonlinear analysis cosparse model (NACM) with which a unique feature vector will be much more efficiently extracted. Additionally, we derive a deep insight to demonstrate that NACM is capable of simultaneously learning the task adapted feature transformation and regularization to encode our preferences, domain prior knowledge and task oriented supervised information into the features. The proposed NACM is devoted to the classification task as a discriminative feature model and yield a novel discriminative nonlinear analysis operator learning framework (DNAOL). The theoretical analysis and experimental performances clearly demonstrate that DNAOL will not only achieve the better or at least competitive classification accuracies than the state-of-the-art algorithms but it can also dramatically reduce the time complexities in both training and testing phases.
ERIC Educational Resources Information Center
Spearing, Debra; Woehlke, Paula
To assess the effect on discriminant analysis in terms of correct classification into two groups, the following parameters were systematically altered using Monte Carlo techniques: sample sizes; proportions of one group to the other; number of independent variables; and covariance matrices. The pairing of the off diagonals (or covariances) with…
Chance-corrected classification for use in discriminant analysis: Ecological applications
Titus, K.; Mosher, J.A.; Williams, B.K.
1984-01-01
A method for evaluating the classification table from a discriminant analysis is described. The statistic, kappa, is useful to ecologists in that it removes the effects of chance. It is useful even with equal group sample sizes although the need for a chance-corrected measure of prediction becomes greater with more dissimilar group sample sizes. Examples are presented.
Drivelos, Spiros A; Danezis, Georgios P; Haroutounian, Serkos A; Georgiou, Constantinos A
2016-12-15
This study examines the trace and rare earth elemental (REE) fingerprint variations of PDO (Protected Designation of Origin) "Fava Santorinis" over three consecutive harvesting years (2011-2013). Classification of samples in harvesting years was studied by performing discriminant analysis (DA), k nearest neighbours (κ-NN), partial least squares (PLS) analysis and probabilistic neural networks (PNN) using rare earth elements and trace metals determined using ICP-MS. DA performed better than κ-NN, producing 100% discrimination using trace elements and 79% using REEs. PLS was found to be superior to PNN, achieving 99% and 90% classification for trace and REEs, respectively, while PNN achieved 96% and 71% classification for trace and REEs, respectively. The information obtained using REEs did not enhance classification, indicating that REEs vary minimally per harvesting year, providing robust geographical origin discrimination. The results show that seasonal patterns can occur in the elemental composition of "Fava Santorinis", probably reflecting seasonality of climate. Copyright © 2016 Elsevier Ltd. All rights reserved.
Lee, Ga-Young; Kim, Jeonghun; Kim, Ju Han; Kim, Kiwoong; Seong, Joon-Kyung
2014-01-01
Mobile healthcare applications are becoming a growing trend. Also, the prevalence of dementia in modern society is showing a steady growing trend. Among degenerative brain diseases that cause dementia, Alzheimer disease (AD) is the most common. The purpose of this study was to identify AD patients using magnetic resonance imaging in the mobile environment. We propose an incremental classification for mobile healthcare systems. Our classification method is based on incremental learning for AD diagnosis and AD prediction using the cortical thickness data and hippocampus shape. We constructed a classifier based on principal component analysis and linear discriminant analysis. We performed initial learning and mobile subject classification. Initial learning is the group learning part in our server. Our smartphone agent implements the mobile classification and shows various results. With use of cortical thickness data analysis alone, the discrimination accuracy was 87.33% (sensitivity 96.49% and specificity 64.33%). When cortical thickness data and hippocampal shape were analyzed together, the achieved accuracy was 87.52% (sensitivity 96.79% and specificity 63.24%). In this paper, we presented a classification method based on online learning for AD diagnosis by employing both cortical thickness data and hippocampal shape analysis data. Our method was implemented on smartphone devices and discriminated AD patients for normal group.
NASA Astrophysics Data System (ADS)
Song, Biao; Lu, Dan; Peng, Ming; Li, Xia; Zou, Ye; Huang, Meizhen; Lu, Feng
2017-02-01
Raman spectroscopy is developed as a fast and non-destructive method for the discrimination and classification of hydroxypropyl methyl cellulose (HPMC) samples. 44 E series and 41 K series of HPMC samples are measured by a self-developed portable Raman spectrometer (Hx-Raman) which is excited by a 785 nm diode laser and the spectrum range is 200-2700 cm-1 with a resolution (FWHM) of 6 cm-1. Multivariate analysis is applied for discrimination of E series from K series. By methods of principal components analysis (PCA) and Fisher discriminant analysis (FDA), a discrimination result with sensitivity of 90.91% and specificity of 95.12% is achieved. The corresponding receiver operating characteristic (ROC) is 0.99, indicting the accuracy of the predictive model. This result demonstrates the prospect of portable Raman spectrometer for rapid, non-destructive classification and discrimination of E series and K series samples of HPMC.
ASTM clustering for improving coal analysis by near-infrared spectroscopy.
Andrés, J M; Bona, M T
2006-11-15
Multivariate analysis techniques have been applied to near-infrared (NIR) spectra coals to investigate the relationship between nine coal properties (moisture (%), ash (%), volatile matter (%), fixed carbon (%), heating value (kcal/kg), carbon (%), hydrogen (%), nitrogen (%) and sulphur (%)) and the corresponding predictor variables. In this work, a whole set of coal samples was grouped into six more homogeneous clusters following the ASTM reference method for classification prior to the application of calibration methods to each coal set. The results obtained showed a considerable improvement of the error determination compared with the calibration for the whole sample set. For some groups, the established calibrations approached the quality required by the ASTM/ISO norms for laboratory analysis. To predict property values for a new coal sample it is necessary the assignation of that sample to its respective group. Thus, the discrimination and classification ability of coal samples by Diffuse Reflectance Infrared Fourier Transform Spectroscopy (DRIFTS) in the NIR range was also studied by applying Soft Independent Modelling of Class Analogy (SIMCA) and Linear Discriminant Analysis (LDA) techniques. Modelling of the groups by SIMCA led to overlapping models that cannot discriminate for unique classification. On the other hand, the application of Linear Discriminant Analysis improved the classification of the samples but not enough to be satisfactory for every group considered.
NASA Astrophysics Data System (ADS)
Yu, Xin; Cao, Liang; Liu, Jinhu; Zhao, Bo; Shan, Xiujuan; Dou, Shuozeng
2014-09-01
We tested the use of otolith shape analysis to discriminate between species and stocks of five goby species ( Ctenotrypauchen chinensis, Odontamblyopus lacepedii, Amblychaeturichthys hexanema, Chaeturichthys stigmatias, and Acanthogobius hasta) found in northern Chinese coastal waters. The five species were well differentiated with high overall classification success using shape indices (83.7%), elliptic Fourier coefficients (98.6%), or the combination of both methods (94.9%). However, shape analysis alone was only moderately successful at discriminating among the four stocks (Liaodong Bay, LD; Bohai Bay, BH; Huanghe (Yellow) River estuary HRE, and Jiaozhou Bay, JZ stocks) of A. hasta (50%-54%) and C. stigmatias (65.7%-75.8%). For these two species, shape analysis was moderately successful at discriminating the HRE or JZ stocks from other stocks, but failed to effectively identify the LD and BH stocks. A large number of otoliths were misclassified between the HRE and JZ stocks, which are geographically well separated. The classification success for stock discrimination was higher using elliptic Fourier coefficients alone (70.2%) or in combination with shape indices (75.8%) than using only shape indices (65.7%) in C. stigmatias whereas there was little difference among the three methods for A. hasta. Our results supported the common belief that otolith shape analysis is generally more effective for interspecific identification than intraspecific discrimination. Moreover, compared with shape indices analysis, Fourier analysis improves classification success during inter- and intra-species discrimination by otolith shape analysis, although this did not necessarily always occur in all fish species.
MATRIX DISCRIMINANT ANALYSIS WITH APPLICATION TO COLORIMETRIC SENSOR ARRAY DATA
Suslick, Kenneth S.
2014-01-01
With the rapid development of nano-technology, a “colorimetric sensor array” (CSA) which is referred to as an optical electronic nose has been developed for the identification of toxicants. Unlike traditional sensors which rely on a single chemical interaction, CSA can measure multiple chemical interactions by using chemo-responsive dyes. The color changes of the chemo-responsive dyes are recorded before and after exposure to toxicants and serve as a template for classification. The color changes are digitalized in the form of a matrix with rows representing dye effects and columns representing the spectrum of colors. Thus, matrix-classification methods are highly desirable. In this article, we develop a novel classification method, matrix discriminant analysis (MDA), which is a generalization of linear discriminant analysis (LDA) for the data in matrix form. By incorporating the intrinsic matrix-structure of the data in discriminant analysis, the proposed method can improve CSA’s sensitivity and more importantly, specificity. A penalized MDA method, PMDA, is also introduced to further incorporate sparsity structure in discriminant function. Numerical studies suggest that the proposed MDA and PMDA methods outperform LDA and other competing discriminant methods for matrix predictors. The asymptotic consistency of MDA is also established. R code and data are available online as supplementary material. PMID:26783371
Kernel PLS-SVC for Linear and Nonlinear Discrimination
NASA Technical Reports Server (NTRS)
Rosipal, Roman; Trejo, Leonard J.; Matthews, Bryan
2003-01-01
A new methodology for discrimination is proposed. This is based on kernel orthonormalized partial least squares (PLS) dimensionality reduction of the original data space followed by support vector machines for classification. Close connection of orthonormalized PLS and Fisher's approach to linear discrimination or equivalently with canonical correlation analysis is described. This gives preference to use orthonormalized PLS over principal component analysis. Good behavior of the proposed method is demonstrated on 13 different benchmark data sets and on the real world problem of the classification finger movement periods versus non-movement periods based on electroencephalogram.
NASA Astrophysics Data System (ADS)
Szuflitowska, B.; Orlowski, P.
2017-08-01
Automated detection system consists of two key steps: extraction of features from EEG signals and classification for detection of pathology activity. The EEG sequences were analyzed using Short-Time Fourier Transform and the classification was performed using Linear Discriminant Analysis. The accuracy of the technique was tested on three sets of EEG signals: epilepsy, healthy and Alzheimer's Disease. The classification error below 10% has been considered a success. The higher accuracy are obtained for new data of unknown classes than testing data. The methodology can be helpful in differentiation epilepsy seizure and disturbances in the EEG signal in Alzheimer's Disease.
NASA Astrophysics Data System (ADS)
YangDai, Tianyi; Zhang, Li
2016-02-01
Energy dispersive X-ray diffraction (EDXRD) combined with hybrid discriminant analysis (HDA) has been utilized for classifying the liquid materials for the first time. The XRD spectra of 37 kinds of liquid contrabands and daily supplies were obtained using an EDXRD test bed facility. The unique spectra of different samples reveal XRD's capability to distinguish liquid contrabands from daily supplies. In order to create a system to detect liquid contrabands, the diffraction spectra were subjected to HDA which is the combination of principal components analysis (PCA) and linear discriminant analysis (LDA). Experiments based on the leave-one-out method demonstrate that HDA is a practical method with higher classification accuracy and lower noise sensitivity than the other methods in this application. The study shows the great capability and potential of the combination of XRD and HDA for liquid contrabands classification.
Parametric Time-Frequency Analysis and Its Applications in Music Classification
NASA Astrophysics Data System (ADS)
Shen, Ying; Li, Xiaoli; Ma, Ngok-Wah; Krishnan, Sridhar
2010-12-01
Analysis of nonstationary signals, such as music signals, is a challenging task. The purpose of this study is to explore an efficient and powerful technique to analyze and classify music signals in higher frequency range (44.1 kHz). The pursuit methods are good tools for this purpose, but they aimed at representing the signals rather than classifying them as in Y. Paragakin et al., 2009. Among the pursuit methods, matching pursuit (MP), an adaptive true nonstationary time-frequency signal analysis tool, is applied for music classification. First, MP decomposes the sample signals into time-frequency functions or atoms. Atom parameters are then analyzed and manipulated, and discriminant features are extracted from atom parameters. Besides the parameters obtained using MP, an additional feature, central energy, is also derived. Linear discriminant analysis and the leave-one-out method are used to evaluate the classification accuracy rate for different feature sets. The study is one of the very few works that analyze atoms statistically and extract discriminant features directly from the parameters. From our experiments, it is evident that the MP algorithm with the Gabor dictionary decomposes nonstationary signals, such as music signals, into atoms in which the parameters contain strong discriminant information sufficient for accurate and efficient signal classifications.
Theory and analysis of statistical discriminant techniques as applied to remote sensing data
NASA Technical Reports Server (NTRS)
Odell, P. L.
1973-01-01
Classification of remote earth resources sensing data according to normed exponential density statistics is reported. The use of density models appropriate for several physical situations provides an exact solution for the probabilities of classifications associated with the Bayes discriminant procedure even when the covariance matrices are unequal.
USDA-ARS?s Scientific Manuscript database
Fisher’s linear discriminant (FLD) models for wheat variety classification were developed and validated. The inputs to the FLD models were the capacitance (C), impedance (Z), and phase angle ('), measured at two frequencies. Classification of wheat varieties was obtained as output of the FLD mod...
Discrimination Enhancement with Transient Feature Analysis of a Graphene Chemical Sensor.
Nallon, Eric C; Schnee, Vincent P; Bright, Collin J; Polcha, Michael P; Li, Qiliang
2016-01-19
A graphene chemical sensor is subjected to a set of structurally and chemically similar hydrocarbon compounds consisting of toluene, o-xylene, p-xylene, and mesitylene. The fractional change in resistance of the sensor upon exposure to these compounds exhibits a similar response magnitude among compounds, whereas large variation is observed within repetitions for each compound, causing a response overlap. Therefore, traditional features depending on maximum response change will cause confusion during further discrimination and classification analysis. More robust features that are less sensitive to concentration, sampling, and drift variability would provide higher quality information. In this work, we have explored the advantage of using transient-based exponential fitting coefficients to enhance the discrimination of similar compounds. The advantages of such feature analysis to discriminate each compound is evaluated using principle component analysis (PCA). In addition, machine learning-based classification algorithms were used to compare the prediction accuracies when using fitting coefficients as features. The additional features greatly enhanced the discrimination between compounds while performing PCA and also improved the prediction accuracy by 34% when using linear discrimination analysis.
NASA Astrophysics Data System (ADS)
He, Shixuan; Xie, Wanyi; Zhang, Wei; Zhang, Liqun; Wang, Yunxia; Liu, Xiaoling; Liu, Yulong; Du, Chunlei
2015-02-01
A novel strategy which combines iteratively cubic spline fitting baseline correction method with discriminant partial least squares qualitative analysis is employed to analyze the surface enhanced Raman scattering (SERS) spectroscopy of banned food additives, such as Sudan I dye and Rhodamine B in food, Malachite green residues in aquaculture fish. Multivariate qualitative analysis methods, using the combination of spectra preprocessing iteratively cubic spline fitting (ICSF) baseline correction with principal component analysis (PCA) and discriminant partial least squares (DPLS) classification respectively, are applied to investigate the effectiveness of SERS spectroscopy for predicting the class assignments of unknown banned food additives. PCA cannot be used to predict the class assignments of unknown samples. However, the DPLS classification can discriminate the class assignment of unknown banned additives using the information of differences in relative intensities. The results demonstrate that SERS spectroscopy combined with ICSF baseline correction method and exploratory analysis methodology DPLS classification can be potentially used for distinguishing the banned food additives in field of food safety.
Real-Time Classification of Exercise Exertion Levels Using Discriminant Analysis of HRV Data.
Jeong, In Cheol; Finkelstein, Joseph
2015-01-01
Heart rate variability (HRV) was shown to reflect activation of sympathetic nervous system however it is not clear which set of HRV parameters is optimal for real-time classification of exercise exertion levels. There is no studies that compared potential of two types of HRV parameters (time-domain and frequency-domain) in predicting exercise exertion level using discriminant analysis. The main goal of this study was to compare potential of HRV time-domain parameters versus HRV frequency-domain parameters in classifying exercise exertion level. Rest, exercise, and recovery categories were used in classification models. Overall 79.5% classification agreement by the time-domain parameters as compared to overall 52.8% classification agreement by frequency-domain parameters demonstrated that the time-domain parameters had higher potential in classifying exercise exertion levels.
ERIC Educational Resources Information Center
Cohen, Ira L.; Liu, Xudong; Hudson, Melissa; Gillis, Jennifer; Cavalari, Rachel N. S.; Romanczyk, Raymond G.; Karmel, Bernard Z.; Gardner, Judith M.
2016-01-01
In order to improve discrimination accuracy between Autism Spectrum Disorder (ASD) and similar neurodevelopmental disorders, a data mining procedure, Classification and Regression Trees (CART), was used on a large multi-site sample of PDD Behavior Inventory (PDDBI) forms on children with and without ASD. Discrimination accuracy exceeded 80%,…
Aursand, Marit; Standal, Inger B; Praël, Angelika; McEvoy, Lesley; Irvine, Joe; Axelson, David E
2009-05-13
(13)C nuclear magnetic resonance (NMR) in combination with multivariate data analysis was used to (1) discriminate between farmed and wild Atlantic salmon ( Salmo salar L.), (2) discriminate between different geographical origins, and (3) verify the origin of market samples. Muscle lipids from 195 Atlantic salmon of known origin (wild and farmed salmon from Norway, Scotland, Canada, Iceland, Ireland, the Faroes, and Tasmania) in addition to market samples were analyzed by (13)C NMR spectroscopy and multivariate analysis. Both probabilistic neural networks (PNN) and support vector machines (SVM) provided excellent discrimination (98.5 and 100.0%, respectively) between wild and farmed salmon. Discrimination with respect to geographical origin was somewhat more difficult, with correct classification rates ranging from 82.2 to 99.3% by PNN and SVM, respectively. In the analysis of market samples, five fish labeled and purchased as wild salmon were classified as farmed salmon (indicating mislabeling), and there were also some discrepancies between the classification and the product declaration with regard to geographical origin.
2011-01-01
Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed overall classification accuracy above a median value of 0.63, but for most sensitivity was around or even lower than a median value of 0.5. Conclusions When taking into account sensitivity, specificity and overall classification accuracy Random Forests and Linear Discriminant analysis rank first among all the classifiers tested in prediction of dementia using several neuropsychological tests. These methods may be used to improve accuracy, sensitivity and specificity of Dementia predictions from neuropsychological testing. PMID:21849043
Multi-level discriminative dictionary learning with application to large scale image classification.
Shen, Li; Sun, Gang; Huang, Qingming; Wang, Shuhui; Lin, Zhouchen; Wu, Enhua
2015-10-01
The sparse coding technique has shown flexibility and capability in image representation and analysis. It is a powerful tool in many visual applications. Some recent work has shown that incorporating the properties of task (such as discrimination for classification task) into dictionary learning is effective for improving the accuracy. However, the traditional supervised dictionary learning methods suffer from high computation complexity when dealing with large number of categories, making them less satisfactory in large scale applications. In this paper, we propose a novel multi-level discriminative dictionary learning method and apply it to large scale image classification. Our method takes advantage of hierarchical category correlation to encode multi-level discriminative information. Each internal node of the category hierarchy is associated with a discriminative dictionary and a classification model. The dictionaries at different layers are learnt to capture the information of different scales. Moreover, each node at lower layers also inherits the dictionary of its parent, so that the categories at lower layers can be described with multi-scale information. The learning of dictionaries and associated classification models is jointly conducted by minimizing an overall tree loss. The experimental results on challenging data sets demonstrate that our approach achieves excellent accuracy and competitive computation cost compared with other sparse coding methods for large scale image classification.
NASA Astrophysics Data System (ADS)
Phinyomark, A.; Hu, H.; Phukpattaranont, P.; Limsakul, C.
2012-01-01
The classification of upper-limb movements based on surface electromyography (EMG) signals is an important issue in the control of assistive devices and rehabilitation systems. Increasing the number of EMG channels and features in order to increase the number of control commands can yield a high dimensional feature vector. To cope with the accuracy and computation problems associated with high dimensionality, it is commonplace to apply a processing step that transforms the data to a space of significantly lower dimensions with only a limited loss of useful information. Linear discriminant analysis (LDA) has been successfully applied as an EMG feature projection method. Recently, a number of extended LDA-based algorithms have been proposed, which are more competitive in terms of both classification accuracy and computational costs/times with classical LDA. This paper presents the findings of a comparative study of classical LDA and five extended LDA methods. From a quantitative comparison based on seven multi-feature sets, three extended LDA-based algorithms, consisting of uncorrelated LDA, orthogonal LDA and orthogonal fuzzy neighborhood discriminant analysis, produce better class separability when compared with a baseline system (without feature projection), principle component analysis (PCA), and classical LDA. Based on a 7-dimension time domain and time-scale feature vectors, these methods achieved respectively 95.2% and 93.2% classification accuracy by using a linear discriminant classifier.
Stability and bias of classification rates in biological applications of discriminant analysis
Williams, B.K.; Titus, K.; Hines, J.E.
1990-01-01
We assessed the sampling stability of classification rates in discriminant analysis by using a factorial design with factors for multivariate dimensionality, dispersion structure, configuration of group means, and sample size. A total of 32,400 discriminant analyses were conducted, based on data from simulated populations with appropriate underlying statistical distributions. Simulation results indicated strong bias in correct classification rates when group sample sizes were small and when overlap among groups was high. We also found that stability of the correct classification rates was influenced by these factors, indicating that the number of samples required for a given level of precision increases with the amount of overlap among groups. In a review of 60 published studies, we found that 57% of the articles presented results on classification rates, though few of them mentioned potential biases in their results. Wildlife researchers should choose the total number of samples per group to be at least 2 times the number of variables to be measured when overlap among groups is low. Substantially more samples are required as the overlap among groups increases
NASA Astrophysics Data System (ADS)
Mohammadimanesh, F.; Salehi, B.; Mahdianpari, M.; Homayouni, S.
2016-06-01
Polarimetric Synthetic Aperture Radar (PolSAR) imagery is a complex multi-dimensional dataset, which is an important source of information for various natural resources and environmental classification and monitoring applications. PolSAR imagery produces valuable information by observing scattering mechanisms from different natural and man-made objects. Land cover mapping using PolSAR data classification is one of the most important applications of SAR remote sensing earth observations, which have gained increasing attention in the recent years. However, one of the most challenging aspects of classification is selecting features with maximum discrimination capability. To address this challenge, a statistical approach based on the Fisher Linear Discriminant Analysis (FLDA) and the incorporation of physical interpretation of PolSAR data into classification is proposed in this paper. After pre-processing of PolSAR data, including the speckle reduction, the H/α classification is used in order to classify the basic scattering mechanisms. Then, a new method for feature weighting, based on the fusion of FLDA and physical interpretation, is implemented. This method proves to increase the classification accuracy as well as increasing between-class discrimination in the final Wishart classification. The proposed method was applied to a full polarimetric C-band RADARSAT-2 data set from Avalon area, Newfoundland and Labrador, Canada. This imagery has been acquired in June 2015, and covers various types of wetlands including bogs, fens, marshes and shallow water. The results were compared with the standard Wishart classification, and an improvement of about 20% was achieved in the overall accuracy. This method provides an opportunity for operational wetland classification in northern latitude with high accuracy using only SAR polarimetric data.
ERIC Educational Resources Information Center
Zwick, Rebecca; Lenaburg, Lubella
2009-01-01
In certain data analyses (e.g., multiple discriminant analysis and multinomial log-linear modeling), classification decisions are made based on the estimated posterior probabilities that individuals belong to each of several distinct categories. In the Bayesian network literature, this type of classification is often accomplished by assigning…
Yang, Jun-Ho; Yoh, Jack J
2018-01-01
A novel technique is reported for separating overlapping latent fingerprints using chemometric approaches that combine laser-induced breakdown spectroscopy (LIBS) and multivariate analysis. The LIBS technique provides the capability of real time analysis and high frequency scanning as well as the data regarding the chemical composition of overlapping latent fingerprints. These spectra offer valuable information for the classification and reconstruction of overlapping latent fingerprints by implementing appropriate statistical multivariate analysis. The current study employs principal component analysis and partial least square methods for the classification of latent fingerprints from the LIBS spectra. This technique was successfully demonstrated through a classification study of four distinct latent fingerprints using classification methods such as soft independent modeling of class analogy (SIMCA) and partial least squares discriminant analysis (PLS-DA). The novel method yielded an accuracy of more than 85% and was proven to be sufficiently robust. Furthermore, through laser scanning analysis at a spatial interval of 125 µm, the overlapping fingerprints were reconstructed as separate two-dimensional forms.
He, Shixuan; Xie, Wanyi; Zhang, Wei; Zhang, Liqun; Wang, Yunxia; Liu, Xiaoling; Liu, Yulong; Du, Chunlei
2015-02-25
A novel strategy which combines iteratively cubic spline fitting baseline correction method with discriminant partial least squares qualitative analysis is employed to analyze the surface enhanced Raman scattering (SERS) spectroscopy of banned food additives, such as Sudan I dye and Rhodamine B in food, Malachite green residues in aquaculture fish. Multivariate qualitative analysis methods, using the combination of spectra preprocessing iteratively cubic spline fitting (ICSF) baseline correction with principal component analysis (PCA) and discriminant partial least squares (DPLS) classification respectively, are applied to investigate the effectiveness of SERS spectroscopy for predicting the class assignments of unknown banned food additives. PCA cannot be used to predict the class assignments of unknown samples. However, the DPLS classification can discriminate the class assignment of unknown banned additives using the information of differences in relative intensities. The results demonstrate that SERS spectroscopy combined with ICSF baseline correction method and exploratory analysis methodology DPLS classification can be potentially used for distinguishing the banned food additives in field of food safety. Copyright © 2014 Elsevier B.V. All rights reserved.
A Comparison of Two-Group Classification Methods
ERIC Educational Resources Information Center
Holden, Jocelyn E.; Finch, W. Holmes; Kelley, Ken
2011-01-01
The statistical classification of "N" individuals into "G" mutually exclusive groups when the actual group membership is unknown is common in the social and behavioral sciences. The results of such classification methods often have important consequences. Among the most common methods of statistical classification are linear discriminant analysis,…
Multi-class ERP-based BCI data analysis using a discriminant space self-organizing map.
Onishi, Akinari; Natsume, Kiyohisa
2014-01-01
Emotional or non-emotional image stimulus is recently applied to event-related potential (ERP) based brain computer interfaces (BCI). Though the classification performance is over 80% in a single trial, a discrimination between those ERPs has not been considered. In this research we tried to clarify the discriminability of four-class ERP-based BCI target data elicited by desk, seal, spider images and letter intensifications. A conventional self organizing map (SOM) and newly proposed discriminant space SOM (ds-SOM) were applied, then the discriminabilites were visualized. We also classify all pairs of those ERPs by stepwise linear discriminant analysis (SWLDA) and verify the visualization of discriminabilities. As a result, the ds-SOM showed understandable visualization of the data with a shorter computational time than the traditional SOM. We also confirmed the clear boundary between the letter cluster and the other clusters. The result was coherent with the classification performances by SWLDA. The method might be helpful not only for developing a new BCI paradigm, but also for the big data analysis.
NASA Astrophysics Data System (ADS)
Luna, Aderval S.; da Silva, Arnaldo P.; Pinho, Jéssica S. A.; Ferré, Joan; Boqué, Ricard
Near infrared (NIR) spectroscopy and multivariate classification were applied to discriminate soybean oil samples into non-transgenic and transgenic. Principal Component Analysis (PCA) was applied to extract relevant features from the spectral data and to remove the anomalous samples. The best results were obtained when with Support Vectors Machine-Discriminant Analysis (SVM-DA) and Partial Least Squares-Discriminant Analysis (PLS-DA) after mean centering plus multiplicative scatter correction. For SVM-DA the percentage of successful classification was 100% for the training group and 100% and 90% in validation group for non transgenic and transgenic soybean oil samples respectively. For PLS-DA the percentage of successful classification was 95% and 100% in training group for non transgenic and transgenic soybean oil samples respectively and 100% and 80% in validation group for non transgenic and transgenic respectively. The results demonstrate that NIR spectroscopy can provide a rapid, nondestructive and reliable method to distinguish non-transgenic and transgenic soybean oils.
Jaiswara, Ranjana; Nandi, Diptarup; Balakrishnan, Rohini
2013-01-01
Traditional taxonomy based on morphology has often failed in accurate species identification owing to the occurrence of cryptic species, which are reproductively isolated but morphologically identical. Molecular data have thus been used to complement morphology in species identification. The sexual advertisement calls in several groups of acoustically communicating animals are species-specific and can thus complement molecular data as non-invasive tools for identification. Several statistical tools and automated identifier algorithms have been used to investigate the efficiency of acoustic signals in species identification. Despite a plethora of such methods, there is a general lack of knowledge regarding the appropriate usage of these methods in specific taxa. In this study, we investigated the performance of two commonly used statistical methods, discriminant function analysis (DFA) and cluster analysis, in identification and classification based on acoustic signals of field cricket species belonging to the subfamily Gryllinae. Using a comparative approach we evaluated the optimal number of species and calling song characteristics for both the methods that lead to most accurate classification and identification. The accuracy of classification using DFA was high and was not affected by the number of taxa used. However, a constraint in using discriminant function analysis is the need for a priori classification of songs. Accuracy of classification using cluster analysis, which does not require a priori knowledge, was maximum for 6-7 taxa and decreased significantly when more than ten taxa were analysed together. We also investigated the efficacy of two novel derived acoustic features in improving the accuracy of identification. Our results show that DFA is a reliable statistical tool for species identification using acoustic signals. Our results also show that cluster analysis of acoustic signals in crickets works effectively for species classification and identification.
Wang, Kun; Jiang, Tianzi; Liang, Meng; Wang, Liang; Tian, Lixia; Zhang, Xinqing; Li, Kuncheng; Liu, Zhening
2006-01-01
In this work, we proposed a discriminative model of Alzheimer's disease (AD) on the basis of multivariate pattern classification and functional magnetic resonance imaging (fMRI). This model used the correlation/anti-correlation coefficients of two intrinsically anti-correlated networks in resting brains, which have been suggested by two recent studies, as the feature of classification. Pseudo-Fisher Linear Discriminative Analysis (pFLDA) was then performed on the feature space and a linear classifier was generated. Using leave-one-out (LOO) cross validation, our results showed a correct classification rate of 83%. We also compared the proposed model with another one based on the whole brain functional connectivity. Our proposed model outperformed the other one significantly, and this implied that the two intrinsically anti-correlated networks may be a more susceptible part of the whole brain network in the early stage of AD.
Jiménez-Carvelo, Ana M; Pérez-Castaño, Estefanía; González-Casado, Antonio; Cuadros-Rodríguez, Luis
2017-04-15
A new method for differentiation of olive oil (independently of the quality category) from other vegetable oils (canola, safflower, corn, peanut, seeds, grapeseed, palm, linseed, sesame and soybean) has been developed. The analytical procedure for chromatographic fingerprinting of the methyl-transesterified fraction of each vegetable oil, using normal-phase liquid chromatography, is described and the chemometric strategies applied and discussed. Some chemometric methods, such as k-nearest neighbours (kNN), partial least squared-discriminant analysis (PLS-DA), support vector machine classification analysis (SVM-C), and soft independent modelling of class analogies (SIMCA), were applied to build classification models. Performance of the classification was evaluated and ranked using several classification quality metrics. The discriminant analysis, based on the use of one input-class, (plus a dummy class) was applied for the first time in this study. Copyright © 2016 Elsevier Ltd. All rights reserved.
Gender classification of running subjects using full-body kinematics
NASA Astrophysics Data System (ADS)
Williams, Christina M.; Flora, Jeffrey B.; Iftekharuddin, Khan M.
2016-05-01
This paper proposes novel automated gender classification of subjects while engaged in running activity. The machine learning techniques include preprocessing steps using principal component analysis followed by classification with linear discriminant analysis, and nonlinear support vector machines, and decision-stump with AdaBoost. The dataset consists of 49 subjects (25 males, 24 females, 2 trials each) all equipped with approximately 80 retroreflective markers. The trials are reflective of the subject's entire body moving unrestrained through a capture volume at a self-selected running speed, thus producing highly realistic data. The classification accuracy using leave-one-out cross validation for the 49 subjects is improved from 66.33% using linear discriminant analysis to 86.74% using the nonlinear support vector machine. Results are further improved to 87.76% by means of implementing a nonlinear decision stump with AdaBoost classifier. The experimental findings suggest that the linear classification approaches are inadequate in classifying gender for a large dataset with subjects running in a moderately uninhibited environment.
Sauerbruch, T; Ansari, H; Wotzka, R; Soehendra, N; Köpcke, W
1988-01-08
Prospective prognosis systems for predicting half-year death-rate after bleeding from oesophageal varices and sclerotherapy were tested on 129 patients. The receiver-operating-characteristic curves of three discriminant scores were compared with the Child-Pugh classification. It was found that the latter is still the best for prognosticating the course of the disease. A simplified discriminant score which contains as its only factors bilirubin and the Quick value does, however, give nearly as good information.
Lee, Sang Min; Kim, Hye-Jin; Jang, Young Pyo
2012-01-01
It needs many years of special training to gain expertise on the organoleptic classification of botanical raw materials and, even for those experts, discrimination among Umbelliferae medicinal herbs remains an intricate challenge due to their morphological similarity. To develop a new chemometric classification method using a direct analysis in real time-time of flight-mass spectrometry (DART-TOF-MS) fingerprinting for Umbelliferae medicinal herbs and to provide a platform for its application to the discrimination of other herbal medicines. Angelica tenuissima, Angelica gigas, Angelica dahurica and Cnidium officinale were chosen for this study and ten samples of each species were purchased from various Korean markets. DART-TOF-MS was employed on powdered raw materials to obtain a chemical fingerprint of each sample and the orthogonal partial-least squares method in discriminant analysis (OPLS-DA) was used for multivariate analysis. All samples of collected species were successfully discriminated from each other according to their characteristic DART-TOF-MS fingerprint. Decursin (or decursinol angelate) and byakangelicol were identified as marker molecules for Angelica gigas and A. dahurica, respectively. Using the OPLS method for discriminant analysis, Angelica tenuissima and Cnidium officinale were clearly separated into two groups. Angelica tenuissima was characterised by the presence of ligustilide and unidentified molecular ions of m/z 239 and 283, while senkyunolide A together with signals with m/z 387 and 389 were the marker compounds for Cnidium officinale. Elaborating with chemoinformatics, DART-TOF-MS fingerprinting with chemoinformatic tools results in a powerful method for the classification of morphologically similar Umbelliferae medicinal herbs and quality control of medicinal herbal products, including the extracts of these crude drugs. Copyright © 2012 John Wiley & Sons, Ltd.
Semi-supervised learning for ordinal Kernel Discriminant Analysis.
Pérez-Ortiz, M; Gutiérrez, P A; Carbonero-Ruz, M; Hervás-Martínez, C
2016-12-01
Ordinal classification considers those classification problems where the labels of the variable to predict follow a given order. Naturally, labelled data is scarce or difficult to obtain in this type of problems because, in many cases, ordinal labels are given by a user or expert (e.g. in recommendation systems). Firstly, this paper develops a new strategy for ordinal classification where both labelled and unlabelled data are used in the model construction step (a scheme which is referred to as semi-supervised learning). More specifically, the ordinal version of kernel discriminant learning is extended for this setting considering the neighbourhood information of unlabelled data, which is proposed to be computed in the feature space induced by the kernel function. Secondly, a new method for semi-supervised kernel learning is devised in the context of ordinal classification, which is combined with our developed classification strategy to optimise the kernel parameters. The experiments conducted compare 6 different approaches for semi-supervised learning in the context of ordinal classification in a battery of 30 datasets, showing (1) the good synergy of the ordinal version of discriminant analysis and the use of unlabelled data and (2) the advantage of computing distances in the feature space induced by the kernel function. Copyright © 2016 Elsevier Ltd. All rights reserved.
Ranking procedure for partial discriminant analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Beckman, R.J.; Johnson, M.E.
1981-09-01
A rank procedure developed by Broffitt, Randles, and Hogg (1976) is modified to control the conditional probability of misclassification given that classification has been attempted. This modification leads to a useful solution to the two-population partial discriminant analysis problem for even moderately sized training sets.
NASA Astrophysics Data System (ADS)
Leka, K. D.; Barnes, Graham; Wagner, Eric
2018-04-01
A classification infrastructure built upon Discriminant Analysis (DA) has been developed at NorthWest Research Associates for examining the statistical differences between samples of two known populations. Originating to examine the physical differences between flare-quiet and flare-imminent solar active regions, we describe herein some details of the infrastructure including: parametrization of large datasets, schemes for handling "null" and "bad" data in multi-parameter analysis, application of non-parametric multi-dimensional DA, an extension through Bayes' theorem to probabilistic classification, and methods invoked for evaluating classifier success. The classifier infrastructure is applicable to a wide range of scientific questions in solar physics. We demonstrate its application to the question of distinguishing flare-imminent from flare-quiet solar active regions, updating results from the original publications that were based on different data and much smaller sample sizes. Finally, as a demonstration of "Research to Operations" efforts in the space-weather forecasting context, we present the Discriminant Analysis Flare Forecasting System (DAFFS), a near-real-time operationally-running solar flare forecasting tool that was developed from the research-directed infrastructure.
Feature Extraction of Electronic Nose Signals Using QPSO-Based Multiple KFDA Signal Processing
Wen, Tailai; Huang, Daoyu; Lu, Kun; Deng, Changjian; Zeng, Tanyue; Yu, Song; He, Zhiyi
2018-01-01
The aim of this research was to enhance the classification accuracy of an electronic nose (E-nose) in different detecting applications. During the learning process of the E-nose to predict the types of different odors, the prediction accuracy was not quite satisfying because the raw features extracted from sensors’ responses were regarded as the input of a classifier without any feature extraction processing. Therefore, in order to obtain more useful information and improve the E-nose’s classification accuracy, in this paper, a Weighted Kernels Fisher Discriminant Analysis (WKFDA) combined with Quantum-behaved Particle Swarm Optimization (QPSO), i.e., QWKFDA, was presented to reprocess the original feature matrix. In addition, we have also compared the proposed method with quite a few previously existing ones including Principal Component Analysis (PCA), Locality Preserving Projections (LPP), Fisher Discriminant Analysis (FDA) and Kernels Fisher Discriminant Analysis (KFDA). Experimental results proved that QWKFDA is an effective feature extraction method for E-nose in predicting the types of wound infection and inflammable gases, which shared much higher classification accuracy than those of the contrast methods. PMID:29382146
Feature Extraction of Electronic Nose Signals Using QPSO-Based Multiple KFDA Signal Processing.
Wen, Tailai; Yan, Jia; Huang, Daoyu; Lu, Kun; Deng, Changjian; Zeng, Tanyue; Yu, Song; He, Zhiyi
2018-01-29
The aim of this research was to enhance the classification accuracy of an electronic nose (E-nose) in different detecting applications. During the learning process of the E-nose to predict the types of different odors, the prediction accuracy was not quite satisfying because the raw features extracted from sensors' responses were regarded as the input of a classifier without any feature extraction processing. Therefore, in order to obtain more useful information and improve the E-nose's classification accuracy, in this paper, a Weighted Kernels Fisher Discriminant Analysis (WKFDA) combined with Quantum-behaved Particle Swarm Optimization (QPSO), i.e., QWKFDA, was presented to reprocess the original feature matrix. In addition, we have also compared the proposed method with quite a few previously existing ones including Principal Component Analysis (PCA), Locality Preserving Projections (LPP), Fisher Discriminant Analysis (FDA) and Kernels Fisher Discriminant Analysis (KFDA). Experimental results proved that QWKFDA is an effective feature extraction method for E-nose in predicting the types of wound infection and inflammable gases, which shared much higher classification accuracy than those of the contrast methods.
Local classification: Locally weighted-partial least squares-discriminant analysis (LW-PLS-DA).
Bevilacqua, Marta; Marini, Federico
2014-08-01
The possibility of devising a simple, flexible and accurate non-linear classification method, by extending the locally weighted partial least squares (LW-PLS) approach to the cases where the algorithm is used in a discriminant way (partial least squares discriminant analysis, PLS-DA), is presented. In particular, to assess which category an unknown sample belongs to, the proposed algorithm operates by identifying which training objects are most similar to the one to be predicted and building a PLS-DA model using these calibration samples only. Moreover, the influence of the selected training samples on the local model can be further modulated by adopting a not uniform distance-based weighting scheme which allows the farthest calibration objects to have less impact than the closest ones. The performances of the proposed locally weighted-partial least squares-discriminant analysis (LW-PLS-DA) algorithm have been tested on three simulated data sets characterized by a varying degree of non-linearity: in all cases, a classification accuracy higher than 99% on external validation samples was achieved. Moreover, when also applied to a real data set (classification of rice varieties), characterized by a high extent of non-linearity, the proposed method provided an average correct classification rate of about 93% on the test set. By the preliminary results, showed in this paper, the performances of the proposed LW-PLS-DA approach have proved to be comparable and in some cases better than those obtained by other non-linear methods (k nearest neighbors, kernel-PLS-DA and, in the case of rice, counterpropagation neural networks). Copyright © 2014 Elsevier B.V. All rights reserved.
Chen, Xue; Li, Xiaohui; Yang, Sibo; Yu, Xin; Liu, Aichun
2018-01-01
Lymphoma is a significant cancer that affects the human lymphatic and hematopoietic systems. In this work, discrimination of lymphoma using laser-induced breakdown spectroscopy (LIBS) conducted on whole blood samples is presented. The whole blood samples collected from lymphoma patients and healthy controls are deposited onto standard quantitative filter papers and ablated with a 1064 nm Q-switched Nd:YAG laser. 16 atomic and ionic emission lines of calcium (Ca), iron (Fe), magnesium (Mg), potassium (K) and sodium (Na) are selected to discriminate the cancer disease. Chemometric methods, including principal component analysis (PCA), linear discriminant analysis (LDA) classification, and k nearest neighbor (kNN) classification are used to build the discrimination models. Both LDA and kNN models have achieved very good discrimination performances for lymphoma, with an accuracy of over 99.7%, a sensitivity of over 0.996, and a specificity of over 0.997. These results demonstrate that the whole-blood-based LIBS technique in combination with chemometric methods can serve as a fast, less invasive, and accurate method for detection and discrimination of human malignancies. PMID:29541503
NASA Astrophysics Data System (ADS)
Meerdink, S.; Roberts, D. A.; Roth, K. L.
2015-12-01
Accurate knowledge of the spatial distribution of plant species is required for many research and management agendas that track ecosystem health. Because of this, there is continuous development of research focused on remotely-sensed species classifications for many diverse ecosystems. While plant species have been mapped using airborne imaging spectroscopy, the geographic extent has been limited due to data availability and spectrally similar species continue to be difficult to separate. The proposed Hyperspectral Infrared Imager (HyspIRI) space-borne mission, which includes a visible near infrared/shortwave infrared (VSWIR) imaging spectrometer and thermal infrared (TIR) multi-spectral imager, would present an opportunity to improve species discrimination over a much broader scale. Here we evaluate: 1) the capability of VSWIR and/or TIR spectra to discriminate plant species; 2) the accuracy of species classifications within an ecosystem; and 3) the potential for discriminating among species across a range of ecosystems. Simulated HyspIRI imagery was acquired in spring/summer of 2013 spanning from Santa Barbara to Bakersfield, CA with the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) and the MODIS/ASTER Airborne Simulator (MASTER) instruments. Three spectral libraries were created from these images: AVIRIS (224 bands from 0.4 - 2.5 µm), MASTER (8 bands from 7.5 - 12 µm), and AVIRIS + MASTER. We used canonical discriminant analysis (CDA) as a dimension reduction technique and then classified plant species using linear discriminant analysis (LDA). Our results show the inclusion of TIR spectra improved species discrimination, but only for plant species with emissivities departing from that of a gray body. Ecosystems with species that have high spectral contrast had higher classification accuracies. Mapping plant species across all ecosystems resulted in a classification with lower accuracies than a single ecosystem due to the complex nature of incorporating more plant species.
ERIC Educational Resources Information Center
Ahrens, Steve
Predictor variables that could be used effectively to place entering freshmen methematics students into courses of instruction in mathematics were investigated at West Virginia University. Multiple discriminant analysis was used with nearly 6,000 student records collected over a three-year period, and a series of predictive equations were…
The UXO Classification Demonstration at the Former Camp Butner, NC
2011-07-01
Symposium and Workshop, Technical Session 2D: Classification Methods for Military Munitions Response. 1 December 2010. [49] Pasion , L. Personal...Communication. 15 June 2011. [50] Pasion , L. “Practical Strategies for UXO Discrimination: Camp Butner Analysis.” ESTCP Munitions Management In-Progress...Review. 9 February 2011. [51] Pasion , L., et al. “UXO Discrimination Using Full Coverage and Cued Interrogation Data Sets at Camp Butner, NC.” Partners
Highly Accurate Classification of Watson-Crick Basepairs on Termini of Single DNA Molecules
Winters-Hilt, Stephen; Vercoutere, Wenonah; DeGuzman, Veronica S.; Deamer, David; Akeson, Mark; Haussler, David
2003-01-01
We introduce a computational method for classification of individual DNA molecules measured by an α-hemolysin channel detector. We show classification with better than 99% accuracy for DNA hairpin molecules that differ only in their terminal Watson-Crick basepairs. Signal classification was done in silico to establish performance metrics (i.e., where train and test data were of known type, via single-species data files). It was then performed in solution to assay real mixtures of DNA hairpins. Hidden Markov Models (HMMs) were used with Expectation/Maximization for denoising and for associating a feature vector with the ionic current blockade of the DNA molecule. Support Vector Machines (SVMs) were used as discriminators, and were the focus of off-line training. A multiclass SVM architecture was designed to place less discriminatory load on weaker discriminators, and novel SVM kernels were used to boost discrimination strength. The tuning on HMMs and SVMs enabled biophysical analysis of the captured molecule states and state transitions; structure revealed in the biophysical analysis was used for better feature selection. PMID:12547778
The prediction of swimming performance in competition from behavioral information.
Rushall, B S; Leet, D
1979-06-01
The swimming performances of the Canadian Team at the 1976 Olympic Games were categorized as being improved or worse than previous best times in the events contested. The two groups had been previously assessed on the Psychological Inventories for Competitive Swimmers. A stepwise multiple-discriminant analysis of the inventory responses revealed that 13 test questions produced a perfect discrimination of group membership. The resultant discriminant functions for predicting performance classification were applied to the test responses of 157 swimmers at the 1977 Canadian Winter National Swimming Championships. Using the same performance classification criteria the accuracy of prediction was not better than chance in three of four sex by performance classifications. This yielded a failure to locate a set of behavioral factors which determine swimming performance improvements in elite competitive circumstances. The possibility of sets of factors which do not discriminate between performances in similar environments or between similar groups of swimmers was raised.
A multiple maximum scatter difference discriminant criterion for facial feature extraction.
Song, Fengxi; Zhang, David; Mei, Dayong; Guo, Zhongwei
2007-12-01
Maximum scatter difference (MSD) discriminant criterion was a recently presented binary discriminant criterion for pattern classification that utilizes the generalized scatter difference rather than the generalized Rayleigh quotient as a class separability measure, thereby avoiding the singularity problem when addressing small-sample-size problems. MSD classifiers based on this criterion have been quite effective on face-recognition tasks, but as they are binary classifiers, they are not as efficient on large-scale classification tasks. To address the problem, this paper generalizes the classification-oriented binary criterion to its multiple counterpart--multiple MSD (MMSD) discriminant criterion for facial feature extraction. The MMSD feature-extraction method, which is based on this novel discriminant criterion, is a new subspace-based feature-extraction method. Unlike most other subspace-based feature-extraction methods, the MMSD computes its discriminant vectors from both the range of the between-class scatter matrix and the null space of the within-class scatter matrix. The MMSD is theoretically elegant and easy to calculate. Extensive experimental studies conducted on the benchmark database, FERET, show that the MMSD out-performs state-of-the-art facial feature-extraction methods such as null space method, direct linear discriminant analysis (LDA), eigenface, Fisherface, and complete LDA.
NASA Astrophysics Data System (ADS)
Teye, Ernest; Huang, Xingyi; Dai, Huang; Chen, Quansheng
2013-10-01
Quick, accurate and reliable technique for discrimination of cocoa beans according to geographical origin is essential for quality control and traceability management. This current study presents the application of Near Infrared Spectroscopy technique and multivariate classification for the differentiation of Ghana cocoa beans. A total of 194 cocoa bean samples from seven cocoa growing regions were used. Principal component analysis (PCA) was used to extract relevant information from the spectral data and this gave visible cluster trends. The performance of four multivariate classification methods: Linear discriminant analysis (LDA), K-nearest neighbors (KNN), Back propagation artificial neural network (BPANN) and Support vector machine (SVM) were compared. The performances of the models were optimized by cross validation. The results revealed that; SVM model was superior to all the mathematical methods with a discrimination rate of 100% in both the training and prediction set after preprocessing with Mean centering (MC). BPANN had a discrimination rate of 99.23% for the training set and 96.88% for prediction set. While LDA model had 96.15% and 90.63% for the training and prediction sets respectively. KNN model had 75.01% for the training set and 72.31% for prediction set. The non-linear classification methods used were superior to the linear ones. Generally, the results revealed that NIR Spectroscopy coupled with SVM model could be used successfully to discriminate cocoa beans according to their geographical origins for effective quality assurance.
Jackman, Patrick; Sun, Da-Wen; Allen, Paul; Valous, Nektarios A; Mendoza, Fernando; Ward, Paddy
2010-04-01
A method to discriminate between various grades of pork and turkey ham was developed using colour and wavelet texture features. Image analysis methods originally developed for predicting the palatability of beef were applied to rapidly identify the ham grade. With high quality digital images of 50-94 slices per ham it was possible to identify the greyscale that best expressed the differences between the various ham grades. The best 10 discriminating image features were then found with a genetic algorithm. Using the best 10 image features, simple linear discriminant analysis models produced 100% correct classifications for both pork and turkey on both calibration and validation sets. 2009 Elsevier Ltd. All rights reserved.
Shayan, Zahra; Mohammad Gholi Mezerji, Naser; Shayan, Leila; Naseri, Parisa
2015-11-03
Logistic regression (LR) and linear discriminant analysis (LDA) are two popular statistical models for prediction of group membership. Although they are very similar, the LDA makes more assumptions about the data. When categorical and continuous variables used simultaneously, the optimal choice between the two models is questionable. In most studies, classification error (CE) is used to discriminate between subjects in several groups, but this index is not suitable to predict the accuracy of the outcome. The present study compared LR and LDA models using classification indices. This cross-sectional study selected 243 cancer patients. Sample sets of different sizes (n = 50, 100, 150, 200, 220) were randomly selected and the CE, B, and Q classification indices were calculated by the LR and LDA models. CE revealed the a lack of superiority for one model over the other, but the results showed that LR performed better than LDA for the B and Q indices in all situations. No significant effect for sample size on CE was noted for selection of an optimal model. Assessment of the accuracy of prediction of real data indicated that the B and Q indices are appropriate for selection of an optimal model. The results of this study showed that LR performs better in some cases and LDA in others when based on CE. The CE index is not appropriate for classification, although the B and Q indices performed better and offered more efficient criteria for comparison and discrimination between groups.
Discrimination of genetically modified sugar beets based on terahertz spectroscopy
NASA Astrophysics Data System (ADS)
Chen, Tao; Li, Zhi; Yin, Xianhua; Hu, Fangrong; Hu, Cong
2016-01-01
The objective of this paper was to apply terahertz (THz) spectroscopy combined with chemometrics techniques for discrimination of genetically modified (GM) and non-GM sugar beets. In this paper, the THz spectra of 84 sugar beet samples (36 GM sugar beets and 48 non-GM ones) were obtained by using terahertz time-domain spectroscopy (THz-TDS) system in the frequency range from 0.2 to 1.2 THz. Three chemometrics methods, principal component analysis (PCA), discriminant analysis (DA) and discriminant partial least squares (DPLS), were employed to classify sugar beet samples into two groups: genetically modified organisms (GMOs) and non-GMOs. The DPLS method yielded the best classification result, and the percentages of successful classification for GM and non-GM sugar beets were both 100%. Results of the present study demonstrate the usefulness of THz spectroscopy together with chemometrics methods as a powerful tool to distinguish GM and non-GM sugar beets.
Invariant approach to the character classification
NASA Astrophysics Data System (ADS)
Šariri, Kristina; Demoli, Nazif
2008-04-01
Image moments analysis is a very useful tool which allows image description invariant to translation and rotation, scale change and some types of image distortions. The aim of this work was development of simple method for fast and reliable classification of characters by using Hu's and affine moment invariants. Measure of Eucleidean distance was used as a discrimination feature with statistical parameters estimated. The method was tested in classification of Times New Roman font letters as well as sets of the handwritten characters. It is shown that using all Hu's and three affine invariants as discrimination set improves recognition rate by 30%.
Moscetti, Roberto; Radicetti, Emanuele; Monarca, Danilo; Cecchini, Massimo; Massantini, Riccardo
2015-10-01
This study investigates the possibility of using near infrared spectroscopy for the authentication of the 'Nocciola Romana' hazelnut (Corylus avellana L. cvs Tonda Gentile Romana and Nocchione) as a Protected Designation of Origin (PDO) hazelnut from central Italy. Algorithms for the selection of the optimal pretreatments were tested in combination with the following discriminant routines: k-nearest neighbour, soft independent modelling of class analogy, partial least squares discriminant analysis and support vector machine discriminant analysis. The best results were obtained using a support vector machine discriminant analysis routine. Thus, classification performance rates with specificities, sensitivities and accuracies as high as 96.0%, 95.0% and 95.5%, respectively, were achieved. Various pretreatments, such as standard normal variate, mean centring and a Savitzky-Golay filter with seven smoothing points, were used. The optimal wavelengths for classification were mainly correlated with lipids, although some contribution from minor constituents, such as proteins and carbohydrates, was also observed. Near infrared spectroscopy could classify hazelnut according to the PDO 'Nocciola Romana' designation. Thus, the experimentation lays the foundations for a rapid, online, authentication system for hazelnut. However, model robustness should be improved taking into account agro-pedo-climatic growing conditions. © 2014 Society of Chemical Industry.
NASA Astrophysics Data System (ADS)
Squiers, John J.; Li, Weizhi; King, Darlene R.; Mo, Weirong; Zhang, Xu; Lu, Yang; Sellke, Eric W.; Fan, Wensheng; DiMaio, J. Michael; Thatcher, Jeffrey E.
2016-03-01
The clinical judgment of expert burn surgeons is currently the standard on which diagnostic and therapeutic decisionmaking regarding burn injuries is based. Multispectral imaging (MSI) has the potential to increase the accuracy of burn depth assessment and the intraoperative identification of viable wound bed during surgical debridement of burn injuries. A highly accurate classification model must be developed using machine-learning techniques in order to translate MSI data into clinically-relevant information. An animal burn model was developed to build an MSI training database and to study the burn tissue classification ability of several models trained via common machine-learning algorithms. The algorithms tested, from least to most complex, were: K-nearest neighbors (KNN), decision tree (DT), linear discriminant analysis (LDA), weighted linear discriminant analysis (W-LDA), quadratic discriminant analysis (QDA), ensemble linear discriminant analysis (EN-LDA), ensemble K-nearest neighbors (EN-KNN), and ensemble decision tree (EN-DT). After the ground-truth database of six tissue types (healthy skin, wound bed, blood, hyperemia, partial injury, full injury) was generated by histopathological analysis, we used 10-fold cross validation to compare the algorithms' performances based on their accuracies in classifying data against the ground truth, and each algorithm was tested 100 times. The mean test accuracy of the algorithms were KNN 68.3%, DT 61.5%, LDA 70.5%, W-LDA 68.1%, QDA 68.9%, EN-LDA 56.8%, EN-KNN 49.7%, and EN-DT 36.5%. LDA had the highest test accuracy, reflecting the bias-variance tradeoff over the range of complexities inherent to the algorithms tested. Several algorithms were able to match the current standard in burn tissue classification, the clinical judgment of expert burn surgeons. These results will guide further development of an MSI burn tissue classification system. Given that there are few surgeons and facilities specializing in burn care, this technology may improve the standard of burn care for patients without access to specialized facilities.
Ethnicity identification from face images
NASA Astrophysics Data System (ADS)
Lu, Xiaoguang; Jain, Anil K.
2004-08-01
Human facial images provide the demographic information, such as ethnicity and gender. Conversely, ethnicity and gender also play an important role in face-related applications. Image-based ethnicity identification problem is addressed in a machine learning framework. The Linear Discriminant Analysis (LDA) based scheme is presented for the two-class (Asian vs. non-Asian) ethnicity classification task. Multiscale analysis is applied to the input facial images. An ensemble framework, which integrates the LDA analysis for the input face images at different scales, is proposed to further improve the classification performance. The product rule is used as the combination strategy in the ensemble. Experimental results based on a face database containing 263 subjects (2,630 face images, with equal balance between the two classes) are promising, indicating that LDA and the proposed ensemble framework have sufficient discriminative power for the ethnicity classification problem. The normalized ethnicity classification scores can be helpful in the facial identity recognition. Useful as a "soft" biometric, face matching scores can be updated based on the output of ethnicity classification module. In other words, ethnicity classifier does not have to be perfect to be useful in practice.
Yourganov, Grigori; Schmah, Tanya; Churchill, Nathan W; Berman, Marc G; Grady, Cheryl L; Strother, Stephen C
2014-08-01
The field of fMRI data analysis is rapidly growing in sophistication, particularly in the domain of multivariate pattern classification. However, the interaction between the properties of the analytical model and the parameters of the BOLD signal (e.g. signal magnitude, temporal variance and functional connectivity) is still an open problem. We addressed this problem by evaluating a set of pattern classification algorithms on simulated and experimental block-design fMRI data. The set of classifiers consisted of linear and quadratic discriminants, linear support vector machine, and linear and nonlinear Gaussian naive Bayes classifiers. For linear discriminant, we used two methods of regularization: principal component analysis, and ridge regularization. The classifiers were used (1) to classify the volumes according to the behavioral task that was performed by the subject, and (2) to construct spatial maps that indicated the relative contribution of each voxel to classification. Our evaluation metrics were: (1) accuracy of out-of-sample classification and (2) reproducibility of spatial maps. In simulated data sets, we performed an additional evaluation of spatial maps with ROC analysis. We varied the magnitude, temporal variance and connectivity of simulated fMRI signal and identified the optimal classifier for each simulated environment. Overall, the best performers were linear and quadratic discriminants (operating on principal components of the data matrix) and, in some rare situations, a nonlinear Gaussian naïve Bayes classifier. The results from the simulated data were supported by within-subject analysis of experimental fMRI data, collected in a study of aging. This is the first study that systematically characterizes interactions between analysis model and signal parameters (such as magnitude, variance and correlation) on the performance of pattern classifiers for fMRI. Copyright © 2014 Elsevier Inc. All rights reserved.
Jaiswara, Ranjana; Nandi, Diptarup; Balakrishnan, Rohini
2013-01-01
Traditional taxonomy based on morphology has often failed in accurate species identification owing to the occurrence of cryptic species, which are reproductively isolated but morphologically identical. Molecular data have thus been used to complement morphology in species identification. The sexual advertisement calls in several groups of acoustically communicating animals are species-specific and can thus complement molecular data as non-invasive tools for identification. Several statistical tools and automated identifier algorithms have been used to investigate the efficiency of acoustic signals in species identification. Despite a plethora of such methods, there is a general lack of knowledge regarding the appropriate usage of these methods in specific taxa. In this study, we investigated the performance of two commonly used statistical methods, discriminant function analysis (DFA) and cluster analysis, in identification and classification based on acoustic signals of field cricket species belonging to the subfamily Gryllinae. Using a comparative approach we evaluated the optimal number of species and calling song characteristics for both the methods that lead to most accurate classification and identification. The accuracy of classification using DFA was high and was not affected by the number of taxa used. However, a constraint in using discriminant function analysis is the need for a priori classification of songs. Accuracy of classification using cluster analysis, which does not require a priori knowledge, was maximum for 6–7 taxa and decreased significantly when more than ten taxa were analysed together. We also investigated the efficacy of two novel derived acoustic features in improving the accuracy of identification. Our results show that DFA is a reliable statistical tool for species identification using acoustic signals. Our results also show that cluster analysis of acoustic signals in crickets works effectively for species classification and identification. PMID:24086666
Yang, Mingxing; Li, Xiumin; Li, Zhibin; Ou, Zhimin; Liu, Ming; Liu, Suhuan; Li, Xuejun; Yang, Shuyu
2013-01-01
DNA microarray analysis is characterized by obtaining a large number of gene variables from a small number of observations. Cluster analysis is widely used to analyze DNA microarray data to make classification and diagnosis of disease. Because there are so many irrelevant and insignificant genes in a dataset, a feature selection approach must be employed in data analysis. The performance of cluster analysis of this high-throughput data depends on whether the feature selection approach chooses the most relevant genes associated with disease classes. Here we proposed a new method using multiple Orthogonal Partial Least Squares-Discriminant Analysis (mOPLS-DA) models and S-plots to select the most relevant genes to conduct three-class disease classification and prediction. We tested our method using Golub's leukemia microarray data. For three classes with subtypes, we proposed hierarchical orthogonal partial least squares-discriminant analysis (OPLS-DA) models and S-plots to select features for two main classes and their subtypes. For three classes in parallel, we employed three OPLS-DA models and S-plots to choose marker genes for each class. The power of feature selection to classify and predict three-class disease was evaluated using cluster analysis. Further, the general performance of our method was tested using four public datasets and compared with those of four other feature selection methods. The results revealed that our method effectively selected the most relevant features for disease classification and prediction, and its performance was better than that of the other methods.
Benign-malignant mass classification in mammogram using edge weighted local texture features
NASA Astrophysics Data System (ADS)
Rabidas, Rinku; Midya, Abhishek; Sadhu, Anup; Chakraborty, Jayasree
2016-03-01
This paper introduces novel Discriminative Robust Local Binary Pattern (DRLBP) and Discriminative Robust Local Ternary Pattern (DRLTP) for the classification of mammographic masses as benign or malignant. Mass is one of the common, however, challenging evidence of breast cancer in mammography and diagnosis of masses is a difficult task. Since DRLBP and DRLTP overcome the drawbacks of Local Binary Pattern (LBP) and Local Ternary Pattern (LTP) by discriminating a brighter object against the dark background and vice-versa, in addition to the preservation of the edge information along with the texture information, several edge-preserving texture features are extracted, in this study, from DRLBP and DRLTP. Finally, a Fisher Linear Discriminant Analysis method is incorporated with discriminating features, selected by stepwise logistic regression method, for the classification of benign and malignant masses. The performance characteristics of DRLBP and DRLTP features are evaluated using a ten-fold cross-validation technique with 58 masses from the mini-MIAS database, and the best result is observed with DRLBP having an area under the receiver operating characteristic curve of 0.982.
ERIC Educational Resources Information Center
Montoya, Isaac D.
2008-01-01
Three classification techniques (Chi-square Automatic Interaction Detection [CHAID], Classification and Regression Tree [CART], and discriminant analysis) were tested to determine their accuracy in predicting Temporary Assistance for Needy Families program recipients' future employment. Technique evaluation was based on proportion of correctly…
John Hogland; Nedret Billor; Nathaniel Anderson
2013-01-01
Discriminant analysis, referred to as maximum likelihood classification within popular remote sensing software packages, is a common supervised technique used by analysts. Polytomous logistic regression (PLR), also referred to as multinomial logistic regression, is an alternative classification approach that is less restrictive, more flexible, and easy to interpret. To...
A discrimlnant function approach to ecological site classification in northern New England
James M. Fincher; Marie-Louise Smith
1994-01-01
Describes one approach to ecologically based classification of upland forest community types of the White and Green Mountain physiographic regions. The classification approach is based on an intensive statistical analysis of the relationship between the communities and soil-site factors. Discriminant functions useful in distinguishing between types based on soil-site...
Signal peptide discrimination and cleavage site identification using SVM and NN.
Kazemian, H B; Yusuf, S A; White, K
2014-02-01
About 15% of all proteins in a genome contain a signal peptide (SP) sequence, at the N-terminus, that targets the protein to intracellular secretory pathways. Once the protein is targeted correctly in the cell, the SP is cleaved, releasing the mature protein. Accurate prediction of the presence of these short amino-acid SP chains is crucial for modelling the topology of membrane proteins, since SP sequences can be confused with transmembrane domains due to similar composition of hydrophobic amino acids. This paper presents a cascaded Support Vector Machine (SVM)-Neural Network (NN) classification methodology for SP discrimination and cleavage site identification. The proposed method utilises a dual phase classification approach using SVM as a primary classifier to discriminate SP sequences from Non-SP. The methodology further employs NNs to predict the most suitable cleavage site candidates. In phase one, a SVM classification utilises hydrophobic propensities as a primary feature vector extraction using symmetric sliding window amino-acid sequence analysis for discrimination of SP and Non-SP. In phase two, a NN classification uses asymmetric sliding window sequence analysis for prediction of cleavage site identification. The proposed SVM-NN method was tested using Uni-Prot non-redundant datasets of eukaryotic and prokaryotic proteins with SP and Non-SP N-termini. Computer simulation results demonstrate an overall accuracy of 0.90 for SP and Non-SP discrimination based on Matthews Correlation Coefficient (MCC) tests using SVM. For SP cleavage site prediction, the overall accuracy is 91.5% based on cross-validation tests using the novel SVM-NN model. © 2013 Published by Elsevier Ltd.
Wang, Changming; Xiong, Shi; Hu, Xiaoping; Yao, Li; Zhang, Jiacai
2012-10-01
Categorization of images containing visual objects can be successfully recognized using single-trial electroencephalograph (EEG) measured when subjects view images. Previous studies have shown that task-related information contained in event-related potential (ERP) components could discriminate two or three categories of object images. In this study, we investigated whether four categories of objects (human faces, buildings, cats and cars) could be mutually discriminated using single-trial EEG data. Here, the EEG waveforms acquired while subjects were viewing four categories of object images were segmented into several ERP components (P1, N1, P2a and P2b), and then Fisher linear discriminant analysis (Fisher-LDA) was used to classify EEG features extracted from ERP components. Firstly, we compared the classification results using features from single ERP components, and identified that the N1 component achieved the highest classification accuracies. Secondly, we discriminated four categories of objects using combining features from multiple ERP components, and showed that combination of ERP components improved four-category classification accuracies by utilizing the complementarity of discriminative information in ERP components. These findings confirmed that four categories of object images could be discriminated with single-trial EEG and could direct us to select effective EEG features for classifying visual objects.
NASA Technical Reports Server (NTRS)
Lillesand, T. M.; Werth, L. F. (Principal Investigator)
1980-01-01
A 25% improvement in average classification accuracy was realized by processing double-date vs. single-date data. Under the spectrally and spatially complex site conditions characterizing the geographical area used, further improvement in wetland classification accuracy is apparently precluded by the spectral and spatial resolution restrictions of the LANDSAT MSS. Full scene analysis of scanning densitometer data extracted from scale infrared photography failed to permit discrimination of many wetland and nonwetland cover types. When classification of photographic data was limited to wetland areas only, much more detailed and accurate classification could be made. The integration of conventional image interpretation (to simply delineate wetland boundaries) and machine assisted classification (to discriminate among cover types present within the wetland areas) appears to warrant further research to study the feasibility and cost of extending this methodology over a large area using LANDSAT and/or small scale photography.
Li, X C; Li, J S; Meng, L; Bai, Y N; Yu, D S; Liu, X N; Liu, X F; Jiang, X J; Ren, X W; Yang, X T; Shen, X P; Zhang, J W
2017-08-10
Objective: To understand the dominant pathogens of febrile respiratory syndrome (FRS) patients in Gansu province and to establish the Bayes discriminant function in order to identify the patients infected with the dominant pathogens. Methods: FRS patients were collected in various sentinel hospitals of Gansu province from 2009 to 2015 and the dominant pathogens were determined by describing the composition of pathogenic profile. Significant clinical variables were selected by stepwise discriminant analysis to establish the Bayes discriminant function. Results: In the detection of pathogens for FRS, both influenza virus and rhinovirus showed higher positive rates than those caused by other viruses (13.79%, 8.63%), that accounting for 54.38%, 13.73% of total viral positive patients. Most frequently detected bacteria would include Streptococcus pneumoniae , and haemophilus influenza (44.41%, 18.07%) that accounting for 66.21% and 24.55% among the bacterial positive patients. The original-validated rate of discriminant function, established by 11 clinical variables, was 73.1%, with the cross-validated rate as 70.6%. Conclusion: Influenza virus, Rhinovirus, Streptococcus pneumoniae and Haemophilus influenzae were the dominant pathogens of FRS in Gansu province. Results from the Bayes discriminant analysis showed both higher accuracy in the classification of dominant pathogens, and applicative value for FRS.
NASA Astrophysics Data System (ADS)
Ogruc Ildiz, G.; Arslan, M.; Unsalan, O.; Araujo-Andrade, C.; Kurt, E.; Karatepe, H. T.; Yilmaz, A.; Yalcinkaya, O. B.; Herken, H.
2016-01-01
In this study, a methodology based on Fourier-transform infrared spectroscopy and principal component analysis and partial least square methods is proposed for the analysis of blood plasma samples in order to identify spectral changes correlated with some biomarkers associated with schizophrenia and bipolarity. Our main goal was to use the spectral information for the calibration of statistical models to discriminate and classify blood plasma samples belonging to bipolar and schizophrenic patients. IR spectra of 30 samples of blood plasma obtained from each, bipolar and schizophrenic patients and healthy control group were collected. The results obtained from principal component analysis (PCA) show a clear discrimination between the bipolar (BP), schizophrenic (SZ) and control group' (CG) blood samples that also give possibility to identify three main regions that show the major differences correlated with both mental disorders (biomarkers). Furthermore, a model for the classification of the blood samples was calibrated using partial least square discriminant analysis (PLS-DA), allowing the correct classification of BP, SZ and CG samples. The results obtained applying this methodology suggest that it can be used as a complimentary diagnostic tool for the detection and discrimination of these mental diseases.
Cao, Longlong; Guo, Shuixia; Xue, Zhimin; Hu, Yong; Liu, Haihong; Mwansisya, Tumbwene E; Pu, Weidan; Yang, Bo; Liu, Chang; Feng, Jianfeng; Chen, Eric Y H; Liu, Zhening
2014-02-01
Aberrant brain functional connectivity patterns have been reported in major depressive disorder (MDD). It is unknown whether they can be used in discriminant analysis for diagnosis of MDD. In the present study we examined the efficiency of discriminant analysis of MDD by individualized computer-assisted diagnosis. Based on resting-state functional magnetic resonance imaging data, a new approach was adopted to investigate functional connectivity changes in 39 MDD patients and 37 well-matched healthy controls. By using the proposed feature selection method, we identified significant altered functional connections in patients. They were subsequently applied to our analysis as discriminant features using a support vector machine classification method. Furthermore, the relative contribution of functional connectivity was estimated. After subset selection of high-dimension features, the support vector machine classifier reached up to approximately 84% with leave-one-out training during the discrimination process. Through summarizing the classification contribution of functional connectivities, we obtained four obvious contribution modules: inferior orbitofrontal module, supramarginal gyrus module, inferior parietal lobule-posterior cingulated gyrus module and middle temporal gyrus-inferior temporal gyrus module. The experimental results demonstrated that the proposed method is effective in discriminating MDD patients from healthy controls. Functional connectivities might be useful as new biomarkers to assist clinicians in computer auxiliary diagnosis of MDD. © 2013 The Authors. Psychiatry and Clinical Neurosciences © 2013 Japanese Society of Psychiatry and Neurology.
Detection of stress factors in crop and weed species using hyperspectral remote sensing reflectance
NASA Astrophysics Data System (ADS)
Henry, William Brien
The primary objective of this work was to determine if stress factors such as moisture stress or herbicide injury stress limit the ability to distinguish between weeds and crops using remotely sensed data. Additional objectives included using hyperspectral reflectance data to measure moisture content within a species, and to measure crop injury in response to drift rates of non-selective herbicides. Moisture stress did not reduce the ability to discriminate between species. Regardless of analysis technique, the trend was that as moisture stress increased, so too did the ability to distinguish between species. Signature amplitudes (SA) of the top 5 bands, discrete wavelet transforms (DWT), and multiple indices were promising analysis techniques. Discriminant models created from one year's data set and validated on additional data sets provided, on average, approximately 80% accurate classification among weeds and crop. This suggests that these models are relatively robust and could potentially be used across environmental conditions in field scenarios. Distinguishing between leaves grown at high-moisture stress and no-stress was met with limited success, primarily because there was substantial variation among samples within the treatments. Leaf water potential (LWP) was measured, and these were classified into three categories using indices. Classification accuracies were as high as 68%. The 10 bands most highly correlated to LWP were selected; however, there were no obvious trends or patterns in these top 10 bands with respect to time, species or moisture level, suggesting that LWP is an elusive parameter to quantify spectrally. In order to address herbicide injury stress and its impact on species discrimination, discriminant models were created from combinations of multiple indices. The model created from the second experimental run's data set and validated on the first experimental run's data provided an average of 97% correct classification of soybean and an overall average classification accuracy of 65% for all species. This suggests that these models are relatively robust and could potentially be used across a wide range of herbicide applications in field scenarios. From the pooled data set, a single discriminant model was created with multiple indices that discriminated soybean from weeds 88%, on average, regardless of herbicide, rate or species. Several analysis techniques including multiple indices, signature amplitude with spectral bands as features, and wavelet analysis were employed to distinguish between herbicide-treated and nontreated plants. Classification accuracy using signature amplitude (SA) analysis of paraquat injury on soybean was better than 75% for both 1/2 and 1/8X rates at 1, 4, and 7 DAA. Classification accuracy of paraquat injury on corn was better than 72% for the 1/2X rate at 1, 4, and 7 DAA. These data suggest that hyperspectral reflectance may be used to distinguish between healthy plants and injured plants to which herbicides have been applied; however, the classification accuracies remained at 75% or higher only when the higher rates of herbicide were applied. (Abstract shortened by UMI.)
NASA Astrophysics Data System (ADS)
Ramos, M. Rosário; Carolino, E.; Viegas, Carla; Viegas, Sandra
2016-06-01
Health effects associated with occupational exposure to particulate matter have been studied by several authors. In this study were selected six industries of five different areas: Cork company 1, Cork company 2, poultry, slaughterhouse for cattle, riding arena and production of animal feed. The measurements tool was a portable device for direct reading. This tool provides information on the particle number concentration for six different diameters, namely 0.3 µm, 0.5 µm, 1 µm, 2.5 µm, 5 µm and 10 µm. The focus on these features is because they might be more closely related with adverse health effects. The aim is to identify the particles that better discriminate the industries, with the ultimate goal of classifying industries regarding potential negative effects on workers' health. Several methods of discriminant analysis were applied to data of occupational exposure to particulate matter and compared with respect to classification accuracy. The selected methods were linear discriminant analyses (LDA); linear quadratic discriminant analysis (QDA), robust linear discriminant analysis with selected estimators (MLE (Maximum Likelihood Estimators), MVE (Minimum Volume Elipsoid), "t", MCD (Minimum Covariance Determinant), MCD-A, MCD-B), multinomial logistic regression and artificial neural networks (ANN). The predictive accuracy of the methods was accessed through a simulation study. ANN yielded the highest rate of classification accuracy in the data set under study. Results indicate that the particle number concentration of diameter size 0.5 µm is the parameter that better discriminates industries.
Vavougios, George D; Doskas, Triantafyllos; Konstantopoulos, Kostas
2018-05-01
Dysarthrophonia is a predominant symptom in many neurological diseases, affecting the quality of life of the patients. In this study, we produced a discriminant function equation that can differentiate MS patients from healthy controls, using electroglottographic variables not analyzed in a previous study. We applied stepwise linear discriminant function analysis in order to produce a function and score derived from electroglottographic variables extracted from a previous study. The derived discriminant function's statistical significance was determined via Wilk's λ test (and the associated p value). Finally, a 2 × 2 confusion matrix was used to determine the function's predictive accuracy, whereas the cross-validated predictive accuracy is estimated via the "leave-one-out" classification process. Discriminant function analysis (DFA) was used to create a linear function of continuous predictors. DFA produced the following model (Wilk's λ = 0.043, χ2 = 388.588, p < 0.0001, Tables 3 and 4): D (MS vs controls) = 0.728*DQx1 mean monologue + 0.325*CQx monologue + 0.298*DFx1 90% range monologue + 0.443*DQx1 90% range reading - 1.490*DQx1 90% range monologue. The derived discriminant score (S1) was used subsequently in order to form the coordinates of a ROC curve. Thus, a cutoff score of - 0.788 for S1 corresponded to a perfect classification (100% sensitivity and 100% specificity, p = 1.67e -22 ). Consistent with previous findings, electroglottographic evaluation represents an easy to implement and potentially important assessment in MS patients, achieving adequate classification accuracy. Further evaluation is needed to determine its use as a biomarker.
NASA Astrophysics Data System (ADS)
Prochazka, D.; Mazura, M.; Samek, O.; Rebrošová, K.; Pořízka, P.; Klus, J.; Prochazková, P.; Novotný, J.; Novotný, K.; Kaiser, J.
2018-01-01
In this work, we investigate the impact of data provided by complementary laser-based spectroscopic methods on multivariate classification accuracy. Discrimination and classification of five Staphylococcus bacterial strains and one strain of Escherichia coli is presented. The technique that we used for measurements is a combination of Raman spectroscopy and Laser-Induced Breakdown Spectroscopy (LIBS). Obtained spectroscopic data were then processed using Multivariate Data Analysis algorithms. Principal Components Analysis (PCA) was selected as the most suitable technique for visualization of bacterial strains data. To classify the bacterial strains, we used Neural Networks, namely a supervised version of Kohonen's self-organizing maps (SOM). We were processing results in three different ways - separately from LIBS measurements, from Raman measurements, and we also merged data from both mentioned methods. The three types of results were then compared. By applying the PCA to Raman spectroscopy data, we observed that two bacterial strains were fully distinguished from the rest of the data set. In the case of LIBS data, three bacterial strains were fully discriminated. Using a combination of data from both methods, we achieved the complete discrimination of all bacterial strains. All the data were classified with a high success rate using SOM algorithm. The most accurate classification was obtained using a combination of data from both techniques. The classification accuracy varied, depending on specific samples and techniques. As for LIBS, the classification accuracy ranged from 45% to 100%, as for Raman Spectroscopy from 50% to 100% and in case of merged data, all samples were classified correctly. Based on the results of the experiments presented in this work, we can assume that the combination of Raman spectroscopy and LIBS significantly enhances discrimination and classification accuracy of bacterial species and strains. The reason is the complementarity in obtained chemical information while using these two methods.
Insausti, Matías; Gomes, Adriano A; Cruz, Fernanda V; Pistonesi, Marcelo F; Araujo, Mario C U; Galvão, Roberto K H; Pereira, Claudete F; Band, Beatriz S F
2012-08-15
This paper investigates the use of UV-vis, near infrared (NIR) and synchronous fluorescence (SF) spectrometries coupled with multivariate classification methods to discriminate biodiesel samples with respect to the base oil employed in their production. More specifically, the present work extends previous studies by investigating the discrimination of corn-based biodiesel from two other biodiesel types (sunflower and soybean). Two classification methods are compared, namely full-spectrum SIMCA (soft independent modelling of class analogies) and SPA-LDA (linear discriminant analysis with variables selected by the successive projections algorithm). Regardless of the spectrometric technique employed, full-spectrum SIMCA did not provide an appropriate discrimination of the three biodiesel types. In contrast, all samples were correctly classified on the basis of a reduced number of wavelengths selected by SPA-LDA. It can be concluded that UV-vis, NIR and SF spectrometries can be successfully employed to discriminate corn-based biodiesel from the two other biodiesel types, but wavelength selection by SPA-LDA is key to the proper separation of the classes. Copyright © 2012 Elsevier B.V. All rights reserved.
Motor Oil Classification using Color Histograms and Pattern Recognition Techniques.
Ahmadi, Shiva; Mani-Varnosfaderani, Ahmad; Habibi, Biuck
2018-04-20
Motor oil classification is important for quality control and the identification of oil adulteration. In thiswork, we propose a simple, rapid, inexpensive and nondestructive approach based on image analysis and pattern recognition techniques for the classification of nine different types of motor oils according to their corresponding color histograms. For this, we applied color histogram in different color spaces such as red green blue (RGB), grayscale, and hue saturation intensity (HSI) in order to extract features that can help with the classification procedure. These color histograms and their combinations were used as input for model development and then were statistically evaluated by using linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and support vector machine (SVM) techniques. Here, two common solutions for solving a multiclass classification problem were applied: (1) transformation to binary classification problem using a one-against-all (OAA) approach and (2) extension from binary classifiers to a single globally optimized multilabel classification model. In the OAA strategy, LDA, QDA, and SVM reached up to 97% in terms of accuracy, sensitivity, and specificity for both the training and test sets. In extension from binary case, despite good performances by the SVM classification model, QDA and LDA provided better results up to 92% for RGB-grayscale-HSI color histograms and up to 93% for the HSI color map, respectively. In order to reduce the numbers of independent variables for modeling, a principle component analysis algorithm was used. Our results suggest that the proposed method is promising for the identification and classification of different types of motor oils.
NASA Astrophysics Data System (ADS)
Hutchings, Joanne; Kendall, Catherine; Shepherd, Neil; Barr, Hugh; Stone, Nicholas
2010-11-01
Rapid Raman mapping has the potential to be used for automated histopathology diagnosis, providing an adjunct technique to histology diagnosis. The aim of this work is to evaluate the feasibility of automated and objective pathology classification of Raman maps using linear discriminant analysis. Raman maps of esophageal tissue sections are acquired. Principal component (PC)-fed linear discriminant analysis (LDA) is carried out using subsets of the Raman map data (6483 spectra). An overall (validated) training classification model performance of 97.7% (sensitivity 95.0 to 100% and specificity 98.6 to 100%) is obtained. The remainder of the map spectra (131,672 spectra) are projected onto the classification model resulting in Raman images, demonstrating good correlation with contiguous hematoxylin and eosin (HE) sections. Initial results suggest that LDA has the potential to automate pathology diagnosis of esophageal Raman images, but since the classification of test spectra is forced into existing training groups, further work is required to optimize the training model. A small pixel size is advantageous for developing the training datasets using mapping data, despite lengthy mapping times, due to additional morphological information gained, and could facilitate differentiation of further tissue groups, such as the basal cells/lamina propria, in the future, but larger pixels sizes (and faster mapping) may be more feasible for clinical application.
NASA Astrophysics Data System (ADS)
De Lucia, Frank C., Jr.; Gottfried, Jennifer L.
2011-02-01
Using a series of thirteen organic materials that includes novel high-nitrogen energetic materials, conventional organic military explosives, and benign organic materials, we have demonstrated the importance of variable selection for maximizing residue discrimination with partial least squares discriminant analysis (PLS-DA). We built several PLS-DA models using different variable sets based on laser induced breakdown spectroscopy (LIBS) spectra of the organic residues on an aluminum substrate under an argon atmosphere. The model classification results for each sample are presented and the influence of the variables on these results is discussed. We found that using the whole spectra as the data input for the PLS-DA model gave the best results. However, variables due to the surrounding atmosphere and the substrate contribute to discrimination when the whole spectra are used, indicating this may not be the most robust model. Further iterative testing with additional validation data sets is necessary to determine the most robust model.
Spectral Regression Discriminant Analysis for Hyperspectral Image Classification
NASA Astrophysics Data System (ADS)
Pan, Y.; Wu, J.; Huang, H.; Liu, J.
2012-08-01
Dimensionality reduction algorithms, which aim to select a small set of efficient and discriminant features, have attracted great attention for Hyperspectral Image Classification. The manifold learning methods are popular for dimensionality reduction, such as Locally Linear Embedding, Isomap, and Laplacian Eigenmap. However, a disadvantage of many manifold learning methods is that their computations usually involve eigen-decomposition of dense matrices which is expensive in both time and memory. In this paper, we introduce a new dimensionality reduction method, called Spectral Regression Discriminant Analysis (SRDA). SRDA casts the problem of learning an embedding function into a regression framework, which avoids eigen-decomposition of dense matrices. Also, with the regression based framework, different kinds of regularizes can be naturally incorporated into our algorithm which makes it more flexible. It can make efficient use of data points to discover the intrinsic discriminant structure in the data. Experimental results on Washington DC Mall and AVIRIS Indian Pines hyperspectral data sets demonstrate the effectiveness of the proposed method.
EXTRACTING PRINCIPLE COMPONENTS FOR DISCRIMINANT ANALYSIS OF FMRI IMAGES
Liu, Jingyu; Xu, Lai; Caprihan, Arvind; Calhoun, Vince D.
2009-01-01
This paper presents an approach for selecting optimal components for discriminant analysis. Such an approach is useful when further detailed analyses for discrimination or characterization requires dimensionality reduction. Our approach can accommodate a categorical variable such as diagnosis (e.g. schizophrenic patient or healthy control), or a continuous variable like severity of the disorder. This information is utilized as a reference for measuring a component’s discriminant power after principle component decomposition. After sorting each component according to its discriminant power, we extract the best components for discriminant analysis. An application of our reference selection approach is shown using a functional magnetic resonance imaging data set in which the sample size is much less than the dimensionality. The results show that the reference selection approach provides an improved discriminant component set as compared to other approaches. Our approach is general and provides a solid foundation for further discrimination and classification studies. PMID:20582334
EXTRACTING PRINCIPLE COMPONENTS FOR DISCRIMINANT ANALYSIS OF FMRI IMAGES.
Liu, Jingyu; Xu, Lai; Caprihan, Arvind; Calhoun, Vince D
2008-05-12
This paper presents an approach for selecting optimal components for discriminant analysis. Such an approach is useful when further detailed analyses for discrimination or characterization requires dimensionality reduction. Our approach can accommodate a categorical variable such as diagnosis (e.g. schizophrenic patient or healthy control), or a continuous variable like severity of the disorder. This information is utilized as a reference for measuring a component's discriminant power after principle component decomposition. After sorting each component according to its discriminant power, we extract the best components for discriminant analysis. An application of our reference selection approach is shown using a functional magnetic resonance imaging data set in which the sample size is much less than the dimensionality. The results show that the reference selection approach provides an improved discriminant component set as compared to other approaches. Our approach is general and provides a solid foundation for further discrimination and classification studies.
Khanmohammadi, Mohammadreza; Bagheri Garmarudi, Amir; Samani, Simin; Ghasemi, Keyvan; Ashuri, Ahmad
2011-06-01
Attenuated Total Reflectance Fourier Transform Infrared (ATR-FTIR) microspectroscopy was applied for detection of colon cancer according to the spectral features of colon tissues. Supervised classification models can be trained to identify the tissue type based on the spectroscopic fingerprint. A total of 78 colon tissues were used in spectroscopy studies. Major spectral differences were observed in 1,740-900 cm(-1) spectral region. Several chemometric methods such as analysis of variance (ANOVA), cluster analysis (CA) and linear discriminate analysis (LDA) were applied for classification of IR spectra. Utilizing the chemometric techniques, clear and reproducible differences were observed between the spectra of normal and cancer cases, suggesting that infrared microspectroscopy in conjunction with spectral data processing would be useful for diagnostic classification. Using LDA technique, the spectra were classified into cancer and normal tissue classes with an accuracy of 95.8%. The sensitivity and specificity was 100 and 93.1%, respectively.
Progress toward the determination of correct classification rates in fire debris analysis.
Waddell, Erin E; Song, Emma T; Rinke, Caitlin N; Williams, Mary R; Sigman, Michael E
2013-07-01
Principal components analysis (PCA), linear discriminant analysis (LDA), and quadratic discriminant analysis (QDA) were used to develop a multistep classification procedure for determining the presence of ignitable liquid residue in fire debris and assigning any ignitable liquid residue present into the classes defined under the American Society for Testing and Materials (ASTM) E 1618-10 standard method. A multistep classification procedure was tested by cross-validation based on model data sets comprised of the time-averaged mass spectra (also referred to as total ion spectra) of commercial ignitable liquids and pyrolysis products from common building materials and household furnishings (referred to simply as substrates). Fire debris samples from laboratory-scale and field test burns were also used to test the model. The optimal model's true-positive rate was 81.3% for cross-validation samples and 70.9% for fire debris samples. The false-positive rate was 9.9% for cross-validation samples and 8.9% for fire debris samples. © 2013 American Academy of Forensic Sciences.
Classification and prediction of pilot weather encounters: A discriminant function analysis.
O'Hare, David; Hunter, David R; Martinussen, Monica; Wiggins, Mark
2011-05-01
Flight into adverse weather continues to be a significant hazard for General Aviation (GA) pilots. Weather-related crashes have a significantly higher fatality rate than other GA crashes. Previous research has identified lack of situational awareness, risk perception, and risk tolerance as possible explanations for why pilots would continue into adverse weather. However, very little is known about the nature of these encounters or the differences between pilots who avoid adverse weather and those who do not. Visitors to a web site described an experience with adverse weather and completed a range of measures of personal characteristics. The resulting data from 364 pilots were carefully screened and subject to a discriminant function analysis. Two significant functions were found. The first, accounting for 69% of the variance, reflected measures of risk awareness and pilot judgment while the second differentiated pilots in terms of their experience levels. The variables measured in this study enabled us to correctly discriminate between the three groups of pilots considerably better (53% correct classifications) than would have been possible by chance (33% correct classifications). The implications of these findings for targeting safety interventions are discussed.
Weakly Supervised Dictionary Learning
NASA Astrophysics Data System (ADS)
You, Zeyu; Raich, Raviv; Fern, Xiaoli Z.; Kim, Jinsub
2018-05-01
We present a probabilistic modeling and inference framework for discriminative analysis dictionary learning under a weak supervision setting. Dictionary learning approaches have been widely used for tasks such as low-level signal denoising and restoration as well as high-level classification tasks, which can be applied to audio and image analysis. Synthesis dictionary learning aims at jointly learning a dictionary and corresponding sparse coefficients to provide accurate data representation. This approach is useful for denoising and signal restoration, but may lead to sub-optimal classification performance. By contrast, analysis dictionary learning provides a transform that maps data to a sparse discriminative representation suitable for classification. We consider the problem of analysis dictionary learning for time-series data under a weak supervision setting in which signals are assigned with a global label instead of an instantaneous label signal. We propose a discriminative probabilistic model that incorporates both label information and sparsity constraints on the underlying latent instantaneous label signal using cardinality control. We present the expectation maximization (EM) procedure for maximum likelihood estimation (MLE) of the proposed model. To facilitate a computationally efficient E-step, we propose both a chain and a novel tree graph reformulation of the graphical model. The performance of the proposed model is demonstrated on both synthetic and real-world data.
NASA Astrophysics Data System (ADS)
Åberg Lindell, M.; Andersson, P.; Grape, S.; Hellesen, C.; Håkansson, A.; Thulin, M.
2018-03-01
This paper investigates how concentrations of certain fission products and their related gamma-ray emissions can be used to discriminate between uranium oxide (UOX) and mixed oxide (MOX) type fuel. Discrimination of irradiated MOX fuel from irradiated UOX fuel is important in nuclear facilities and for transport of nuclear fuel, for purposes of both criticality safety and nuclear safeguards. Although facility operators keep records on the identity and properties of each fuel, tools for nuclear safeguards inspectors that enable independent verification of the fuel are critical in the recovery of continuity of knowledge, should it be lost. A discrimination methodology for classification of UOX and MOX fuel, based on passive gamma-ray spectroscopy data and multivariate analysis methods, is presented. Nuclear fuels and their gamma-ray emissions were simulated in the Monte Carlo code Serpent, and the resulting data was used as input to train seven different multivariate classification techniques. The trained classifiers were subsequently implemented and evaluated with respect to their capabilities to correctly predict the classes of unknown fuel items. The best results concerning successful discrimination of UOX and MOX-fuel were acquired when using non-linear classification techniques, such as the k nearest neighbors method and the Gaussian kernel support vector machine. For fuel with cooling times up to 20 years, when it is considered that gamma-rays from the isotope 134Cs can still be efficiently measured, success rates of 100% were obtained. A sensitivity analysis indicated that these methods were also robust.
Shawky, Eman; Abou El Kheir, Rasha M
2018-02-11
Species of Apiaceae are used in folk medicine as spices and in officinal medicinal preparations of drugs. They are an excellent source of phenolics exhibiting antioxidant activity, which are of great benefit to human health. Discrimination among Apiaceae medicinal herbs remains an intricate challenge due to their morphological similarity. In this study, a combined "untargeted" and "targeted" approach to investigate different Apiaceae plants species was proposed by using the merging of high-performance thin layer chromatography (HPTLC)-image analysis and pattern recognition methods which were used for fingerprinting and classification of 42 different Apiaceae samples collected from Egypt. Software for image processing was applied for fingerprinting and data acquisition. HPTLC fingerprint assisted by principal component analysis (PCA) and hierarchical cluster analysis (HCA)-heat maps resulted in a reliable untargeted approach for discrimination and classification of different samples. The "targeted" approach was performed by developing and validating an HPTLC method allowing the quantification of eight flavonoids. The combination of quantitative data with PCA and HCA-heat-maps allowed the different samples to be discriminated from each other. The use of chemometrics tools for evaluation of fingerprints reduced expense and analysis time. The proposed method can be adopted for routine discrimination and evaluation of the phytochemical variability in different Apiaceae species extracts. Copyright © 2018 John Wiley & Sons, Ltd.
Fernández, Katherina; Labarca, Ximena; Bordeu, Edmundo; Guesalaga, Andrés; Agosin, Eduardo
2007-11-01
Wine tannins are fundamental to the determination of wine quality. However, the chemical and sensorial analysis of these compounds is not straightforward and a simple and rapid technique is necessary. We analyzed the mid-infrared spectra of white, red, and model wines spiked with known amounts of skin or seed tannins, collected using Fourier transform mid-infrared (FT-MIR) transmission spectroscopy (400-4000 cm(-1)). The spectral data were classified according to their tannin source, skin or seed, and tannin concentration by means of discriminant analysis (DA) and soft independent modeling of class analogy (SIMCA) to obtain a probabilistic classification. Wines were also classified sensorially by a trained panel and compared with FT-MIR. SIMCA models gave the most accurate classification (over 97%) and prediction (over 60%) among the wine samples. The prediction was increased (over 73%) using the leave-one-out cross-validation technique. Sensory classification of the wines was less accurate than that obtained with FT-MIR and SIMCA. Overall, these results show the potential of FT-MIR spectroscopy, in combination with adequate statistical tools, to discriminate wines with different tannin levels.
Discrimination of almonds (Prunus dulcis) geographical origin by minerals and fatty acids profiling.
Amorello, Diana; Orecchio, Santino; Pace, Andrea; Barreca, Salvatore
2016-09-01
Twenty-one almond samples from three different geographical origins (Sicily, Spain and California) were investigated by determining minerals and fatty acids compositions. Data were used to discriminate by chemometry almond origin by linear discriminant analysis. With respect to previous PCA profiling studies, this work provides a simpler analytical protocol for the identification of almonds geographical origin. Classification by using mineral contents data only was correct in 77% of the samples, while, by using fatty acid profiles, the percentages of samples correctly classified reached 82%. The coupling of mineral contents and fatty acid profiles lead to an increased efficiency of the classification with 87% of samples correctly classified.
Nikolić, Biljana; Martinović, Jelena; Matić, Milan; Stefanović, Đorđe
2018-05-29
Different variables determine the performance of cyclists, which brings up the question how these parameters may help in their classification by specialty. The aim of the study was to determine differences in cardiorespiratory parameters of male cyclists according to their specialty, flat rider (N=21), hill rider (N=35) and sprinter (N=20) and obtain the multivariate model for further cyclists classification by specialties, based on selected variables. Seventeen variables were measured at submaximal and maximum load on the cycle ergometer Cosmed E 400HK (Cosmed, Rome, Italy) (initial 100W with 25W increase, 90-100 rpm). Multivariate discriminant analysis was used to determine which variables group cyclists within their specialty, and to predict which variables can direct cyclists to a particular specialty. Among nine variables that statistically contribute to the discriminant power of the model, achieved power on the anaerobic threshold and the produced CO2 had the biggest impact. The obtained discriminatory model correctly classified 91.43% of flat riders, 85.71% of hill riders, while sprinters were classified completely correct (100%), i.e. 92.10% of examinees were correctly classified, which point out the strength of the discriminatory model. Respiratory indicators mostly contribute to the discriminant power of the model, which may significantly contribute to training practice and laboratory tests in future.
Controlling protected designation of origin of wine by Raman spectroscopy.
Mandrile, Luisa; Zeppa, Giuseppe; Giovannozzi, Andrea Mario; Rossi, Andrea Mario
2016-11-15
In this paper, a Fourier Transform Raman spectroscopy method, to authenticate the provenience of wine, for food traceability applications was developed. In particular, due to the specific chemical fingerprint of the Raman spectrum, it was possible to discriminate different wines produced in the Piedmont area (North West Italy) in accordance with i) grape varieties, ii) production area and iii) ageing time. In order to create a consistent training set, more than 300 samples from tens of different producers were analyzed, and a chemometric treatment of raw spectra was applied. A discriminant analysis method was employed in the classification procedures, providing a classification capability (percentage of correct answers) of 90% for validation of grape analysis and geographical area provenance, and a classification capability of 84% for ageing time classification. The present methodology was applied successfully to raw materials without any preliminary treatment of the sample, providing a response in a very short time. Copyright © 2016 Elsevier Ltd. All rights reserved.
Carnahan, Brian; Meyer, Gérard; Kuntz, Lois-Ann
2003-01-01
Multivariate classification models play an increasingly important role in human factors research. In the past, these models have been based primarily on discriminant analysis and logistic regression. Models developed from machine learning research offer the human factors professional a viable alternative to these traditional statistical classification methods. To illustrate this point, two machine learning approaches--genetic programming and decision tree induction--were used to construct classification models designed to predict whether or not a student truck driver would pass his or her commercial driver license (CDL) examination. The models were developed and validated using the curriculum scores and CDL exam performances of 37 student truck drivers who had completed a 320-hr driver training course. Results indicated that the machine learning classification models were superior to discriminant analysis and logistic regression in terms of predictive accuracy. Actual or potential applications of this research include the creation of models that more accurately predict human performance outcomes.
Shan, Ying; Sawhney, Harpreet S; Kumar, Rakesh
2008-04-01
This paper proposes a novel unsupervised algorithm learning discriminative features in the context of matching road vehicles between two non-overlapping cameras. The matching problem is formulated as a same-different classification problem, which aims to compute the probability of vehicle images from two distinct cameras being from the same vehicle or different vehicle(s). We employ a novel measurement vector that consists of three independent edge-based measures and their associated robust measures computed from a pair of aligned vehicle edge maps. The weight of each measure is determined by an unsupervised learning algorithm that optimally separates the same-different classes in the combined measurement space. This is achieved with a weak classification algorithm that automatically collects representative samples from same-different classes, followed by a more discriminative classifier based on Fisher' s Linear Discriminants and Gibbs Sampling. The robustness of the match measures and the use of unsupervised discriminant analysis in the classification ensures that the proposed method performs consistently in the presence of missing/false features, temporally and spatially changing illumination conditions, and systematic misalignment caused by different camera configurations. Extensive experiments based on real data of over 200 vehicles at different times of day demonstrate promising results.
Ballabio, Davide; Consonni, Viviana; Mauri, Andrea; Todeschini, Roberto
2010-01-11
In multivariate regression and classification issues variable selection is an important procedure used to select an optimal subset of variables with the aim of producing more parsimonious and eventually more predictive models. Variable selection is often necessary when dealing with methodologies that produce thousands of variables, such as Quantitative Structure-Activity Relationships (QSARs) and highly dimensional analytical procedures. In this paper a novel method for variable selection for classification purposes is introduced. This method exploits the recently proposed Canonical Measure of Correlation between two sets of variables (CMC index). The CMC index is in this case calculated for two specific sets of variables, the former being comprised of the independent variables and the latter of the unfolded class matrix. The CMC values, calculated by considering one variable at a time, can be sorted and a ranking of the variables on the basis of their class discrimination capabilities results. Alternatively, CMC index can be calculated for all the possible combinations of variables and the variable subset with the maximal CMC can be selected, but this procedure is computationally more demanding and classification performance of the selected subset is not always the best one. The effectiveness of the CMC index in selecting variables with discriminative ability was compared with that of other well-known strategies for variable selection, such as the Wilks' Lambda, the VIP index based on the Partial Least Squares-Discriminant Analysis, and the selection provided by classification trees. A variable Forward Selection based on the CMC index was finally used in conjunction of Linear Discriminant Analysis. This approach was tested on several chemical data sets. Obtained results were encouraging.
NASA Astrophysics Data System (ADS)
Vítková, Gabriela; Prokeš, Lubomír; Novotný, Karel; Pořízka, Pavel; Novotný, Jan; Všianský, Dalibor; Čelko, Ladislav; Kaiser, Jozef
2014-11-01
Focusing on historical aspect, during archeological excavation or restoration works of buildings or different structures built from bricks it is important to determine, preferably in-situ and in real-time, the locality of bricks origin. Fast classification of bricks on the base of Laser-Induced Breakdown Spectroscopy (LIBS) spectra is possible using multivariate statistical methods. Combination of principal component analysis (PCA) and linear discriminant analysis (LDA) was applied in this case. LIBS was used to classify altogether the 29 brick samples from 7 different localities. Realizing comparative study using two different LIBS setups - stand-off and table-top it is shown that stand-off LIBS has a big potential for archeological in-field measurements.
Zakaria, Ammar; Shakaff, Ali Yeon Md; Masnan, Maz Jamilah; Ahmad, Mohd Noor; Adom, Abdul Hamid; Jaafar, Mahmad Nor; Ghani, Supri A.; Abdullah, Abu Hassan; Aziz, Abdul Hallis Abdul; Kamarudin, Latifah Munirah; Subari, Norazian; Fikri, Nazifah Ahmad
2011-01-01
The major compounds in honey are carbohydrates such as monosaccharides and disaccharides. The same compounds are found in cane-sugar concentrates. Unfortunately when sugar concentrate is added to honey, laboratory assessments are found to be ineffective in detecting this adulteration. Unlike tracing heavy metals in honey, sugar adulterated honey is much trickier and harder to detect, and traditionally it has been very challenging to come up with a suitable method to prove the presence of adulterants in honey products. This paper proposes a combination of array sensing and multi-modality sensor fusion that can effectively discriminate the samples not only based on the compounds present in the sample but also mimic the way humans perceive flavours and aromas. Conversely, analytical instruments are based on chemical separations which may alter the properties of the volatiles or flavours of a particular honey. The present work is focused on classifying 18 samples of different honeys, sugar syrups and adulterated samples using data fusion of electronic nose (e-nose) and electronic tongue (e-tongue) measurements. Each group of samples was evaluated separately by the e-nose and e-tongue. Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) were able to separately discriminate monofloral honey from sugar syrup, and polyfloral honey from sugar and adulterated samples using the e-nose and e-tongue. The e-nose was observed to give better separation compared to e-tongue assessment, particularly when LDA was applied. However, when all samples were combined in one classification analysis, neither PCA nor LDA were able to discriminate between honeys of different floral origins, sugar syrup and adulterated samples. By applying a sensor fusion technique, the classification for the 18 different samples was improved. Significant improvement was observed using PCA, while LDA not only improved the discrimination but also gave better classification. An improvement in performance was also observed using a Probabilistic Neural Network classifier when the e-nose and e-tongue data were fused. PMID:22164046
Zakaria, Ammar; Shakaff, Ali Yeon Md; Masnan, Maz Jamilah; Ahmad, Mohd Noor; Adom, Abdul Hamid; Jaafar, Mahmad Nor; Ghani, Supri A; Abdullah, Abu Hassan; Aziz, Abdul Hallis Abdul; Kamarudin, Latifah Munirah; Subari, Norazian; Fikri, Nazifah Ahmad
2011-01-01
The major compounds in honey are carbohydrates such as monosaccharides and disaccharides. The same compounds are found in cane-sugar concentrates. Unfortunately when sugar concentrate is added to honey, laboratory assessments are found to be ineffective in detecting this adulteration. Unlike tracing heavy metals in honey, sugar adulterated honey is much trickier and harder to detect, and traditionally it has been very challenging to come up with a suitable method to prove the presence of adulterants in honey products. This paper proposes a combination of array sensing and multi-modality sensor fusion that can effectively discriminate the samples not only based on the compounds present in the sample but also mimic the way humans perceive flavours and aromas. Conversely, analytical instruments are based on chemical separations which may alter the properties of the volatiles or flavours of a particular honey. The present work is focused on classifying 18 samples of different honeys, sugar syrups and adulterated samples using data fusion of electronic nose (e-nose) and electronic tongue (e-tongue) measurements. Each group of samples was evaluated separately by the e-nose and e-tongue. Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) were able to separately discriminate monofloral honey from sugar syrup, and polyfloral honey from sugar and adulterated samples using the e-nose and e-tongue. The e-nose was observed to give better separation compared to e-tongue assessment, particularly when LDA was applied. However, when all samples were combined in one classification analysis, neither PCA nor LDA were able to discriminate between honeys of different floral origins, sugar syrup and adulterated samples. By applying a sensor fusion technique, the classification for the 18 different samples was improved. Significant improvement was observed using PCA, while LDA not only improved the discrimination but also gave better classification. An improvement in performance was also observed using a Probabilistic Neural Network classifier when the e-nose and e-tongue data were fused.
Meltzer, H Y; Matsubara, S; Lee, J C
1989-10-01
The pKi values of 13 reference typical and 7 reference atypical antipsychotic drugs (APDs) for rat striatal dopamine D-1 and D-2 receptor binding sites and cortical serotonin (5-HT2) receptor binding sites were determined. The atypical antipsychotics had significantly lower pKi values for the D-2 but not 5-HT2 binding sites. There was a trend for a lower pKi value for the D-1 binding site for the atypical APD. The 5-HT2 and D-1 pKi values were correlated for the typical APD whereas the 5-HT2 and D-2 pKi values were correlated for the atypical APD. A stepwise discriminant function analysis to determine the independent contribution of each pKi value for a given binding site to the classification as a typical or atypical APD entered the D-2 pKi value first, followed by the 5-HT2 pKi value. The D-1 pKi value was not entered. A discriminant function analysis correctly classified 19 of 20 of these compounds plus 14 of 17 additional test compounds as typical or atypical APD for an overall correct classification rate of 89.2%. The major contributors to the discriminant function were the D-2 and 5-HT2 pKi values. A cluster analysis based only on the 5-HT2/D2 ratio grouped 15 of 17 atypical + one typical APD in one cluster and 19 of 20 typical + two atypical APDs in a second cluster, for an overall correct classification rate of 91.9%. When the stepwise discriminant function was repeated for all 37 compounds, only the D-2 and 5-HT2 pKi values were entered into the discriminant function.(ABSTRACT TRUNCATED AT 250 WORDS)
Study on nondestructive discrimination of genuine and counterfeit wild ginsengs using NIRS
NASA Astrophysics Data System (ADS)
Lu, Q.; Fan, Y.; Peng, Z.; Ding, H.; Gao, H.
2012-07-01
A new approach for the nondestructive discrimination between genuine wild ginsengs and the counterfeit ones by near infrared spectroscopy (NIRS) was developed. Both discriminant analysis and back propagation artificial neural network (BP-ANN) were applied to the model establishment for discrimination. Optimal modeling wavelengths were determined based on the anomalous spectral information of counterfeit samples. Through principal component analysis (PCA) of various wild ginseng samples, genuine and counterfeit, the cumulative percentages of variance of the principal components were obtained, serving as a reference for principal component (PC) factor determination. Discriminant analysis achieved an identification ratio of 88.46%. With sample' truth values as its outputs, a three-layer BP-ANN model was built, which yielded a higher discrimination accuracy of 100%. The overall results sufficiently demonstrate that NIRS combined with BP-ANN classification algorithm performs better on ginseng discrimination than discriminant analysis, and can be used as a rapid and nondestructive method for the detection of counterfeit wild ginsengs in food and pharmaceutical industry.
Laurencikas, E; Sävendahl, L; Jorulf, H
2006-06-01
To assess the value of the metacarpophalangeal pattern profile (MCPP) analysis as a diagnostic tool for differentiating between patients with dyschondrosteosis, Turner syndrome, and hypochondroplasia. Radiographic and clinical data from 135 patients between 1 and 51 years of age were collected and analyzed. The study included 25 patients with hypochondroplasia (HCP), 39 with dyschondrosteosis (LWD), and 71 with Turner syndrome (TS). Hand pattern profiles were calculated and compared with those of 110 normal individuals. Pearson correlation coefficient (r) and multivariate discriminant analysis were used for pattern profile analysis. Pattern variability index, a measure of dysmorphogenesis, was calculated for LWD, TS, HCP, and normal controls. Our results demonstrate that patients with LWD, TS, or HCP have distinct pattern profiles that are significantly different from each other and from those of normal controls. Discriminant analysis yielded correct classification of normal versus abnormal individuals in 84% of cases. Classification of the patients into LWD, TS, and HCP groups was successful in 75%. The correct classification rate was higher (85%) when differentiating two pathological groups at a time. Pattern variability index was not helpful for differential diagnosis of LWD, TS, and HCP. Patients with LWD, TS, or HCP have distinct MCPPs and can be successfully differentiated from each other using advanced MCPP analysis. Discriminant analysis is to be preferred over Pearson correlation coefficient because it is a more sensitive and specific technique. MCPP analysis is a helpful tool for differentiating between syndromes with similar clinical and radiological abnormalities.
Deep feature extraction and combination for synthetic aperture radar target classification
NASA Astrophysics Data System (ADS)
Amrani, Moussa; Jiang, Feng
2017-10-01
Feature extraction has always been a difficult problem in the classification performance of synthetic aperture radar automatic target recognition (SAR-ATR). It is very important to select discriminative features to train a classifier, which is a prerequisite. Inspired by the great success of convolutional neural network (CNN), we address the problem of SAR target classification by proposing a feature extraction method, which takes advantage of exploiting the extracted deep features from CNNs on SAR images to introduce more powerful discriminative features and robust representation ability for them. First, the pretrained VGG-S net is fine-tuned on moving and stationary target acquisition and recognition (MSTAR) public release database. Second, after a simple preprocessing is performed, the fine-tuned network is used as a fixed feature extractor to extract deep features from the processed SAR images. Third, the extracted deep features are fused by using a traditional concatenation and a discriminant correlation analysis algorithm. Finally, for target classification, K-nearest neighbors algorithm based on LogDet divergence-based metric learning triplet constraints is adopted as a baseline classifier. Experiments on MSTAR are conducted, and the classification accuracy results demonstrate that the proposed method outperforms the state-of-the-art methods.
Janousova, Eva; Schwarz, Daniel; Kasparek, Tomas
2015-06-30
We investigated a combination of three classification algorithms, namely the modified maximum uncertainty linear discriminant analysis (mMLDA), the centroid method, and the average linkage, with three types of features extracted from three-dimensional T1-weighted magnetic resonance (MR) brain images, specifically MR intensities, grey matter densities, and local deformations for distinguishing 49 first episode schizophrenia male patients from 49 healthy male subjects. The feature sets were reduced using intersubject principal component analysis before classification. By combining the classifiers, we were able to obtain slightly improved results when compared with single classifiers. The best classification performance (81.6% accuracy, 75.5% sensitivity, and 87.8% specificity) was significantly better than classification by chance. We also showed that classifiers based on features calculated using more computation-intensive image preprocessing perform better; mMLDA with classification boundary calculated as weighted mean discriminative scores of the groups had improved sensitivity but similar accuracy compared to the original MLDA; reducing a number of eigenvectors during data reduction did not always lead to higher classification accuracy, since noise as well as the signal important for classification were removed. Our findings provide important information for schizophrenia research and may improve accuracy of computer-aided diagnostics of neuropsychiatric diseases. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Thomas, Randall W.; Ustin, Susan L.
1987-01-01
A preliminary assessment was made of Airborne Imaging Spectrometer (AIS) data for discriminating and characterizing vegetation in a semiarid environment. May and October AIS data sets were acquired over a large alluvial fan in eastern California, on which were found Great Basin desert shrub communities. Maximum likelihood classification of a principal components representation of the May AIS data enabled discrimination of subtle spatial detail in images relating to vegetation and soil characteristics. The spatial patterns in the May AIS classification were, however, too detailed for complete interpretation with existing ground data. A similar analysis of the October AIS data yielded poor results. Comparison of AIS results with a similar analysis of May Landsat Thematic Mapper data showed that the May AIS data contained approximately three to four times as much spectrally coherent information. When only two shortwave infrared TM bands were used, results were similar to those from AIS data acquired in October.
Magagna, Federico; Guglielmetti, Alessandro; Liberto, Erica; Reichenbach, Stephen E; Allegrucci, Elena; Gobino, Guido; Bicchi, Carlo; Cordero, Chiara
2017-08-02
This study investigates chemical information of volatile fractions of high-quality cocoa (Theobroma cacao L. Malvaceae) from different origins (Mexico, Ecuador, Venezuela, Columbia, Java, Trinidad, and Sao Tomè) produced for fine chocolate. This study explores the evolution of the entire pattern of volatiles in relation to cocoa processing (raw, roasted, steamed, and ground beans). Advanced chemical fingerprinting (e.g., combined untargeted and targeted fingerprinting) with comprehensive two-dimensional gas chromatography coupled with mass spectrometry allows advanced pattern recognition for classification, discrimination, and sensory-quality characterization. The entire data set is analyzed for 595 reliable two-dimensional peak regions, including 130 known analytes and 13 potent odorants. Multivariate analysis with unsupervised exploration (principal component analysis) and simple supervised discrimination methods (Fisher ratios and linear regression trees) reveal informative patterns of similarities and differences and identify characteristic compounds related to sample origin and manufacturing step.
Ryder, Alan G
2002-03-01
Eighty-five solid samples consisting of illegal narcotics diluted with several different materials were analyzed by near-infrared (785 nm excitation) Raman spectroscopy. Principal Component Analysis (PCA) was employed to classify the samples according to narcotic type. The best sample discrimination was obtained by using the first derivative of the Raman spectra. Furthermore, restricting the spectral variables for PCA to 2 or 3% of the original spectral data according to the most intense peaks in the Raman spectrum of the pure narcotic resulted in a rapid discrimination method for classifying samples according to narcotic type. This method allows for the easy discrimination between cocaine, heroin, and MDMA mixtures even when the Raman spectra are complex or very similar. This approach of restricting the spectral variables also decreases the computational time by a factor of 30 (compared to the complete spectrum), making the methodology attractive for rapid automatic classification and identification of suspect materials.
Vessel Classification in Cosmo-Skymed SAR Data Using Hierarchical Feature Selection
NASA Astrophysics Data System (ADS)
Makedonas, A.; Theoharatos, C.; Tsagaris, V.; Anastasopoulos, V.; Costicoglou, S.
2015-04-01
SAR based ship detection and classification are important elements of maritime monitoring applications. Recently, high-resolution SAR data have opened new possibilities to researchers for achieving improved classification results. In this work, a hierarchical vessel classification procedure is presented based on a robust feature extraction and selection scheme that utilizes scale, shape and texture features in a hierarchical way. Initially, different types of feature extraction algorithms are implemented in order to form the utilized feature pool, able to represent the structure, material, orientation and other vessel type characteristics. A two-stage hierarchical feature selection algorithm is utilized next in order to be able to discriminate effectively civilian vessels into three distinct types, in COSMO-SkyMed SAR images: cargos, small ships and tankers. In our analysis, scale and shape features are utilized in order to discriminate smaller types of vessels present in the available SAR data, or shape specific vessels. Then, the most informative texture and intensity features are incorporated in order to be able to better distinguish the civilian types with high accuracy. A feature selection procedure that utilizes heuristic measures based on features' statistical characteristics, followed by an exhaustive research with feature sets formed by the most qualified features is carried out, in order to discriminate the most appropriate combination of features for the final classification. In our analysis, five COSMO-SkyMed SAR data with 2.2m x 2.2m resolution were used to analyse the detailed characteristics of these types of ships. A total of 111 ships with available AIS data were used in the classification process. The experimental results show that this method has good performance in ship classification, with an overall accuracy reaching 83%. Further investigation of additional features and proper feature selection is currently in progress.
Burneo-Garcés, Carlos; Marín-Morales, Agar; Pérez-García, Miguel
2018-01-01
The ability of a wide range of psychological and actuarial measures to characterize crimes in the prison population has not yet been compared in a single study. Our main objective was to determine if the discriminant capacity of psychological measures (PM) and actuarial data (AD) varies according to the crime. An Ecuadorian sample of 576 men convicted of Robbery, Murder, Rape and Drug Possession crimes was evaluated through an ad hoc questionnaire, prison files and the Spanish adaptation of the Personality Assessment Inventory. Discriminant analysis was used to establish, for each crime, the discriminant capacity and the classification accuracy of a model composed of AD (socio-demographic and judicial measures) and a second model incorporating PM. The AD showed a superior discriminant capacity, whilst the contribution of both types of measures varied according to the crime. The PM generated some increase in the correct classification percentages for Murder, Rape and Drug Possession, but their contribution was zero for the crime of Robbery. Specific profiles of each crime were obtained from the strongest significant correlations between the value of each explanatory variable and the probability of belonging to the crime. The AD model is more robust when these four crimes are characterized. The contribution of AD and PM depends on the crime, and the inclusion of PM in actuarial models moderately optimizes the classification accuracy of Murder, Rape, and Drug Possession crimes. PMID:29874264
A Quantitative Analysis of Pulsed Signals Emitted by Wild Bottlenose Dolphins.
Luís, Ana Rita; Couchinho, Miguel N; Dos Santos, Manuel E
2016-01-01
Common bottlenose dolphins (Tursiops truncatus), produce a wide variety of vocal emissions for communication and echolocation, of which the pulsed repertoire has been the most difficult to categorize. Packets of high repetition, broadband pulses are still largely reported under a general designation of burst-pulses, and traditional attempts to classify these emissions rely mainly in their aural characteristics and in graphical aspects of spectrograms. Here, we present a quantitative analysis of pulsed signals emitted by wild bottlenose dolphins, in the Sado estuary, Portugal (2011-2014), and test the reliability of a traditional classification approach. Acoustic parameters (minimum frequency, maximum frequency, peak frequency, duration, repetition rate and inter-click-interval) were extracted from 930 pulsed signals, previously categorized using a traditional approach. Discriminant function analysis revealed a high reliability of the traditional classification approach (93.5% of pulsed signals were consistently assigned to their aurally based categories). According to the discriminant function analysis (Wilk's Λ = 0.11, F3, 2.41 = 282.75, P < 0.001), repetition rate is the feature that best enables the discrimination of different pulsed signals (structure coefficient = 0.98). Classification using hierarchical cluster analysis led to a similar categorization pattern: two main signal types with distinct magnitudes of repetition rate were clustered into five groups. The pulsed signals, here described, present significant differences in their time-frequency features, especially repetition rate (P < 0.001), inter-click-interval (P < 0.001) and duration (P < 0.001). We document the occurrence of a distinct signal type-short burst-pulses, and highlight the existence of a diverse repertoire of pulsed vocalizations emitted in graded sequences. The use of quantitative analysis of pulsed signals is essential to improve classifications and to better assess the contexts of emission, geographic variation and the functional significance of pulsed signals.
Estuarial fingerprinting through multidimensional fluorescence and multivariate analysis.
Hall, Gregory J; Clow, Kerin E; Kenny, Jonathan E
2005-10-01
As part of a strategy for preventing the introduction of aquatic nuisance species (ANS) to U.S. estuaries, ballast water exchange (BWE) regulations have been imposed. Enforcing these regulations requires a reliable method for determining the port of origin of water in the ballast tanks of ships entering U.S. waters. This study shows that a three-dimensional fluorescence fingerprinting technique, excitation emission matrix (EEM) spectroscopy, holds great promise as a ballast water analysis tool. In our technique, EEMs are analyzed by multivariate classification and curve resolution methods, such as N-way partial least squares Regression-discriminant analysis (NPLS-DA) and parallel factor analysis (PARAFAC). We demonstrate that classification techniques can be used to discriminate among sampling sites less than 10 miles apart, encompassing Boston Harbor and two tributaries in the Mystic River Watershed. To our knowledge, this work is the first to use multivariate analysis to classify water as to location of origin. Furthermore, it is shown that curve resolution can show seasonal features within the multidimensional fluorescence data sets, which correlate with difficulty in classification.
Bisenius, Sandrine; Mueller, Karsten; Diehl-Schmid, Janine; Fassbender, Klaus; Grimmer, Timo; Jessen, Frank; Kassubek, Jan; Kornhuber, Johannes; Landwehrmeyer, Bernhard; Ludolph, Albert; Schneider, Anja; Anderl-Straub, Sarah; Stuke, Katharina; Danek, Adrian; Otto, Markus; Schroeter, Matthias L
2017-01-01
Primary progressive aphasia (PPA) encompasses the three subtypes nonfluent/agrammatic variant PPA, semantic variant PPA, and the logopenic variant PPA, which are characterized by distinct patterns of language difficulties and regional brain atrophy. To validate the potential of structural magnetic resonance imaging data for early individual diagnosis, we used support vector machine classification on grey matter density maps obtained by voxel-based morphometry analysis to discriminate PPA subtypes (44 patients: 16 nonfluent/agrammatic variant PPA, 17 semantic variant PPA, 11 logopenic variant PPA) from 20 healthy controls (matched for sample size, age, and gender) in the cohort of the multi-center study of the German consortium for frontotemporal lobar degeneration. Here, we compared a whole-brain with a meta-analysis-based disease-specific regions-of-interest approach for support vector machine classification. We also used support vector machine classification to discriminate the three PPA subtypes from each other. Whole brain support vector machine classification enabled a very high accuracy between 91 and 97% for identifying specific PPA subtypes vs. healthy controls, and 78/95% for the discrimination between semantic variant vs. nonfluent/agrammatic or logopenic PPA variants. Only for the discrimination between nonfluent/agrammatic and logopenic PPA variants accuracy was low with 55%. Interestingly, the regions that contributed the most to the support vector machine classification of patients corresponded largely to the regions that were atrophic in these patients as revealed by group comparisons. Although the whole brain approach took also into account regions that were not covered in the regions-of-interest approach, both approaches showed similar accuracies due to the disease-specificity of the selected networks. Conclusion, support vector machine classification of multi-center structural magnetic resonance imaging data enables prediction of PPA subtypes with a very high accuracy paving the road for its application in clinical settings.
Detection of Genetically Modified Sugarcane by Using Terahertz Spectroscopy and Chemometrics
NASA Astrophysics Data System (ADS)
Liu, J.; Xie, H.; Zha, B.; Ding, W.; Luo, J.; Hu, C.
2018-03-01
A methodology is proposed to identify genetically modified sugarcane from non-genetically modified sugarcane by using terahertz spectroscopy and chemometrics techniques, including linear discriminant analysis (LDA), support vector machine-discriminant analysis (SVM-DA), and partial least squares-discriminant analysis (PLS-DA). The classification rate of the above mentioned methods is compared, and different types of preprocessing are considered. According to the experimental results, the best option is PLS-DA, with an identification rate of 98%. The results indicated that THz spectroscopy and chemometrics techniques are a powerful tool to identify genetically modified and non-genetically modified sugarcane.
Schönweiler, R; Wübbelt, P; Tolloczko, R; Rose, C; Ptok, M
2000-01-01
Discriminant analysis (DA) and self-organizing feature maps (SOFM) were used to classify passively evoked auditory event-related potentials (ERP) P(1), N(1), P(2) and N(2). Responses from 16 children with severe behavioral auditory perception deficits, 16 children with marked behavioral auditory perception deficits, and 14 controls were examined. Eighteen ERP amplitude parameters were selected for examination of statistical differences between the groups. Different DA methods and SOFM configurations were trained to the values. SOFM had better classification results than DA methods. Subsequently, measures on another 37 subjects that were unknown for the trained SOFM were used to test the reliability of the system. With 10-dimensional vectors, reliable classifications were obtained that matched behavioral auditory perception deficits in 96%, implying central auditory processing disorder (CAPD). The results also support the assumption that CAPD includes a 'non-peripheral' auditory processing deficit. Copyright 2000 S. Karger AG, Basel.
Graphical methods for the sensitivity analysis in discriminant analysis
Kim, Youngil; Anderson-Cook, Christine M.; Dae-Heung, Jang
2015-09-30
Similar to regression, many measures to detect influential data points in discriminant analysis have been developed. Many follow similar principles as the diagnostic measures used in linear regression in the context of discriminant analysis. Here we focus on the impact on the predicted classification posterior probability when a data point is omitted. The new method is intuitive and easily interpretative compared to existing methods. We also propose a graphical display to show the individual movement of the posterior probability of other data points when a specific data point is omitted. This enables the summaries to capture the overall pattern ofmore » the change.« less
Lim, Jongguk; Kim, Giyoung; Mo, Changyeun; Oh, Kyoungmin; Yoo, Hyeonchae; Ham, Hyeonheui; Kim, Moon S.
2017-01-01
The purpose of this study is to use near-infrared reflectance (NIR) spectroscopy equipment to nondestructively and rapidly discriminate Fusarium-infected hulled barley. Both normal hulled barley and Fusarium-infected hulled barley were scanned by using a NIR spectrometer with a wavelength range of 1175 to 2170 nm. Multiple mathematical pretreatments were applied to the reflectance spectra obtained for Fusarium discrimination and the multivariate analysis method of partial least squares discriminant analysis (PLS-DA) was used for discriminant prediction. The PLS-DA prediction model developed by applying the second-order derivative pretreatment to the reflectance spectra obtained from the side of hulled barley without crease achieved 100% accuracy in discriminating the normal hulled barley and the Fusarium-infected hulled barley. These results demonstrated the feasibility of rapid discrimination of the Fusarium-infected hulled barley by combining multivariate analysis with the NIR spectroscopic technique, which is utilized as a nondestructive detection method. PMID:28974012
Sabir, Aryani; Rafi, Mohamad; Darusman, Latifah K
2017-04-15
HPLC fingerprint analysis combined with chemometrics was developed to discriminate between the red and the white rice bran grown in Indonesia. The major component in rice bran is γ-oryzanol which consisted of 4 main compounds, namely cycloartenol ferulate, cyclobranol ferulate, campesterol ferulate and β-sitosterol ferulate. Separation of these four compounds along with other compounds was performed using C18 and methanol-acetonitrile with gradient elution system. By using these intensity variations, principal component and discriminant analysis were performed to discriminate the two samples. Discriminant analysis was successfully discriminated the red from the white rice bran with predictive ability of the model showed a satisfactory classification for the test samples. The results of this study indicated that the developed method was suitable as quality control method for rice bran in terms of identification and discrimination of the red and the white rice bran. Copyright © 2016 Elsevier Ltd. All rights reserved.
Hsiung, Chang; Pederson, Christopher G.; Zou, Peng; Smith, Valton; von Gunten, Marc; O’Brien, Nada A.
2016-01-01
Near-infrared spectroscopy as a rapid and non-destructive analytical technique offers great advantages for pharmaceutical raw material identification (RMID) to fulfill the quality and safety requirements in pharmaceutical industry. In this study, we demonstrated the use of portable miniature near-infrared (MicroNIR) spectrometers for NIR-based pharmaceutical RMID and solved two challenges in this area, model transferability and large-scale classification, with the aid of support vector machine (SVM) modeling. We used a set of 19 pharmaceutical compounds including various active pharmaceutical ingredients (APIs) and excipients and six MicroNIR spectrometers to test model transferability. For the test of large-scale classification, we used another set of 253 pharmaceutical compounds comprised of both chemically and physically different APIs and excipients. We compared SVM with conventional chemometric modeling techniques, including soft independent modeling of class analogy, partial least squares discriminant analysis, linear discriminant analysis, and quadratic discriminant analysis. Support vector machine modeling using a linear kernel, especially when combined with a hierarchical scheme, exhibited excellent performance in both model transferability and large-scale classification. Hence, ultra-compact, portable and robust MicroNIR spectrometers coupled with SVM modeling can make on-site and in situ pharmaceutical RMID for large-volume applications highly achievable. PMID:27029624
Polarimetric SAR image classification based on discriminative dictionary learning model
NASA Astrophysics Data System (ADS)
Sang, Cheng Wei; Sun, Hong
2018-03-01
Polarimetric SAR (PolSAR) image classification is one of the important applications of PolSAR remote sensing. It is a difficult high-dimension nonlinear mapping problem, the sparse representations based on learning overcomplete dictionary have shown great potential to solve such problem. The overcomplete dictionary plays an important role in PolSAR image classification, however for PolSAR image complex scenes, features shared by different classes will weaken the discrimination of learned dictionary, so as to degrade classification performance. In this paper, we propose a novel overcomplete dictionary learning model to enhance the discrimination of dictionary. The learned overcomplete dictionary by the proposed model is more discriminative and very suitable for PolSAR classification.
Panneton, Bernard; Guillaume, Serge; Roger, Jean-Michel; Samson, Guy
2010-01-01
Precision weeding by spot spraying in real time requires sensors to discriminate between weeds and crop without contact. Among the optical based solutions, the ultraviolet (UV) induced fluorescence of the plants appears as a promising alternative. In a first paper, the feasibility of discriminating between corn hybrids, monocotyledonous, and dicotyledonous weeds was demonstrated on the basis of the complete spectra. Some considerations about the different sources of fluorescence oriented the focus to the blue-green fluorescence (BGF) part, ignoring the chlorophyll fluorescence that is inherently more variable in time. This paper investigates the potential of performing weed/crop discrimination on the basis of several large spectral bands in the BGF area. A partial least squares discriminant analysis (PLS-DA) was performed on a set of 1908 spectra of corn and weed plants over 3 years and various growing conditions. The discrimination between monocotyledonous and dicotyledonous plants based on the blue-green fluorescence yielded robust models (classification error between 1.3 and 4.6% for between-year validation). On the basis of the analysis of the PLS-DA model, two large bands were chosen in the blue-green fluorescence zone (400-425 nm and 425-490 nm). A linear discriminant analysis based on the signal from these two bands also provided very robust inter-year results (classification error from 1.5% to 5.2%). The same selection process was applied to discriminate between monocotyledonous weeds and maize but yielded no robust models (up to 50% inter-year error). Further work will be required to solve this problem and provide a complete UV fluorescence based sensor for weed-maize discrimination.
Mandelkow, Hendrik; de Zwart, Jacco A.; Duyn, Jeff H.
2016-01-01
Naturalistic stimuli like movies evoke complex perceptual processes, which are of great interest in the study of human cognition by functional MRI (fMRI). However, conventional fMRI analysis based on statistical parametric mapping (SPM) and the general linear model (GLM) is hampered by a lack of accurate parametric models of the BOLD response to complex stimuli. In this situation, statistical machine-learning methods, a.k.a. multivariate pattern analysis (MVPA), have received growing attention for their ability to generate stimulus response models in a data-driven fashion. However, machine-learning methods typically require large amounts of training data as well as computational resources. In the past, this has largely limited their application to fMRI experiments involving small sets of stimulus categories and small regions of interest in the brain. By contrast, the present study compares several classification algorithms known as Nearest Neighbor (NN), Gaussian Naïve Bayes (GNB), and (regularized) Linear Discriminant Analysis (LDA) in terms of their classification accuracy in discriminating the global fMRI response patterns evoked by a large number of naturalistic visual stimuli presented as a movie. Results show that LDA regularized by principal component analysis (PCA) achieved high classification accuracies, above 90% on average for single fMRI volumes acquired 2 s apart during a 300 s movie (chance level 0.7% = 2 s/300 s). The largest source of classification errors were autocorrelations in the BOLD signal compounded by the similarity of consecutive stimuli. All classifiers performed best when given input features from a large region of interest comprising around 25% of the voxels that responded significantly to the visual stimulus. Consistent with this, the most informative principal components represented widespread distributions of co-activated brain regions that were similar between subjects and may represent functional networks. In light of these results, the combination of naturalistic movie stimuli and classification analysis in fMRI experiments may prove to be a sensitive tool for the assessment of changes in natural cognitive processes under experimental manipulation. PMID:27065832
Hettick, Justin M; Green, Brett J; Buskirk, Amanda D; Kashon, Michael L; Slaven, James E; Janotka, Erika; Blachere, Francoise M; Schmechel, Detlef; Beezhold, Donald H
2008-09-15
Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) was used to generate highly reproducible mass spectral fingerprints for 12 species of fungi of the genus Aspergillus and 5 different strains of Aspergillus flavus. Prior to MALDI-TOF MS analysis, the fungi were subjected to three 1-min bead beating cycles in an acetonitrile/trifluoroacetic acid solvent. The mass spectra contain abundant peaks in the range of 5 to 20kDa and may be used to discriminate between species unambiguously. A discriminant analysis using all peaks from the MALDI-TOF MS data yielded error rates for classification of 0 and 18.75% for resubstitution and cross-validation methods, respectively. If a subset of 28 significant peaks is chosen, resubstitution and cross-validation error rates are 0%. Discriminant analysis of the MALDI-TOF MS data for 5 strains of A. flavus using all peaks yielded error rates for classification of 0 and 5% for resubstitution and cross-validation methods, respectively. These data indicate that MALDI-TOF MS data may be used for unambiguous identification of members of the genus Aspergillus at both the species and strain levels.
A manual and an automatic TERS based virus discrimination
NASA Astrophysics Data System (ADS)
Olschewski, Konstanze; Kämmer, Evelyn; Stöckel, Stephan; Bocklitz, Thomas; Deckert-Gaudig, Tanja; Zell, Roland; Cialla-May, Dana; Weber, Karina; Deckert, Volker; Popp, Jürgen
2015-02-01
Rapid techniques for virus identification are more relevant today than ever. Conventional virus detection and identification strategies generally rest upon various microbiological methods and genomic approaches, which are not suited for the analysis of single virus particles. In contrast, the highly sensitive spectroscopic technique tip-enhanced Raman spectroscopy (TERS) allows the characterisation of biological nano-structures like virions on a single-particle level. In this study, the feasibility of TERS in combination with chemometrics to discriminate two pathogenic viruses, Varicella-zoster virus (VZV) and Porcine teschovirus (PTV), was investigated. In a first step, chemometric methods transformed the spectral data in such a way that a rapid visual discrimination of the two examined viruses was enabled. In a further step, these methods were utilised to perform an automatic quality rating of the measured spectra. Spectra that passed this test were eventually used to calculate a classification model, through which a successful discrimination of the two viral species based on TERS spectra of single virus particles was also realised with a classification accuracy of 91%.Rapid techniques for virus identification are more relevant today than ever. Conventional virus detection and identification strategies generally rest upon various microbiological methods and genomic approaches, which are not suited for the analysis of single virus particles. In contrast, the highly sensitive spectroscopic technique tip-enhanced Raman spectroscopy (TERS) allows the characterisation of biological nano-structures like virions on a single-particle level. In this study, the feasibility of TERS in combination with chemometrics to discriminate two pathogenic viruses, Varicella-zoster virus (VZV) and Porcine teschovirus (PTV), was investigated. In a first step, chemometric methods transformed the spectral data in such a way that a rapid visual discrimination of the two examined viruses was enabled. In a further step, these methods were utilised to perform an automatic quality rating of the measured spectra. Spectra that passed this test were eventually used to calculate a classification model, through which a successful discrimination of the two viral species based on TERS spectra of single virus particles was also realised with a classification accuracy of 91%. Electronic supplementary information (ESI) available. See DOI: 10.1039/c4nr07033j
Micro-Raman spectroscopy of natural and synthetic indigo samples.
Vandenabeele, Peter; Moens, Luc
2003-02-01
In this work indigo samples from three different sources are studied by using Raman spectroscopy: the synthetic pigment and pigments from the woad (Isatis tinctoria) and the indigo plant (Indigofera tinctoria). 21 samples were obtained from 8 suppliers; for each sample 5 Raman spectra were recorded and used for further chemometrical analysis. Principal components analysis (PCA) was performed as data reduction method before applying hierarchical cluster analysis. Linear discriminant analysis (LDA) was implemented as a non-hierarchical supervised pattern recognition method to build a classification model. In order to avoid broad-shaped interferences from the fluorescence background, the influence of 1st and 2nd derivatives on the classification was studied by using cross-validation. Although chemically identical, it is shown that Raman spectroscopy in combination with suitable chemometric methods has the potential to discriminate between synthetic and natural indigo samples.
Jo, Javier A; Fang, Qiyin; Papaioannou, Thanassis; Baker, J Dennis; Dorafshar, Amir H; Reil, Todd; Qiao, Jian-Hua; Fishbein, Michael C; Freischlag, Julie A; Marcu, Laura
2006-01-01
We report the application of the Laguerre deconvolution technique (LDT) to the analysis of in-vivo time-resolved laser-induced fluorescence spectroscopy (TR-LIFS) data and the diagnosis of atherosclerotic plaques. TR-LIFS measurements were obtained in vivo from normal and atherosclerotic aortas (eight rabbits, 73 areas), and subsequently analyzed using LDT. Spectral and time-resolved features were used to develop four classification algorithms: linear discriminant analysis (LDA), stepwise LDA (SLDA), principal component analysis (PCA), and artificial neural network (ANN). Accurate deconvolution of TR-LIFS in-vivo measurements from normal and atherosclerotic arteries was provided by LDT. The derived Laguerre expansion coefficients reflected changes in the arterial biochemical composition, and provided a means to discriminate lesions rich in macrophages with high sensitivity (>85%) and specificity (>95%). Classification algorithms (SLDA and PCA) using a selected number of features with maximum discriminating power provided the best performance. This study demonstrates the potential of the LDT for in-vivo tissue diagnosis, and specifically for the detection of macrophages infiltration in atherosclerotic lesions, a key marker of plaque vulnerability.
NASA Astrophysics Data System (ADS)
Jo, Javier A.; Fang, Qiyin; Papaioannou, Thanassis; Baker, J. Dennis; Dorafshar, Amir; Reil, Todd; Qiao, Jianhua; Fishbein, Michael C.; Freischlag, Julie A.; Marcu, Laura
2006-03-01
We report the application of the Laguerre deconvolution technique (LDT) to the analysis of in-vivo time-resolved laser-induced fluorescence spectroscopy (TR-LIFS) data and the diagnosis of atherosclerotic plaques. TR-LIFS measurements were obtained in vivo from normal and atherosclerotic aortas (eight rabbits, 73 areas), and subsequently analyzed using LDT. Spectral and time-resolved features were used to develop four classification algorithms: linear discriminant analysis (LDA), stepwise LDA (SLDA), principal component analysis (PCA), and artificial neural network (ANN). Accurate deconvolution of TR-LIFS in-vivo measurements from normal and atherosclerotic arteries was provided by LDT. The derived Laguerre expansion coefficients reflected changes in the arterial biochemical composition, and provided a means to discriminate lesions rich in macrophages with high sensitivity (>85%) and specificity (>95%). Classification algorithms (SLDA and PCA) using a selected number of features with maximum discriminating power provided the best performance. This study demonstrates the potential of the LDT for in-vivo tissue diagnosis, and specifically for the detection of macrophages infiltration in atherosclerotic lesions, a key marker of plaque vulnerability.
Jo, Javier A.; Fang, Qiyin; Papaioannou, Thanassis; Baker, J. Dennis; Dorafshar, Amir H.; Reil, Todd; Qiao, Jian-Hua; Fishbein, Michael C.; Freischlag, Julie A.; Marcu, Laura
2007-01-01
We report the application of the Laguerre deconvolution technique (LDT) to the analysis of in-vivo time-resolved laser-induced fluorescence spectroscopy (TR-LIFS) data and the diagnosis of atherosclerotic plaques. TR-LIFS measurements were obtained in vivo from normal and atherosclerotic aortas (eight rabbits, 73 areas), and subsequently analyzed using LDT. Spectral and time-resolved features were used to develop four classification algorithms: linear discriminant analysis (LDA), stepwise LDA (SLDA), principal component analysis (PCA), and artificial neural network (ANN). Accurate deconvolution of TR-LIFS in-vivo measurements from normal and atherosclerotic arteries was provided by LDT. The derived Laguerre expansion coefficients reflected changes in the arterial biochemical composition, and provided a means to discriminate lesions rich in macrophages with high sensitivity (>85%) and specificity (>95%). Classification algorithms (SLDA and PCA) using a selected number of features with maximum discriminating power provided the best performance. This study demonstrates the potential of the LDT for in-vivo tissue diagnosis, and specifically for the detection of macrophages infiltration in atherosclerotic lesions, a key marker of plaque vulnerability. PMID:16674179
General tensor discriminant analysis and gabor features for gait recognition.
Tao, Dacheng; Li, Xuelong; Wu, Xindong; Maybank, Stephen J
2007-10-01
The traditional image representations are not suited to conventional classification methods, such as the linear discriminant analysis (LDA), because of the under sample problem (USP): the dimensionality of the feature space is much higher than the number of training samples. Motivated by the successes of the two dimensional LDA (2DLDA) for face recognition, we develop a general tensor discriminant analysis (GTDA) as a preprocessing step for LDA. The benefits of GTDA compared with existing preprocessing methods, e.g., principal component analysis (PCA) and 2DLDA, include 1) the USP is reduced in subsequent classification by, for example, LDA; 2) the discriminative information in the training tensors is preserved; and 3) GTDA provides stable recognition rates because the alternating projection optimization algorithm to obtain a solution of GTDA converges, while that of 2DLDA does not. We use human gait recognition to validate the proposed GTDA. The averaged gait images are utilized for gait representation. Given the popularity of Gabor function based image decompositions for image understanding and object recognition, we develop three different Gabor function based image representations: 1) the GaborD representation is the sum of Gabor filter responses over directions, 2) GaborS is the sum of Gabor filter responses over scales, and 3) GaborSD is the sum of Gabor filter responses over scales and directions. The GaborD, GaborS and GaborSD representations are applied to the problem of recognizing people from their averaged gait images.A large number of experiments were carried out to evaluate the effectiveness (recognition rate) of gait recognition based on first obtaining a Gabor, GaborD, GaborS or GaborSD image representation, then using GDTA to extract features and finally using LDA for classification. The proposed methods achieved good performance for gait recognition based on image sequences from the USF HumanID Database. Experimental comparisons are made with nine state of the art classification methods in gait recognition.
Zhou, Fei; Zhao, Yajing; Peng, Jiyu; Jiang, Yirong; Li, Maiquan; Jiang, Yuan; Lu, Baiyi
2017-07-01
Osmanthus fragrans flowers are used as folk medicine and additives for teas, beverages and foods. The metabolites of O. fragrans flowers from different geographical origins were inconsistent in some extent. Chromatography and mass spectrometry combined with multivariable analysis methods provides an approach for discriminating the origin of O. fragrans flowers. To discriminate the Osmanthus fragrans var. thunbergii flowers from different origins with the identified metabolites. GC-MS and UPLC-PDA were conducted to analyse the metabolites in O. fragrans var. thunbergii flowers (in total 150 samples). Principal component analysis (PCA), soft independent modelling of class analogy analysis (SIMCA) and random forest (RF) analysis were applied to group the GC-MS and UPLC-PDA data. GC-MS identified 32 compounds common to all samples while UPLC-PDA/QTOF-MS identified 16 common compounds. PCA of the UPLC-PDA data generated a better clustering than PCA of the GC-MS data. Ten metabolites (six from GC-MS and four from UPLC-PDA) were selected as effective compounds for discrimination by PCA loadings. SIMCA and RF analysis were used to build classification models, and the RF model, based on the four effective compounds (caffeic acid derivative, acteoside, ligustroside and compound 15), yielded better results with the classification rate of 100% in the calibration set and 97.8% in the prediction set. GC-MS and UPLC-PDA combined with multivariable analysis methods can discriminate the origin of Osmanthus fragrans var. thunbergii flowers. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Quantum Ensemble Classification: A Sampling-Based Learning Control Approach.
Chen, Chunlin; Dong, Daoyi; Qi, Bo; Petersen, Ian R; Rabitz, Herschel
2017-06-01
Quantum ensemble classification (QEC) has significant applications in discrimination of atoms (or molecules), separation of isotopes, and quantum information extraction. However, quantum mechanics forbids deterministic discrimination among nonorthogonal states. The classification of inhomogeneous quantum ensembles is very challenging, since there exist variations in the parameters characterizing the members within different classes. In this paper, we recast QEC as a supervised quantum learning problem. A systematic classification methodology is presented by using a sampling-based learning control (SLC) approach for quantum discrimination. The classification task is accomplished via simultaneously steering members belonging to different classes to their corresponding target states (e.g., mutually orthogonal states). First, a new discrimination method is proposed for two similar quantum systems. Then, an SLC method is presented for QEC. Numerical results demonstrate the effectiveness of the proposed approach for the binary classification of two-level quantum ensembles and the multiclass classification of multilevel quantum ensembles.
identification. URE from ten MSP430F5529 16-bit microcontrollers were analyzed using: 1) RF distinct native attributes (RF-DNA) fingerprints paired with multiple...discriminant analysis/maximum likelihood (MDA/ML) classification, 2) RF-DNA fingerprints paired with generalized relevance learning vector quantized
Fang, Chen; Li, Chunfei; Cabrerizo, Mercedes; Barreto, Armando; Andrian, Jean; Rishe, Naphtali; Loewenstein, David; Duara, Ranjan; Adjouadi, Malek
2018-04-12
Over the past few years, several approaches have been proposed to assist in the early diagnosis of Alzheimer's disease (AD) and its prodromal stage of mild cognitive impairment (MCI). Using multimodal biomarkers for this high-dimensional classification problem, the widely used algorithms include Support Vector Machines (SVM), Sparse Representation-based classification (SRC), Deep Belief Networks (DBN) and Random Forest (RF). These widely used algorithms continue to yield unsatisfactory performance for delineating the MCI participants from the cognitively normal control (CN) group. A novel Gaussian discriminant analysis-based algorithm is thus introduced to achieve a more effective and accurate classification performance than the aforementioned state-of-the-art algorithms. This study makes use of magnetic resonance imaging (MRI) data uniquely as input to two separate high-dimensional decision spaces that reflect the structural measures of the two brain hemispheres. The data used include 190 CN, 305 MCI and 133 AD subjects as part of the AD Big Data DREAM Challenge #1. Using 80% data for a 10-fold cross-validation, the proposed algorithm achieved an average F1 score of 95.89% and an accuracy of 96.54% for discriminating AD from CN; and more importantly, an average F1 score of 92.08% and an accuracy of 90.26% for discriminating MCI from CN. Then, a true test was implemented on the remaining 20% held-out test data. For discriminating MCI from CN, an accuracy of 80.61%, a sensitivity of 81.97% and a specificity of 78.38% were obtained. These results show significant improvement over existing algorithms for discriminating the subtle differences between MCI participants and the CN group.
Drivelos, Spiros A; Higgins, Kevin; Kalivas, John H; Haroutounian, Serkos A; Georgiou, Constantinos A
2014-12-15
"Fava Santorinis", is a protected designation of origin (PDO) yellow split pea species growing only in the island of Santorini in Greece. Due to its nutritional quality and taste, it has gained a high monetary value. Thus, it is prone to adulteration with other yellow split peas. In order to discriminate "Fava Santorinis" from other yellow split peas, four classification methods utilising rare earth elements (REEs) measured through inductively coupled plasma-mass spectrometry (ICP-MS) are studied. The four classification processes are orthogonal projection analysis (OPA), Mahalanobis distance (MD), partial least squares discriminant analysis (PLS-DA) and k nearest neighbours (KNN). Since it is known that trace elements are often useful to determine geographical origin of food products, we further quantitated for trace elements using ICP-MS. Presented in this paper are results using the four classification processes based on the fusion of the REEs data with the trace element data. Overall, the OPA method was found to perform best with up to 100% accuracy using the fused data. Copyright © 2014 Elsevier Ltd. All rights reserved.
Random whole metagenomic sequencing for forensic discrimination of soils.
Khodakova, Anastasia S; Smith, Renee J; Burgoyne, Leigh; Abarno, Damien; Linacre, Adrian
2014-01-01
Here we assess the ability of random whole metagenomic sequencing approaches to discriminate between similar soils from two geographically distinct urban sites for application in forensic science. Repeat samples from two parklands in residential areas separated by approximately 3 km were collected and the DNA was extracted. Shotgun, whole genome amplification (WGA) and single arbitrarily primed DNA amplification (AP-PCR) based sequencing techniques were then used to generate soil metagenomic profiles. Full and subsampled metagenomic datasets were then annotated against M5NR/M5RNA (taxonomic classification) and SEED Subsystems (metabolic classification) databases. Further comparative analyses were performed using a number of statistical tools including: hierarchical agglomerative clustering (CLUSTER); similarity profile analysis (SIMPROF); non-metric multidimensional scaling (NMDS); and canonical analysis of principal coordinates (CAP) at all major levels of taxonomic and metabolic classification. Our data showed that shotgun and WGA-based approaches generated highly similar metagenomic profiles for the soil samples such that the soil samples could not be distinguished accurately. An AP-PCR based approach was shown to be successful at obtaining reproducible site-specific metagenomic DNA profiles, which in turn were employed for successful discrimination of visually similar soil samples collected from two different locations.
Arrigoni, Simone; Turra, Giovanni; Signoroni, Alberto
2017-09-01
With the rapid diffusion of Full Laboratory Automation systems, Clinical Microbiology is currently experiencing a new digital revolution. The ability to capture and process large amounts of visual data from microbiological specimen processing enables the definition of completely new objectives. These include the direct identification of pathogens growing on culturing plates, with expected improvements in rapid definition of the right treatment for patients affected by bacterial infections. In this framework, the synergies between light spectroscopy and image analysis, offered by hyperspectral imaging, are of prominent interest. This leads us to assess the feasibility of a reliable and rapid discrimination of pathogens through the classification of their spectral signatures extracted from hyperspectral image acquisitions of bacteria colonies growing on blood agar plates. We designed and implemented the whole data acquisition and processing pipeline and performed a comprehensive comparison among 40 combinations of different data preprocessing and classification techniques. High discrimination performance has been achieved also thanks to improved colony segmentation and spectral signature extraction. Experimental results reveal the high accuracy and suitability of the proposed approach, driving the selection of most suitable and scalable classification pipelines and stimulating clinical validations. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Hoffbeck, Joseph P.; Landgrebe, David A.
1994-01-01
Many analysis algorithms for high-dimensional remote sensing data require that the remotely sensed radiance spectra be transformed to approximate reflectance to allow comparison with a library of laboratory reflectance spectra. In maximum likelihood classification, however, the remotely sensed spectra are compared to training samples, thus a transformation to reflectance may or may not be helpful. The effect of several radiance-to-reflectance transformations on maximum likelihood classification accuracy is investigated in this paper. We show that the empirical line approach, LOWTRAN7, flat-field correction, single spectrum method, and internal average reflectance are all non-singular affine transformations, and that non-singular affine transformations have no effect on discriminant analysis feature extraction and maximum likelihood classification accuracy. (An affine transformation is a linear transformation with an optional offset.) Since the Atmosphere Removal Program (ATREM) and the log residue method are not affine transformations, experiments with Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) data were conducted to determine the effect of these transformations on maximum likelihood classification accuracy. The average classification accuracy of the data transformed by ATREM and the log residue method was slightly less than the accuracy of the original radiance data. Since the radiance-to-reflectance transformations allow direct comparison of remotely sensed spectra with laboratory reflectance spectra, they can be quite useful in labeling the training samples required by maximum likelihood classification, but these transformations have only a slight effect or no effect at all on discriminant analysis and maximum likelihood classification accuracy.
Feasibility of laser-induced breakdown spectroscopy (LIBS) for classification of sea salts.
Tan, Man Minh; Cui, Sheng; Yoo, Jonghyun; Han, Song-Hee; Ham, Kyung-Sik; Nam, Sang-Ho; Lee, Yonghoon
2012-03-01
We have investigated the feasibility of laser-induced breakdown spectroscopy (LIBS) as a fast, reliable classification tool for sea salts. For 11 kinds of sea salts, potassium (K), magnesium (Mg), calcium (Ca), and aluminum (Al), concentrations were measured by inductively coupled plasma-atomic emission spectroscopy (ICP-AES), and the LIBS spectra were recorded in the narrow wavelength region between 760 and 800 nm where K (I), Mg (I), Ca (II), Al (I), and cyanide (CN) band emissions are observed. The ICP-AES measurements revealed that the K, Mg, Ca, and Al concentrations varied significantly with the provenance of each salt. The relative intensities of the K (I), Mg (I), Ca (II), and Al (I) peaks observed in the LIBS spectra are consistent with the results using ICP-AES. The principal component analysis of the LIBS spectra provided the score plot with quite a high degree of clustering. This indicates that classification of sea salts by chemometric analysis of LIBS spectra is very promising. Classification models were developed by partial least squares discriminant analysis (PLS-DA) and evaluated. In addition, the Al (I) peaks enabled us to discriminate between different production methods of the salts. © 2012 Society for Applied Spectroscopy
NASA Astrophysics Data System (ADS)
Navratil, Peter; Wilps, Hans
2013-01-01
Three different object-based image classification techniques are applied to high-resolution satellite data for the mapping of the habitats of Asian migratory locust (Locusta migratoria migratoria) in the southern Aral Sea basin, Uzbekistan. A set of panchromatic and multispectral Système Pour l'Observation de la Terre-5 satellite images was spectrally enhanced by normalized difference vegetation index and tasseled cap transformation and segmented into image objects, which were then classified by three different classification approaches: a rule-based hierarchical fuzzy threshold (HFT) classification method was compared to a supervised nearest neighbor classifier and classification tree analysis by the quick, unbiased, efficient statistical trees algorithm. Special emphasis was laid on the discrimination of locust feeding and breeding habitats due to the significance of this discrimination for practical locust control. Field data on vegetation and land cover, collected at the time of satellite image acquisition, was used to evaluate classification accuracy. The results show that a robust HFT classifier outperformed the two automated procedures by 13% overall accuracy. The classification method allowed a reliable discrimination of locust feeding and breeding habitats, which is of significant importance for the application of the resulting data for an economically and environmentally sound control of locust pests because exact spatial knowledge on the habitat types allows a more effective surveying and use of pesticides.
Integrated Low-Rank-Based Discriminative Feature Learning for Recognition.
Zhou, Pan; Lin, Zhouchen; Zhang, Chao
2016-05-01
Feature learning plays a central role in pattern recognition. In recent years, many representation-based feature learning methods have been proposed and have achieved great success in many applications. However, these methods perform feature learning and subsequent classification in two separate steps, which may not be optimal for recognition tasks. In this paper, we present a supervised low-rank-based approach for learning discriminative features. By integrating latent low-rank representation (LatLRR) with a ridge regression-based classifier, our approach combines feature learning with classification, so that the regulated classification error is minimized. In this way, the extracted features are more discriminative for the recognition tasks. Our approach benefits from a recent discovery on the closed-form solutions to noiseless LatLRR. When there is noise, a robust Principal Component Analysis (PCA)-based denoising step can be added as preprocessing. When the scale of a problem is large, we utilize a fast randomized algorithm to speed up the computation of robust PCA. Extensive experimental results demonstrate the effectiveness and robustness of our method.
NASA Astrophysics Data System (ADS)
Weller, Andrew F.; Harris, Anthony J.; Ware, J. Andrew; Jarvis, Paul S.
2006-11-01
The classification of sedimentary organic matter (OM) images can be improved by determining the saliency of image analysis (IA) features measured from them. Knowing the saliency of IA feature measurements means that only the most significant discriminating features need be used in the classification process. This is an important consideration for classification techniques such as artificial neural networks (ANNs), where too many features can lead to the 'curse of dimensionality'. The classification scheme adopted in this work is a hybrid of morphologically and texturally descriptive features from previous manual classification schemes. Some of these descriptive features are assigned to IA features, along with several others built into the IA software (Halcon) to ensure that a valid cross-section is available. After an image is captured and segmented, a total of 194 features are measured for each particle. To reduce this number to a more manageable magnitude, the SPSS AnswerTree Exhaustive CHAID (χ 2 automatic interaction detector) classification tree algorithm is used to establish each measurement's saliency as a classification discriminator. In the case of continuous data as used here, the F-test is used as opposed to the published algorithm. The F-test checks various statistical hypotheses about the variance of groups of IA feature measurements obtained from the particles to be classified. The aim is to reduce the number of features required to perform the classification without reducing its accuracy. In the best-case scenario, 194 inputs are reduced to 8, with a subsequent multi-layer back-propagation ANN recognition rate of 98.65%. This paper demonstrates the ability of the algorithm to reduce noise, help overcome the curse of dimensionality, and facilitate an understanding of the saliency of IA features as discriminators for sedimentary OM classification.
Geographical classification of apple based on hyperspectral imaging
NASA Astrophysics Data System (ADS)
Guo, Zhiming; Huang, Wenqian; Chen, Liping; Zhao, Chunjiang; Peng, Yankun
2013-05-01
Attribute of apple according to geographical origin is often recognized and appreciated by the consumers. It is usually an important factor to determine the price of a commercial product. Hyperspectral imaging technology and supervised pattern recognition was attempted to discriminate apple according to geographical origins in this work. Hyperspectral images of 207 Fuji apple samples were collected by hyperspectral camera (400-1000nm). Principal component analysis (PCA) was performed on hyperspectral imaging data to determine main efficient wavelength images, and then characteristic variables were extracted by texture analysis based on gray level co-occurrence matrix (GLCM) from dominant waveband image. All characteristic variables were obtained by fusing the data of images in efficient spectra. Support vector machine (SVM) was used to construct the classification model, and showed excellent performance in classification results. The total classification rate had the high classify accuracy of 92.75% in the training set and 89.86% in the prediction sets, respectively. The overall results demonstrated that the hyperspectral imaging technique coupled with SVM classifier can be efficiently utilized to discriminate Fuji apple according to geographical origins.
Ghorai, Santanu; Mukherjee, Anirban; Dutta, Pranab K
2010-06-01
In this brief we have proposed the multiclass data classification by computationally inexpensive discriminant analysis through vector-valued regularized kernel function approximation (VVRKFA). VVRKFA being an extension of fast regularized kernel function approximation (FRKFA), provides the vector-valued response at single step. The VVRKFA finds a linear operator and a bias vector by using a reduced kernel that maps a pattern from feature space into the low dimensional label space. The classification of patterns is carried out in this low dimensional label subspace. A test pattern is classified depending on its proximity to class centroids. The effectiveness of the proposed method is experimentally verified and compared with multiclass support vector machine (SVM) on several benchmark data sets as well as on gene microarray data for multi-category cancer classification. The results indicate the significant improvement in both training and testing time compared to that of multiclass SVM with comparable testing accuracy principally in large data sets. Experiments in this brief also serve as comparison of performance of VVRKFA with stratified random sampling and sub-sampling.
Using complex networks for text classification: Discriminating informative and imaginative documents
NASA Astrophysics Data System (ADS)
de Arruda, Henrique F.; Costa, Luciano da F.; Amancio, Diego R.
2016-01-01
Statistical methods have been widely employed in recent years to grasp many language properties. The application of such techniques have allowed an improvement of several linguistic applications, such as machine translation and document classification. In the latter, many approaches have emphasised the semantical content of texts, as is the case of bag-of-word language models. These approaches have certainly yielded reasonable performance. However, some potential features such as the structural organization of texts have been used only in a few studies. In this context, we probe how features derived from textual structure analysis can be effectively employed in a classification task. More specifically, we performed a supervised classification aiming at discriminating informative from imaginative documents. Using a networked model that describes the local topological/dynamical properties of function words, we achieved an accuracy rate of up to 95%, which is much higher than similar networked approaches. A systematic analysis of feature relevance revealed that symmetry and accessibility measurements are among the most prominent network measurements. Our results suggest that these measurements could be used in related language applications, as they play a complementary role in characterising texts.
Kianmehr, Keivan; Alhajj, Reda
2008-09-01
In this study, we aim at building a classification framework, namely the CARSVM model, which integrates association rule mining and support vector machine (SVM). The goal is to benefit from advantages of both, the discriminative knowledge represented by class association rules and the classification power of the SVM algorithm, to construct an efficient and accurate classifier model that improves the interpretability problem of SVM as a traditional machine learning technique and overcomes the efficiency issues of associative classification algorithms. In our proposed framework: instead of using the original training set, a set of rule-based feature vectors, which are generated based on the discriminative ability of class association rules over the training samples, are presented to the learning component of the SVM algorithm. We show that rule-based feature vectors present a high-qualified source of discrimination knowledge that can impact substantially the prediction power of SVM and associative classification techniques. They provide users with more conveniences in terms of understandability and interpretability as well. We have used four datasets from UCI ML repository to evaluate the performance of the developed system in comparison with five well-known existing classification methods. Because of the importance and popularity of gene expression analysis as real world application of the classification model, we present an extension of CARSVM combined with feature selection to be applied to gene expression data. Then, we describe how this combination will provide biologists with an efficient and understandable classifier model. The reported test results and their biological interpretation demonstrate the applicability, efficiency and effectiveness of the proposed model. From the results, it can be concluded that a considerable increase in classification accuracy can be obtained when the rule-based feature vectors are integrated in the learning process of the SVM algorithm. In the context of applicability, according to the results obtained from gene expression analysis, we can conclude that the CARSVM system can be utilized in a variety of real world applications with some adjustments.
Waldman, John R.; Fabrizio, Mary C.
1994-01-01
Stock contribution studies of mixed-stock fisheries rely on the application of classification algorithms to samples of unknown origin. Although the performance of these algorithms can be assessed, there are no guidelines regarding decisions about including minor stocks, pooling stocks into regional groups, or sampling discrete substocks to adequately characterize a stock. We examined these questions for striped bass Morone saxatilis of the U.S. Atlantic coast by applying linear discriminant functions to meristic and morphometric data from fish collected from spawning areas. Some of our samples were from the Hudson and Roanoke rivers and four tributaries of the Chesapeake Bay. We also collected fish of mixed-stock origin from the Atlantic Ocean near Montauk, New York. Inclusion of the minor stock from the Roanoke River in the classification algorithm decreased the correct-classification rate, whereas grouping of the Roanoke River and Chesapeake Bay stock into a regional (''southern'') group increased the overall resolution. The increased resolution was offset by our inability to obtain separate contribution estimates of the groups that were pooled. Although multivariate analysis of variance indicated significant differences among Chesapeake Bay substocks, increasing the number of substocks in the discriminant analysis decreased the overall correct-classification rate. Although the inclusion of one, two, three, or four substocks in the classification algorithm did not greatly affect the overall correct-classification rates, the specific combination of substocks significantly affected the relative contribution estimates derived from the mixed-stock sample. Future studies of this kind must balance the costs and benefits of including minor stocks and would profit from examination of the variation in discriminant characters among all Chesapeake Bay substocks.
NASA Astrophysics Data System (ADS)
Li, Xiaohui; Yang, Sibo; Fan, Rongwei; Yu, Xin; Chen, Deying
2018-06-01
In this paper, discrimination of soft tissues using laser-induced breakdown spectroscopy (LIBS) in combination with multivariate statistical methods is presented. Fresh pork fat, skin, ham, loin and tenderloin muscle tissues are manually cut into slices and ablated using a 1064 nm pulsed Nd:YAG laser. Discrimination analyses between fat, skin and muscle tissues, and further between highly similar ham, loin and tenderloin muscle tissues, are performed based on the LIBS spectra in combination with multivariate statistical methods, including principal component analysis (PCA), k nearest neighbors (kNN) classification, and support vector machine (SVM) classification. Performances of the discrimination models, including accuracy, sensitivity and specificity, are evaluated using 10-fold cross validation. The classification models are optimized to achieve best discrimination performances. The fat, skin and muscle tissues can be definitely discriminated using both kNN and SVM classifiers, with accuracy of over 99.83%, sensitivity of over 0.995 and specificity of over 0.998. The highly similar ham, loin and tenderloin muscle tissues can also be discriminated with acceptable performances. The best performances are achieved with SVM classifier using Gaussian kernel function, with accuracy of 76.84%, sensitivity of over 0.742 and specificity of over 0.869. The results show that the LIBS technique assisted with multivariate statistical methods could be a powerful tool for online discrimination of soft tissues, even for tissues of high similarity, such as muscles from different parts of the animal body. This technique could be used for discrimination of tissues suffering minor clinical changes, thus may advance the diagnosis of early lesions and abnormalities.
Caprihan, A; Pearlson, G D; Calhoun, V D
2008-08-15
Principal component analysis (PCA) is often used to reduce the dimension of data before applying more sophisticated data analysis methods such as non-linear classification algorithms or independent component analysis. This practice is based on selecting components corresponding to the largest eigenvalues. If the ultimate goal is separation of data in two groups, then these set of components need not have the most discriminatory power. We measured the distance between two such populations using Mahalanobis distance and chose the eigenvectors to maximize it, a modified PCA method, which we call the discriminant PCA (DPCA). DPCA was applied to diffusion tensor-based fractional anisotropy images to distinguish age-matched schizophrenia subjects from healthy controls. The performance of the proposed method was evaluated by the one-leave-out method. We show that for this fractional anisotropy data set, the classification error with 60 components was close to the minimum error and that the Mahalanobis distance was twice as large with DPCA, than with PCA. Finally, by masking the discriminant function with the white matter tracts of the Johns Hopkins University atlas, we identified left superior longitudinal fasciculus as the tract which gave the least classification error. In addition, with six optimally chosen tracts the classification error was zero.
Travassos, Claudia; Laguardia, Josué; Marques, Priscilla M; Mota, Jurema C; Szwarcwald, Celia L
2011-08-25
This paper aims to compare the classification of race/skin color based on the discrete categories used by the Demographic Census of the Brazilian Institute of Geography and Statistics (IBGE) and a skin color scale with values ranging from 1 (lighter skin) to 10 (darker skin), examining whether choosing one alternative or the other can influence measures of self-evaluation of health status, health care service utilization and discrimination in the health services. This is a cross-sectional study based on data from the World Health Survey carried out in Brazil in 2003 with a sample of 5000 individuals older than 18 years. Similarities between the two classifications were evaluated by means of correspondence analysis. The effect of the two classifications on health outcomes was tested through logistic regression models for each sex, using age, educational level and ownership of consumer goods as covariables. Both measures of race/skin color represent the same race/skin color construct. The results show a tendency among Brazilians to classify their skin color in shades closer to the center of the color gradient. Women tend to classify their race/skin color as a little lighter than men in the skin color scale, an effect not observed when IBGE categories are used. With regard to health and health care utilization, race/skin color was not relevant in explaining any of them, regardless of the race/skin color classification. Lack of money and social class were the most prevalent reasons for discrimination in healthcare reported in the survey, suggesting that in Brazil the discussion about discrimination in the health care must not be restricted to racial discrimination and should also consider class-based discrimination. The study shows that the differences of the two classifications of race/skin color are small. However, the interval scale measure appeared to increase the freedom of choice of the respondent.
2011-01-01
Background This paper aims to compare the classification of race/skin color based on the discrete categories used by the Demographic Census of the Brazilian Institute of Geography and Statistics (IBGE) and a skin color scale with values ranging from 1 (lighter skin) to 10 (darker skin), examining whether choosing one alternative or the other can influence measures of self-evaluation of health status, health care service utilization and discrimination in the health services. Methods This is a cross-sectional study based on data from the World Health Survey carried out in Brazil in 2003 with a sample of 5000 individuals older than 18 years. Similarities between the two classifications were evaluated by means of correspondence analysis. The effect of the two classifications on health outcomes was tested through logistic regression models for each sex, using age, educational level and ownership of consumer goods as covariables. Results Both measures of race/skin color represent the same race/skin color construct. The results show a tendency among Brazilians to classify their skin color in shades closer to the center of the color gradient. Women tend to classify their race/skin color as a little lighter than men in the skin color scale, an effect not observed when IBGE categories are used. With regard to health and health care utilization, race/skin color was not relevant in explaining any of them, regardless of the race/skin color classification. Lack of money and social class were the most prevalent reasons for discrimination in healthcare reported in the survey, suggesting that in Brazil the discussion about discrimination in the health care must not be restricted to racial discrimination and should also consider class-based discrimination. The study shows that the differences of the two classifications of race/skin color are small. However, the interval scale measure appeared to increase the freedom of choice of the respondent. PMID:21867522
Hosseinifard, Behshad; Moradi, Mohammad Hassan; Rostami, Reza
2013-03-01
Diagnosing depression in the early curable stages is very important and may even save the life of a patient. In this paper, we study nonlinear analysis of EEG signal for discriminating depression patients and normal controls. Forty-five unmedicated depressed patients and 45 normal subjects were participated in this study. Power of four EEG bands and four nonlinear features including detrended fluctuation analysis (DFA), higuchi fractal, correlation dimension and lyapunov exponent were extracted from EEG signal. For discriminating the two groups, k-nearest neighbor, linear discriminant analysis and logistic regression as the classifiers are then used. Highest classification accuracy of 83.3% is obtained by correlation dimension and LR classifier among other nonlinear features. For further improvement, all nonlinear features are combined and applied to classifiers. A classification accuracy of 90% is achieved by all nonlinear features and LR classifier. In all experiments, genetic algorithm is employed to select the most important features. The proposed technique is compared and contrasted with the other reported methods and it is demonstrated that by combining nonlinear features, the performance is enhanced. This study shows that nonlinear analysis of EEG can be a useful method for discriminating depressed patients and normal subjects. It is suggested that this analysis may be a complementary tool to help psychiatrists for diagnosing depressed patients. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Using color histograms and SPA-LDA to classify bacteria.
de Almeida, Valber Elias; da Costa, Gean Bezerra; de Sousa Fernandes, David Douglas; Gonçalves Dias Diniz, Paulo Henrique; Brandão, Deysiane; de Medeiros, Ana Claudia Dantas; Véras, Germano
2014-09-01
In this work, a new approach is proposed to verify the differentiating characteristics of five bacteria (Escherichia coli, Enterococcus faecalis, Streptococcus salivarius, Streptococcus oralis, and Staphylococcus aureus) by using digital images obtained with a simple webcam and variable selection by the Successive Projections Algorithm associated with Linear Discriminant Analysis (SPA-LDA). In this sense, color histograms in the red-green-blue (RGB), hue-saturation-value (HSV), and grayscale channels and their combinations were used as input data, and statistically evaluated by using different multivariate classifiers (Soft Independent Modeling by Class Analogy (SIMCA), Principal Component Analysis-Linear Discriminant Analysis (PCA-LDA), Partial Least Squares Discriminant Analysis (PLS-DA) and Successive Projections Algorithm-Linear Discriminant Analysis (SPA-LDA)). The bacteria strains were cultivated in a nutritive blood agar base layer for 24 h by following the Brazilian Pharmacopoeia, maintaining the status of cell growth and the nature of nutrient solutions under the same conditions. The best result in classification was obtained by using RGB and SPA-LDA, which reached 94 and 100 % of classification accuracy in the training and test sets, respectively. This result is extremely positive from the viewpoint of routine clinical analyses, because it avoids bacterial identification based on phenotypic identification of the causative organism using Gram staining, culture, and biochemical proofs. Therefore, the proposed method presents inherent advantages, promoting a simpler, faster, and low-cost alternative for bacterial identification.
Parasites as biological tags of fish stocks: a meta-analysis of their discriminatory power.
Poulin, Robert; Kamiya, Tsukushi
2015-01-01
The use of parasites as biological tags to discriminate among marine fish stocks has become a widely accepted method in fisheries management. Here, we first link this approach to its unstated ecological foundation, the decay in the similarity of the species composition of assemblages as a function of increasing distance between them, a phenomenon almost universal in nature. We explain how distance decay of similarity can influence the use of parasites as biological tags. Then, we perform a meta-analysis of 61 uses of parasites as tags of marine fish populations in multivariate discriminant analyses, obtained from 29 articles. Our main finding is that across all studies, the observed overall probability of correct classification of fish based on parasite data was about 71%. This corresponds to a two-fold improvement over the rate of correct classification expected by chance alone, and the average effect size (Zr = 0·463) computed from the original values was also indicative of a medium-to-large effect. However, none of the moderator variables included in the meta-analysis had a significant effect on the proportion of correct classification; these moderators included the total number of fish sampled, the number of parasite species used in the discriminant analysis, the number of localities from which fish were sampled, the minimum and maximum distance between any pair of sampling localities, etc. Therefore, there are no clear-cut situations in which the use of parasites as tags is more useful than others. Finally, we provide recommendations for the future usage of parasites as tags for stock discrimination, to ensure that future applications of the method achieve statistical rigour and a high discriminatory power.
Besga, Ariadna; Gonzalez, Itxaso; Echeburua, Enrique; Savio, Alexandre; Ayerdi, Borja; Chyzhyk, Darya; Madrigal, Jose L M; Leza, Juan C; Graña, Manuel; Gonzalez-Pinto, Ana Maria
2015-01-01
Late onset bipolar disorder (LOBD) is often difficult to distinguish from degenerative dementias, such as Alzheimer disease (AD), due to comorbidities and common cognitive symptoms. Moreover, LOBD prevalence in the elder population is not negligible and it is increasing. Both pathologies share pathophysiological neuroinflammation features. Improvements in differential diagnosis of LOBD and AD will help to select the best personalized treatment. The aim of this study is to assess the relative significance of clinical observations, neuropsychological tests, and specific blood plasma biomarkers (inflammatory and neurotrophic), separately and combined, in the differential diagnosis of LOBD versus AD. It was carried out evaluating the accuracy achieved by classification-based computer-aided diagnosis (CAD) systems based on these variables. A sample of healthy controls (HC) (n = 26), AD patients (n = 37), and LOBD patients (n = 32) was recruited at the Alava University Hospital. Clinical observations, neuropsychological tests, and plasma biomarkers were measured at recruitment time. We applied multivariate machine learning classification methods to discriminate subjects from HC, AD, and LOBD populations in the study. We analyzed, for each classification contrast, feature sets combining clinical observations, neuropsychological measures, and biological markers, including inflammation biomarkers. Furthermore, we analyzed reduced feature sets containing variables with significative differences determined by a Welch's t-test. Furthermore, a battery of classifier architectures were applied, encompassing linear and non-linear Support Vector Machines (SVM), Random Forests (RF), Classification and regression trees (CART), and their performance was evaluated in a leave-one-out (LOO) cross-validation scheme. Post hoc analysis of Gini index in CART classifiers provided a measure of each variable importance. Welch's t-test found one biomarker (Malondialdehyde) with significative differences (p < 0.001) in LOBD vs. AD contrast. Classification results with the best features are as follows: discrimination of HC vs. AD patients reaches accuracy 97.21% and AUC 98.17%. Discrimination of LOBD vs. AD patients reaches accuracy 90.26% and AUC 89.57%. Discrimination of HC vs LOBD patients achieves accuracy 95.76% and AUC 88.46%. It is feasible to build CAD systems for differential diagnosis of LOBD and AD on the basis of a reduced set of clinical variables. Clinical observations provide the greatest discrimination. Neuropsychological tests are improved by the addition of biomarkers, and both contribute significantly to improve the overall predictive performance.
Classification and pose estimation of objects using nonlinear features
NASA Astrophysics Data System (ADS)
Talukder, Ashit; Casasent, David P.
1998-03-01
A new nonlinear feature extraction method called the maximum representation and discrimination feature (MRDF) method is presented for extraction of features from input image data. It implements transformations similar to the Sigma-Pi neural network. However, the weights of the MRDF are obtained in closed form, and offer advantages compared to nonlinear neural network implementations. The features extracted are useful for both object discrimination (classification) and object representation (pose estimation). We show its use in estimating the class and pose of images of real objects and rendered solid CAD models of machine parts from single views using a feature-space trajectory (FST) neural network classifier. We show more accurate classification and pose estimation results than are achieved by standard principal component analysis (PCA) and Fukunaga-Koontz (FK) feature extraction methods.
NASA Astrophysics Data System (ADS)
Tulip, David F.; Lucas, Keith B.
1991-12-01
At a time when recruitment into preservice teacher education courses in mathematics and science is difficult, one strategy to increase the number of graduates is to minimise the number of students who fail to complete their university courses. This study sought to determine factors which distinguish withdrawers from persisters in the first semester of a B.Ed course. Discriminant analysis was employed; a discriminant function employing seven factors resulted in correct classification in 81% of cases. Further analysis distinguishing between dropouts and transferees resulted in two discriminant functions with some common variables.
[The homogeneity of a population of yeasts from Camembert cheeses].
Schmidt, J L; Daudin, J J
1983-01-01
Yeasts are found to a large extent in cheeses, more particularly in soft cheeses such as Camembert. The proximity between two species previously identified by standard methods was studied using a factorial discriminant analysis on 326 strains. Twenty-three fermentation and assimilation tests (discriminant variables) gave a fairly good discrimination between species. This treatment has allowed us to confirm the present tendencies noticed in yeast classification and has also enabled us to group some of the species.
Hayashi, Hideaki; Shibanoki, Taro; Shima, Keisuke; Kurita, Yuichi; Tsuji, Toshio
2015-12-01
This paper proposes a probabilistic neural network (NN) developed on the basis of time-series discriminant component analysis (TSDCA) that can be used to classify high-dimensional time-series patterns. TSDCA involves the compression of high-dimensional time series into a lower dimensional space using a set of orthogonal transformations and the calculation of posterior probabilities based on a continuous-density hidden Markov model with a Gaussian mixture model expressed in the reduced-dimensional space. The analysis can be incorporated into an NN, which is named a time-series discriminant component network (TSDCN), so that parameters of dimensionality reduction and classification can be obtained simultaneously as network coefficients according to a backpropagation through time-based learning algorithm with the Lagrange multiplier method. The TSDCN is considered to enable high-accuracy classification of high-dimensional time-series patterns and to reduce the computation time taken for network training. The validity of the TSDCN is demonstrated for high-dimensional artificial data and electroencephalogram signals in the experiments conducted during the study.
NASA Technical Reports Server (NTRS)
Quattrochi, D. A.
1985-01-01
The capabilities of TM data for discriminating land covers within three particular cultural and ecological realms was assessed. The agricultural investigation in Poinsett County, Arkansas illustrates that TM data can successfully be used to discriminate a variety of crop cover types within the study area. The single-date TM classification produced results that were significantly better than those developed from multitemporal MSS data. For the Reelfoot Lake area of Tennessee TM data, processed using unsupervised signature development techniques, produced a detailed classification of forested wetlands with excellent accuracy. Even in a small city of approximately 15,000 people (Union City, Tennessee). TM data can successfully be used to spectrally distinguish specific urban classes. Furthermore, the principal components analysis evaluation of the data shows that through photointerpretation, it is possible to distinguish individual buildings and roof responses with the TM.
Villarreal, Diana; Laffargue, Andreina; Posada, Huver; Bertrand, Benoit; Lashermes, Philippe; Dussert, Stephane
2009-12-09
In a previous study, the effectiveness of chlorogenic acids, fatty acids (FA), and elements was compared for the discrimination of Arabica varieties and growing terroirs. Since FA provided the best results, the aim of the present work was to validate their discrimination ability using an extended experimental design, including twice the number of location x variety combinations and 2 years of study. It also aimed at understanding how the environment influences FA composition through correlation analysis using different climatic parameters. Percentages of correct classification of known samples remained very high, independent of the classification criterion. However, cross-validation tests across years indicated that prediction of unknown locations was less efficient than that of unknown genotypes. Environmental temperature during the development of coffee beans had a dramatic influence on their FA composition. Analysis of climate patterns over years enabled us to understand the efficient location discrimination within a single year but only moderate efficiency across years.
NASA Technical Reports Server (NTRS)
Klemas, V.; Bartlett, D.; Rogers, R.; Reed, L.
1974-01-01
Digital analysis of ERTS-1 imagery was used in an attempt to map and inventory the significant ecological communities of Delaware's coastal zone. Eight vegetation and land use discrimination classes were selected: (1) phragmites communis (Giant Reed grass); (2) spartina alterniflora (Salt marsh cord grass); (3) spartina patens (Salt marsh hay); (4) shallow water and exposed mud; (5) deep water (2 meters); (6) forest; (7) agriculture; and (8) exposed sand and concrete. Canonical analysis showed that classification accuracy was quite good with spartina alterniflora, exposed sand-concrete, and forested land - all discriminated with between 94% and 100% accuracy. The shallow water-mud and deep water categories were classified with accuracies of 88% and 93% respectively. Phragmites communis showed a classification accuracy of 83% with all confusion occurring with spartina patens which may be due to use of mixed stands of these species as training sets. Discrimination of spartina patens was very poor (accuracy 52%).
NASA Technical Reports Server (NTRS)
Quattrochi, D. A.; Anderson, J. E.; Brannon, D. P.; Hill, C. L.
1982-01-01
An initial analysis of LANDSAT 4 thematic mapper (TM) data for the delineation and classification of agricultural, forested wetland, and urban land covers was conducted. A study area in Poinsett County, Arkansas was used to evaluate a classification of agricultural lands derived from multitemporal LANDSAT multispectral scanner (MSS) data in comparison with a classification of TM data for the same area. Data over Reelfoot Lake in northwestern Tennessee were utilized to evaluate the TM for delineating forested wetland species. A classification of the study area was assessed for accuracy in discriminating five forested wetland categories. Finally, the TM data were used to identify urban features within a small city. A computer generated classification of Union City, Tennessee was analyzed for accuracy in delineating urban land covers. An evaluation of digitally enhanced TM data using principal components analysis to facilitate photointerpretation of urban features was also performed.
High Dimensional Classification Using Features Annealed Independence Rules.
Fan, Jianqing; Fan, Yingying
2008-01-01
Classification using high-dimensional features arises frequently in many contemporary statistical studies such as tumor classification using microarray or other high-throughput data. The impact of dimensionality on classifications is largely poorly understood. In a seminal paper, Bickel and Levina (2004) show that the Fisher discriminant performs poorly due to diverging spectra and they propose to use the independence rule to overcome the problem. We first demonstrate that even for the independence classification rule, classification using all the features can be as bad as the random guessing due to noise accumulation in estimating population centroids in high-dimensional feature space. In fact, we demonstrate further that almost all linear discriminants can perform as bad as the random guessing. Thus, it is paramountly important to select a subset of important features for high-dimensional classification, resulting in Features Annealed Independence Rules (FAIR). The conditions under which all the important features can be selected by the two-sample t-statistic are established. The choice of the optimal number of features, or equivalently, the threshold value of the test statistics are proposed based on an upper bound of the classification error. Simulation studies and real data analysis support our theoretical results and demonstrate convincingly the advantage of our new classification procedure.
Comparison of cranial sex determination by discriminant analysis and logistic regression.
Amores-Ampuero, Anabel; Alemán, Inmaculada
2016-04-05
Various methods have been proposed for estimating dimorphism. The objective of this study was to compare sex determination results from cranial measurements using discriminant analysis or logistic regression. The study sample comprised 130 individuals (70 males) of known sex, age, and cause of death from San José cemetery in Granada (Spain). Measurements of 19 neurocranial dimensions and 11 splanchnocranial dimensions were subjected to discriminant analysis and logistic regression, and the percentages of correct classification were compared between the sex functions obtained with each method. The discriminant capacity of the selected variables was evaluated with a cross-validation procedure. The percentage accuracy with discriminant analysis was 78.2% for the neurocranium (82.4% in females and 74.6% in males) and 73.7% for the splanchnocranium (79.6% in females and 68.8% in males). These percentages were higher with logistic regression analysis: 85.7% for the neurocranium (in both sexes) and 94.1% for the splanchnocranium (100% in females and 91.7% in males).
Al-Qazzaz, Noor Kamal; Ali, Sawal; Ahmad, Siti Anom; Escudero, Javier
2017-07-01
The aim of the present study was to discriminate the electroencephalogram (EEG) of 5 patients with vascular dementia (VaD), 15 patients with stroke-related mild cognitive impairment (MCI), and 15 control normal subjects during a working memory (WM) task. We used independent component analysis (ICA) and wavelet transform (WT) as a hybrid preprocessing approach for EEG artifact removal. Three different features were extracted from the cleaned EEG signals: spectral entropy (SpecEn), permutation entropy (PerEn) and Tsallis entropy (TsEn). Two classification schemes were applied - support vector machine (SVM) and k-nearest neighbors (kNN) - with fuzzy neighborhood preserving analysis with QR-decomposition (FNPAQR) as a dimensionality reduction technique. The FNPAQR dimensionality reduction technique increased the SVM classification accuracy from 82.22% to 90.37% and from 82.6% to 86.67% for kNN. These results suggest that FNPAQR consistently improves the discrimination of VaD, MCI patients and control normal subjects and it could be a useful feature selection to help the identification of patients with VaD and MCI.
Wang, Kun-Ching
2015-01-14
The classification of emotional speech is mostly considered in speech-related research on human-computer interaction (HCI). In this paper, the purpose is to present a novel feature extraction based on multi-resolutions texture image information (MRTII). The MRTII feature set is derived from multi-resolution texture analysis for characterization and classification of different emotions in a speech signal. The motivation is that we have to consider emotions have different intensity values in different frequency bands. In terms of human visual perceptual, the texture property on multi-resolution of emotional speech spectrogram should be a good feature set for emotion classification in speech. Furthermore, the multi-resolution analysis on texture can give a clearer discrimination between each emotion than uniform-resolution analysis on texture. In order to provide high accuracy of emotional discrimination especially in real-life, an acoustic activity detection (AAD) algorithm must be applied into the MRTII-based feature extraction. Considering the presence of many blended emotions in real life, in this paper make use of two corpora of naturally-occurring dialogs recorded in real-life call centers. Compared with the traditional Mel-scale Frequency Cepstral Coefficients (MFCC) and the state-of-the-art features, the MRTII features also can improve the correct classification rates of proposed systems among different language databases. Experimental results show that the proposed MRTII-based feature information inspired by human visual perception of the spectrogram image can provide significant classification for real-life emotional recognition in speech.
Medical image classification based on multi-scale non-negative sparse coding.
Zhang, Ruijie; Shen, Jian; Wei, Fushan; Li, Xiong; Sangaiah, Arun Kumar
2017-11-01
With the rapid development of modern medical imaging technology, medical image classification has become more and more important in medical diagnosis and clinical practice. Conventional medical image classification algorithms usually neglect the semantic gap problem between low-level features and high-level image semantic, which will largely degrade the classification performance. To solve this problem, we propose a multi-scale non-negative sparse coding based medical image classification algorithm. Firstly, Medical images are decomposed into multiple scale layers, thus diverse visual details can be extracted from different scale layers. Secondly, for each scale layer, the non-negative sparse coding model with fisher discriminative analysis is constructed to obtain the discriminative sparse representation of medical images. Then, the obtained multi-scale non-negative sparse coding features are combined to form a multi-scale feature histogram as the final representation for a medical image. Finally, SVM classifier is combined to conduct medical image classification. The experimental results demonstrate that our proposed algorithm can effectively utilize multi-scale and contextual spatial information of medical images, reduce the semantic gap in a large degree and improve medical image classification performance. Copyright © 2017 Elsevier B.V. All rights reserved.
Prat, Chantal; Besalú, Emili; Bañeras, Lluís; Anticó, Enriqueta
2011-06-15
The volatile fraction of aqueous cork macerates of tainted and non-tainted agglomerate cork stoppers was analysed by headspace solid-phase microextraction (HS-SPME)/gas chromatography. Twenty compounds containing terpenoids, aliphatic alcohols, lignin-related compounds and others were selected and analysed in individual corks. Cork stoppers were previously classified in six different classes according to sensory descriptions including, 2,4,6-trichloroanisole taint and other frequent, non-characteristic odours found in cork. A multivariate analysis of the chromatographic data of 20 selected chemical compounds using linear discriminant analysis models helped in the differentiation of the a priori made groups. The discriminant model selected five compounds as the best combination. Selected compounds appear in the model in the following order; 2,4,6 TCA, fenchyl alcohol, 1-octen-3-ol, benzyl alcohol and benzothiazole. Unfortunately, not all six a priori differentiated sensory classes were clearly discriminated in the model, probably indicating that no measurable differences exist in the chromatographic data for some categories. The predictive analyses of a refined model in which two sensory classes were fused together resulted in a good classification. Prediction rates of control (non-tainted), TCA, musty-earthy-vegetative, vegetative and chemical descriptions were 100%, 100%, 85%, 67.3% and 100%, respectively, when the modified model was used. The multivariate analysis of chromatographic data will help in the classification of stoppers and provide a perfect complement to sensory analyses. Copyright © 2010 Elsevier Ltd. All rights reserved.
Quantifying tolerance indicator values for common stream fish species of the United States
Meador, M.R.; Carlisle, D.M.
2007-01-01
The classification of fish species tolerance to environmental disturbance is often used as a means to assess ecosystem conditions. Its use, however, may be problematic because the approach to tolerance classification is based on subjective judgment. We analyzed fish and physicochemical data from 773 stream sites collected as part of the U.S. Geological Survey's National Water-Quality Assessment Program to calculate tolerance indicator values for 10 physicochemical variables using weighted averaging. Tolerance indicator values (TIVs) for ammonia, chloride, dissolved oxygen, nitrite plus nitrate, pH, phosphorus, specific conductance, sulfate, suspended sediment, and water temperature were calculated for 105 common fish species of the United States. Tolerance indicator values for specific conductance and sulfate were correlated (rho = 0.87), and thus, fish species may be co-tolerant to these water-quality variables. We integrated TIVs for each species into an overall tolerance classification for comparisons with judgment-based tolerance classifications. Principal components analysis indicated that the distinction between tolerant and intolerant classifications was determined largely by tolerance to suspended sediment, specific conductance, chloride, and total phosphorus. Factors such as water temperature, dissolved oxygen, and pH may not be as important in distinguishing between tolerant and intolerant classifications, but may help to segregate species classified as moderate. Empirically derived tolerance classifications were 58.8% in agreement with judgment-derived tolerance classifications. Canonical discriminant analysis revealed that few TIVs, primarily chloride, could discriminate among judgment-derived tolerance classifications of tolerant, moderate, and intolerant. To our knowledge, this is the first empirically based understanding of fish species tolerance for stream fishes in the United States.
Choi, Young Hae; Sertic, Sarah; Kim, Hye Kyong; Wilson, Erica G; Michopoulos, Filippos; Lefeber, Alfons W M; Erkelens, Cornelis; Prat Kricun, Sergio D; Verpoorte, Robert
2005-02-23
The metabolomic analysis of 11 Ilex species, I. argentina, I. brasiliensis, I. brevicuspis, I. dumosavar. dumosa, I. dumosa var. guaranina, I. integerrima, I. microdonta, I. paraguariensis var. paraguariensis, I. pseudobuxus, I. taubertiana, and I. theezans, was carried out by NMR spectroscopy and multivariate data analysis. The analysis using principal component analysis and classification of the (1)H NMR spectra showed a clear discrimination of those samples based on the metabolites present in the organic and aqueous fractions. The major metabolites that contribute to the discrimination are arbutin, caffeine, phenylpropanoids, and theobromine. Among those metabolites, arbutin, which has not been reported yet as a constituent of Ilex species, was found to be a biomarker for I. argentina,I. brasiliensis, I. brevicuspis, I. integerrima, I. microdonta, I. pseudobuxus, I. taubertiana, and I. theezans. This reliable method based on the determination of a large number of metabolites makes the chemotaxonomical analysis of Ilex species possible.
NASA Astrophysics Data System (ADS)
Wiśniewska, Paulina; Boqué, Ricard; Borràs, Eva; Busto, Olga; Wardencki, Waldemar; Namieśnik, Jacek; Dymerski, Tomasz
2017-02-01
Headspace mass-spectrometry (HS-MS), mid infrared (MIR) and UV-vis spectroscopy were used to authenticate whisky samples from different origins and ways of production ((Irish, Spanish, Bourbon, Tennessee Whisky and Scotch). The collected spectra were processed with partial least-squares discriminant analysis (PLS-DA) to build the classification models. In all cases the five groups of whiskies were distinguished, but the best results were obtained by HS-MS, which indicates that the biggest differences between different types of whisky are due to their aroma. Differences were also found inside groups, showing that not only raw material is important to discriminate samples but also the way of their production. The methodology is quick, easy and does not require sample preparation.
NASA Astrophysics Data System (ADS)
Legara, Erika Fille; Monterola, Christopher; Abundo, Cheryl
2011-01-01
We demonstrate an accurate procedure based on linear discriminant analysis that allows automatic authorship classification of opinion column articles. First, we extract the following stylometric features of 157 column articles from four authors: statistics on high frequency words, number of words per sentence, and number of sentences per paragraph. Then, by systematically ranking these features based on an effect size criterion, we show that we can achieve an average classification accuracy of 93% for the test set. In comparison, frequency size based ranking has an average accuracy of 80%. The highest possible average classification accuracy of our data merely relying on chance is ∼31%. By carrying out sensitivity analysis, we show that the effect size criterion is superior than frequency ranking because there exist low frequency words that significantly contribute to successful author discrimination. Consistent results are seen when the procedure is applied in classifying the undisputed Federalist papers of Alexander Hamilton and James Madison. To the best of our knowledge, the work is the first attempt in classifying opinion column articles, that by virtue of being shorter in length (as compared to novels or short stories), are more prone to over-fitting issues. The near perfect classification for the longer papers supports this claim. Our results provide an important insight on authorship attribution that has been overlooked in previous studies: that ranking discriminant variables based on word frequency counts is not necessarily an optimal procedure.
A Comparative Study of Land Cover Classification by Using Multispectral and Texture Data
Qadri, Salman; Khan, Dost Muhammad; Ahmad, Farooq; Qadri, Syed Furqan; Babar, Masroor Ellahi; Shahid, Muhammad; Ul-Rehman, Muzammil; Razzaq, Abdul; Shah Muhammad, Syed; Fahad, Muhammad; Ahmad, Sarfraz; Pervez, Muhammad Tariq; Naveed, Nasir; Aslam, Naeem; Jamil, Mutiullah; Rehmani, Ejaz Ahmad; Ahmad, Nazir; Akhtar Khan, Naeem
2016-01-01
The main objective of this study is to find out the importance of machine vision approach for the classification of five types of land cover data such as bare land, desert rangeland, green pasture, fertile cultivated land, and Sutlej river land. A novel spectra-statistical framework is designed to classify the subjective land cover data types accurately. Multispectral data of these land covers were acquired by using a handheld device named multispectral radiometer in the form of five spectral bands (blue, green, red, near infrared, and shortwave infrared) while texture data were acquired with a digital camera by the transformation of acquired images into 229 texture features for each image. The most discriminant 30 features of each image were obtained by integrating the three statistical features selection techniques such as Fisher, Probability of Error plus Average Correlation, and Mutual Information (F + PA + MI). Selected texture data clustering was verified by nonlinear discriminant analysis while linear discriminant analysis approach was applied for multispectral data. For classification, the texture and multispectral data were deployed to artificial neural network (ANN: n-class). By implementing a cross validation method (80-20), we received an accuracy of 91.332% for texture data and 96.40% for multispectral data, respectively. PMID:27376088
de Castro, Ana-Isabel; Jurado-Expósito, Montserrat; Gómez-Casero, María-Teresa; López-Granados, Francisca
2012-01-01
In the context of detection of weeds in crops for site-specific weed control, on-ground spectral reflectance measurements are the first step to determine the potential of remote spectral data to classify weeds and crops. Field studies were conducted for four years at different locations in Spain. We aimed to distinguish cruciferous weeds in wheat and broad bean crops, using hyperspectral and multispectral readings in the visible and near-infrared spectrum. To identify differences in reflectance between cruciferous weeds, we applied three classification methods: stepwise discriminant (STEPDISC) analysis and two neural networks, specifically, multilayer perceptron (MLP) and radial basis function (RBF). Hyperspectral and multispectral signatures of cruciferous weeds, and wheat and broad bean crops can be classified using STEPDISC analysis, and MLP and RBF neural networks with different success, being the MLP model the most accurate with 100%, or higher than 98.1%, of classification performance for all the years. Classification accuracy from hyperspectral signatures was similar to that from multispectral and spectral indices, suggesting that little advantage would be obtained by using more expensive airborne hyperspectral imagery. Therefore, for next investigations, we recommend using multispectral remote imagery to explore whether they can potentially discriminate these weeds and crops. PMID:22629171
de Castro, Ana-Isabel; Jurado-Expósito, Montserrat; Gómez-Casero, María-Teresa; López-Granados, Francisca
2012-01-01
In the context of detection of weeds in crops for site-specific weed control, on-ground spectral reflectance measurements are the first step to determine the potential of remote spectral data to classify weeds and crops. Field studies were conducted for four years at different locations in Spain. We aimed to distinguish cruciferous weeds in wheat and broad bean crops, using hyperspectral and multispectral readings in the visible and near-infrared spectrum. To identify differences in reflectance between cruciferous weeds, we applied three classification methods: stepwise discriminant (STEPDISC) analysis and two neural networks, specifically, multilayer perceptron (MLP) and radial basis function (RBF). Hyperspectral and multispectral signatures of cruciferous weeds, and wheat and broad bean crops can be classified using STEPDISC analysis, and MLP and RBF neural networks with different success, being the MLP model the most accurate with 100%, or higher than 98.1%, of classification performance for all the years. Classification accuracy from hyperspectral signatures was similar to that from multispectral and spectral indices, suggesting that little advantage would be obtained by using more expensive airborne hyperspectral imagery. Therefore, for next investigations, we recommend using multispectral remote imagery to explore whether they can potentially discriminate these weeds and crops.
Empirical Testing of an Algorithm for Defining Somatization in Children
Eisman, Howard D.; Fogel, Joshua; Lazarovich, Regina; Pustilnik, Inna
2007-01-01
Introduction A previous article proposed an algorithm for defining somatization in children by classifying them into three categories: well, medically ill, and somatizer; the authors suggested further empirical validation of the algorithm (Postilnik et al., 2006). We use the Child Behavior Checklist (CBCL) to provide this empirical validation. Method Parents of children seen in pediatric clinics completed the CBCL (n=126). The physicians of these children completed specially-designed questionnaires. The sample comprised of 62 boys and 64 girls (age range 2 to 15 years). Classification categories included: well (n=53), medically ill (n=55), and somatizer (n=18). Analysis of variance (ANOVA) was used for statistical comparisons. Discriminant function analysis was conducted with the CBCL subscales. Results There were significant differences between the classification categories for the somatic complaints (p=<0.001), social problems (p=0.004), thought problems (p=0.01), attention problems (0.006), and internalizing (p=0.003) subscales and also total (p=0.001), and total-t (p=0.001) scales of the CBCL. Discriminant function analysis showed that 78% of somatizers and 66% of well were accurately classified, while only 35% of medically ill were accurately classified. Conclusion The somatization classification algorithm proposed by Postilnik et al. (2006) shows promise for classification of children and adolescents with somatic symptoms. PMID:18421368
Martins, Lucia Regina Rocha; Pereira-Filho, Edenir Rodrigues; Cass, Quezia Bezerra
2011-04-01
Taking in consideration the global analysis of complex samples, proposed by the metabolomic approach, the chromatographic fingerprint encompasses an attractive chemical characterization of herbal medicines. Thus, it can be used as a tool in quality control analysis of phytomedicines. The generated multivariate data are better evaluated by chemometric analyses, and they can be modeled by classification methods. "Stone breaker" is a popular Brazilian plant of Phyllanthus genus, used worldwide to treat renal calculus, hepatitis, and many other diseases. In this study, gradient elution at reversed-phase conditions with detection at ultraviolet region were used to obtain chemical profiles (fingerprints) of botanically identified samples of six Phyllanthus species. The obtained chromatograms, at 275 nm, were organized in data matrices, and the time shifts of peaks were adjusted using the Correlation Optimized Warping algorithm. Principal Component Analyses were performed to evaluate similarities among cultivated and uncultivated samples and the discrimination among the species and, after that, the samples were used to compose three classification models using Soft Independent Modeling of Class analogy, K-Nearest Neighbor, and Partial Least Squares for Discriminant Analysis. The ability of classification models were discussed after their successful application for authenticity evaluation of 25 commercial samples of "stone breaker."
Automatic classification of spectral units in the Aristarchus plateau
NASA Astrophysics Data System (ADS)
Erard, S.; Le Mouelic, S.; Langevin, Y.
1999-09-01
A reduction scheme has been recently proposed for the NIR images of Clementine (Le Mouelic et al, JGR 1999). This reduction has been used to build an integrated UVvis-NIR image cube of the Aristarchus region, from which compositional and maturity variations can be studied (Pinet et al, LPSC 1999). We will present an analysis of this image cube, providing a classification in spectral types and spectral units. The image cube is processed with Gmode analysis using three different data sets: Normalized spectra provide a classification based mainly on spectral slope variations (ie. maturity and volcanic glasses). This analysis discriminates between craters plus ejecta, mare basalts, and DMD. Olivine-rich areas and Aristarchus central peak are also recognized. Continuum-removed spectra provide a classification more related to compositional variations, which correctly identifies olivine and pyroxenes-rich areas (in Aristarchus, Krieger, Schiaparelli\\ldots). A third analysis uses spectral parameters related to maturity and Fe composition (reflectance, 1 mu m band depth, and spectral slope) rather than intensities. It provides the most spatially consistent picture, but fails in detecting Vallis Schroeteri and DMDs. A supplementary unit, younger and rich in pyroxene, is found on Aristarchus south rim. In conclusion, Gmode analysis can discriminate between different spectral types already identified with more classic methods (PCA, linear mixing\\ldots). No previous assumption is made on the data structure, such as endmembers number and nature, or linear relationship between input variables. The variability of the spectral types is intrinsically accounted for, so that the level of analysis is always restricted to meaningful limits. A complete classification should integrate several analyses based on different sets of parameters. Gmode is therefore a powerful light toll to perform first look analysis of spectral imaging data. This research has been partly founded by the French Programme National de Planetologie.
Parsons, Helen M; Ludwig, Christian; Günther, Ulrich L; Viant, Mark R
2007-01-01
Background Classifying nuclear magnetic resonance (NMR) spectra is a crucial step in many metabolomics experiments. Since several multivariate classification techniques depend upon the variance of the data, it is important to first minimise any contribution from unwanted technical variance arising from sample preparation and analytical measurements, and thereby maximise any contribution from wanted biological variance between different classes. The generalised logarithm (glog) transform was developed to stabilise the variance in DNA microarray datasets, but has rarely been applied to metabolomics data. In particular, it has not been rigorously evaluated against other scaling techniques used in metabolomics, nor tested on all forms of NMR spectra including 1-dimensional (1D) 1H, projections of 2D 1H, 1H J-resolved (pJRES), and intact 2D J-resolved (JRES). Results Here, the effects of the glog transform are compared against two commonly used variance stabilising techniques, autoscaling and Pareto scaling, as well as unscaled data. The four methods are evaluated in terms of the effects on the variance of NMR metabolomics data and on the classification accuracy following multivariate analysis, the latter achieved using principal component analysis followed by linear discriminant analysis. For two of three datasets analysed, classification accuracies were highest following glog transformation: 100% accuracy for discriminating 1D NMR spectra of hypoxic and normoxic invertebrate muscle, and 100% accuracy for discriminating 2D JRES spectra of fish livers sampled from two rivers. For the third dataset, pJRES spectra of urine from two breeds of dog, the glog transform and autoscaling achieved equal highest accuracies. Additionally we extended the glog algorithm to effectively suppress noise, which proved critical for the analysis of 2D JRES spectra. Conclusion We have demonstrated that the glog and extended glog transforms stabilise the technical variance in NMR metabolomics datasets. This significantly improves the discrimination between sample classes and has resulted in higher classification accuracies compared to unscaled, autoscaled or Pareto scaled data. Additionally we have confirmed the broad applicability of the glog approach using three disparate datasets from different biological samples using 1D NMR spectra, 1D projections of 2D JRES spectra, and intact 2D JRES spectra. PMID:17605789
Shape classification of wear particles by image boundary analysis using machine learning algorithms
NASA Astrophysics Data System (ADS)
Yuan, Wei; Chin, K. S.; Hua, Meng; Dong, Guangneng; Wang, Chunhui
2016-05-01
The shape features of wear particles generated from wear track usually contain plenty of information about the wear states of a machinery operational condition. Techniques to quickly identify types of wear particles quickly to respond to the machine operation and prolong the machine's life appear to be lacking and are yet to be established. To bridge rapid off-line feature recognition with on-line wear mode identification, this paper presents a new radial concave deviation (RCD) method that mainly involves the use of the particle boundary signal to analyze wear particle features. Signal output from the RCDs subsequently facilitates the determination of several other feature parameters, typically relevant to the shape and size of the wear particle. Debris feature and type are identified through the use of various classification methods, such as linear discriminant analysis, quadratic discriminant analysis, naïve Bayesian method, and classification and regression tree method (CART). The average errors of the training and test via ten-fold cross validation suggest CART is a highly suitable approach for classifying and analyzing particle features. Furthermore, the results of the wear debris analysis enable the maintenance team to diagnose faults appropriately.
Liu, Chao; Gu, Jinwei
2014-01-01
Classifying raw, unpainted materials--metal, plastic, ceramic, fabric, and so on--is an important yet challenging task for computer vision. Previous works measure subsets of surface spectral reflectance as features for classification. However, acquiring the full spectral reflectance is time consuming and error-prone. In this paper, we propose to use coded illumination to directly measure discriminative features for material classification. Optimal illumination patterns--which we call "discriminative illumination"--are learned from training samples, after projecting to which the spectral reflectance of different materials are maximally separated. This projection is automatically realized by the integration of incident light for surface reflection. While a single discriminative illumination is capable of linear, two-class classification, we show that multiple discriminative illuminations can be used for nonlinear and multiclass classification. We also show theoretically that the proposed method has higher signal-to-noise ratio than previous methods due to light multiplexing. Finally, we construct an LED-based multispectral dome and use the discriminative illumination method for classifying a variety of raw materials, including metal (aluminum, alloy, steel, stainless steel, brass, and copper), plastic, ceramic, fabric, and wood. Experimental results demonstrate its effectiveness.
Sex determination of the Acadian Flycatcher using discriminant analysis
Wilson, R.R.
1999-01-01
I used five morphometric variables from 114 individuals captured in Arkansas to develop a discriminant model to predict the sex of Acadian Flycatchers (Empidonax virescens). Stepwise discriminant function analyses selected wing chord and tail length as the most parsimonious subset of variables for discriminating sex. This two-variable model correctly classified 80% of females and 97% of males used to develop the model. Validation of the model using 19 individuals from Louisiana and Virginia resulted in 100% correct classification of males and females. This model provides criteria for sexing monomorphic Acadian Flycatchers during the breeding season and possibly during the winter.
Local kernel nonparametric discriminant analysis for adaptive extraction of complex structures
NASA Astrophysics Data System (ADS)
Li, Quanbao; Wei, Fajie; Zhou, Shenghan
2017-05-01
The linear discriminant analysis (LDA) is one of popular means for linear feature extraction. It usually performs well when the global data structure is consistent with the local data structure. Other frequently-used approaches of feature extraction usually require linear, independence, or large sample condition. However, in real world applications, these assumptions are not always satisfied or cannot be tested. In this paper, we introduce an adaptive method, local kernel nonparametric discriminant analysis (LKNDA), which integrates conventional discriminant analysis with nonparametric statistics. LKNDA is adept in identifying both complex nonlinear structures and the ad hoc rule. Six simulation cases demonstrate that LKNDA have both parametric and nonparametric algorithm advantages and higher classification accuracy. Quartic unilateral kernel function may provide better robustness of prediction than other functions. LKNDA gives an alternative solution for discriminant cases of complex nonlinear feature extraction or unknown feature extraction. At last, the application of LKNDA in the complex feature extraction of financial market activities is proposed.
An electronic nose for reliable measurement and correct classification of beverages.
Mamat, Mazlina; Samad, Salina Abdul; Hannan, Mahammad A
2011-01-01
This paper reports the design of an electronic nose (E-nose) prototype for reliable measurement and correct classification of beverages. The prototype was developed and fabricated in the laboratory using commercially available metal oxide gas sensors and a temperature sensor. The repeatability, reproducibility and discriminative ability of the developed E-nose prototype were tested on odors emanating from different beverages such as blackcurrant juice, mango juice and orange juice, respectively. Repeated measurements of three beverages showed very high correlation (r > 0.97) between the same beverages to verify the repeatability. The prototype also produced highly correlated patterns (r > 0.97) in the measurement of beverages using different sensor batches to verify its reproducibility. The E-nose prototype also possessed good discriminative ability whereby it was able to produce different patterns for different beverages, different milk heat treatments (ultra high temperature, pasteurization) and fresh and spoiled milks. The discriminative ability of the E-nose was evaluated using Principal Component Analysis and a Multi Layer Perception Neural Network, with both methods showing good classification results.
An Electronic Nose for Reliable Measurement and Correct Classification of Beverages
Mamat, Mazlina; Samad, Salina Abdul; Hannan, Mahammad A.
2011-01-01
This paper reports the design of an electronic nose (E-nose) prototype for reliable measurement and correct classification of beverages. The prototype was developed and fabricated in the laboratory using commercially available metal oxide gas sensors and a temperature sensor. The repeatability, reproducibility and discriminative ability of the developed E-nose prototype were tested on odors emanating from different beverages such as blackcurrant juice, mango juice and orange juice, respectively. Repeated measurements of three beverages showed very high correlation (r > 0.97) between the same beverages to verify the repeatability. The prototype also produced highly correlated patterns (r > 0.97) in the measurement of beverages using different sensor batches to verify its reproducibility. The E-nose prototype also possessed good discriminative ability whereby it was able to produce different patterns for different beverages, different milk heat treatments (ultra high temperature, pasteurization) and fresh and spoiled milks. The discriminative ability of the E-nose was evaluated using Principal Component Analysis and a Multi Layer Perception Neural Network, with both methods showing good classification results. PMID:22163964
Peake, Barrie M; Tong, Alfred Y C; Wells, William J; Harraway, John A; Niven, Brian E; Weege, Butch; LaFollette, Douglas J
2015-06-01
The trace metal content of roots of samples of the American ginseng natural herbal plant species (Panax quinquefolius) was investigated as a means of differentiating between this species grown on Wisconsin and New Zealand farms, and from Canadian and Chinese sources. ICP-MS measurements were undertaken by ashing samples of the roots and then digestion with conc. HNO3 and H2O2. There was considerable variation in the concentrations of 28 detectable elements along the length of a root, between different roots, between different farms/sources and between different countries. Statistical processing of the log-transformed concentration data was undertaken using principal component analysis (PCA) and discriminant function analysis (DFA). Although PCA showed some differentiation between samples, a much clearer discrimination of the Panax quinquefolius species of ginseng from the four countries was observed using DFA. 88% of the variation between countries could be accounted for by only using discriminant function 1 while 80% of the remaining 12% of the variation between countries is accounted for by discriminant function 2. The Fisher Classification Functions classify 98% of the 87 samples to the correct country of origin with 97% of the cross-validated cases correctly classified. The predictive ability of this DFA model was further tested by constructing 100 discriminant models each using a random selection of the data for two thirds of the 87 sampled ginseng root tops, and then using the resulting classification functions to determine correctly the country of origin of the remaining third of the cases. The mean success rate of the 100 classifications was 92%. These results suggest that measurement and statistical analysis of just the trace metal content of the roots of Panax quinquefolius promises to be an excellent predictor of the country of origin of this ginseng species. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Zakaria, Ammar; Shakaff, Ali Yeon Md; Masnan, Maz Jamilah; Saad, Fathinul Syahir Ahmad; Adom, Abdul Hamid; Ahmad, Mohd Noor; Jaafar, Mahmad Nor; Abdullah, Abu Hassan; Kamarudin, Latifah Munirah
2012-01-01
In recent years, there have been a number of reported studies on the use of non-destructive techniques to evaluate and determine mango maturity and ripeness levels. However, most of these reported works were conducted using single-modality sensing systems, either using an electronic nose, acoustics or other non-destructive measurements. This paper presents the work on the classification of mangoes (Magnifera Indica cv. Harumanis) maturity and ripeness levels using fusion of the data of an electronic nose and an acoustic sensor. Three groups of samples each from two different harvesting times (week 7 and week 8) were evaluated by the e-nose and then followed by the acoustic sensor. Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) were able to discriminate the mango harvested at week 7 and week 8 based solely on the aroma and volatile gases released from the mangoes. However, when six different groups of different maturity and ripeness levels were combined in one classification analysis, both PCA and LDA were unable to discriminate the age difference of the Harumanis mangoes. Instead of six different groups, only four were observed using the LDA, while PCA showed only two distinct groups. By applying a low level data fusion technique on the e-nose and acoustic data, the classification for maturity and ripeness levels using LDA was improved. However, no significant improvement was observed using PCA with data fusion technique. Further work using a hybrid LDA-Competitive Learning Neural Network was performed to validate the fusion technique and classify the samples. It was found that the LDA-CLNN was also improved significantly when data fusion was applied. PMID:22778629
Zakaria, Ammar; Shakaff, Ali Yeon Md; Masnan, Maz Jamilah; Saad, Fathinul Syahir Ahmad; Adom, Abdul Hamid; Ahmad, Mohd Noor; Jaafar, Mahmad Nor; Abdullah, Abu Hassan; Kamarudin, Latifah Munirah
2012-01-01
In recent years, there have been a number of reported studies on the use of non-destructive techniques to evaluate and determine mango maturity and ripeness levels. However, most of these reported works were conducted using single-modality sensing systems, either using an electronic nose, acoustics or other non-destructive measurements. This paper presents the work on the classification of mangoes (Magnifera Indica cv. Harumanis) maturity and ripeness levels using fusion of the data of an electronic nose and an acoustic sensor. Three groups of samples each from two different harvesting times (week 7 and week 8) were evaluated by the e-nose and then followed by the acoustic sensor. Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) were able to discriminate the mango harvested at week 7 and week 8 based solely on the aroma and volatile gases released from the mangoes. However, when six different groups of different maturity and ripeness levels were combined in one classification analysis, both PCA and LDA were unable to discriminate the age difference of the Harumanis mangoes. Instead of six different groups, only four were observed using the LDA, while PCA showed only two distinct groups. By applying a low level data fusion technique on the e-nose and acoustic data, the classification for maturity and ripeness levels using LDA was improved. However, no significant improvement was observed using PCA with data fusion technique. Further work using a hybrid LDA-Competitive Learning Neural Network was performed to validate the fusion technique and classify the samples. It was found that the LDA-CLNN was also improved significantly when data fusion was applied.
A simple randomisation procedure for validating discriminant analysis: a methodological note.
Wastell, D G
1987-04-01
Because the goal of discriminant analysis (DA) is to optimise classification, it designedly exaggerates between-group differences. This bias complicates validation of DA. Jack-knifing has been used for validation but is inappropriate when stepwise selection (SWDA) is employed. A simple randomisation test is presented which is shown to give correct decisions for SWDA. The general superiority of randomisation tests over orthodox significance tests is discussed. Current work on non-parametric methods of estimating the error rates of prediction rules is briefly reviewed.
Some observations on the use of discriminant analysis in ecology
Williams, B.K.
1983-01-01
The application of discriminant analysis in ecological investigations is discussed. The appropriate statistical assumptions for discriminant analysis are illustrated, and both classification and group separation approaches are outlined. Three assumptions that are crucial in ecological studies are discussed at length, and the consequences of their violation are developed. These assumptions are: equality of dispersions, identifiability of prior probabilities, and precise and accurate estimation of means and dispersions. The use of discriminant functions for purposes of interpreting ecological relationships is also discussed. It is suggested that the common practice of imputing ecological 'meaning' to the signs and magnitudes of coefficients be replaced by an assessment of 'structure coefficients.' Finally, the potential and limitations of representation of data in canonical space are considered, and some cautionary points are made concerning ecological interpretation of patterns in canonical space.
Assessment of sexual orientation using the hemodynamic brain response to visual sexual stimuli.
Ponseti, Jorge; Granert, Oliver; Jansen, Olav; Wolff, Stephan; Mehdorn, Hubertus; Bosinski, Hartmut; Siebner, Hartwig
2009-06-01
The assessment of sexual orientation is of importance to the diagnosis and treatment of sex offenders and paraphilic disorders. Phallometry is considered gold standard in objectifying sexual orientation, yet this measurement has been criticized because of its intrusiveness and limited reliability. To evaluate whether the spatial response pattern to sexual stimuli as revealed by a change in blood oxygen level-dependent (BOLD) signal can be used for individual classification of sexual orientation. We used a preexisting functional MRI (fMRI) data set that had been acquired in a nonclinical sample of 12 heterosexual men and 14 homosexual men. During fMRI, participants were briefly exposed to pictures of same-sex and opposite-sex genitals. Data analysis involved four steps: (i) differences in the BOLD response to female and male sexual stimuli were calculated for each subject; (ii) these contrast images were entered into a group analysis to calculate whole-brain difference maps between homosexual and heterosexual participants; (iii) a single expression value was computed for each subject expressing its correspondence to the group result; and (iv) based on these expression values, Fisher's linear discriminant analysis and the kappa-nearest neighbor classification method were used to predict the sexual orientation of each subject. Sensitivity and specificity of the two classification methods in predicting individual sexual orientation. Both classification methods performed well in predicting individual sexual orientation with a mean accuracy of >85% (Fisher's linear discriminant analysis: 92% sensitivity, 85% specificity; kappa-nearest neighbor classification: 88% sensitivity, 92% specificity). Despite the small sample size, the functional response patterns of the brain to sexual stimuli contained sufficient information to predict individual sexual orientation with high accuracy. These results suggest that fMRI-based classification methods hold promise for the diagnosis of paraphilic disorders (e.g., pedophilia).
Nonlinear features for classification and pose estimation of machined parts from single views
NASA Astrophysics Data System (ADS)
Talukder, Ashit; Casasent, David P.
1998-10-01
A new nonlinear feature extraction method is presented for classification and pose estimation of objects from single views. The feature extraction method is called the maximum representation and discrimination feature (MRDF) method. The nonlinear MRDF transformations to use are obtained in closed form, and offer significant advantages compared to nonlinear neural network implementations. The features extracted are useful for both object discrimination (classification) and object representation (pose estimation). We consider MRDFs on image data, provide a new 2-stage nonlinear MRDF solution, and show it specializes to well-known linear and nonlinear image processing transforms under certain conditions. We show the use of MRDF in estimating the class and pose of images of rendered solid CAD models of machine parts from single views using a feature-space trajectory neural network classifier. We show new results with better classification and pose estimation accuracy than are achieved by standard principal component analysis and Fukunaga-Koontz feature extraction methods.
NASA Astrophysics Data System (ADS)
Postadjian, T.; Le Bris, A.; Sahbi, H.; Mallet, C.
2017-05-01
Semantic classification is a core remote sensing task as it provides the fundamental input for land-cover map generation. The very recent literature has shown the superior performance of deep convolutional neural networks (DCNN) for many classification tasks including the automatic analysis of Very High Spatial Resolution (VHR) geospatial images. Most of the recent initiatives have focused on very high discrimination capacity combined with accurate object boundary retrieval. Therefore, current architectures are perfectly tailored for urban areas over restricted areas but not designed for large-scale purposes. This paper presents an end-to-end automatic processing chain, based on DCNNs, that aims at performing large-scale classification of VHR satellite images (here SPOT 6/7). Since this work assesses, through various experiments, the potential of DCNNs for country-scale VHR land-cover map generation, a simple yet effective architecture is proposed, efficiently discriminating the main classes of interest (namely buildings, roads, water, crops, vegetated areas) by exploiting existing VHR land-cover maps for training.
Vigli, Georgia; Philippidis, Angelos; Spyros, Apostolos; Dais, Photis
2003-09-10
A combination of (1)H NMR and (31)P NMR spectroscopy and multivariate statistical analysis was used to classify 192 samples from 13 types of vegetable oils, namely, hazelnut, sunflower, corn, soybean, sesame, walnut, rapeseed, almond, palm, groundnut, safflower, coconut, and virgin olive oils from various regions of Greece. 1,2-Diglycerides, 1,3-diglycerides, the ratio of 1,2-diglycerides to total diglycerides, acidity, iodine value, and fatty acid composition determined upon analysis of the respective (1)H NMR and (31)P NMR spectra were selected as variables to establish a classification/prediction model by employing discriminant analysis. This model, obtained from the training set of 128 samples, resulted in a significant discrimination among the different classes of oils, whereas 100% of correct validated assignments for 64 samples were obtained. Different artificial mixtures of olive-hazelnut, olive-corn, olive-sunflower, and olive-soybean oils were prepared and analyzed by (1)H NMR and (31)P NMR spectroscopy. Subsequent discriminant analysis of the data allowed detection of adulteration as low as 5% w/w, provided that fresh virgin olive oil samples were used, as reflected by their high 1,2-diglycerides to total diglycerides ratio (D > or = 0.90).
NASA Astrophysics Data System (ADS)
Liu, Yue; Zhang, Ying; Zhang, Jing; Fan, Gang; Tu, Ya; Sun, Suqin; Shen, Xudong; Li, Qingzhu; Zhang, Yi
2018-03-01
As an important ethnic medicine, sea buckthorn was widely used to prevent and treat various diseases due to its nutritional and medicinal properties. According to the Chinese Pharmacopoeia, sea buckthorn was originated from H. rhamnoides, which includes five subspecies distributed in China. Confusion and misidentification usually occurred due to their similar morphology, especially in dried and powdered forms. Additionally, these five subspecies have vital differences in quality and physiological efficacy. This paper focused on the quick classification and identification method of sea buckthorn berry powders from five H. rhamnoides subspecies using multi-step IR spectroscopy coupled with multivariate data analysis. The holistic chemical compositions revealed by the FT-IR spectra demonstrated that flavonoids, fatty acids and sugars were the main chemical components. Further, the differences in FT-IR spectra regarding their peaks, positions and intensities were used to identify H. rhamnoides subspecies samples. The discrimination was achieved using principal component analysis (PCA) and partial least square-discriminant analysis (PLS-DA). The results showed that the combination of multi-step IR spectroscopy and chemometric analysis offered a simple, fast and reliable method for the classification and identification of the sea buckthorn berry powders from different H. rhamnoides subspecies.
Morphometric classification of Spanish thoroughbred stallion sperm heads.
Hidalgo, Manuel; Rodríguez, Inmaculada; Dorado, Jesús; Soler, Carles
2008-01-30
This work used semen samples collected from 12 stallions and assessed for sperm morphometry by the Sperm Class Analyzer (SCA) computer-assisted system. A discriminant analysis was performed on the morphometric data from that sperm to obtain a classification matrix for sperm head shape. Thereafter, we defined six types of sperm head shape. Classification of sperm head by this method obtained a globally correct assignment of 90.1%. Moreover, significant differences (p<0.05) were found between animals for all the sperm head morphometric parameters assessed.
The ERTS-1 investigation (ER-600). Volume 4: ERTS-1 range analysis
NASA Technical Reports Server (NTRS)
Erb, R. B.
1974-01-01
The Range Analysis Team conducted an investigation to determine the utility of using LANDSAT 1 data for mapping vegetation-type information on range and related grazing lands. Two study areas within the Houston Area Test Site (HATS) were mapped to the highest classification level possible using manual image interpretation and computer aided classification techniques. Rangeland was distinguished from nonrangeland (water, urban area, and cropland) and was further classified as woodland versus nonwoodland. Finer classification of coastal features was attempted with some success in differentiating the lowland zone from the drier upland zone. Computer aided temporal analysis techniques enhanced discrimination among nearly all the vegetation types found in this investigation.
Maraschin, Marcelo; Somensi-Zeggio, Amélia; Oliveira, Simone K; Kuhnen, Shirley; Tomazzoli, Maíra M; Raguzzoni, Josiane C; Zeri, Ana C M; Carreira, Rafael; Correia, Sara; Costa, Christopher; Rocha, Miguel
2016-01-22
The chemical composition of propolis is affected by environmental factors and harvest season, making it difficult to standardize its extracts for medicinal usage. By detecting a typical chemical profile associated with propolis from a specific production region or season, certain types of propolis may be used to obtain a specific pharmacological activity. In this study, propolis from three agroecological regions (plain, plateau, and highlands) from southern Brazil, collected over the four seasons of 2010, were investigated through a novel NMR-based metabolomics data analysis workflow. Chemometrics and machine learning algorithms (PLS-DA and RF), including methods to estimate variable importance in classification, were used in this study. The machine learning and feature selection methods permitted construction of models for propolis sample classification with high accuracy (>75%, reaching ∼90% in the best case), better discriminating samples regarding their collection seasons comparatively to the harvest regions. PLS-DA and RF allowed the identification of biomarkers for sample discrimination, expanding the set of discriminating features and adding relevant information for the identification of the class-determining metabolites. The NMR-based metabolomics analytical platform, coupled to bioinformatic tools, allowed characterization and classification of Brazilian propolis samples regarding the metabolite signature of important compounds, i.e., chemical fingerprint, harvest seasons, and production regions.
NASA Astrophysics Data System (ADS)
Bocsi, Jozsef; Mittag, Anja; Pierzchalski, Arkadiusz; Osmancik, Pavel; Dähnert, Ingo; Tárnok, Attila
2011-02-01
Introduction: Methylprednisolone (MP) is frequently preoperatively administered in children undergoing open heart surgery. The aim of this medication is to inhibit overshooting immune responses. Earlier studies demonstrated cellular and humoral immunological changes in pediatric patients undergoing heart surgeries with and without MP administration. Here in a retrospective study we investigated the modulation of the cellular immune response by MP. The aim was to identify suitable parameters characterizing MP effects by cluster analysis. Methods: Blood samples were analysed from two aged matched groups with surgical correction of septum defects. Group without MP treatment consisted of 10 patients; MP was administered on 21 patients (median dose: 11mg/kg) before cardiopulmonary bypass (CPB). EDTA anticoagulated blood was obtained 24 h preoperatively, after anesthesia, at CPB begin and end (CPB2), 4h, 24h, 48h after surgery, at discharge and at out-patient followup (8.2; 3.3-12.2 month after surgery; median and IQR). Flow cytometry showed the biggest MP relevant changes at CPB2 and 4h postoperatively. They were used for clustering analysis. Classification was made by discriminant analysis and cluster analysis by means of Genes@work software. Results & conclusion: 146 parameters were obtained from analysis. Cross-validation revealed several parameters being able to discriminate between MP groups and to identify immune modulation. MP administration resulted in a delayed activation of monocytes, increased ratio of neutrophils, reduced T-lymphocytes counts. Cluster analysis demonstrated that classification of patients is possible based on the identified cytomics parameters. Further investigation of these parameters might help to understand the MP effects in pediatric open heart surgery.
Sex estimation standards for medieval and contemporary Croats
Bašić, Željana; Kružić, Ivana; Jerković, Ivan; Anđelinović, Deny; Anđelinović, Šimun
2017-01-01
Aim To develop discriminant functions for sex estimation on medieval Croatian population and test their application on contemporary Croatian population. Methods From a total of 519 skeletons, we chose 84 adult excellently preserved skeletons free of antemortem and postmortem changes and took all standard measurements. Sex was estimated/determined using standard anthropological procedures and ancient DNA (amelogenin analysis) where pelvis was insufficiently preserved or where sex morphological indicators were not consistent. We explored which measurements showed sexual dimorphism and used them for developing univariate and multivariate discriminant functions for sex estimation. We included only those functions that reached accuracy rate ≥80%. We tested the applicability of developed functions on modern Croatian sample (n = 37). Results From 69 standard skeletal measurements used in this study, 56 of them showed statistically significant sexual dimorphism (74.7%). We developed five univariate discriminant functions with classification rate 80.6%-85.2% and seven multivariate discriminant functions with an accuracy rate of 81.8%-93.0%. When tested on the modern population functions showed classification rates 74.1%-100%, and ten of them reached aimed accuracy rate. Females showed higher classification rates in the medieval populations, whereas males were better classified in the modern populations. Conclusion Developed discriminant functions are sufficiently accurate for reliable sex estimation in both medieval Croatian population and modern Croatian samples and may be used in forensic settings. The methodological issues that emerged regarding the importance of considering external factors in development and application of discriminant functions for sex estimation should be further explored. PMID:28613039
Speech Music Discrimination Using Class-Specific Features
2004-08-01
Speech Music Discrimination Using Class-Specific Features Thomas Beierholm...between speech and music . Feature extraction is class-specific and can therefore be tailored to each class meaning that segment size, model orders...interest. Some of the applications of audio signal classification are speech/ music classification [1], acoustical environmental classification [2][3
Neural net applied to anthropological material: a methodical study on the human nasal skeleton.
Prescher, Andreas; Meyers, Anne; Gerf von Keyserlingk, Diedrich
2005-07-01
A new information processing method, an artificial neural net, was applied to characterise the variability of anthropological features of the human nasal skeleton. The aim was to find different types of nasal skeletons. A neural net with 15*15 nodes was trained by 17 standard anthropological parameters taken from 184 skulls of the Aachen collection. The trained neural net delivers its classification in a two-dimensional map. Different types of noses were locally separated within the map. Rare and frequent types may be distinguished after one passage of the complete collection through the net. Statistical descriptive analysis, hierarchical cluster analysis, and discriminant analysis were applied to the same data set. These parallel applications allowed comparison of the new approach to the more traditional ones. In general the classification by the neural net is in correspondence with cluster analysis and discriminant analysis. However, it goes beyond these classifications because of the possibility of differentiating the types in multi-dimensional dependencies. Furthermore, places in the map are kept blank for intermediate forms, which may be theoretically expected, but were not included in the training set. In conclusion, the application of a neural network is a suitable method for investigating large collections of biological material. The gained classification may be helpful in anatomy and anthropology as well as in forensic medicine. It may be used to characterise the peculiarity of a whole set as well as to find particular cases within the set.
Wang, Kun-Ching
2015-01-01
The classification of emotional speech is mostly considered in speech-related research on human-computer interaction (HCI). In this paper, the purpose is to present a novel feature extraction based on multi-resolutions texture image information (MRTII). The MRTII feature set is derived from multi-resolution texture analysis for characterization and classification of different emotions in a speech signal. The motivation is that we have to consider emotions have different intensity values in different frequency bands. In terms of human visual perceptual, the texture property on multi-resolution of emotional speech spectrogram should be a good feature set for emotion classification in speech. Furthermore, the multi-resolution analysis on texture can give a clearer discrimination between each emotion than uniform-resolution analysis on texture. In order to provide high accuracy of emotional discrimination especially in real-life, an acoustic activity detection (AAD) algorithm must be applied into the MRTII-based feature extraction. Considering the presence of many blended emotions in real life, in this paper make use of two corpora of naturally-occurring dialogs recorded in real-life call centers. Compared with the traditional Mel-scale Frequency Cepstral Coefficients (MFCC) and the state-of-the-art features, the MRTII features also can improve the correct classification rates of proposed systems among different language databases. Experimental results show that the proposed MRTII-based feature information inspired by human visual perception of the spectrogram image can provide significant classification for real-life emotional recognition in speech. PMID:25594590
NASA Astrophysics Data System (ADS)
Khansari, Maziyar M.; O'Neill, William; Penn, Richard; Blair, Norman P.; Chau, Felix; Shahidi, Mahnaz
2017-03-01
The conjunctiva is a densely vascularized tissue of the eye that provides an opportunity for imaging of human microcirculation. In the current study, automated fine structure analysis of conjunctival microvasculature images was performed to discriminate stages of diabetic retinopathy (DR). The study population consisted of one group of nondiabetic control subjects (NC) and 3 groups of diabetic subjects, with no clinical DR (NDR), non-proliferative DR (NPDR), or proliferative DR (PDR). Ordinary least square regression and Fisher linear discriminant analyses were performed to automatically discriminate images between group pairs of subjects. Human observers who were masked to the grouping of subjects performed image discrimination between group pairs. Over 80% and 70% of images of subjects with clinical and non-clinical DR were correctly discriminated by the automated method, respectively. The discrimination rates of the automated method were higher than human observers. The fine structure analysis of conjunctival microvasculature images provided discrimination of DR stages and can be potentially useful for DR screening and monitoring.
Wiśniewska, Paulina; Boqué, Ricard; Borràs, Eva; Busto, Olga; Wardencki, Waldemar; Namieśnik, Jacek; Dymerski, Tomasz
2017-02-15
Headspace mass-spectrometry (HS-MS), mid infrared (MIR) and UV-vis spectroscopy were used to authenticate whisky samples from different origins and ways of production ((Irish, Spanish, Bourbon, Tennessee Whisky and Scotch). The collected spectra were processed with partial least-squares discriminant analysis (PLS-DA) to build the classification models. In all cases the five groups of whiskies were distinguished, but the best results were obtained by HS-MS, which indicates that the biggest differences between different types of whisky are due to their aroma. Differences were also found inside groups, showing that not only raw material is important to discriminate samples but also the way of their production. The methodology is quick, easy and does not require sample preparation. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Xu, Ye; Sonka, Milan; McLennan, Geoffrey; Guo, Junfeng; Hoffman, Eric
2005-04-01
Lung parenchyma evaluation via multidetector-row CT (MDCT), has significantly altered clinical practice in the early detection of lung disease. Our goal is to enhance our texture-based tissue classification ability to differentiate early pathologic processes by extending our 2-D Adaptive Multiple Feature Method (AMFM) to 3-D AMFM. We performed MDCT on 34 human volunteers in five categories: emphysema in severe Chronic Obstructive Pulmonary Disease (COPD) as EC, emphysema in mild COPD (MC), normal appearing lung in COPD (NC), non-smokers with normal lung function (NN), smokers with normal function (NS). We volumetrically excluded the airway and vessel regions, calculated 24 volumetric texture features for each Volume of Interest (VOI); and used Bayesian rules for discrimination. Leave-one-out and half-half methods were used for testing. Sensitivity, specificity and accuracy were calculated. The accuracy of the leave-one-out method for the four-class classification in the form of 3-D/2-D is: EC: 84.9%/70.7%, MC: 89.8%/82.7%; NC: 87.5.0%/49.6%; NN: 100.0%/60.0%. The accuracy of the leave-one-out method for the two-class classification in the form of 3-D/2-D is: NN: 99.3%/71.6%; NS: 99.7%/74.5%. We conclude that 3-D AMFM analysis of the lung parenchyma improves discrimination compared to 2-D analysis of the same images.
Ghasemi-Varnamkhasti, Mahdi; Amiri, Zahra Safari; Tohidi, Mojtaba; Dowlati, Majid; Mohtasebi, Seyed Saeid; Silva, Adenilton C; Fernandes, David D S; Araujo, Mário C U
2018-01-01
Cumin is a plant of the Apiaceae family (umbelliferae) which has been used since ancient times as a medicinal plant and as a spice. The difference in the percentage of aromatic compounds in cumin obtained from different locations has led to differentiation of some species of cumin from other species. The quality and price of cumin vary according to the specie and may be an incentive for the adulteration of high value samples with low quality cultivars. An electronic nose simulates the human olfactory sense by using an array of sensors to distinguish complex smells. This makes it an alternative for the identification and classification of cumin species. The data, however, may have a complex structure, difficult to interpret. Given this, chemometric tools can be used to manipulate data with two-dimensional structure (sensor responses in time) obtained by using electronic nose sensors. In this study, an electronic nose based on eight metal oxide semiconductor sensors (MOS) and 2D-LDA (two-dimensional linear discriminant analysis), U-PLS-DA (Partial least square discriminant analysis applied to the unfolded data) and PARAFAC-LDA (Parallel factor analysis with linear discriminant analysis) algorithms were used in order to identify and classify different varieties of both cultivated and wild black caraway and cumin. The proposed methodology presented a correct classification rate of 87.1% for PARAFAC-LDA and 100% for 2D-LDA and U-PLS-DA, indicating a promising strategy for the classification different varieties of cumin, caraway and other seeds. Copyright © 2017 Elsevier B.V. All rights reserved.
Structural vibration-based damage classification of delaminated smart composite laminates
NASA Astrophysics Data System (ADS)
Khan, Asif; Kim, Heung Soo; Sohn, Jung Woo
2018-03-01
Separation along the interfaces of layers (delamination) is a principal mode of failure in laminated composites and its detection is of prime importance for structural integrity of composite materials. In this work, structural vibration response is employed to detect and classify delaminations in piezo-bonded laminated composites. Improved layerwise theory and finite element method are adopted to develop the electromechanically coupled governing equation of a smart composite laminate with and without delaminations. Transient responses of the healthy and damaged structures are obtained through a surface bonded piezoelectric sensor by solving the governing equation in the time domain. Wavelet packet transform (WPT) and linear discriminant analysis (LDA) are employed to extract discriminative features from the structural vibration response of the healthy and delaminated structures. Dendrogram-based support vector machine (DSVM) is used to classify the discriminative features. The confusion matrix of the classification algorithm provided physically consistent results.
Detection of Lettuce Discoloration Using Hyperspectral Reflectance Imaging
Mo, Changyeun; Kim, Giyoung; Lim, Jongguk; Kim, Moon S.; Cho, Hyunjeong; Cho, Byoung-Kwan
2015-01-01
Rapid visible/near-infrared (VNIR) hyperspectral imaging methods, employing both a single waveband algorithm and multi-spectral algorithms, were developed in order to discrimination between sound and discolored lettuce. Reflectance spectra for sound and discolored lettuce surfaces were extracted from hyperspectral reflectance images obtained in the 400–1000 nm wavelength range. The optimal wavebands for discriminating between discolored and sound lettuce surfaces were determined using one-way analysis of variance. Multi-spectral imaging algorithms developed using ratio and subtraction functions resulted in enhanced classification accuracy of above 99.9% for discolored and sound areas on both adaxial and abaxial lettuce surfaces. Ratio imaging (RI) and subtraction imaging (SI) algorithms at wavelengths of 552/701 nm and 557–701 nm, respectively, exhibited better classification performances compared to results obtained for all possible two-waveband combinations. These results suggest that hyperspectral reflectance imaging techniques can potentially be used to discriminate between discolored and sound fresh-cut lettuce. PMID:26610510
Detection of Lettuce Discoloration Using Hyperspectral Reflectance Imaging.
Mo, Changyeun; Kim, Giyoung; Lim, Jongguk; Kim, Moon S; Cho, Hyunjeong; Cho, Byoung-Kwan
2015-11-20
Rapid visible/near-infrared (VNIR) hyperspectral imaging methods, employing both a single waveband algorithm and multi-spectral algorithms, were developed in order to discrimination between sound and discolored lettuce. Reflectance spectra for sound and discolored lettuce surfaces were extracted from hyperspectral reflectance images obtained in the 400-1000 nm wavelength range. The optimal wavebands for discriminating between discolored and sound lettuce surfaces were determined using one-way analysis of variance. Multi-spectral imaging algorithms developed using ratio and subtraction functions resulted in enhanced classification accuracy of above 99.9% for discolored and sound areas on both adaxial and abaxial lettuce surfaces. Ratio imaging (RI) and subtraction imaging (SI) algorithms at wavelengths of 552/701 nm and 557-701 nm, respectively, exhibited better classification performances compared to results obtained for all possible two-waveband combinations. These results suggest that hyperspectral reflectance imaging techniques can potentially be used to discriminate between discolored and sound fresh-cut lettuce.
Analysis of a multisensor image data set of south San Rafael Swell, Utah
NASA Technical Reports Server (NTRS)
Evans, D. L.
1982-01-01
A Shuttle Imaging Radar (SIR-A) image of the southern portion of the San Rafael Swell in Utah has been digitized and registered to coregistered Landsat, Seasat, and HCMM thermal inertia images. The addition of the SIR-A image to the registered data set improves rock type discrimination in both qualitative and quantitative analyses. Sedimentary units can be separated in a combined SIR-A/Seasat image that cannot be seen in either image alone. Discriminant Analyses show that the classification accuracy is improved with addition of the SIR-A image to Landsat images. Classification accuracy is further improved when texture information from the Seasat and SIR-A images is included.
NASA Astrophysics Data System (ADS)
Larter, Jarod Lee
Stephens Lake, Manitoba is an example of a peatland reservoir that has undergone physical changes related to mineral erosion and peatland disintegration processes since its initial impoundment. In this thesis I focused on the processes of peatland upheaval, transport, and disintegration as the primary drivers of dynamic change within the reservoir. The changes related to these processes are most frequent after initial reservoir impoundment and decline over time. They continue to occur over 35 years after initial flooding. I developed a remote sensing approach that employs both optical and microwave sensors for discriminating land (Le. floating peatlands, forested land, and barren land) from open water within the reservoir. High spatial resolution visible and near-infrared (VNIR) optical data obtained from the QuickBird satellite, and synthetic aperture radar (SAR) microwave data obtained from the RADARSAT-1 satellite were implemented. The approach was facilitated with a Geographic Information System (GIS) based validation map for the extraction of optical and SAR pixel data. Each sensor's extracted data set was first analyzed separately using univariate and multivariate statistical methods to determine the discriminant ability of each sensor. The initial analyses were followed by an integrated sensor approach; the development of an image classification model; and a change detection analysis. Results showed excellent (> 95%) classification accuracy using QuickBird satellite image data. Discrimination and classification of studied land cover classes using SAR image texture data resulted in lower overall classification accuracies (˜ 60%). SAR data classification accuracy improved to > 90% when classifying only land and water, demonstrating SAR's utility as a land and water mapping tool. An integrated sensor data approach showed no considerable improvement over the use of optical satellite image data alone. An image classification model was developed that could be used to map both detailed land cover classes and the land and water interface within the reservoir. Change detection analysis over a seven year period indicated that physical changes related to mineral erosion, peatland upheaval, transport, and disintegration, and operational water level variation continue to take place in the reservoir some 35 years after initial flooding. This thesis demonstrates the ability of optical and SAR satellite image remote sensing data sets to be used in an operational context for the routine discrimination of the land and water boundaries within a dynamic peatland reservoir. Future monitoring programs would benefit most from a complementary image acquisition program in which SAR images, known for their acquisition reliability under cloud cover, are acquired along with optical images given their ability to discriminate land cover classes in greater detail.
Lo Bianco, M; Grillo, O; Cañadas, E; Venora, G; Bacchetta, G
2017-03-01
This work aims to discriminate among different species of the genus Cistus, using seed parameters and following the scientific plant names included as accepted in The Plant List. Also, the intraspecific phenotypic differentiation of C. creticus, through comparison with three subspecies (C. creticus subsp. creticus, C. c. subsp. eriocephalus and C. c. subsp. corsicus), as well as the interpopulation variability among five C. creticus subsp. eriocephalus populations was evaluated. Seed mean weight and 137 morphocolorimetric quantitative variables, describing shape, size, colour and textural seed traits, were measured using image analysis techniques. Measured data were analysed applying step-wise linear discriminant analysis. An overall cross-validated classification performance of 80.6% was recorded at species level. With regard to C. creticus, as case study, percentages of correct discrimination of 96.7% and 99.6% were achieved at intraspecific and interpopulation levels, respectively. In this classification model, the relevance of the colorimetric and textural descriptive features was highlighted, as well as the seed mean weight, which was the most discriminant feature at specific and intraspecific level. These achievements proved the ability of the image analysis system as highly diagnostic for systematic purposes and confirm that seeds in the genus Cistus have important diagnostic value. © 2016 German Botanical Society and The Royal Botanical Society of the Netherlands.
NASA Astrophysics Data System (ADS)
Verma, Surendra P.; Rivera-Gómez, M. Abdelaly; Díaz-González, Lorena; Quiroz-Ruiz, Alfredo
2016-12-01
A new multidimensional classification scheme consistent with the chemical classification of the International Union of Geological Sciences (IUGS) is proposed for the nomenclature of High-Mg altered rocks. Our procedure is based on an extensive database of major element (SiO2, TiO2, Al2O3, Fe2O3t, MnO, MgO, CaO, Na2O, K2O, and P2O5) compositions of a total of 33,868 (920 High-Mg and 32,948 "Common") relatively fresh igneous rock samples. The database consisting of these multinormally distributed samples in terms of their isometric log-ratios was used to propose a set of 11 discriminant functions and 6 diagrams to facilitate High-Mg rock classification. The multinormality required by linear discriminant and canonical analysis was ascertained by a new computer program DOMuDaF. One multidimensional function can distinguish the High-Mg and Common igneous rocks with high percent success values of about 86.4% and 98.9%, respectively. Similarly, from 10 discriminant functions the High-Mg rocks can also be classified as one of the four rock types (komatiite, meimechite, picrite, and boninite), with high success values of about 88%-100%. Satisfactory functioning of this new classification scheme was confirmed by seven independent tests. Five further case studies involving application to highly altered rocks illustrate the usefulness of our proposal. A computer program HMgClaMSys was written to efficiently apply the proposed classification scheme, which will be available for online processing of igneous rock compositional data. Monte Carlo simulation modeling and mass-balance computations confirmed the robustness of our classification with respect to analytical errors and postemplacement compositional changes.
Gil Solsona, R; Boix, C; Ibáñez, M; Sancho, J V
2018-03-01
The aim of this study was to use an untargeted UHPLC-HRMS-based metabolomics approach allowing discrimination between almonds based on their origin and variety. Samples were homogenised, extracted with ACN:H 2 O (80:20) containing 0.1% HCOOH and injected in a UHPLC-QTOF instrument in both positive and negative ionisation modes. Principal component analysis (PCA) was performed to ensure the absence of outliers. Partial least squares - discriminant analysis (PLS-DA) was employed to create and validate the models for country (with five different compounds) and variety (with 20 features), showing more than 95% accuracy. Additional samples were injected and the model was evaluated with blind samples, with more than 95% of samples being correctly classified using both models. MS/MS experiments were carried out to tentatively elucidate the highlighted marker compounds (pyranosides, peptides or amino acids, among others). This study has shown the potential of high-resolution mass spectrometry to perform and validate classification models, also providing information concerning the identification of the unexpected biomarkers which showed the highest discriminant power.
Qiu, Shanshan; Wang, Jun; Gao, Liping
2014-07-09
An electronic nose (E-nose) and an electronic tongue (E-tongue) have been used to characterize five types of strawberry juices based on processing approaches (i.e., microwave pasteurization, steam blanching, high temperature short time pasteurization, frozen-thawed, and freshly squeezed). Juice quality parameters (vitamin C, pH, total soluble solid, total acid, and sugar/acid ratio) were detected by traditional measuring methods. Multivariate statistical methods (linear discriminant analysis (LDA) and partial least squares regression (PLSR)) and neural networks (Random Forest (RF) and Support Vector Machines) were employed to qualitative classification and quantitative regression. E-tongue system reached higher accuracy rates than E-nose did, and the simultaneous utilization did have an advantage in LDA classification and PLSR regression. According to cross-validation, RF has shown outstanding and indisputable performances in the qualitative and quantitative analysis. This work indicates that the simultaneous utilization of E-nose and E-tongue can discriminate processed fruit juices and predict quality parameters successfully for the beverage industry.
Estimating the concordance probability in a survival analysis with a discrete number of risk groups.
Heller, Glenn; Mo, Qianxing
2016-04-01
A clinical risk classification system is an important component of a treatment decision algorithm. A measure used to assess the strength of a risk classification system is discrimination, and when the outcome is survival time, the most commonly applied global measure of discrimination is the concordance probability. The concordance probability represents the pairwise probability of lower patient risk given longer survival time. The c-index and the concordance probability estimate have been used to estimate the concordance probability when patient-specific risk scores are continuous. In the current paper, the concordance probability estimate and an inverse probability censoring weighted c-index are modified to account for discrete risk scores. Simulations are generated to assess the finite sample properties of the concordance probability estimate and the weighted c-index. An application of these measures of discriminatory power to a metastatic prostate cancer risk classification system is examined.
NASA Astrophysics Data System (ADS)
Giana, Fabián Eduardo; Bonetto, Fabián José; Bellotti, Mariela Inés
2018-03-01
In this work we present an assay to discriminate between normal and cancerous cells. The method is based on the measurement of electrical impedance spectra of in vitro cell cultures. We developed a protocol consisting on four consecutive measurement phases, each of them designed to obtain different information about the cell cultures. Through the analysis of the measured data, 26 characteristic features were obtained for both cell types. From the complete set of features, we selected the most relevant in terms of their discriminant capacity by means of conventional statistical tests. A linear discriminant analysis was then carried out on the selected features, allowing the classification of the samples in normal or cancerous with 4.5% of false positives and no false negatives.
Giana, Fabián Eduardo; Bonetto, Fabián José; Bellotti, Mariela Inés
2018-03-01
In this work we present an assay to discriminate between normal and cancerous cells. The method is based on the measurement of electrical impedance spectra of in vitro cell cultures. We developed a protocol consisting on four consecutive measurement phases, each of them designed to obtain different information about the cell cultures. Through the analysis of the measured data, 26 characteristic features were obtained for both cell types. From the complete set of features, we selected the most relevant in terms of their discriminant capacity by means of conventional statistical tests. A linear discriminant analysis was then carried out on the selected features, allowing the classification of the samples in normal or cancerous with 4.5% of false positives and no false negatives.
Liang, Wenyi; Chen, Wenjing; Wu, Lingfang; Li, Shi; Qi, Qi; Cui, Yaping; Liang, Linjin; Ye, Ting; Zhang, Lanzhen
2017-03-17
Danshen, the dried root of Salvia miltiorrhiza Bge., is a widely used commercially available herbal drug, and unstable quality of different samples is a current issue. This study focused on a comprehensive and systematic method combining fingerprints and chemical identification with chemometrics for discrimination and quality assessment of Danshen samples. Twenty-five samples were analyzed by HPLC-PAD and HPLC-MS n . Forty-nine components were identified and characteristic fragmentation regularities were summarized for further interpretation of bioactive components. Chemometric analysis was employed to differentiate samples and clarify the quality differences of Danshen including hierarchical cluster analysis, principal component analysis, and partial least squares discriminant analysis. Consistent results were that the samples were divided into three categories which reflected the difference in quality of Danshen samples. By analyzing the reasons for sample classification, it was revealed that the processing method had a more obvious impact on sample classification than the geographical origin, it induced the different content of bioactive compounds and finally lead to different qualities. Cryptotanshinone, trijuganone B, and 15,16-dihydrotanshinone I were screened out as markers to distinguish samples by different processing methods. The developed strategy could provide a reference for evaluation and discrimination of other traditional herbal medicines.
Zhang, Mengliang; Zhao, Yang; Harrington, Peter de B; Chen, Pei
2016-03-01
Two simple fingerprinting methods, flow-injection coupled to ultraviolet spectroscopy and proton nuclear magnetic resonance, were used for discriminating between Aurantii fructus immaturus and Fructus poniciri trifoliatae immaturus . Both methods were combined with partial least-squares discriminant analysis. In the flow-injection method, four data representations were evaluated: total ultraviolet absorbance chromatograms, averaged ultraviolet spectra, absorbance at 193, 205, 225, and 283 nm, and absorbance at 225 and 283 nm. Prediction rates of 100% were achieved for all data representations by partial least-squares discriminant analysis using leave-one-sample-out cross-validation. The prediction rate for the proton nuclear magnetic resonance data by partial least-squares discriminant analysis with leave-one-sample-out cross-validation was also 100%. A new validation set of data was collected by flow-injection with ultraviolet spectroscopic detection two weeks later and predicted by partial least-squares discriminant analysis models constructed by the initial data representations with no parameter changes. The classification rates were 95% with the total ultraviolet absorbance chromatograms datasets and 100% with the other three datasets. Flow-injection with ultraviolet detection and proton nuclear magnetic resonance are simple, high throughput, and low-cost methods for discrimination studies.
Benchmark data on the separability among crops in the southern San Joaquin Valley of California
NASA Technical Reports Server (NTRS)
Morse, A.; Card, D. H.
1984-01-01
Landsat MSS data were input to a discriminant analysis of 21 crops on each of eight dates in 1979 using a total of 4,142 fields in southern Fresno County, California. The 21 crops, which together account for over 70 percent of the agricultural acreage in the southern San Joaquin Valley, were analyzed to quantify the spectral separability, defined as omission error, between all pairs of crops. On each date the fields were segregated into six groups based on the mean value of the MSS7/MSS5 ratio, which is correlated with green biomass. Discriminant analysis was run on each group on each date. The resulting contingency tables offer information that can be profitably used in conjunction with crop calendars to pick the best dates for a classification. The tables show expected percent correct classification and error rates for all the crops. The patterns in the contingency tables show that the percent correct classification for crops generally increases with the amount of greenness in the fields being classified. However, there are exceptions to this general rule, notably grain.
NASA Astrophysics Data System (ADS)
Yao, Sen; Li, Tao; Li, JieQing; Liu, HongGao; Wang, YuanZhong
2018-06-01
Boletus griseus and Boletus edulis are two well-known wild-grown edible mushrooms which have high nutrition, delicious flavor and high economic value distributing in Yunnan Province. In this study, a rapid method using Fourier transform infrared (FT-IR) and ultraviolet (UV) spectroscopies coupled with data fusion was established for the discrimination of Boletus mushrooms from seven different geographical origins with pattern recognition method. Initially, the spectra of 332 mushroom samples obtained from the two spectroscopic techniques were analyzed individually and then the classification performance based on data fusion strategy was investigated. Meanwhile, the latent variables (LVs) of FT-IR and UV spectra were extracted by partial least square discriminant analysis (PLS-DA) and two datasets were concatenated into a new matrix for data fusion. Then, the fusion matrix was further analyzed by support vector machine (SVM). Compared with single spectroscopic technique, data fusion strategy can improve the classification performance effectively. In particular, the accuracy of correct classification of SVM model in training and test sets were 99.10% and 100.00%, respectively. The results demonstrated that data fusion of FT-IR and UV spectra can provide higher synergic effect for the discrimination of different geographical origins of Boletus mushrooms, which may be benefit for further authentication and quality assessment of edible mushrooms.
Yao, Sen; Li, Tao; Li, JieQing; Liu, HongGao; Wang, YuanZhong
2018-06-05
Boletus griseus and Boletus edulis are two well-known wild-grown edible mushrooms which have high nutrition, delicious flavor and high economic value distributing in Yunnan Province. In this study, a rapid method using Fourier transform infrared (FT-IR) and ultraviolet (UV) spectroscopies coupled with data fusion was established for the discrimination of Boletus mushrooms from seven different geographical origins with pattern recognition method. Initially, the spectra of 332 mushroom samples obtained from the two spectroscopic techniques were analyzed individually and then the classification performance based on data fusion strategy was investigated. Meanwhile, the latent variables (LVs) of FT-IR and UV spectra were extracted by partial least square discriminant analysis (PLS-DA) and two datasets were concatenated into a new matrix for data fusion. Then, the fusion matrix was further analyzed by support vector machine (SVM). Compared with single spectroscopic technique, data fusion strategy can improve the classification performance effectively. In particular, the accuracy of correct classification of SVM model in training and test sets were 99.10% and 100.00%, respectively. The results demonstrated that data fusion of FT-IR and UV spectra can provide higher synergic effect for the discrimination of different geographical origins of Boletus mushrooms, which may be benefit for further authentication and quality assessment of edible mushrooms. Copyright © 2018 Elsevier B.V. All rights reserved.
Martins, Angélica Rocha; Talhavini, Márcio; Vieira, Maurício Leite; Zacca, Jorge Jardim; Braga, Jez Willian Batista
2017-08-15
The discrimination of whisky brands and counterfeit identification were performed by UV-Vis spectroscopy combined with partial least squares for discriminant analysis (PLS-DA). In the proposed method all spectra were obtained with no sample preparation. The discrimination models were built with the employment of seven whisky brands: Red Label, Black Label, White Horse, Chivas Regal (12years), Ballantine's Finest, Old Parr and Natu Nobilis. The method was validated with an independent test set of authentic samples belonging to the seven selected brands and another eleven brands not included in the training samples. Furthermore, seventy-three counterfeit samples were also used to validate the method. Results showed correct classification rates for genuine and false samples over 98.6% and 93.1%, respectively, indicating that the method can be helpful for the forensic analysis of whisky samples. Copyright © 2017 Elsevier Ltd. All rights reserved.
Bavykin, Sergei G.; Mirzabekov, Andrei D.
2007-10-30
The present invention is directed to a novel method of discriminating a highly infectious bacterium Bacillus anthracis from a group of closely related microorganisms. Sequence variations in the 16S and 23S rRNA of the B. cereus subgroup including B. anthracis are utilized to construct an array that can detect these sequence variations through selective hybridizations. The identification and analysis of these sequence variations enables positive discrimination of isolates of the B. cereus group that includes B. anthracis. Discrimination of single base differences in rRNA was achieved with a microchip during analysis of B. cereus group isolates from both single and in mixed probes, as well as identification of polymorphic sites. Successful use of a microchip to determine the appropriate subgroup classification using eight reference microorganisms from the B. cereus group as a study set, was demonstrated.
Taxonomic discrimination of higher plants by pyrolysis mass spectrometry.
Kim, S W; Ban, S H; Chung, H J; Choi, D W; Choi, P S; Yoo, O J; Liu, J R
2004-02-01
Pyrolysis mass spectrometry (PyMS) is a rapid, simple, high-resolution analytical method based on thermal degradation of complex material in a vacuum and has been widely applied to the discrimination of closely related microbial strains. Leaf samples of six species and one variety of higher plants (Rosa multiflora, R. multiflora var. platyphylla, Sedum kamtschaticum, S. takesimense, S. sarmentosum, Hepatica insularis, and H. asiatica) were subjected to PyMS for spectral fingerprinting. Principal component analysis of PyMS data was not able to discriminate these plants in discrete clusters. However, canonical variate analysis of PyMS data separated these plants from one another. A hierarchical dendrogram based on canonical variate analysis was in agreement with the known taxonomy of the plants at the variety level. These results indicate that PyMS is able to discriminate higher plants based on taxonomic classification at the family, genus, species, and variety level.
Truzzi, Cristina; Illuminati, Silvia; Annibaldia, Anna; Finale, Carolina; Rossetti, Monica; Scarponi, Giuseppe
2014-11-01
The purpose of this study was the physicochemical characterization and classification of Italian honey from Marche Region with a chemometric approach. A total of 135 honeys of different botanical origins [acacia (Robinia pseudoacacia L.), chestnut (Castanea sativa), coriander (Coriandrum sativum L.), lime (Tilia spp.), sunflower (Helianthus annuus L.), Metcalfa honeydew and multifloral honey] were considered. The average results of electrical conductivity (0.14-1.45 mS cm(-1)), pH (3.89-5.42), free acidity (10.9-39.0 meq(NaOH) kg(-1)), lactones (2.4-4.5 meq(NaOH) kg(-1)), total acidity (14.5-40.9 meq(NaOH) kg(-1)), proline (229-665 mg kg(-1)) and 5-(hydroxy-methyl)-2-furaldehyde (0.6-3.9 mg kg(-1)) content show wide variability among the analysed honey types, with statistically significant differences between the different honey types. Pattern recognition methods such as principal component analysis and discriminant analysis were performed in order to find a relationship between variables and types of honey and to classify honey on the basis of its physicochemical properties. The variables of electrical conductivity, acidity (free, lactones), pH and proline content exhibited higher discriminant power and provided enough information for the classification and distinction of unifloral honey types, but not for the classification of multifloral honey (100% and 85% of samples correctly classified, respectively).
Hohmann, Monika; Monakhova, Yulia; Erich, Sarah; Christoph, Norbert; Wachter, Helmut; Holzgrabe, Ulrike
2015-11-04
Because the basic suitability of proton nuclear magnetic resonance spectroscopy ((1)H NMR) to differentiate organic versus conventional tomatoes was recently proven, the approach to optimize (1)H NMR classification models (comprising overall 205 authentic tomato samples) by including additional data of isotope ratio mass spectrometry (IRMS, δ(13)C, δ(15)N, and δ(18)O) and mid-infrared (MIR) spectroscopy was assessed. Both individual and combined analytical methods ((1)H NMR + MIR, (1)H NMR + IRMS, MIR + IRMS, and (1)H NMR + MIR + IRMS) were examined using principal component analysis (PCA), partial least squares discriminant analysis (PLS-DA), linear discriminant analysis (LDA), and common components and specific weight analysis (ComDim). With regard to classification abilities, fused data of (1)H NMR + MIR + IRMS yielded better validation results (ranging between 95.0 and 100.0%) than individual methods ((1)H NMR, 91.3-100%; MIR, 75.6-91.7%), suggesting that the combined examination of analytical profiles enhances authentication of organically produced tomatoes.
Wang, Jie; Feng, Zuren; Lu, Na; Luo, Jing
2018-06-01
Feature selection plays an important role in the field of EEG signals based motor imagery pattern classification. It is a process that aims to select an optimal feature subset from the original set. Two significant advantages involved are: lowering the computational burden so as to speed up the learning procedure and removing redundant and irrelevant features so as to improve the classification performance. Therefore, feature selection is widely employed in the classification of EEG signals in practical brain-computer interface systems. In this paper, we present a novel statistical model to select the optimal feature subset based on the Kullback-Leibler divergence measure, and automatically select the optimal subject-specific time segment. The proposed method comprises four successive stages: a broad frequency band filtering and common spatial pattern enhancement as preprocessing, features extraction by autoregressive model and log-variance, the Kullback-Leibler divergence based optimal feature and time segment selection and linear discriminate analysis classification. More importantly, this paper provides a potential framework for combining other feature extraction models and classification algorithms with the proposed method for EEG signals classification. Experiments on single-trial EEG signals from two public competition datasets not only demonstrate that the proposed method is effective in selecting discriminative features and time segment, but also show that the proposed method yields relatively better classification results in comparison with other competitive methods. Copyright © 2018 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Vasefi, Fartash; Kittle, David S.; Nie, Zhaojun; Falcone, Christina; Patil, Chirag G.; Chu, Ray M.; Mamelak, Adam N.; Black, Keith L.; Butte, Pramod V.
2016-04-01
We have developed and tested a system for real-time intra-operative optical identification and classification of brain tissues using time-resolved fluorescence spectroscopy (TRFS). A supervised learning algorithm using linear discriminant analysis (LDA) employing selected intrinsic fluorescence decay temporal points in 6 spectral bands was employed to maximize statistical significance difference between training groups. The linear discriminant analysis on in vivo human tissues obtained by TRFS measurements (N = 35) were validated by histopathologic analysis and neuronavigation correlation to pre-operative MRI images. These results demonstrate that TRFS can differentiate between normal cortex, white matter and glioma.
Feature extraction with deep neural networks by a generalized discriminant analysis.
Stuhlsatz, André; Lippel, Jens; Zielke, Thomas
2012-04-01
We present an approach to feature extraction that is a generalization of the classical linear discriminant analysis (LDA) on the basis of deep neural networks (DNNs). As for LDA, discriminative features generated from independent Gaussian class conditionals are assumed. This modeling has the advantages that the intrinsic dimensionality of the feature space is bounded by the number of classes and that the optimal discriminant function is linear. Unfortunately, linear transformations are insufficient to extract optimal discriminative features from arbitrarily distributed raw measurements. The generalized discriminant analysis (GerDA) proposed in this paper uses nonlinear transformations that are learnt by DNNs in a semisupervised fashion. We show that the feature extraction based on our approach displays excellent performance on real-world recognition and detection tasks, such as handwritten digit recognition and face detection. In a series of experiments, we evaluate GerDA features with respect to dimensionality reduction, visualization, classification, and detection. Moreover, we show that GerDA DNNs can preprocess truly high-dimensional input data to low-dimensional representations that facilitate accurate predictions even if simple linear predictors or measures of similarity are used.
Na, X D; Zang, S Y; Wu, C S; Li, W L
2015-11-01
Knowledge of the spatial extent of forested wetlands is essential to many studies including wetland functioning assessment, greenhouse gas flux estimation, and wildlife suitable habitat identification. For discriminating forested wetlands from their adjacent land cover types, researchers have resorted to image analysis techniques applied to numerous remotely sensed data. While with some success, there is still no consensus on the optimal approaches for mapping forested wetlands. To address this problem, we examined two machine learning approaches, random forest (RF) and K-nearest neighbor (KNN) algorithms, and applied these two approaches to the framework of pixel-based and object-based classifications. The RF and KNN algorithms were constructed using predictors derived from Landsat 8 imagery, Radarsat-2 advanced synthetic aperture radar (SAR), and topographical indices. The results show that the objected-based classifications performed better than per-pixel classifications using the same algorithm (RF) in terms of overall accuracy and the difference of their kappa coefficients are statistically significant (p<0.01). There were noticeably omissions for forested and herbaceous wetlands based on the per-pixel classifications using the RF algorithm. As for the object-based image analysis, there were also statistically significant differences (p<0.01) of Kappa coefficient between results performed based on RF and KNN algorithms. The object-based classification using RF provided a more visually adequate distribution of interested land cover types, while the object classifications based on the KNN algorithm showed noticeably commissions for forested wetlands and omissions for agriculture land. This research proves that the object-based classification with RF using optical, radar, and topographical data improved the mapping accuracy of land covers and provided a feasible approach to discriminate the forested wetlands from the other land cover types in forestry area.
Song, Weiran; Wang, Hui; Maguire, Paul; Nibouche, Omar
2018-06-07
Partial Least Squares Discriminant Analysis (PLS-DA) is one of the most effective multivariate analysis methods for spectral data analysis, which extracts latent variables and uses them to predict responses. In particular, it is an effective method for handling high-dimensional and collinear spectral data. However, PLS-DA does not explicitly address data multimodality, i.e., within-class multimodal distribution of data. In this paper, we present a novel method termed nearest clusters based PLS-DA (NCPLS-DA) for addressing the multimodality and nonlinearity issues explicitly and improving the performance of PLS-DA on spectral data classification. The new method applies hierarchical clustering to divide samples into clusters and calculates the corresponding centre of every cluster. For a given query point, only clusters whose centres are nearest to such a query point are used for PLS-DA. Such a method can provide a simple and effective tool for separating multimodal and nonlinear classes into clusters which are locally linear and unimodal. Experimental results on 17 datasets, including 12 UCI and 5 spectral datasets, show that NCPLS-DA can outperform 4 baseline methods, namely, PLS-DA, kernel PLS-DA, local PLS-DA and k-NN, achieving the highest classification accuracy most of the time. Copyright © 2018 Elsevier B.V. All rights reserved.
Crop Identification Technology Assessment for Remote Sensing (CITARS)
NASA Technical Reports Server (NTRS)
Bauer, M. E.; Cary, T. K.; Davis, B. J.; Swain, P. H.
1975-01-01
The results of classifications and experiments performed for the Crop Identification Technology Assessment for Remote Sensing (CITARS) project are summarized. Fifteen data sets were classified using two analysis procedures. One procedure used class weights while the other assumed equal probabilities of occurrence for all classes. In addition, 20 data sets were classified using training statistics from another segment or date. The results of both the local and non-local classifications in terms of classification and proportion estimation are presented. Several additional experiments are described which were performed to provide additional understanding of the CITARS results. These experiments investigated alternative analysis procedures, training set selection and size, effects of multitemporal registration, the spectral discriminability of corn, soybeans, and other, and analysis of aircraft multispectral data.
Large-scale optimization-based classification models in medicine and biology.
Lee, Eva K
2007-06-01
We present novel optimization-based classification models that are general purpose and suitable for developing predictive rules for large heterogeneous biological and medical data sets. Our predictive model simultaneously incorporates (1) the ability to classify any number of distinct groups; (2) the ability to incorporate heterogeneous types of attributes as input; (3) a high-dimensional data transformation that eliminates noise and errors in biological data; (4) the ability to incorporate constraints to limit the rate of misclassification, and a reserved-judgment region that provides a safeguard against over-training (which tends to lead to high misclassification rates from the resulting predictive rule); and (5) successive multi-stage classification capability to handle data points placed in the reserved-judgment region. To illustrate the power and flexibility of the classification model and solution engine, and its multi-group prediction capability, application of the predictive model to a broad class of biological and medical problems is described. Applications include: the differential diagnosis of the type of erythemato-squamous diseases; predicting presence/absence of heart disease; genomic analysis and prediction of aberrant CpG island meythlation in human cancer; discriminant analysis of motility and morphology data in human lung carcinoma; prediction of ultrasonic cell disruption for drug delivery; identification of tumor shape and volume in treatment of sarcoma; discriminant analysis of biomarkers for prediction of early atherosclerois; fingerprinting of native and angiogenic microvascular networks for early diagnosis of diabetes, aging, macular degeneracy and tumor metastasis; prediction of protein localization sites; and pattern recognition of satellite images in classification of soil types. In all these applications, the predictive model yields correct classification rates ranging from 80 to 100%. This provides motivation for pursuing its use as a medical diagnostic, monitoring and decision-making tool.
2011-01-01
Background For brain computer interfaces (BCIs), which may be valuable in neurorehabilitation, brain signals derived from mental activation can be monitored by non-invasive methods, such as functional near-infrared spectroscopy (fNIRS). Single-trial classification is important for this purpose and this was the aim of the presented study. In particular, we aimed to investigate a combined approach: 1) offline single-trial classification of brain signals derived from a novel wireless fNIRS instrument; 2) to use motor imagery (MI) as mental task thereby discriminating between MI signals in response to different tasks complexities, i.e. simple and complex MI tasks. Methods 12 subjects were asked to imagine either a simple finger-tapping task using their right thumb or a complex sequential finger-tapping task using all fingers of their right hand. fNIRS was recorded over secondary motor areas of the contralateral hemisphere. Using Fisher's linear discriminant analysis (FLDA) and cross validation, we selected for each subject a best-performing feature combination consisting of 1) one out of three channel, 2) an analysis time interval ranging from 5-15 s after stimulation onset and 3) up to four Δ[O2Hb] signal features (Δ[O2Hb] mean signal amplitudes, variance, skewness and kurtosis). Results The results of our single-trial classification showed that using the simple combination set of channels, time intervals and up to four Δ[O2Hb] signal features comprising Δ[O2Hb] mean signal amplitudes, variance, skewness and kurtosis, it was possible to discriminate single-trials of MI tasks differing in complexity, i.e. simple versus complex tasks (inter-task paired t-test p ≤ 0.001), over secondary motor areas with an average classification accuracy of 81%. Conclusions Although the classification accuracies look promising they are nevertheless subject of considerable subject-to-subject variability. In the discussion we address each of these aspects, their limitations for future approaches in single-trial classification and their relevance for neurorehabilitation. PMID:21682906
NASA Astrophysics Data System (ADS)
Díaz-Ayil, Gilberto; Amouroux, Marine; Clanché, Fabien; Granjon, Yves; Blondel, Walter C. P. M.
2009-07-01
Spatially-resolved bimodal spectroscopy (multiple AutoFluorescence AF excitation and Diffuse Reflectance DR), was used in vivo to discriminate various healthy and precancerous skin stages in a pre-clinical model (UV-irradiated mouse): Compensatory Hyperplasia CH, Atypical Hyperplasia AH and Dysplasia D. A specific data preprocessing scheme was applied to intensity spectra (filtering, spectral correction and intensity normalization), and several sets of spectral characteristics were automatically extracted and selected based on their discrimination power, statistically tested for every pair-wise comparison of histological classes. Data reduction with Principal Components Analysis (PCA) was performed and 3 classification methods were implemented (k-NN, LDA and SVM), in order to compare diagnostic performance of each method. Diagnostic performance was studied and assessed in terms of Sensibility (Se) and Specificity (Sp) as a function of the selected features, of the combinations of 3 different inter-fibres distances and of the numbers of principal components, such that: Se and Sp ~ 100% when discriminating CH vs. others; Sp ~ 100% and Se > 95% when discriminating Healthy vs. AH or D; Sp ~ 74% and Se ~ 63% for AH vs. D.
Discriminant analysis of resting-state functional connectivity patterns on the Grassmann manifold
NASA Astrophysics Data System (ADS)
Fan, Yong; Liu, Yong; Jiang, Tianzi; Liu, Zhening; Hao, Yihui; Liu, Haihong
2010-03-01
The functional networks, extracted from fMRI images using independent component analysis, have been demonstrated informative for distinguishing brain states of cognitive functions and neurological diseases. In this paper, we propose a novel algorithm for discriminant analysis of functional networks encoded by spatial independent components. The functional networks of each individual are used as bases for a linear subspace, referred to as a functional connectivity pattern, which facilitates a comprehensive characterization of temporal signals of fMRI data. The functional connectivity patterns of different individuals are analyzed on the Grassmann manifold by adopting a principal angle based subspace distance. In conjunction with a support vector machine classifier, a forward component selection technique is proposed to select independent components for constructing the most discriminative functional connectivity pattern. The discriminant analysis method has been applied to an fMRI based schizophrenia study with 31 schizophrenia patients and 31 healthy individuals. The experimental results demonstrate that the proposed method not only achieves a promising classification performance for distinguishing schizophrenia patients from healthy controls, but also identifies discriminative functional networks that are informative for schizophrenia diagnosis.
Fast detection of tobacco mosaic virus infected tobacco using laser-induced breakdown spectroscopy
NASA Astrophysics Data System (ADS)
Peng, Jiyu; Song, Kunlin; Zhu, Hongyan; Kong, Wenwen; Liu, Fei; Shen, Tingting; He, Yong
2017-03-01
Tobacco mosaic virus (TMV) is one of the most devastating viruses to crops, which can cause severe production loss and affect the quality of products. In this study, we have proposed a novel approach to discriminate TMV-infected tobacco based on laser-induced breakdown spectroscopy (LIBS). Two different kinds of tobacco samples (fresh leaves and dried leaf pellets) were collected for spectral acquisition, and partial least squared discrimination analysis (PLS-DA) was used to establish classification models based on full spectrum and observed emission lines. The influences of moisture content on spectral profile, signal stability and plasma parameters (temperature and electron density) were also analysed. The results revealed that moisture content in fresh tobacco leaves would worsen the stability of analysis, and have a detrimental effect on the classification results. Good classification results were achieved based on the data from both full spectrum and observed emission lines of dried leaves, approaching 97.2% and 88.9% in the prediction set, respectively. In addition, support vector machine (SVM) could improve the classification results and eliminate influences of moisture content. The preliminary results indicate that LIBS coupled with chemometrics could provide a fast, efficient and low-cost approach for TMV-infected disease detection in tobacco leaves.
Leontidis, Georgios
2017-11-01
Human retina is a diverse and important tissue, vastly studied for various retinal and other diseases. Diabetic retinopathy (DR), a leading cause of blindness, is one of them. This work proposes a novel and complete framework for the accurate and robust extraction and analysis of a series of retinal vascular geometric features. It focuses on studying the registered bifurcations in successive years of progression from diabetes (no DR) to DR, in order to identify the vascular alterations. Retinal fundus images are utilised, and multiple experimental designs are employed. The framework includes various steps, such as image registration and segmentation, extraction of features, statistical analysis and classification models. Linear mixed models are utilised for making the statistical inferences, alongside the elastic-net logistic regression, boruta algorithm, and regularised random forests for the feature selection and classification phases, in order to evaluate the discriminative potential of the investigated features and also build classification models. A number of geometric features, such as the central retinal artery and vein equivalents, are found to differ significantly across the experiments and also have good discriminative potential. The classification systems yield promising results with the area under the curve values ranging from 0.821 to 0.968, across the four different investigated combinations. Copyright © 2017 Elsevier Ltd. All rights reserved.
Fast detection of tobacco mosaic virus infected tobacco using laser-induced breakdown spectroscopy
Peng, Jiyu; Song, Kunlin; Zhu, Hongyan; Kong, Wenwen; Liu, Fei; Shen, Tingting; He, Yong
2017-01-01
Tobacco mosaic virus (TMV) is one of the most devastating viruses to crops, which can cause severe production loss and affect the quality of products. In this study, we have proposed a novel approach to discriminate TMV-infected tobacco based on laser-induced breakdown spectroscopy (LIBS). Two different kinds of tobacco samples (fresh leaves and dried leaf pellets) were collected for spectral acquisition, and partial least squared discrimination analysis (PLS-DA) was used to establish classification models based on full spectrum and observed emission lines. The influences of moisture content on spectral profile, signal stability and plasma parameters (temperature and electron density) were also analysed. The results revealed that moisture content in fresh tobacco leaves would worsen the stability of analysis, and have a detrimental effect on the classification results. Good classification results were achieved based on the data from both full spectrum and observed emission lines of dried leaves, approaching 97.2% and 88.9% in the prediction set, respectively. In addition, support vector machine (SVM) could improve the classification results and eliminate influences of moisture content. The preliminary results indicate that LIBS coupled with chemometrics could provide a fast, efficient and low-cost approach for TMV-infected disease detection in tobacco leaves. PMID:28300144
Mathematical and Statistical Software Index.
1986-08-01
geometric) mean HMEAN - harmonic mean MEDIAN - median MODE - mode QUANT - quantiles OGIVE - distribution curve IQRNG - interpercentile range RANGE ... range mutliphase pivoting algorithm cross-classification multiple discriminant analysis cross-tabul ation mul tipl e-objecti ve model curve fitting...Statistics). .. .. .... ...... ..... ...... ..... .. 21 *RANGEX (Correct Correlations for Curtailment of Range ). .. .. .... ...... ... 21 *RUMMAGE II (Analysis
Zakaria, Ammar; Shakaff, Ali Yeon Md.; Adom, Abdul Hamid; Ahmad, Mohd Noor; Masnan, Maz Jamilah; Aziz, Abdul Hallis Abdul; Fikri, Nazifah Ahmad; Abdullah, Abu Hassan; Kamarudin, Latifah Munirah
2010-01-01
An improved classification of Orthosiphon stamineus using a data fusion technique is presented. Five different commercial sources along with freshly prepared samples were discriminated using an electronic nose (e-nose) and an electronic tongue (e-tongue). Samples from the different commercial brands were evaluated by the e-tongue and then followed by the e-nose. Applying Principal Component Analysis (PCA) separately on the respective e-tongue and e-nose data, only five distinct groups were projected. However, by employing a low level data fusion technique, six distinct groupings were achieved. Hence, this technique can enhance the ability of PCA to analyze the complex samples of Orthosiphon stamineus. Linear Discriminant Analysis (LDA) was then used to further validate and classify the samples. It was found that the LDA performance was also improved when the responses from the e-nose and e-tongue were fused together. PMID:22163381
Zakaria, Ammar; Shakaff, Ali Yeon Md; Adom, Abdul Hamid; Ahmad, Mohd Noor; Masnan, Maz Jamilah; Aziz, Abdul Hallis Abdul; Fikri, Nazifah Ahmad; Abdullah, Abu Hassan; Kamarudin, Latifah Munirah
2010-01-01
An improved classification of Orthosiphon stamineus using a data fusion technique is presented. Five different commercial sources along with freshly prepared samples were discriminated using an electronic nose (e-nose) and an electronic tongue (e-tongue). Samples from the different commercial brands were evaluated by the e-tongue and then followed by the e-nose. Applying Principal Component Analysis (PCA) separately on the respective e-tongue and e-nose data, only five distinct groups were projected. However, by employing a low level data fusion technique, six distinct groupings were achieved. Hence, this technique can enhance the ability of PCA to analyze the complex samples of Orthosiphon stamineus. Linear Discriminant Analysis (LDA) was then used to further validate and classify the samples. It was found that the LDA performance was also improved when the responses from the e-nose and e-tongue were fused together.
Hamit, Murat; Yun, Weikang; Yan, Chuanbo; Kutluk, Abdugheni; Fang, Yang; Alip, Elzat
2015-06-01
Image feature extraction is an important part of image processing and it is an important field of research and application of image processing technology. Uygur medicine is one of Chinese traditional medicine and researchers pay more attention to it. But large amounts of Uygur medicine data have not been fully utilized. In this study, we extracted the image color histogram feature of herbal and zooid medicine of Xinjiang Uygur. First, we did preprocessing, including image color enhancement, size normalizition and color space transformation. Then we extracted color histogram feature and analyzed them with statistical method. And finally, we evaluated the classification ability of features by Bayes discriminant analysis. Experimental results showed that high accuracy for Uygur medicine image classification was obtained by using color histogram feature. This study would have a certain help for the content-based medical image retrieval for Xinjiang Uygur medicine.
Jochumsen, Mads; Rovsing, Cecilie; Rovsing, Helene; Niazi, Imran Khan; Dremstrup, Kim; Kamavuako, Ernest Nlandu
2017-01-01
Detection of single-trial movement intentions from EEG is paramount for brain-computer interfacing in neurorehabilitation. These movement intentions contain task-related information and if this is decoded, the neurorehabilitation could potentially be optimized. The aim of this study was to classify single-trial movement intentions associated with two levels of force and speed and three different grasp types using EEG rhythms and components of the movement-related cortical potential (MRCP) as features. The feature importance was used to estimate encoding of discriminative information. Two data sets were used. 29 healthy subjects executed and imagined different hand movements, while EEG was recorded over the contralateral sensorimotor cortex. The following features were extracted: delta, theta, mu/alpha, beta, and gamma rhythms, readiness potential, negative slope, and motor potential of the MRCP. Sequential forward selection was performed, and classification was performed using linear discriminant analysis and support vector machines. Limited classification accuracies were obtained from the EEG rhythms and MRCP-components: 0.48 ± 0.05 (grasp types), 0.41 ± 0.07 (kinetic profiles, motor execution), and 0.39 ± 0.08 (kinetic profiles, motor imagination). Delta activity contributed the most but all features provided discriminative information. These findings suggest that information from the entire EEG spectrum is needed to discriminate between task-related parameters from single-trial movement intentions.
de Rijke, E; Schoorl, J C; Cerli, C; Vonhof, H B; Verdegaal, S J A; Vivó-Truyols, G; Lopatka, M; Dekter, R; Bakker, D; Sjerps, M J; Ebskamp, M; de Koster, C G
2016-08-01
Two approaches were investigated to discriminate between bell peppers of different geographic origins. Firstly, δ(18)O fruit water and corresponding source water were analyzed and correlated to the regional GNIP (Global Network of Isotopes in Precipitation) values. The water and GNIP data showed good correlation with the pepper data, with constant isotope fractionation of about -4. Secondly, compound-specific stable hydrogen isotope data was used for classification. Using n-alkane fingerprinting data, both linear discriminant analysis (LDA) and a likelihood-based classification, using the kernel-density smoothed data, were developed to discriminate between peppers from different origins. Both methods were evaluated using the δ(2)H values and n-alkanes relative composition as variables. Misclassification rates were calculated using a Monte-Carlo 5-fold cross-validation procedure. Comparable overall classification performance was achieved, however, the two methods showed sensitivity to different samples. The combined values of δ(2)H IRMS, and complimentary information regarding the relative abundance of four main alkanes in bell pepper fruit water, has proven effective for geographic origin discrimination. Evaluation of the rarity of observing particular ranges for these characteristics could be used to make quantitative assertions regarding geographic origin of bell peppers and, therefore, have a role in verifying compliance with labeling of geographical origin. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Ballew, G.
1977-01-01
The ability of Landsat multispectral digital data to differentiate among 62 combinations of rock and alteration types at the Goldfield mining district of Western Nevada was investigated by using statistical techniques of cluster and discriminant analysis. Multivariate discriminant analysis was not effective in classifying each of the 62 groups, with classification results essentially the same whether data of four channels alone or combined with six ratios of channels were used. Bivariate plots of group means revealed a cluster of three groups including mill tailings, basalt and all other rock and alteration types. Automatic hierarchical clustering based on the fourth dimensional Mahalanobis distance between group means of 30 groups having five or more samples was performed. The results of the cluster analysis revealed hierarchies of mill tailings vs. natural materials, basalt vs. non-basalt, highly reflectant rocks vs. other rocks and exclusively unaltered rocks vs. predominantly altered rocks. The hierarchies were used to determine the order in which sets of multiple discriminant analyses were to be performed and the resulting discriminant functions were used to produce a map of geology and alteration which has an overall accuracy of 70 percent for discriminating exclusively altered rocks from predominantly altered rocks.
NASA Astrophysics Data System (ADS)
Rohaeti, Eti; Rafi, Mohamad; Syafitri, Utami Dyah; Heryanto, Rudi
2015-02-01
Turmeric (Curcuma longa), java turmeric (Curcuma xanthorrhiza) and cassumunar ginger (Zingiber cassumunar) are widely used in traditional Indonesian medicines (jamu). They have similar color for their rhizome and possess some similar uses, so it is possible to substitute one for the other. The identification and discrimination of these closely-related plants is a crucial task to ensure the quality of the raw materials. Therefore, an analytical method which is rapid, simple and accurate for discriminating these species using Fourier transform infrared spectroscopy (FTIR) combined with some chemometrics methods was developed. FTIR spectra were acquired in the mid-IR region (4000-400 cm-1). Standard normal variate, first and second order derivative spectra were compared for the spectral data. Principal component analysis (PCA) and canonical variate analysis (CVA) were used for the classification of the three species. Samples could be discriminated by visual analysis of the FTIR spectra by using their marker bands. Discrimination of the three species was also possible through the combination of the pre-processed FTIR spectra with PCA and CVA, in which CVA gave clearer discrimination. Subsequently, the developed method could be used for the identification and discrimination of the three closely-related plant species.
A Discriminative Approach to EEG Seizure Detection
Johnson, Ashley N.; Sow, Daby; Biem, Alain
2011-01-01
Seizures are abnormal sudden discharges in the brain with signatures represented in electroencephalograms (EEG). The efficacy of the application of speech processing techniques to discriminate between seizure and non-seizure states in EEGs is reported. The approach accounts for the challenges of unbalanced datasets (seizure and non-seizure), while also showing a system capable of real-time seizure detection. The Minimum Classification Error (MCE) algorithm, which is a discriminative learning algorithm with wide-use in speech processing, is applied and compared with conventional classification techniques that have already been applied to the discrimination between seizure and non-seizure states in the literature. The system is evaluated on 22 pediatric patients multi-channel EEG recordings. Experimental results show that the application of speech processing techniques and MCE compare favorably with conventional classification techniques in terms of classification performance, while requiring less computational overhead. The results strongly suggests the possibility of deploying the designed system at the bedside. PMID:22195192
Comparative decision models for anticipating shortage of food grain production in India
NASA Astrophysics Data System (ADS)
Chattopadhyay, Manojit; Mitra, Subrata Kumar
2018-01-01
This paper attempts to predict food shortages in advance from the analysis of rainfall during the monsoon months along with other inputs used for crop production, such as land used for cereal production, percentage of area covered under irrigation and fertiliser use. We used six binary classification data mining models viz., logistic regression, Multilayer Perceptron, kernel lab-Support Vector Machines, linear discriminant analysis, quadratic discriminant analysis and k-Nearest Neighbors Network, and found that linear discriminant analysis and kernel lab-Support Vector Machines are equally suitable for predicting per capita food shortage with 89.69 % accuracy in overall prediction and 92.06 % accuracy in predicting food shortage ( true negative rate). Advance information of food shortage can help policy makers to take remedial measures in order to prevent devastating consequences arising out of food non-availability.
Discrimination of rectal cancer through human serum using surface-enhanced Raman spectroscopy
NASA Astrophysics Data System (ADS)
Li, Xiaozhou; Yang, Tianyue; Li, Siqi; Zhang, Su; Jin, Lili
2015-05-01
In this paper, surface-enhanced Raman spectroscopy (SERS) was used to detect the changes in blood serum components that accompany rectal cancer. The differences in serum SERS data between rectal cancer patients and healthy controls were examined. Postoperative rectal cancer patients also participated in the comparison to monitor the effects of cancer treatments. The results show that there are significant variations at certain wavenumbers which indicates alteration of corresponding biological substances. Principal component analysis (PCA) and parameters of intensity ratios were used on the original SERS spectra for the extraction of featured variables. These featured variables then underwent linear discriminant analysis (LDA) and classification and regression tree (CART) for the discrimination analysis. Accuracies of 93.5 and 92.4 % were obtained for PCA-LDA and parameter-CART, respectively.
EEG-based classification of imaginary left and right foot movements using beta rebound.
Hashimoto, Yasunari; Ushiba, Junichi
2013-11-01
The purpose of this study was to investigate cortical lateralization of event-related (de)synchronization during left and right foot motor imagery tasks and to determine classification accuracy of the two imaginary movements in a brain-computer interface (BCI) paradigm. We recorded 31-channel scalp electroencephalograms (EEGs) from nine healthy subjects during brisk imagery tasks of left and right foot movements. EEG was analyzed with time-frequency maps and topographies, and the accuracy rate of classification between left and right foot movements was calculated. Beta rebound at the end of imagination (increase of EEG beta rhythm amplitude) was identified from the two EEGs derived from the right-shift and left-shift bipolar pairs at the vertex. This process enabled discrimination between right or left foot imagery at a high accuracy rate (maximum 81.6% in single trial analysis). These data suggest that foot motor imagery has potential to elicit left-right differences in EEG, while BCI using the unilateral foot imagery can achieve high classification accuracy, similar to ordinary BCI, based on hand motor imagery. By combining conventional discrimination techniques, the left-right discrimination of unilateral foot motor imagery provides a novel BCI system that could control a foot neuroprosthesis or a robotic foot. Copyright © 2013 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Assessment of pedophilia using hemodynamic brain response to sexual stimuli.
Ponseti, Jorge; Granert, Oliver; Jansen, Olav; Wolff, Stephan; Beier, Klaus; Neutze, Janina; Deuschl, Günther; Mehdorn, Hubertus; Siebner, Hartwig; Bosinski, Hartmut
2012-02-01
Accurately assessing sexual preference is important in the treatment of child sex offenders. Phallometry is the standard method to identify sexual preference; however, this measure has been criticized for its intrusiveness and limited reliability. To evaluate whether spatial response pattern to sexual stimuli as revealed by a change in the blood oxygen level-dependent signal facilitates the identification of pedophiles. During functional magnetic resonance imaging, pedophilic and nonpedophilic participants were briefly exposed to same- and opposite-sex images of nude children and adults. We calculated differences in blood oxygen level-dependent signals to child and adult sexual stimuli for each participant. The corresponding contrast images were entered into a group analysis to calculate whole-brain difference maps between groups. We calculated an expression value that corresponded to the group result for each participant. These expression values were submitted to 2 different classification algorithms: Fisher linear discriminant analysis and κ -nearest neighbor analysis. This classification procedure was cross-validated using the leave-one-out method. Section of Sexual Medicine, Medical School, Christian Albrechts University of Kiel, Kiel, Germany. We recruited 24 participants with pedophilia who were sexually attracted to either prepubescent girls (n = 11) or prepubescent boys (n = 13) and 32 healthy male controls who were sexually attracted to either adult women (n = 18) or adult men (n = 14). Sensitivity and specificity scores of the 2 classification algorithms. The highest classification accuracy was achieved by Fisher linear discriminant analysis, which showed a mean accuracy of 95% (100% specificity, 88% sensitivity). Functional brain response patterns to sexual stimuli contain sufficient information to identify pedophiles with high accuracy. The automatic classification of these patterns is a promising objective tool to clinically diagnose pedophilia.
Analysis of digitized cervical images to detect cervical neoplasia
NASA Astrophysics Data System (ADS)
Ferris, Daron G.
2004-05-01
Cervical cancer is the second most common malignancy in women worldwide. If diagnosed in the premalignant stage, cure is invariably assured. Although the Papanicolaou (Pap) smear has significantly reduced the incidence of cervical cancer where implemented, the test is only moderately sensitive, highly subjective and skilled-labor intensive. Newer optical screening tests (cervicography, direct visual inspection and speculoscopy), including fluorescent and reflective spectroscopy, are fraught with certain weaknesses. Yet, the integration of optical probes for the detection and discrimination of cervical neoplasia with automated image analysis methods may provide an effective screening tool for early detection of cervical cancer, particularly in resource poor nations. Investigative studies are needed to validate the potential for automated classification and recognition algorithms. By applying image analysis techniques for registration, segmentation, pattern recognition, and classification, cervical neoplasia may be reliably discriminated from normal epithelium. The National Cancer Institute (NCI), in cooperation with the National Library of Medicine (NLM), has embarked on a program to begin this and other similar investigative studies.
Dyer, Betsey D.; Kahn, Michael J.; LeBlanc, Mark D.
2008-01-01
Classification and regression tree (CART) analysis was applied to genome-wide tetranucleotide frequencies (genomic signatures) of 195 archaea and bacteria. Although genomic signatures have typically been used to classify evolutionary divergence, in this study, convergent evolution was the focus. Temperature optima for most of the organisms examined could be distinguished by CART analyses of tetranucleotide frequencies. This suggests that pervasive (nonlinear) qualities of genomes may reflect certain environmental conditions (such as temperature) in which those genomes evolved. The predominant use of GAGA and AGGA as the discriminating tetramers in CART models suggests that purine-loading and codon biases of thermophiles may explain some of the results. PMID:19054742
Discriminant function analysis as tool for subsurface geologist
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chesser, K.
1987-05-01
Sedimentary structures such as cross-bedding control porosity, permeability, and other petrophysical properties in sandstone reservoirs. Understanding the distribution of such structures in the subsurface not only aids in the prediction of reservoir properties but also provides information about depositional environments. Discriminant function analysis (DFA) is a simple yet powerful method incorporating petrophysical data from wireline logs, core analyses, or other sources into groups that have been previously defined through direct observation of sedimentary structures in cores. Once data have been classified into meaningful groups, the geologist can predict the distribution of specific sedimentary structures or important reservoir properties in areasmore » where cores are unavailable. DFA is efficient. Given several variables, DFA will choose the best combination to discriminate among groups. The initial classification function can be computed from relatively few observations, and additional data may be included as necessary. Furthermore, DFA provides quantitative goodness-of-fit estimates for each observation. Such estimates can be used as mapping parameters or to assess risk in petroleum ventures. Petrophysical data from the Skinner sandstone of Strauss field in southeastern Kansas tested the ability of DFA to discriminate between cross-bedded and ripple-bedded sandstones. Petroleum production in Strauss field is largely restricted to the more permeable cross-bedded sandstones. DFA based on permeability correctly placed 80% of samples into cross-bedded or ripple-bedded groups. Addition of formation factor to the discriminant function increased correct classifications to 83% - a small but statistically significant gain.« less
Borràs, Eva; Ferré, Joan; Boqué, Ricard; Mestres, Montserrat; Aceña, Laura; Calvo, Angels; Busto, Olga
2016-07-15
Three instrumental techniques, headspace-mass spectrometry (HS-MS), mid-infrared spectroscopy (MIR) and UV-visible spectrophotometry (UV-vis), have been combined to classify virgin olive oil samples based on the presence or absence of sensory defects. The reference sensory values were provided by an official taste panel. Different data fusion strategies were studied to improve the discrimination capability compared to using each instrumental technique individually. A general model was applied to discriminate high-quality non-defective olive oils (extra-virgin) and the lowest-quality olive oils considered non-edible (lampante). A specific identification of key off-flavours, such as musty, winey, fusty and rancid, was also studied. The data fusion of the three techniques improved the classification results in most of the cases. Low-level data fusion was the best strategy to discriminate musty, winey and fusty defects, using HS-MS, MIR and UV-vis, and the rancid defect using only HS-MS and MIR. The mid-level data fusion approach using partial least squares-discriminant analysis (PLS-DA) scores was found to be the best strategy for defective vs non-defective and edible vs non-edible oil discrimination. However, the data fusion did not sufficiently improve the results obtained by a single technique (HS-MS) to classify non-defective classes. These results indicate that instrumental data fusion can be useful for the identification of sensory defects in virgin olive oils. Copyright © 2016 Elsevier Ltd. All rights reserved.
Speaker gender identification based on majority vote classifiers
NASA Astrophysics Data System (ADS)
Mezghani, Eya; Charfeddine, Maha; Nicolas, Henri; Ben Amar, Chokri
2017-03-01
Speaker gender identification is considered among the most important tools in several multimedia applications namely in automatic speech recognition, interactive voice response systems and audio browsing systems. Gender identification systems performance is closely linked to the selected feature set and the employed classification model. Typical techniques are based on selecting the best performing classification method or searching optimum tuning of one classifier parameters through experimentation. In this paper, we consider a relevant and rich set of features involving pitch, MFCCs as well as other temporal and frequency-domain descriptors. Five classification models including decision tree, discriminant analysis, nave Bayes, support vector machine and k-nearest neighbor was experimented. The three best perming classifiers among the five ones will contribute by majority voting between their scores. Experimentations were performed on three different datasets spoken in three languages: English, German and Arabic in order to validate language independency of the proposed scheme. Results confirm that the presented system has reached a satisfying accuracy rate and promising classification performance thanks to the discriminating abilities and diversity of the used features combined with mid-level statistics.
NASA Astrophysics Data System (ADS)
Iyatomi, Hitoshi; Hashimoto, Jun; Yoshii, Fumuhito; Kazama, Toshiki; Kawada, Shuichi; Imai, Yutaka
2014-03-01
Discrimination between Alzheimer's disease and other dementia is clinically significant, however it is often difficult. In this study, we developed classification models among Alzheimer's disease (AD), other dementia (OD) and/or normal subjects (NC) using patient factors and indices obtained by brain perfusion SPECT. SPECT is commonly used to assess cerebral blood flow (CBF) and allows the evaluation of the severity of hypoperfusion by introducing statistical parametric mapping (SPM). We investigated a total of 150 cases (50 cases each for AD, OD, and NC) from Tokai University Hospital, Japan. In each case, we obtained a total of 127 candidate parameters from: (A) 2 patient factors (age and sex), (B) 12 CBF parameters and 113 SPM parameters including (C) 3 from specific volume analysis (SVA), and (D) 110 from voxel-based analysis stereotactic extraction estimation (vbSEE). We built linear classifiers with a statistical stepwise feature selection and evaluated the performance with the leave-one-out cross validation strategy. Our classifiers achieved very high classification performances with reasonable number of selected parameters. In the most significant discrimination in clinical, namely those of AD from OD, our classifier achieved both sensitivity (SE) and specificity (SP) of 96%. In a similar way, our classifiers achieved a SE of 90% and a SP of 98% in AD from NC, as well as a SE of 88% and a SP of 86% in AD from OD and NC cases. Introducing SPM indices such as SVA and vbSEE, classification performances improved around 7-15%. We confirmed that these SPM factors are quite important for diagnosing Alzheimer's disease.
NASA Astrophysics Data System (ADS)
Aleardi, Mattia; Ciabarri, Fabio
2017-10-01
In this work we test four classification methods for litho-fluid facies identification in a clastic reservoir located in the offshore Nile Delta. The ultimate goal of this study is to find an optimal classification method for the area under examination. The geologic context of the investigated area allows us to consider three different facies in the classification: shales, brine sands and gas sands. The depth at which the reservoir zone is located (2300-2700 m) produces a significant overlap of the P- and S-wave impedances of brine sands and gas sands that makes discrimination between these two litho-fluid classes particularly problematic. The classification is performed on the feature space defined by the elastic properties that are derived from recorded reflection seismic data by means of amplitude versus angle Bayesian inversion. As classification methods we test both deterministic and probabilistic approaches: the quadratic discriminant analysis and the neural network methods belong to the first group, whereas the standard Bayesian approach and the Bayesian approach that includes a 1D Markov chain a priori model to constrain the vertical continuity of litho-fluid facies belong to the second group. The ability of each method to discriminate the different facies is evaluated both on synthetic seismic data (computed on the basis of available borehole information) and on field seismic data. The outcomes of each classification method are compared with the known facies profile derived from well log data and the goodness of the results is quantitatively evaluated using the so-called confusion matrix. The results show that all methods return vertical facies profiles in which the main reservoir zone is correctly identified. However, the consideration of as much prior information as possible in the classification process is the winning choice for deriving a reliable and physically plausible predicted facies profile.
Fournet, Michelle E; Szabo, Andy; Mellinger, David K
2015-01-01
On low-latitude breeding grounds, humpback whales produce complex and highly stereotyped songs as well as a range of non-song sounds associated with breeding behaviors. While on their Southeast Alaskan foraging grounds, humpback whales produce a range of previously unclassified non-song vocalizations. This study investigates the vocal repertoire of Southeast Alaskan humpback whales from a sample of 299 non-song vocalizations collected over a 3-month period on foraging grounds in Frederick Sound, Southeast Alaska. Three classification systems were used, including aural spectrogram analysis, statistical cluster analysis, and discriminant function analysis, to describe and classify vocalizations. A hierarchical acoustic structure was identified; vocalizations were classified into 16 individual call types nested within four vocal classes. The combined classification method shows promise for identifying variability in call stereotypy between vocal groupings and is recommended for future classification of broad vocal repertoires.
Custers, Deborah; Krakowska, Barbara; De Beer, Jacques O; Courselle, Patricia; Daszykowski, Michal; Apers, Sandra; Deconinck, Eric
2016-02-01
Counterfeit medicines are a global threat to public health. High amounts enter the European market, which is why characterization of these products is a very important issue. In this study, a high-performance liquid chromatography-photodiode array (HPLC-PDA) and high-performance liquid chromatography-mass spectrometry (HPLC-MS) method were developed for the analysis of genuine Viagra®, generic products of Viagra®, and counterfeit samples in order to obtain different types of fingerprints. These data were included in the chemometric data analysis, aiming to test whether PDA and MS are complementary detection techniques. The MS data comprise both MS1 and MS2 fingerprints; the PDA data consist of fingerprints measured at three different wavelengths, i.e., 254, 270, and 290 nm, and all possible combinations of these wavelengths. First, it was verified if both groups of fingerprints can discriminate between genuine, generic, and counterfeit medicines separately; next, it was studied if the obtained results could be ameliorated by combining both fingerprint types. This data analysis showed that MS1 does not provide suitable classification models since several genuines and generics are classified as counterfeits and vice versa. However, when analyzing the MS1_MS2 data in combination with partial least squares-discriminant analysis (PLS-DA), a perfect discrimination was obtained. When only using data measured at 254 nm, good classification models can be obtained by k nearest neighbors (kNN) and soft independent modelling of class analogy (SIMCA), which might be interesting for the characterization of counterfeit drugs in developing countries. However, in general, the combination of PDA and MS data (254 nm_MS1) is preferred due to less classification errors between the genuines/generics and counterfeits compared to PDA and MS data separately.
NASA Astrophysics Data System (ADS)
Garcia-Allende, P. Beatriz; Amygdalos, Iakovos; Dhanapala, Hiruni; Goldin, Robert D.; Hanna, George B.; Elson, Daniel S.
2012-01-01
Computer-aided diagnosis of ophthalmic diseases using optical coherence tomography (OCT) relies on the extraction of thickness and size measures from the OCT images, but such defined layers are usually not observed in emerging OCT applications aimed at "optical biopsy" such as pulmonology or gastroenterology. Mathematical methods such as Principal Component Analysis (PCA) or textural analyses including both spatial textural analysis derived from the two-dimensional discrete Fourier transform (DFT) and statistical texture analysis obtained independently from center-symmetric auto-correlation (CSAC) and spatial grey-level dependency matrices (SGLDM), as well as, quantitative measurements of the attenuation coefficient have been previously proposed to overcome this problem. We recently proposed an alternative approach consisting of a region segmentation according to the intensity variation along the vertical axis and a pure statistical technology for feature quantification. OCT images were first segmented in the axial direction in an automated manner according to intensity. Afterwards, a morphological analysis of the segmented OCT images was employed for quantifying the features that served for tissue classification. In this study, a PCA processing of the extracted features is accomplished to combine their discriminative power in a lower number of dimensions. Ready discrimination of gastrointestinal surgical specimens is attained demonstrating that the approach further surpasses the algorithms previously reported and is feasible for tissue classification in the clinical setting.
Classification of Stellar Spectra with Fuzzy Minimum Within-Class Support Vector Machine
NASA Astrophysics Data System (ADS)
Zhong-bao, Liu; Wen-ai, Song; Jing, Zhang; Wen-juan, Zhao
2017-06-01
Classification is one of the important tasks in astronomy, especially in spectra analysis. Support Vector Machine (SVM) is a typical classification method, which is widely used in spectra classification. Although it performs well in practice, its classification accuracies can not be greatly improved because of two limitations. One is it does not take the distribution of the classes into consideration. The other is it is sensitive to noise. In order to solve the above problems, inspired by the maximization of the Fisher's Discriminant Analysis (FDA) and the SVM separability constraints, fuzzy minimum within-class support vector machine (FMWSVM) is proposed in this paper. In FMWSVM, the distribution of the classes is reflected by the within-class scatter in FDA and the fuzzy membership function is introduced to decrease the influence of the noise. The comparative experiments with SVM on the SDSS datasets verify the effectiveness of the proposed classifier FMWSVM.
Yu, Guan; Liu, Yufeng; Thung, Kim-Han; Shen, Dinggang
2014-01-01
Accurately identifying mild cognitive impairment (MCI) individuals who will progress to Alzheimer's disease (AD) is very important for making early interventions. Many classification methods focus on integrating multiple imaging modalities such as magnetic resonance imaging (MRI) and fluorodeoxyglucose positron emission tomography (FDG-PET). However, the main challenge for MCI classification using multiple imaging modalities is the existence of a lot of missing data in many subjects. For example, in the Alzheimer's Disease Neuroimaging Initiative (ADNI) study, almost half of the subjects do not have PET images. In this paper, we propose a new and flexible binary classification method, namely Multi-task Linear Programming Discriminant (MLPD) analysis, for the incomplete multi-source feature learning. Specifically, we decompose the classification problem into different classification tasks, i.e., one for each combination of available data sources. To solve all different classification tasks jointly, our proposed MLPD method links them together by constraining them to achieve the similar estimated mean difference between the two classes (under classification) for those shared features. Compared with the state-of-the-art incomplete Multi-Source Feature (iMSF) learning method, instead of constraining different classification tasks to choose a common feature subset for those shared features, MLPD can flexibly and adaptively choose different feature subsets for different classification tasks. Furthermore, our proposed MLPD method can be efficiently implemented by linear programming. To validate our MLPD method, we perform experiments on the ADNI baseline dataset with the incomplete MRI and PET images from 167 progressive MCI (pMCI) subjects and 226 stable MCI (sMCI) subjects. We further compared our method with the iMSF method (using incomplete MRI and PET images) and also the single-task classification method (using only MRI or only subjects with both MRI and PET images). Experimental results show very promising performance of our proposed MLPD method.
Yu, Guan; Liu, Yufeng; Thung, Kim-Han; Shen, Dinggang
2014-01-01
Accurately identifying mild cognitive impairment (MCI) individuals who will progress to Alzheimer's disease (AD) is very important for making early interventions. Many classification methods focus on integrating multiple imaging modalities such as magnetic resonance imaging (MRI) and fluorodeoxyglucose positron emission tomography (FDG-PET). However, the main challenge for MCI classification using multiple imaging modalities is the existence of a lot of missing data in many subjects. For example, in the Alzheimer's Disease Neuroimaging Initiative (ADNI) study, almost half of the subjects do not have PET images. In this paper, we propose a new and flexible binary classification method, namely Multi-task Linear Programming Discriminant (MLPD) analysis, for the incomplete multi-source feature learning. Specifically, we decompose the classification problem into different classification tasks, i.e., one for each combination of available data sources. To solve all different classification tasks jointly, our proposed MLPD method links them together by constraining them to achieve the similar estimated mean difference between the two classes (under classification) for those shared features. Compared with the state-of-the-art incomplete Multi-Source Feature (iMSF) learning method, instead of constraining different classification tasks to choose a common feature subset for those shared features, MLPD can flexibly and adaptively choose different feature subsets for different classification tasks. Furthermore, our proposed MLPD method can be efficiently implemented by linear programming. To validate our MLPD method, we perform experiments on the ADNI baseline dataset with the incomplete MRI and PET images from 167 progressive MCI (pMCI) subjects and 226 stable MCI (sMCI) subjects. We further compared our method with the iMSF method (using incomplete MRI and PET images) and also the single-task classification method (using only MRI or only subjects with both MRI and PET images). Experimental results show very promising performance of our proposed MLPD method. PMID:24820966
Simões, Rita; van Cappellen van Walsum, Anne-Marie; Slump, Cornelis H
2014-09-01
Classification methods have been proposed to detect Alzheimer’s disease (AD) using magnetic resonance images. Most rely on features such as the shape/volume of brain structures that need to be defined a priori. In this work, we propose a method that does not require either the segmentation of specific brain regions or the nonlinear alignment to a template. Besides classification, we also analyze which brain regions are discriminative between a group of normal controls and a group of AD patients. We perform 3D texture analysis using Local Binary Patterns computed at local image patches in the whole brain, combined in a classifier ensemble.We evaluate our method in a publicly available database including very mild-to-mild AD subjects and healthy elderly controls. For the subject cohort including only mild AD subjects, the best results are obtained using a combination of large (30×30×30 and 40×40×40 voxels) patches. A spatial analysis on the best performing patches shows that these are located in the medial-temporal lobe and in the periventricular regions. When very mild AD subjects are included in the dataset, the small (10×10×10 voxels) patches perform best, with the most discriminative ones being located near the left hippocampus. We show that our method is able not only to perform accurate classification, but also to localize dis-criminative brain regions, which are in accordance with the medical literature. This is achieved without the need to segment-specific brain structures and without performing nonlinear registration to a template, indicating that the method may be suitable for a clinical implementation that can help to diagnose AD at an earlier stage.
The ITE Land classification: Providing an environmental stratification of Great Britain.
Bunce, R G; Barr, C J; Gillespie, M K; Howard, D C
1996-01-01
The surface of Great Britain (GB) varies continuously in land cover from one area to another. The objective of any environmentally based land classification is to produce classes that match the patterns that are present by helping to define clear boundaries. The more appropriate the analysis and data used, the better the classes will fit the natural patterns. The observation of inter-correlations between ecological factors is the basis for interpreting ecological patterns in the field, and the Institute of Terrestrial Ecology (ITE) Land Classification formalises such subjective ideas. The data inevitably comprise a large number of factors in order to describe the environment adequately. Single factors, such as altitude, would only be useful on a national basis if they were the only dominant causative agent of ecological variation.The ITE Land Classification has defined 32 environmental categories called 'land classes', initially based on a sample of 1-km squares in Great Britain but subsequently extended to all 240 000 1-km squares. The original classification was produced using multivariate analysis of 75 environmental variables. The extension to all squares in GB was performed using a combination of logistic discrimination and discriminant functions. The classes have provided a stratification for successive ecological surveys, the results of which have characterised the classes in terms of botanical, zoological and landscape features.The classification has also been applied to integrate diverse datasets including satellite imagery, soils and socio-economic information. A variety of models have used the structure of the classification, for example to show potential land use change under different economic conditions. The principal data sets relevant for planning purposes have been incorporated into a user-friendly computer package, called the 'Countryside Information System'.
Three-dimensional passive sensing photon counting for object classification
NASA Astrophysics Data System (ADS)
Yeom, Seokwon; Javidi, Bahram; Watson, Edward
2007-04-01
In this keynote address, we address three-dimensional (3D) distortion-tolerant object recognition using photon-counting integral imaging (II). A photon-counting linear discriminant analysis (LDA) is discussed for classification of photon-limited images. We develop a compact distortion-tolerant recognition system based on the multiple-perspective imaging of II. Experimental and simulation results have shown that a low level of photons is sufficient to classify out-of-plane rotated objects.
Classification of Complex Nonspeech Sounds. Panel on Classification of Complex Nonspeech Sounds
1989-04-14
learning of the discrimination task. Since reports on many of these studies have not yet been published, brief summaries of the studies are included below...tonal signal with a noise- producing auditory induction and introduced an intensity ramp that increased the intensity of the tone just before the onset... recorded hand clap signals . The physical properties of the hand claps can be altered (along the lines suggested by the multidimensional analysis
[Research on Rapid Discrimination of Edible Oil by ATR Infrared Spectroscopy].
Ma, Xiao; Yuan, Hong-fu; Song, Chun-feng; Hu, Ai-qin; Li, Xiao-yu; Zhao, Zhong; Li, Xiu-qin; Guo Zhen; Zhu, Zhi-qiang
2015-07-01
A rapid discrimination method of edible oils, KL-BP model, was proposed by attenuated total reflectance infrared spectroscopy. The model extracts the characteristic of classification from source data by KL and reduces data dimension at the same time. Then the neural network model is constructed by the new data which as the input of the model. 84 edible oil samples which include sesame oil, corn oil, canola oil, blend oil, sunflower oil, peanut oil, olive oil, soybean oil and tea seed oil, were collected and their infrared spectra determined using an ATR FT-IR spectrometer. In order to compare the method performance, principal component analysis (PCA) direct-classification model, KL direct-classification model, PLS-DA model, PCA-BP model and KL-BP model are constructed in this paper. The results show that the recognition rates of PCA, PCA-BP, KL, PLS-DA and KL-BP are 59.1%, 68.2%, 77.3%, 77.3% and 90.9% for discriminating the 9 kinds of edible oils, respectively. KL extracts the eigenvector which make the distance between different class and distance of every class ratio is the largest. So the method can get much more classify information than PCA. BP neural network can effectively enhance the classification ability and accuracy. Taking full of the advantages of KL in extracting more category information in dimension reducing and the features of BP neural network in self-learning, adaptive, nonlinear, the KL-BP method has the best classification ability and recognition accuracy and great importance for rapidly recognizing edible oil in practice.
Tree species classification in subtropical forests using small-footprint full-waveform LiDAR data
NASA Astrophysics Data System (ADS)
Cao, Lin; Coops, Nicholas C.; Innes, John L.; Dai, Jinsong; Ruan, Honghua; She, Guanghui
2016-07-01
The accurate classification of tree species is critical for the management of forest ecosystems, particularly subtropical forests, which are highly diverse and complex ecosystems. While airborne Light Detection and Ranging (LiDAR) technology offers significant potential to estimate forest structural attributes, the capacity of this new tool to classify species is less well known. In this research, full-waveform metrics were extracted by a voxel-based composite waveform approach and examined with a Random Forests classifier to discriminate six subtropical tree species (i.e., Masson pine (Pinus massoniana Lamb.)), Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.), Slash pines (Pinus elliottii Engelm.), Sawtooth oak (Quercus acutissima Carruth.) and Chinese holly (Ilex chinensis Sims.) at three levels of discrimination. As part of the analysis, the optimal voxel size for modelling the composite waveforms was investigated, the most important predictor metrics for species classification assessed and the effect of scan angle on species discrimination examined. Results demonstrate that all tree species were classified with relatively high accuracy (68.6% for six classes, 75.8% for four main species and 86.2% for conifers and broadleaved trees). Full-waveform metrics (based on height of median energy, waveform distance and number of waveform peaks) demonstrated high classification importance and were stable among various voxel sizes. The results also suggest that the voxel based approach can alleviate some of the issues associated with large scan angles. In summary, the results indicate that full-waveform LIDAR data have significant potential for tree species classification in the subtropical forests.
[Electroencephalogram Feature Selection Based on Correlation Coefficient Analysis].
Zhou, Jinzhi; Tang, Xiaofang
2015-08-01
In order to improve the accuracy of classification with small amount of motor imagery training data on the development of brain-computer interface (BCD systems, we proposed an analyzing method to automatically select the characteristic parameters based on correlation coefficient analysis. Throughout the five sample data of dataset IV a from 2005 BCI Competition, we utilized short-time Fourier transform (STFT) and correlation coefficient calculation to reduce the number of primitive electroencephalogram dimension, then introduced feature extraction based on common spatial pattern (CSP) and classified by linear discriminant analysis (LDA). Simulation results showed that the average rate of classification accuracy could be improved by using correlation coefficient feature selection method than those without using this algorithm. Comparing with support vector machine (SVM) optimization features algorithm, the correlation coefficient analysis can lead better selection parameters to improve the accuracy of classification.
Slabbinck, Bram; Waegeman, Willem; Dawyndt, Peter; De Vos, Paul; De Baets, Bernard
2010-01-30
Machine learning techniques have shown to improve bacterial species classification based on fatty acid methyl ester (FAME) data. Nonetheless, FAME analysis has a limited resolution for discrimination of bacteria at the species level. In this paper, we approach the species classification problem from a taxonomic point of view. Such a taxonomy or tree is typically obtained by applying clustering algorithms on FAME data or on 16S rRNA gene data. The knowledge gained from the tree can then be used to evaluate FAME-based classifiers, resulting in a novel framework for bacterial species classification. In view of learning in a taxonomic framework, we consider two types of trees. First, a FAME tree is constructed with a supervised divisive clustering algorithm. Subsequently, based on 16S rRNA gene sequence analysis, phylogenetic trees are inferred by the NJ and UPGMA methods. In this second approach, the species classification problem is based on the combination of two different types of data. Herein, 16S rRNA gene sequence data is used for phylogenetic tree inference and the corresponding binary tree splits are learned based on FAME data. We call this learning approach 'phylogenetic learning'. Supervised Random Forest models are developed to train the classification tasks in a stratified cross-validation setting. In this way, better classification results are obtained for species that are typically hard to distinguish by a single or flat multi-class classification model. FAME-based bacterial species classification is successfully evaluated in a taxonomic framework. Although the proposed approach does not improve the overall accuracy compared to flat multi-class classification, it has some distinct advantages. First, it has better capabilities for distinguishing species on which flat multi-class classification fails. Secondly, the hierarchical classification structure allows to easily evaluate and visualize the resolution of FAME data for the discrimination of bacterial species. Summarized, by phylogenetic learning we are able to situate and evaluate FAME-based bacterial species classification in a more informative context.
2010-01-01
Background Machine learning techniques have shown to improve bacterial species classification based on fatty acid methyl ester (FAME) data. Nonetheless, FAME analysis has a limited resolution for discrimination of bacteria at the species level. In this paper, we approach the species classification problem from a taxonomic point of view. Such a taxonomy or tree is typically obtained by applying clustering algorithms on FAME data or on 16S rRNA gene data. The knowledge gained from the tree can then be used to evaluate FAME-based classifiers, resulting in a novel framework for bacterial species classification. Results In view of learning in a taxonomic framework, we consider two types of trees. First, a FAME tree is constructed with a supervised divisive clustering algorithm. Subsequently, based on 16S rRNA gene sequence analysis, phylogenetic trees are inferred by the NJ and UPGMA methods. In this second approach, the species classification problem is based on the combination of two different types of data. Herein, 16S rRNA gene sequence data is used for phylogenetic tree inference and the corresponding binary tree splits are learned based on FAME data. We call this learning approach 'phylogenetic learning'. Supervised Random Forest models are developed to train the classification tasks in a stratified cross-validation setting. In this way, better classification results are obtained for species that are typically hard to distinguish by a single or flat multi-class classification model. Conclusions FAME-based bacterial species classification is successfully evaluated in a taxonomic framework. Although the proposed approach does not improve the overall accuracy compared to flat multi-class classification, it has some distinct advantages. First, it has better capabilities for distinguishing species on which flat multi-class classification fails. Secondly, the hierarchical classification structure allows to easily evaluate and visualize the resolution of FAME data for the discrimination of bacterial species. Summarized, by phylogenetic learning we are able to situate and evaluate FAME-based bacterial species classification in a more informative context. PMID:20113515
Raouafi, Sana; Achiche, Sofiane; Begon, Mickael; Sarcher, Aurélie; Raison, Maxime
2018-01-01
Treatment for cerebral palsy depends upon the severity of the child's condition and requires knowledge about upper limb disability. The aim of this study was to develop a systematic quantitative classification method of the upper limb disability levels for children with spastic unilateral cerebral palsy based on upper limb movements and muscle activation. Thirteen children with spastic unilateral cerebral palsy and six typically developing children participated in this study. Patients were matched on age and manual ability classification system levels I to III. Twenty-three kinematic and electromyographic variables were collected from two tasks. Discriminative analysis and K-means clustering algorithm were applied using 23 kinematic and EMG variables of each participant. Among the 23 kinematic and electromyographic variables, only two variables containing the most relevant information for the prediction of the four levels of severity of spastic unilateral cerebral palsy, which are fixed by manual ability classification system, were identified by discriminant analysis: (1) the Falconer index (CAI E ) which represents the ratio of biceps to triceps brachii activity during extension and (2) the maximal angle extension (θ Extension,max ). A good correlation (Kendall Rank correlation coefficient = -0.53, p = 0.01) was found between levels fixed by manual ability classification system and the obtained classes. These findings suggest that the cost and effort needed to assess and characterize the disability level of a child can be further reduced.
Ivanov, Iliya V; Leitritz, Martin A; Norrenberg, Lars A; Völker, Michael; Dynowski, Marek; Ueffing, Marius; Dietter, Johannes
2016-02-01
Abnormalities of blood vessel anatomy, morphology, and ratio can serve as important diagnostic markers for retinal diseases such as AMD or diabetic retinopathy. Large cohort studies demand automated and quantitative image analysis of vascular abnormalities. Therefore, we developed an analytical software tool to enable automated standardized classification of blood vessels supporting clinical reading. A dataset of 61 images was collected from a total of 33 women and 8 men with a median age of 38 years. The pupils were not dilated, and images were taken after dark adaption. In contrast to current methods in which classification is based on vessel profile intensity averages, and similar to human vision, local color contrast was chosen as a discriminator to allow artery vein discrimination and arterial-venous ratio (AVR) calculation without vessel tracking. With 83% ± 1 standard error of the mean for our dataset, we achieved best classification for weighted lightness information from a combination of the red, green, and blue channels. Tested on an independent dataset, our method reached 89% correct classification, which, when benchmarked against conventional ophthalmologic classification, shows significantly improved classification scores. Our study demonstrates that vessel classification based on local color contrast can cope with inter- or intraimage lightness variability and allows consistent AVR calculation. We offer an open-source implementation of this method upon request, which can be integrated into existing tool sets and applied to general diagnostic exams.
A Qualitative Organic Analysis that Exploits the Senses of Smell, Touch, and Sound
ERIC Educational Resources Information Center
Bromfield-Lee, Deborah C.; Oliver-Hoyo, Maria T.
2007-01-01
This laboratory experiment utilizes the characteristic aromas of some functional groups to exploit the sense of smell as a discriminating tool in an organic qualitative analysis scheme. Students differentiate a variety of compounds by their aromas and based on their olfactory classification identify an unknown functional group. Students then…
Bai, Ou; Lin, Peter; Vorbach, Sherry; Li, Jiang; Furlani, Steve; Hallett, Mark
2007-12-01
To explore effective combinations of computational methods for the prediction of movement intention preceding the production of self-paced right and left hand movements from single trial scalp electroencephalogram (EEG). Twelve naïve subjects performed self-paced movements consisting of three key strokes with either hand. EEG was recorded from 128 channels. The exploration was performed offline on single trial EEG data. We proposed that a successful computational procedure for classification would consist of spatial filtering, temporal filtering, feature selection, and pattern classification. A systematic investigation was performed with combinations of spatial filtering using principal component analysis (PCA), independent component analysis (ICA), common spatial patterns analysis (CSP), and surface Laplacian derivation (SLD); temporal filtering using power spectral density estimation (PSD) and discrete wavelet transform (DWT); pattern classification using linear Mahalanobis distance classifier (LMD), quadratic Mahalanobis distance classifier (QMD), Bayesian classifier (BSC), multi-layer perceptron neural network (MLP), probabilistic neural network (PNN), and support vector machine (SVM). A robust multivariate feature selection strategy using a genetic algorithm was employed. The combinations of spatial filtering using ICA and SLD, temporal filtering using PSD and DWT, and classification methods using LMD, QMD, BSC and SVM provided higher performance than those of other combinations. Utilizing one of the better combinations of ICA, PSD and SVM, the discrimination accuracy was as high as 75%. Further feature analysis showed that beta band EEG activity of the channels over right sensorimotor cortex was most appropriate for discrimination of right and left hand movement intention. Effective combinations of computational methods provide possible classification of human movement intention from single trial EEG. Such a method could be the basis for a potential brain-computer interface based on human natural movement, which might reduce the requirement of long-term training. Effective combinations of computational methods can classify human movement intention from single trial EEG with reasonable accuracy.
NASA Astrophysics Data System (ADS)
Rabidas, Rinku; Midya, Abhishek; Chakraborty, Jayasree; Sadhu, Anup; Arif, Wasim
2018-02-01
In this paper, Curvelet based local attributes, Curvelet-Local configuration pattern (C-LCP), is introduced for the characterization of mammographic masses as benign or malignant. Amid different anomalies such as micro- calcification, bilateral asymmetry, architectural distortion, and masses, the reason for targeting the mass lesions is due to their variation in shape, size, and margin which makes the diagnosis a challenging task. Being efficient in classification, multi-resolution property of the Curvelet transform is exploited and local information is extracted from the coefficients of each subband using Local configuration pattern (LCP). The microscopic measures in concatenation with the local textural information provide more discriminating capability than individual. The measures embody the magnitude information along with the pixel-wise relationships among the neighboring pixels. The performance analysis is conducted with 200 mammograms of the DDSM database containing 100 mass cases of each benign and malignant. The optimal set of features is acquired via stepwise logistic regression method and the classification is carried out with Fisher linear discriminant analysis. The best area under the receiver operating characteristic curve and accuracy of 0.95 and 87.55% are achieved with the proposed method, which is further compared with some of the state-of-the-art competing methods.
NASA Astrophysics Data System (ADS)
Montejo, Ludguier D.; Kim, Hyun K.; Häme, Yrjö; Jia, Jingfei; Montejo, Julio D.; Netz, Uwe J.; Blaschke, Sabine; Zwaka, Paul; Müeller, Gerhard A.; Beuthan, Jürgen; Hielscher, Andreas H.
2011-03-01
We present a study on the effectiveness of computer-aided diagnosis (CAD) of rheumatoid arthritis (RA) from frequency-domain diffuse optical tomographic (FDOT) images. FDOT is used to obtain the distribution of tissue optical properties. Subsequently, the non-parametric Kruskal-Wallis ANOVA test is employed to verify statistically significant differences between the optical parameters of patients affected by RA and healthy volunteers. Furthermore, quadratic discriminate analysis (QDA) of the absorption (μa) and scattering (μa or μ's) distributions is used to classify subjects as affected or not affected by RA. We evaluate the classification efficiency by determining the sensitivity (Se), specificity (Sp), and the Youden index (Y). We find that combining features extracted from μa and μa or μ's images allows for more accurate classification than when μa or μa or μ's features are considered individually on their own. Combining μa and μa or μ's features yields values of up to Y = 0.75 (Se = 0.84 and Sp = 0.91). The best results when μa or μ's features are considered individually are Y = 0.65 (Se = 0.85 and Sp = 0.80) and Y = 0.70 (Se = 0.80 and Sp = 0.90), respectively.
Liu, Wei; Wang, Dongmei; Liu, Jianjun; Li, Dengwu; Yin, Dongxue
2016-01-01
The present study was performed to assess the quality of Potentilla fruticosa L. sampled from distinct regions of China using high performance liquid chromatography (HPLC) fingerprinting coupled with a suite of chemometric methods. For this quantitative analysis, the main active phytochemical compositions and the antioxidant activity in P. fruticosa were also investigated. Considering the high percentages and antioxidant activities of phytochemicals, P. fruticosa samples from Kangding, Sichuan were selected as the most valuable raw materials. Similarity analysis (SA) of HPLC fingerprints, hierarchical cluster analysis (HCA), principle component analysis (PCA), and discriminant analysis (DA) were further employed to provide accurate classification and quality estimates of P. fruticosa. Two principal components (PCs) were collected by PCA. PC1 separated samples from Kangding, Sichuan, capturing 57.64% of the variance, whereas PC2 contributed to further separation, capturing 18.97% of the variance. Two kinds of discriminant functions with a 100% discrimination ratio were constructed. The results strongly supported the conclusion that the eight samples from different regions were clustered into three major groups, corresponding with their morphological classification, for which HPLC analysis confirmed the considerable variation in phytochemical compositions and that P. fruticosa samples from Kangding, Sichuan were of high quality. The results of SA, HCA, PCA, and DA were in agreement and performed well for the quality assessment of P. fruticosa. Consequently, HPLC fingerprinting coupled with chemometric techniques provides a highly flexible and reliable method for the quality evaluation of traditional Chinese medicines.
Liu, Wei; Wang, Dongmei; Liu, Jianjun; Li, Dengwu; Yin, Dongxue
2016-01-01
The present study was performed to assess the quality of Potentilla fruticosa L. sampled from distinct regions of China using high performance liquid chromatography (HPLC) fingerprinting coupled with a suite of chemometric methods. For this quantitative analysis, the main active phytochemical compositions and the antioxidant activity in P. fruticosa were also investigated. Considering the high percentages and antioxidant activities of phytochemicals, P. fruticosa samples from Kangding, Sichuan were selected as the most valuable raw materials. Similarity analysis (SA) of HPLC fingerprints, hierarchical cluster analysis (HCA), principle component analysis (PCA), and discriminant analysis (DA) were further employed to provide accurate classification and quality estimates of P. fruticosa. Two principal components (PCs) were collected by PCA. PC1 separated samples from Kangding, Sichuan, capturing 57.64% of the variance, whereas PC2 contributed to further separation, capturing 18.97% of the variance. Two kinds of discriminant functions with a 100% discrimination ratio were constructed. The results strongly supported the conclusion that the eight samples from different regions were clustered into three major groups, corresponding with their morphological classification, for which HPLC analysis confirmed the considerable variation in phytochemical compositions and that P. fruticosa samples from Kangding, Sichuan were of high quality. The results of SA, HCA, PCA, and DA were in agreement and performed well for the quality assessment of P. fruticosa. Consequently, HPLC fingerprinting coupled with chemometric techniques provides a highly flexible and reliable method for the quality evaluation of traditional Chinese medicines. PMID:26890416
Jiang, Shun-Yuan; Sun, Hong-Bing; Sun, Hui; Ma, Yu-Ying; Chen, Hong-Yu; Zhu, Wen-Tao; Zhou, Yi
2016-03-01
This paper aims to explore a comprehensive assessment method combined traditional Chinese medicinal material specifications with quantitative quality indicators. Seventy-six samples of Notopterygii Rhizoma et Radix were collected on market and at producing areas. Traditional commercial specifications were described and assigned, and 10 chemical components and volatile oils were determined for each sample. Cluster analysis, Fisher discriminant analysis and correspondence analysis were used to establish the relationship between the traditional qualitative commercial specifications and quantitative chemical indices for comprehensive evaluating quality of medicinal materials, and quantitative classification of commercial grade and quality grade. A herb quality index (HQI) including traditional commercial specifications and chemical components for quantitative grade classification were established, and corresponding discriminant function were figured out for precise determination of quality grade and sub-grade of Notopterygii Rhizoma et Radix. The result showed that notopterol, isoimperatorin and volatile oil were the major components for determination of chemical quality, and their dividing values were specified for every grade and sub-grade of the commercial materials of Notopterygii Rhizoma et Radix. According to the result, essential relationship between traditional medicinal indicators, qualitative commercial specifications, and quantitative chemical composition indicators can be examined by K-mean cluster, Fisher discriminant analysis and correspondence analysis, which provide a new method for comprehensive quantitative evaluation of traditional Chinese medicine quality integrated traditional commodity specifications and quantitative modern chemical index. Copyright© by the Chinese Pharmaceutical Association.
Classification and disease prediction via mathematical programming
NASA Astrophysics Data System (ADS)
Lee, Eva K.; Wu, Tsung-Lin
2007-11-01
In this chapter, we present classification models based on mathematical programming approaches. We first provide an overview on various mathematical programming approaches, including linear programming, mixed integer programming, nonlinear programming and support vector machines. Next, we present our effort of novel optimization-based classification models that are general purpose and suitable for developing predictive rules for large heterogeneous biological and medical data sets. Our predictive model simultaneously incorporates (1) the ability to classify any number of distinct groups; (2) the ability to incorporate heterogeneous types of attributes as input; (3) a high-dimensional data transformation that eliminates noise and errors in biological data; (4) the ability to incorporate constraints to limit the rate of misclassification, and a reserved-judgment region that provides a safeguard against over-training (which tends to lead to high misclassification rates from the resulting predictive rule) and (5) successive multi-stage classification capability to handle data points placed in the reserved judgment region. To illustrate the power and flexibility of the classification model and solution engine, and its multigroup prediction capability, application of the predictive model to a broad class of biological and medical problems is described. Applications include: the differential diagnosis of the type of erythemato-squamous diseases; predicting presence/absence of heart disease; genomic analysis and prediction of aberrant CpG island meythlation in human cancer; discriminant analysis of motility and morphology data in human lung carcinoma; prediction of ultrasonic cell disruption for drug delivery; identification of tumor shape and volume in treatment of sarcoma; multistage discriminant analysis of biomarkers for prediction of early atherosclerois; fingerprinting of native and angiogenic microvascular networks for early diagnosis of diabetes, aging, macular degeneracy and tumor metastasis; prediction of protein localization sites; and pattern recognition of satellite images in classification of soil types. In all these applications, the predictive model yields correct classification rates ranging from 80% to 100%. This provides motivation for pursuing its use as a medical diagnostic, monitoring and decision-making tool.
McEntire, John E.; Kuo, Kenneth C.; Smith, Mark E.; Stalling, David L.; Richens, Jack W.; Zumwalt, Robert W.; Gehrke, Charles W.; Papermaster, Ben W.
1989-01-01
A wide spectrum of modified nucleosides has been quantified by high-performance liquid chromatography in serum of 49 male lung cancer patients, 35 patients with other cancers, and 48 patients hospitalized for nonneoplastic diseases. Data for 29 modified nucleoside peaks were normalized to an internal standard and analyzed by discriminant analysis and stepwise discriminant analysis. A model based on peaks selected by a stepwise discriminant procedure correctly classified 79% of the cancer and 75% of the noncancer subjects. It also demonstrated 84% sensitivity and 79% specificity when comparing lung cancer to noncancer subjects, and 80% sensitivity and 55% specificity in comparing lung cancer to other cancers. The nucleoside peaks having the greatest influence on the models varied dependent on the subgroups compared, confirming the importance of quantifying a wide array of nucleosides. These data support and expand previous studies which reported the utility of measuring modified nucleoside levels in serum and show that precise measurement of an array of 29 modified nucleosides in serum by high-performance liquid chromatography with UV scanning with subsequent data modeling may provide a clinically useful approach to patient classification in diagnosis and subsequent therapeutic monitoring.
Bonney, Heather
2014-08-01
Analysis of cut marks in bone is largely limited to two dimensional qualitative description. Development of morphological classification methods using measurements from cut mark cross sections could have multiple uses across palaeoanthropological and archaeological disciplines, where cutting edge types are used to investigate and reconstruct behavioral patterns. An experimental study was undertaken, using porcine bone, to determine the usefulness of discriminant function analysis in classifying cut marks by blade edge type, from a number of measurements taken from their cross-sectional profile. The discriminant analysis correctly classified 86.7% of the experimental cut marks into serrated, non-serrated and bamboo blade types. The technique was then used to investigate a series of cut marks of unknown origin from a collection of trophy skulls from the Torres Strait Islands, to investigate whether they were made by bamboo or metal blades. Nineteen out of twenty of the cut marks investigated were classified as bamboo which supports the non-contemporaneous ethnographic accounts of the knives used for trophy taking and defleshing remains. With further investigation across a variety of blade types, this technique could prove a valuable tool in the interpretation of cut mark evidence from a wide variety of contexts, particularly in forensic anthropology where the requirement for presentation of evidence in a statistical format is becoming increasingly important. © 2014 Wiley Periodicals, Inc.
Harwood, Valerie J.; Whitlock, John; Withington, Victoria
2000-01-01
The antibiotic resistance patterns of fecal streptococci and fecal coliforms isolated from domestic wastewater and animal feces were determined using a battery of antibiotics (amoxicillin, ampicillin, cephalothin, chlortetracycline, oxytetracycline, tetracycline, erythromycin, streptomycin, and vancomycin) at four concentrations each. The sources of animal feces included wild birds, cattle, chickens, dogs, pigs, and raccoons. Antibiotic resistance patterns of fecal streptococci and fecal coliforms from known sources were grouped into two separate databases, and discriminant analysis of these patterns was used to establish the relationship between the antibiotic resistance patterns and the bacterial source. The fecal streptococcus and fecal coliform databases classified isolates from known sources with similar accuracies. The average rate of correct classification for the fecal streptococcus database was 62.3%, and that for the fecal coliform database was 63.9%. The sources of fecal streptococci and fecal coliforms isolated from surface waters were identified by discriminant analysis of their antibiotic resistance patterns. Both databases identified the source of indicator bacteria isolated from surface waters directly impacted by septic tank discharges as human. At sample sites selected for relatively low anthropogenic impact, the dominant sources of indicator bacteria were identified as various animals. The antibiotic resistance analysis technique promises to be a useful tool in assessing sources of fecal contamination in subtropical waters, such as those in Florida. PMID:10966379
Application of Hyperspectral Imaging to Detect Sclerotinia sclerotiorum on Oilseed Rape Stems
Kong, Wenwen; Zhang, Chu; Huang, Weihao
2018-01-01
Hyperspectral imaging covering the spectral range of 384–1034 nm combined with chemometric methods was used to detect Sclerotinia sclerotiorum (SS) on oilseed rape stems by two sample sets (60 healthy and 60 infected stems for each set). Second derivative spectra and PCA loadings were used to select the optimal wavelengths. Discriminant models were built and compared to detect SS on oilseed rape stems, including partial least squares-discriminant analysis, radial basis function neural network, support vector machine and extreme learning machine. The discriminant models using full spectra and optimal wavelengths showed good performance with classification accuracies of over 80% for the calibration and prediction set. Comparing all developed models, the optimal classification accuracies of the calibration and prediction set were over 90%. The similarity of selected optimal wavelengths also indicated the feasibility of using hyperspectral imaging to detect SS on oilseed rape stems. The results indicated that hyperspectral imaging could be used as a fast, non-destructive and reliable technique to detect plant diseases on stems. PMID:29300315
Dong, D; Zheng, W; Jiao, L; Lang, Y; Zhao, X
2016-03-01
Different brands of Chinese vinegar are similar in appearance, color and aroma, making their discrimination difficult. The compositions and concentrations of the volatiles released from different vinegars vary by raw material and brewing process and thus offer a means to discriminate vinegars. In this study, we enhanced the detection sensitivity of the infrared spectrometer by extending its optical path. We measured the infrared spectra of the volatiles from 5 brands of Chinese vinegar and observed the spectral characteristics corresponding to alcohols, esters, acids, furfural, etc. Different brands of Chinese vinegar had obviously different infrared spectra and could be classified through chemometrics analysis. Furthermore, we established classification models and demonstrated their effectiveness for classifying different brands of vinegar. This study demonstrates that long-optical-path infrared spectroscopy has the ability to discriminate Chinese vinegars with the advantages that it is fast and non-destructive and eliminates the need for sampling. Copyright © 2015 Elsevier Ltd. All rights reserved.
Liu, Fei; Wang, Yuan-zhong; Yang, Chun-yan; Jin, Hang
2015-01-01
The genuineness and producing area of Panax notoginseng were studied based on infrared spectroscopy combined with discriminant analysis. The infrared spectra of 136 taproots of P. notoginseng from 13 planting point in 11 counties were collected and the second derivate spectra were calculated by Omnic 8. 0 software. The infrared spectra and their second derivate spectra in the range 1 800 - 700 cm-1 were used to build model by stepwise discriminant analysis, which was in order to distinguish study on the genuineness of P. notoginseng. The model built based on the second derivate spectra showed the better recognition effect for the genuineness of P. notoginseng. The correct rate of returned classification reached to 100%, and the prediction accuracy was 93. 4%. The stability of model was tested by cross validation and the method was performed extrapolation validation. The second derivate spectra combined with the same discriminant analysis method were used to distinguish the producing area of P. notoginseng. The recognition effect of models built based on different range of spectrum and different numbers of samples were compared and found that when the model was built by collecting 8 samples from each planting point as training sample and the spectrum in the range 1 500 - 1 200 cm-1 , the recognition effect was better, with the correct rate of returned classification reached to 99. 0%, and the prediction accuracy was 76. 5%. The results indicated that infrared spectroscopy combined with discriminant analysis showed good recognition effect for the genuineness of P. notoginseng. The method might be a hopeful new method for identification of genuineness of P. notoginseng in practice. The method could recognize the producing area of P. notoginseng to some extent and could be a new thought for identification of the producing area of P. natoginseng.
Semi-supervised vibration-based classification and condition monitoring of compressors
NASA Astrophysics Data System (ADS)
Potočnik, Primož; Govekar, Edvard
2017-09-01
Semi-supervised vibration-based classification and condition monitoring of the reciprocating compressors installed in refrigeration appliances is proposed in this paper. The method addresses the problem of industrial condition monitoring where prior class definitions are often not available or difficult to obtain from local experts. The proposed method combines feature extraction, principal component analysis, and statistical analysis for the extraction of initial class representatives, and compares the capability of various classification methods, including discriminant analysis (DA), neural networks (NN), support vector machines (SVM), and extreme learning machines (ELM). The use of the method is demonstrated on a case study which was based on industrially acquired vibration measurements of reciprocating compressors during the production of refrigeration appliances. The paper presents a comparative qualitative analysis of the applied classifiers, confirming the good performance of several nonlinear classifiers. If the model parameters are properly selected, then very good classification performance can be obtained from NN trained by Bayesian regularization, SVM and ELM classifiers. The method can be effectively applied for the industrial condition monitoring of compressors.
Discrimination of Oil Slicks and Lookalikes in Polarimetric SAR Images Using CNN.
Guo, Hao; Wu, Danni; An, Jubai
2017-08-09
Oil slicks and lookalikes (e.g., plant oil and oil emulsion) all appear as dark areas in polarimetric Synthetic Aperture Radar (SAR) images and are highly heterogeneous, so it is very difficult to use a single feature that can allow classification of dark objects in polarimetric SAR images as oil slicks or lookalikes. We established multi-feature fusion to support the discrimination of oil slicks and lookalikes. In the paper, simple discrimination analysis is used to rationalize a preferred features subset. The features analyzed include entropy, alpha, and Single-bounce Eigenvalue Relative Difference (SERD) in the C-band polarimetric mode. We also propose a novel SAR image discrimination method for oil slicks and lookalikes based on Convolutional Neural Network (CNN). The regions of interest are selected as the training and testing samples for CNN on the three kinds of polarimetric feature images. The proposed method is applied to a training data set of 5400 samples, including 1800 crude oil, 1800 plant oil, and 1800 oil emulsion samples. In the end, the effectiveness of the method is demonstrated through the analysis of some experimental results. The classification accuracy obtained using 900 samples of test data is 91.33%. It is here observed that the proposed method not only can accurately identify the dark spots on SAR images but also verify the ability of the proposed algorithm to classify unstructured features.
Discrimination of Oil Slicks and Lookalikes in Polarimetric SAR Images Using CNN
An, Jubai
2017-01-01
Oil slicks and lookalikes (e.g., plant oil and oil emulsion) all appear as dark areas in polarimetric Synthetic Aperture Radar (SAR) images and are highly heterogeneous, so it is very difficult to use a single feature that can allow classification of dark objects in polarimetric SAR images as oil slicks or lookalikes. We established multi-feature fusion to support the discrimination of oil slicks and lookalikes. In the paper, simple discrimination analysis is used to rationalize a preferred features subset. The features analyzed include entropy, alpha, and Single-bounce Eigenvalue Relative Difference (SERD) in the C-band polarimetric mode. We also propose a novel SAR image discrimination method for oil slicks and lookalikes based on Convolutional Neural Network (CNN). The regions of interest are selected as the training and testing samples for CNN on the three kinds of polarimetric feature images. The proposed method is applied to a training data set of 5400 samples, including 1800 crude oil, 1800 plant oil, and 1800 oil emulsion samples. In the end, the effectiveness of the method is demonstrated through the analysis of some experimental results. The classification accuracy obtained using 900 samples of test data is 91.33%. It is here observed that the proposed method not only can accurately identify the dark spots on SAR images but also verify the ability of the proposed algorithm to classify unstructured features. PMID:28792477
Full-motion video analysis for improved gender classification
NASA Astrophysics Data System (ADS)
Flora, Jeffrey B.; Lochtefeld, Darrell F.; Iftekharuddin, Khan M.
2014-06-01
The ability of computer systems to perform gender classification using the dynamic motion of the human subject has important applications in medicine, human factors, and human-computer interface systems. Previous works in motion analysis have used data from sensors (including gyroscopes, accelerometers, and force plates), radar signatures, and video. However, full-motion video, motion capture, range data provides a higher resolution time and spatial dataset for the analysis of dynamic motion. Works using motion capture data have been limited by small datasets in a controlled environment. In this paper, we explore machine learning techniques to a new dataset that has a larger number of subjects. Additionally, these subjects move unrestricted through a capture volume, representing a more realistic, less controlled environment. We conclude that existing linear classification methods are insufficient for the gender classification for larger dataset captured in relatively uncontrolled environment. A method based on a nonlinear support vector machine classifier is proposed to obtain gender classification for the larger dataset. In experimental testing with a dataset consisting of 98 trials (49 subjects, 2 trials per subject), classification rates using leave-one-out cross-validation are improved from 73% using linear discriminant analysis to 88% using the nonlinear support vector machine classifier.
New feature extraction method for classification of agricultural products from x-ray images
NASA Astrophysics Data System (ADS)
Talukder, Ashit; Casasent, David P.; Lee, Ha-Woon; Keagy, Pamela M.; Schatzki, Thomas F.
1999-01-01
Classification of real-time x-ray images of randomly oriented touching pistachio nuts is discussed. The ultimate objective is the development of a system for automated non- invasive detection of defective product items on a conveyor belt. We discuss the extraction of new features that allow better discrimination between damaged and clean items. This feature extraction and classification stage is the new aspect of this paper; our new maximum representation and discrimination between damaged and clean items. This feature extraction and classification stage is the new aspect of this paper; our new maximum representation and discriminating feature (MRDF) extraction method computes nonlinear features that are used as inputs to a new modified k nearest neighbor classifier. In this work the MRDF is applied to standard features. The MRDF is robust to various probability distributions of the input class and is shown to provide good classification and new ROC data.
Pan, Rui; Wang, Hansheng; Li, Runze
2016-01-01
This paper is concerned with the problem of feature screening for multi-class linear discriminant analysis under ultrahigh dimensional setting. We allow the number of classes to be relatively large. As a result, the total number of relevant features is larger than usual. This makes the related classification problem much more challenging than the conventional one, where the number of classes is small (very often two). To solve the problem, we propose a novel pairwise sure independence screening method for linear discriminant analysis with an ultrahigh dimensional predictor. The proposed procedure is directly applicable to the situation with many classes. We further prove that the proposed method is screening consistent. Simulation studies are conducted to assess the finite sample performance of the new procedure. We also demonstrate the proposed methodology via an empirical analysis of a real life example on handwritten Chinese character recognition. PMID:28127109
Shen, Fei; Wu, Jian; Ying, Yibin; Li, Bobin; Jiang, Tao
2013-12-15
Discrimination of Chinese rice wines from three well-known wineries ("Guyuelongshan", "Kuaijishan", and "Pagoda") in China has been carried out according to mineral element contents in this study. Nineteen macro and trace mineral elements (Na, Mg, Al, K, Ca, Mn, Fe, Cu, Zn, V, Cr, Co, Ni, As, Se, Mo, Cd, Ba and Pb) were determined by inductively coupled plasma mass spectrometry (ICP-MS) in 117 samples. Then the experimental data were subjected to analysis of variance (ANOVA) and principal component analysis (PCA) to reveal significant differences and potential patterns between samples. Stepwise linear discriminant analysis (LDA) and partial least square discriminant analysis (PLS-DA) were applied to develop classification models and achieved correct classified rates of 100% and 97.4% for the prediction sample set, respectively. The discrimination could be attributed to different raw materials (mainly water) and elaboration processes employed. The results indicate that the element compositions combined with multivariate analysis can be used as fingerprinting techniques to protect prestigious wineries and enable the authenticity of Chinese rice wine. Copyright © 2013 Elsevier Ltd. All rights reserved.
Statistical classification techniques for engineering and climatic data samples
NASA Technical Reports Server (NTRS)
Temple, E. C.; Shipman, J. R.
1981-01-01
Fisher's sample linear discriminant function is modified through an appropriate alteration of the common sample variance-covariance matrix. The alteration consists of adding nonnegative values to the eigenvalues of the sample variance covariance matrix. The desired results of this modification is to increase the number of correct classifications by the new linear discriminant function over Fisher's function. This study is limited to the two-group discriminant problem.
Abbasian Ardakani, Ali; Gharbali, Akbar; Mohammadi, Afshin
2015-01-01
The aim of this study was to evaluate computer aided diagnosis (CAD) system with texture analysis (TA) to improve radiologists' accuracy in identification of thyroid nodules as malignant or benign. A total of 70 cases (26 benign and 44 malignant) were analyzed in this study. We extracted up to 270 statistical texture features as a descriptor for each selected region of interests (ROIs) in three normalization schemes (default, 3s and 1%-99%). Then features by the lowest probability of classification error and average correlation coefficients (POE+ACC), and Fisher coefficient (Fisher) eliminated to 10 best and most effective features. These features were analyzed under standard and nonstandard states. For TA of the thyroid nodules, Principle Component Analysis (PCA), Linear Discriminant Analysis (LDA) and Non-Linear Discriminant Analysis (NDA) were applied. First Nearest-Neighbour (1-NN) classifier was performed for the features resulting from PCA and LDA. NDA features were classified by artificial neural network (A-NN). Receiver operating characteristic (ROC) curve analysis was used for examining the performance of TA methods. The best results were driven in 1-99% normalization with features extracted by POE+ACC algorithm and analyzed by NDA with the area under the ROC curve ( Az) of 0.9722 which correspond to sensitivity of 94.45%, specificity of 100%, and accuracy of 97.14%. Our results indicate that TA is a reliable method, can provide useful information help radiologist in detection and classification of benign and malignant thyroid nodules.
Wang, Shunfang; Nie, Bing; Yue, Kun; Fei, Yu; Li, Wenjia; Xu, Dongshu
2017-12-15
Kernel discriminant analysis (KDA) is a dimension reduction and classification algorithm based on nonlinear kernel trick, which can be novelly used to treat high-dimensional and complex biological data before undergoing classification processes such as protein subcellular localization. Kernel parameters make a great impact on the performance of the KDA model. Specifically, for KDA with the popular Gaussian kernel, to select the scale parameter is still a challenging problem. Thus, this paper introduces the KDA method and proposes a new method for Gaussian kernel parameter selection depending on the fact that the differences between reconstruction errors of edge normal samples and those of interior normal samples should be maximized for certain suitable kernel parameters. Experiments with various standard data sets of protein subcellular localization show that the overall accuracy of protein classification prediction with KDA is much higher than that without KDA. Meanwhile, the kernel parameter of KDA has a great impact on the efficiency, and the proposed method can produce an optimum parameter, which makes the new algorithm not only perform as effectively as the traditional ones, but also reduce the computational time and thus improve efficiency.
NASA Astrophysics Data System (ADS)
Koma, Zsófia; Deák, Márton; Kovács, József; Székely, Balázs; Kelemen, Kristóf; Standovár, Tibor
2016-04-01
Airborne Laser Scanning (ALS) is a widely used technology for forestry classification applications. However, single tree detection and species classification from low density ALS point cloud is limited in a dense forest region. In this study we investigate the division of a forest into homogenous groups at stand level. The study area is located in the Aggtelek karst region (Northeast Hungary) with a complex relief topography. The ALS dataset contained only 4 discrete echoes (at 2-4 pt/m2 density) from the study area during leaf-on season. Ground-truth measurements about canopy closure and proportion of tree species cover are available for every 70 meter in 500 square meter circular plots. In the first step, ALS data were processed and geometrical and intensity based features were calculated into a 5×5 meter raster based grid. The derived features contained: basic statistics of relative height, canopy RMS, echo ratio, openness, pulse penetration ratio, basic statistics of radiometric feature. In the second step the data were investigated using Combined Cluster and Discriminant Analysis (CCDA, Kovács et al., 2014). The CCDA method first determines a basic grouping for the multiple circle shaped sampling locations using hierarchical clustering and then for the arising grouping possibilities a core cycle is executed comparing the goodness of the investigated groupings with random ones. Out of these comparisons difference values arise, yielding information about the optimal grouping out of the investigated ones. If sub-groups are then further investigated, one might even find homogeneous groups. We found that low density ALS data classification into homogeneous groups are highly dependent on canopy closure, and the proportion of the dominant tree species. The presented results show high potential using CCDA for determination of homogenous separable groups in LiDAR based tree species classification. Aggtelek Karst/Slovakian Karst Caves" (HUSK/1101/221/0180, Aggtelek NP), data evaluation: 'Multipurpose assessment serving forest biodiversity conservation in the Carpathian region of Hungary', Swiss-Hungarian Cooperation Programme (SH/4/13 Project). BS contributed as an Alexander von Humboldt Research Fellow. J. Kovács, S. Kovács, N. Magyar, P. Tanos, I. G. Hatvani, and A. Anda (2014), Classification into homogeneous groups using combined cluster and discriminant analysis, Environmental Modelling & Software, 57, 52-59.
Quantification of Reflection Patterns in Ground-Penetrating Radar Data
NASA Astrophysics Data System (ADS)
Moysey, S.; Knight, R. J.; Jol, H. M.; Allen-King, R. M.; Gaylord, D. R.
2005-12-01
Radar facies analysis provides a way of interpreting the large-scale structure of the subsurface from ground-penetrating radar (GPR) data. Radar facies are often distinguished from each other by the presence of patterns, such as flat-lying, dipping, or chaotic reflections, in different regions of a radar image. When these patterns can be associated with radar facies in a repeated and predictable manner we refer to them as `radar textures'. While it is often possible to qualitatively differentiate between radar textures visually, pattern recognition tools, like neural networks, require a quantitative measure to discriminate between them. We investigate whether currently available tools, such as instantaneous attributes or metrics adapted from standard texture analysis techniques, can be used to improve the classification of radar facies. To this end, we use a neural network to perform cross-validation tests that assess the efficacy of different textural measures for classifying radar facies in GPR data collected from the William River delta, Saskatchewan, Canada. We found that the highest classification accuracies (>93%) were obtained for measures of texture that preserve information about the spatial arrangement of reflections in the radar image, e.g., spatial covariance. Lower accuracy (87%) was obtained for classifications based directly on windows of amplitude data extracted from the radar image. Measures that did not account for the spatial arrangement of reflections in the image, e.g., instantaneous attributes and amplitude variance, yielded classification accuracies of less than 65%. Optimal classifications were obtained for textural measures that extracted sufficient information from the radar data to discriminate between radar facies but were insensitive to other facies specific characteristics. For example, the rotationally invariant Fourier-Mellin transform delivered better classification results than the spatial covariance because dip angle of the reflections, but not dip direction, was an important discriminator between radar facies at the William River delta. To extend the use of radar texture beyond the identification of radar facies to sedimentary facies we are investigating how sedimentary features are encoded in GPR data at Borden, Ontario, Canada. At this site, we have collected extensive sedimentary and hydrologic data over the area imaged by GPR. Analysis of this data coupled with synthetic modeling of the radar signal has allowed us to develop insight into the generation of radar texture in complex geologic environments.
You Can't Think and Hit at the Same Time: Neural Correlates of Baseball Pitch Classification.
Sherwin, Jason; Muraskin, Jordan; Sajda, Paul
2012-01-01
Hitting a baseball is often described as the most difficult thing to do in sports. A key aptitude of a good hitter is the ability to determine which pitch is coming. This rapid decision requires the batter to make a judgment in a fraction of a second based largely on the trajectory and spin of the ball. When does this decision occur relative to the ball's trajectory and is it possible to identify neural correlates that represent how the decision evolves over a split second? Using single-trial analysis of electroencephalography (EEG) we address this question within the context of subjects discriminating three types of pitches (fastball, curveball, slider) based on pitch trajectories. We find clear neural signatures of pitch classification and, using signal detection theory, we identify the times of discrimination on a trial-to-trial basis. Based on these neural signatures we estimate neural discrimination distributions as a function of the distance the ball is from the plate. We find all three pitches yield unique distributions, namely the timing of the discriminating neural signatures relative to the position of the ball in its trajectory. For instance, fastballs are discriminated at the earliest points in their trajectory, relative to the two other pitches, which is consistent with the need for some constant time to generate and execute the motor plan for the swing (or inhibition of the swing). We also find incorrect discrimination of a pitch (errors) yields neural sources in Brodmann Area 10, which has been implicated in prospective memory, recall, and task difficulty. In summary, we show that single-trial analysis of EEG yields informative distributions of the relative point in a baseball's trajectory when the batter makes a decision on which pitch is coming.
Bavykin, Sergei G.; Mirzabekova, legal representative, Natalia V.; Mirzabekov, deceased, Andrei D.
2007-12-04
The present invention relates to methods and compositions for using nucleotide sequence variations of 16S and 23S rRNA within the B. cereus group to discriminate a highly infectious bacterium B. anthracis from closely related microorganisms. Sequence variations in the 16S and 23S rRNA of the B. cereus subgroup including B. anthracis are utilized to construct an array that can detect these sequence variations through selective hybridizations and discriminate B. cereus group that includes B. anthracis. Discrimination of single base differences in rRNA was achieved with a microchip during analysis of B. cereus group isolates from both single and in mixed samples, as well as identification of polymorphic sites. Successful use of a microchip to determine the appropriate subgroup classification using eight reference microorganisms from the B. cereus group as a study set, was demonstrated.
NASA Astrophysics Data System (ADS)
Zhao, Bo; Liu, Jinhu; Song, Junjie; Cao, Liang; Dou, Shuozeng
2017-08-01
The otolith morphology of two croaker species (Collichthys lucidus and Collichthys niveatus) from three areas (Liaodong Bay, LD; Huanghe (Yellow) River estuary, HRE; Jiaozhou Bay, JZ) along the northern Chinese coast were investigated for species identification and stock discrimination. The otolith contour shape described by elliptic Fourier coefficients (EFC) were analysed using principal components analysis (PCA) and stepwise canonical discriminant analysis (CDA) to identify species and stocks. The two species were well differentiated, with an overall classification success rate of 97.8%. And variations in the otolith shapes were significant enough to discriminate among the three geographical samples of C. lucidus (67.7%) or C. niveatus (65.2%). Relatively high mis-assignment occurred between the geographically adjacent LD and HRE samples, which implied that individual mixing may exist between the two samples. This study yielded information complementary to that derived from genetic studies and provided information for assessing the stock structure of C. lucidus and C. niveatus in the Bohai Sea and the Yellow Sea.
Real-Time Speech/Music Classification With a Hierarchical Oblique Decision Tree
2008-04-01
REAL-TIME SPEECH/ MUSIC CLASSIFICATION WITH A HIERARCHICAL OBLIQUE DECISION TREE Jun Wang, Qiong Wu, Haojiang Deng, Qin Yan Institute of Acoustics...time speech/ music classification with a hierarchical oblique decision tree. A set of discrimination features in frequency domain are selected...handle signals without discrimination and can not work properly in the existence of multimedia signals. This paper proposes a real-time speech/ music
AVHRR channel selection for land cover classification
Maxwell, S.K.; Hoffer, R.M.; Chapman, P.L.
2002-01-01
Mapping land cover of large regions often requires processing of satellite images collected from several time periods at many spectral wavelength channels. However, manipulating and processing large amounts of image data increases the complexity and time, and hence the cost, that it takes to produce a land cover map. Very few studies have evaluated the importance of individual Advanced Very High Resolution Radiometer (AVHRR) channels for discriminating cover types, especially the thermal channels (channels 3, 4 and 5). Studies rarely perform a multi-year analysis to determine the impact of inter-annual variability on the classification results. We evaluated 5 years of AVHRR data using combinations of the original AVHRR spectral channels (1-5) to determine which channels are most important for cover type discrimination, yet stabilize inter-annual variability. Particular attention was placed on the channels in the thermal portion of the spectrum. Fourteen cover types over the entire state of Colorado were evaluated using a supervised classification approach on all two-, three-, four- and five-channel combinations for seven AVHRR biweekly composite datasets covering the entire growing season for each of 5 years. Results show that all three of the major portions of the electromagnetic spectrum represented by the AVHRR sensor are required to discriminate cover types effectively and stabilize inter-annual variability. Of the two-channel combinations, channels 1 (red visible) and 2 (near-infrared) had, by far, the highest average overall accuracy (72.2%), yet the inter-annual classification accuracies were highly variable. Including a thermal channel (channel 4) significantly increased the average overall classification accuracy by 5.5% and stabilized interannual variability. Each of the thermal channels gave similar classification accuracies; however, because of the problems in consistently interpreting channel 3 data, either channel 4 or 5 was found to be a more appropriate choice. Substituting the thermal channel with a single elevation layer resulted in equivalent classification accuracies and inter-annual variability.
Kumar, Shiu; Sharma, Alok; Tsunoda, Tatsuhiko
2017-12-28
Common spatial pattern (CSP) has been an effective technique for feature extraction in electroencephalography (EEG) based brain computer interfaces (BCIs). However, motor imagery EEG signal feature extraction using CSP generally depends on the selection of the frequency bands to a great extent. In this study, we propose a mutual information based frequency band selection approach. The idea of the proposed method is to utilize the information from all the available channels for effectively selecting the most discriminative filter banks. CSP features are extracted from multiple overlapping sub-bands. An additional sub-band has been introduced that cover the wide frequency band (7-30 Hz) and two different types of features are extracted using CSP and common spatio-spectral pattern techniques, respectively. Mutual information is then computed from the extracted features of each of these bands and the top filter banks are selected for further processing. Linear discriminant analysis is applied to the features extracted from each of the filter banks. The scores are fused together, and classification is done using support vector machine. The proposed method is evaluated using BCI Competition III dataset IVa, BCI Competition IV dataset I and BCI Competition IV dataset IIb, and it outperformed all other competing methods achieving the lowest misclassification rate and the highest kappa coefficient on all three datasets. Introducing a wide sub-band and using mutual information for selecting the most discriminative sub-bands, the proposed method shows improvement in motor imagery EEG signal classification.
NASA Technical Reports Server (NTRS)
Ballew, G.
1977-01-01
The ability of Landsat multispectral digital data to differentiate among 62 combinations of rock and alteration types at the Goldfield mining district of Western Nevada was investigated by using statistical techniques of cluster and discriminant analysis. Multivariate discriminant analysis was not effective in classifying each of the 62 groups, with classification results essentially the same whether data of four channels alone or combined with six ratios of channels were used. Bivariate plots of group means revealed a cluster of three groups including mill tailings, basalt and all other rock and alteration types. Automatic hierarchical clustering based on the fourth dimensional Mahalanobis distance between group means of 30 groups having five or more samples was performed using Johnson's HICLUS program. The results of the cluster analysis revealed hierarchies of mill tailings vs. natural materials, basalt vs. non-basalt, highly reflectant rocks vs. other rocks and exclusively unaltered rocks vs. predominantly altered rocks. The hierarchies were used to determine the order in which sets of multiple discriminant analyses were to be performed and the resulting discriminant functions were used to produce a map of geology and alteration which has an overall accuracy of 70 percent for discriminating exclusively altered rocks from predominantly altered rocks.
Rohaeti, Eti; Rafi, Mohamad; Syafitri, Utami Dyah; Heryanto, Rudi
2015-02-25
Turmeric (Curcuma longa), java turmeric (Curcuma xanthorrhiza) and cassumunar ginger (Zingiber cassumunar) are widely used in traditional Indonesian medicines (jamu). They have similar color for their rhizome and possess some similar uses, so it is possible to substitute one for the other. The identification and discrimination of these closely-related plants is a crucial task to ensure the quality of the raw materials. Therefore, an analytical method which is rapid, simple and accurate for discriminating these species using Fourier transform infrared spectroscopy (FTIR) combined with some chemometrics methods was developed. FTIR spectra were acquired in the mid-IR region (4000-400 cm(-1)). Standard normal variate, first and second order derivative spectra were compared for the spectral data. Principal component analysis (PCA) and canonical variate analysis (CVA) were used for the classification of the three species. Samples could be discriminated by visual analysis of the FTIR spectra by using their marker bands. Discrimination of the three species was also possible through the combination of the pre-processed FTIR spectra with PCA and CVA, in which CVA gave clearer discrimination. Subsequently, the developed method could be used for the identification and discrimination of the three closely-related plant species. Copyright © 2014 Elsevier B.V. All rights reserved.
Reboiro-Jato, Miguel; Arrais, Joel P; Oliveira, José Luis; Fdez-Riverola, Florentino
2014-01-30
The diagnosis and prognosis of several diseases can be shortened through the use of different large-scale genome experiments. In this context, microarrays can generate expression data for a huge set of genes. However, to obtain solid statistical evidence from the resulting data, it is necessary to train and to validate many classification techniques in order to find the best discriminative method. This is a time-consuming process that normally depends on intricate statistical tools. geneCommittee is a web-based interactive tool for routinely evaluating the discriminative classification power of custom hypothesis in the form of biologically relevant gene sets. While the user can work with different gene set collections and several microarray data files to configure specific classification experiments, the tool is able to run several tests in parallel. Provided with a straightforward and intuitive interface, geneCommittee is able to render valuable information for diagnostic analyses and clinical management decisions based on systematically evaluating custom hypothesis over different data sets using complementary classifiers, a key aspect in clinical research. geneCommittee allows the enrichment of microarrays raw data with gene functional annotations, producing integrated datasets that simplify the construction of better discriminative hypothesis, and allows the creation of a set of complementary classifiers. The trained committees can then be used for clinical research and diagnosis. Full documentation including common use cases and guided analysis workflows is freely available at http://sing.ei.uvigo.es/GC/.
[Discrimination of varieties of brake fluid using visual-near infrared spectra].
Jiang, Lu-lu; Tan, Li-hong; Qiu, Zheng-jun; Lu, Jiang-feng; He, Yong
2008-06-01
A new method was developed to fast discriminate brands of brake fluid by means of visual-near infrared spectroscopy. Five different brands of brake fluid were analyzed using a handheld near infrared spectrograph, manufactured by ASD Company, and 60 samples were gotten from each brand of brake fluid. The samples data were pretreated using average smoothing and standard normal variable method, and then analyzed using principal component analysis (PCA). A 2-dimensional plot was drawn based on the first and the second principal components, and the plot indicated that the clustering characteristic of different brake fluid is distinct. The foregoing 6 principal components were taken as input variable, and the band of brake fluid as output variable to build the discriminate model by stepwise discriminant analysis method. Two hundred twenty five samples selected randomly were used to create the model, and the rest 75 samples to verify the model. The result showed that the distinguishing rate was 94.67%, indicating that the method proposed in this paper has good performance in classification and discrimination. It provides a new way to fast discriminate different brands of brake fluid.
Classification of collected trot, passage and piaffe based on temporal variables.
Clayton, H M
1997-05-01
The objective was to determine whether collected trot, passage and piaffe could be distinguished as separate gaits on the basis of temporal variables. Sagittal plane, 60 Hz videotapes of 10 finalists in the dressage competitions at the 1992 Olympic Games were analysed to measure the temporal variables in absolute terms and as percentages of stride duration. Classification was based on analysis of variance, a graphical method and discriminant analysis. Stride duration was sufficient to distinguish collected trot from passage and piaffe in all horses. The analysis of variance showed that the mean values of most variables differed significantly between passage and piaffe. When hindlimb stance percentage was plotted against diagonal advanced placement percentage, some overlap was found between all 3 movements indicating that individual horses could not be classified reliably in this manner. Using hindlimb stance percentage and diagonal advanced placement percentage as input in a discriminant analysis, 80% of the cases were classified correctly, but at least one horse was misclassified in each movement. When the absolute, rather than percentage, values of the 2 variables were used as input in the discriminant analysis, 90% of the cases were correctly classified and the only misclassifications were between passage and piaffe. However, the 2 horses in which piaffe was misclassified as passage were the gold and silver medallists. In general, higher placed horses tended toward longer diagonal advanced placements, especially in collected trot and passage, and shorter hindlimb stance percentages in passage and piaffe.
Macaluso, P J
2011-02-01
Digital photogrammetric methods were used to collect diameter, area, and perimeter data of the acetabulum for a twentieth-century skeletal sample from France (Georges Olivier Collection, Musée de l'Homme, Paris) consisting of 46 males and 36 females. The measurements were then subjected to both discriminant function and logistic regression analyses in order to develop osteometric standards for sex assessment. Univariate discriminant functions and logistic regression equations yielded overall correct classification accuracy rates for both the left and the right acetabula ranging from 84.1% to 89.6%. The multivariate models developed in this study did not provide increased accuracy over those using only a single variable. Classification sex bias ratios ranged between 1.1% and 7.3% for the majority of models. The results of this study, therefore, demonstrate that metric analysis of acetabular size provides a highly accurate, and easily replicable, method of discriminating sex in this documented skeletal collection. The results further suggest that the addition of area and perimeter data derived from digital images may provide a more effective method of sex assessment than that offered by traditional linear measurements alone. Copyright © 2010 Elsevier GmbH. All rights reserved.
Centered Kernel Alignment Enhancing Neural Network Pretraining for MRI-Based Dementia Diagnosis
Cárdenas-Peña, David; Collazos-Huertas, Diego; Castellanos-Dominguez, German
2016-01-01
Dementia is a growing problem that affects elderly people worldwide. More accurate evaluation of dementia diagnosis can help during the medical examination. Several methods for computer-aided dementia diagnosis have been proposed using resonance imaging scans to discriminate between patients with Alzheimer's disease (AD) or mild cognitive impairment (MCI) and healthy controls (NC). Nonetheless, the computer-aided diagnosis is especially challenging because of the heterogeneous and intermediate nature of MCI. We address the automated dementia diagnosis by introducing a novel supervised pretraining approach that takes advantage of the artificial neural network (ANN) for complex classification tasks. The proposal initializes an ANN based on linear projections to achieve more discriminating spaces. Such projections are estimated by maximizing the centered kernel alignment criterion that assesses the affinity between the resonance imaging data kernel matrix and the label target matrix. As a result, the performed linear embedding allows accounting for features that contribute the most to the MCI class discrimination. We compare the supervised pretraining approach to two unsupervised initialization methods (autoencoders and Principal Component Analysis) and against the best four performing classification methods of the 2014 CADDementia challenge. As a result, our proposal outperforms all the baselines (7% of classification accuracy and area under the receiver-operating-characteristic curve) at the time it reduces the class biasing. PMID:27148392
Arif, Muhammad
2012-06-01
In pattern classification problems, feature extraction is an important step. Quality of features in discriminating different classes plays an important role in pattern classification problems. In real life, pattern classification may require high dimensional feature space and it is impossible to visualize the feature space if the dimension of feature space is greater than four. In this paper, we have proposed a Similarity-Dissimilarity plot which can project high dimensional space to a two dimensional space while retaining important characteristics required to assess the discrimination quality of the features. Similarity-dissimilarity plot can reveal information about the amount of overlap of features of different classes. Separable data points of different classes will also be visible on the plot which can be classified correctly using appropriate classifier. Hence, approximate classification accuracy can be predicted. Moreover, it is possible to know about whom class the misclassified data points will be confused by the classifier. Outlier data points can also be located on the similarity-dissimilarity plot. Various examples of synthetic data are used to highlight important characteristics of the proposed plot. Some real life examples from biomedical data are also used for the analysis. The proposed plot is independent of number of dimensions of the feature space.
Guillette, Lauren M; Farrell, Tara M; Hoeschele, Marisa; Sturdy, Christopher B
2010-01-01
Previous perceptual research with black-capped and mountain chickadees has demonstrated that these species treat each other's namesake chick-a-dee calls as belonging to separate, open-ended categories. Further, the terminal dee portion of the call has been implicated as the most prominent species marker. However, statistical classification using acoustic summary features suggests that all note-types contained within the chick-a-dee call should be sufficient for species classification. The current study seeks to better understand the note-type based mechanisms underlying species-based classification of the chick-a-dee call by black-capped and mountain chickadees. In two, complementary, operant discrimination experiments, both species were trained to discriminate the species of the signaler using either entire chick-a-dee calls, or individual note-types from chick-a-dee calls. In agreement with previous perceptual work we find that the D note had significant stimulus control over species-based discrimination. However, in line with statistical classifications, we find that all note-types carry species information. We discuss reasons why the most easily discriminated note-types are likely candidates to carry species-based cues.
Purcaro, Giorgia; Cordero, Chiara; Liberto, Erica; Bicchi, Carlo; Conte, Lanfranco S
2014-03-21
This study investigates the applicability of an iterative approach aimed at defining a chemical blueprint of virgin olive oil volatiles to be correlated to the product sensory quality. The investigation strategy proposed allows to fully exploit the informative content of a comprehensive multidimensional gas chromatography (GC×GC) coupled to a mass spectrometry (MS) data set. Olive oil samples (19), including 5 reference standards, obtained from the International Olive Oil Council, and commercial samples, were submitted to a sensory evaluation by a Panel test, before being analyzed in two laboratories using different instrumentation, column set, and software elaboration packages in view of a cross-validation of the entire methodology. A first classification of samples based on untargeted peak features information, was obtained on raw data from two different column combinations (apolar×polar and polar×apolar) by applying unsupervised multivariate analysis (i.e., principal component analysis-PCA). However, to improve effectiveness and specificity of this classification, peak features were reliably identified (261 compounds), on the basis of the MS spectrum and linear retention index matching, and subjected to successive pair-wise comparisons based on 2D patterns, which revealed peculiar distribution of chemicals correlated with samples sensory classification. The most informative compounds were thus identified and collected in a "blueprint" of specific defects (or combination of defects) successively adopted to discriminate Extra Virgin from defected oils (i.e., lampante oil) with the aid of a supervised approach, i.e., partial least squares-discriminant analysis (PLS-DA). In this last step, the principles of sensomics, which assigns higher information potential to analytes with lower odor threshold proved to be successful, and a much more powerful discrimination of samples was obtained in view of a sensory quality assessment. Copyright © 2014 Elsevier B.V. All rights reserved.
T.C. Knight; A.W. Ezell; D.R. Shaw; J.D. Byrd; D.L. Evans
2004-01-01
Multispectral reflectance data were collected in midrotation loblolly pine plantations during spring, summer, and fall seasons with a hand-held spectroradiometer. All data were analyzed by discriminant analysis. Analyses resulted in species classifications with accuracies of 83 percent during the spring season, 54 percent during summer, and 82 percent during fall....
Dess, Brian W; Cardarelli, John; Thomas, Mark J; Stapleton, Jeff; Kroutil, Robert T; Miller, David; Curry, Timothy; Small, Gary W
2018-03-08
A generalized methodology was developed for automating the detection of radioisotopes from gamma-ray spectra collected from an aircraft platform using sodium-iodide detectors. Employing data provided by the U.S Environmental Protection Agency Airborne Spectral Photometric Environmental Collection Technology (ASPECT) program, multivariate classification models based on nonparametric linear discriminant analysis were developed for application to spectra that were preprocessed through a combination of altitude-based scaling and digital filtering. Training sets of spectra for use in building classification models were assembled from a combination of background spectra collected in the field and synthesized spectra obtained by superimposing laboratory-collected spectra of target radioisotopes onto field backgrounds. This approach eliminated the need for field experimentation with radioactive sources for use in building classification models. Through a bi-Gaussian modeling procedure, the discriminant scores that served as the outputs from the classification models were related to associated confidence levels. This provided an easily interpreted result regarding the presence or absence of the signature of a specific radioisotope in each collected spectrum. Through the use of this approach, classifiers were built for cesium-137 ( 137 Cs) and cobalt-60 ( 60 Co), two radioisotopes that are of interest in airborne radiological monitoring applications. The optimized classifiers were tested with field data collected from a set of six geographically diverse sites, three of which contained either 137 Cs, 60 Co, or both. When the optimized classification models were applied, the overall percentages of correct classifications for spectra collected at these sites were 99.9 and 97.9% for the 60 Co and 137 Cs classifiers, respectively. Copyright © 2018 Elsevier Ltd. All rights reserved.
Zeng, Ling-Li; Wang, Huaning; Hu, Panpan; Yang, Bo; Pu, Weidan; Shen, Hui; Chen, Xingui; Liu, Zhening; Yin, Hong; Tan, Qingrong; Wang, Kai; Hu, Dewen
2018-04-01
A lack of a sufficiently large sample at single sites causes poor generalizability in automatic diagnosis classification of heterogeneous psychiatric disorders such as schizophrenia based on brain imaging scans. Advanced deep learning methods may be capable of learning subtle hidden patterns from high dimensional imaging data, overcome potential site-related variation, and achieve reproducible cross-site classification. However, deep learning-based cross-site transfer classification, despite less imaging site-specificity and more generalizability of diagnostic models, has not been investigated in schizophrenia. A large multi-site functional MRI sample (n = 734, including 357 schizophrenic patients from seven imaging resources) was collected, and a deep discriminant autoencoder network, aimed at learning imaging site-shared functional connectivity features, was developed to discriminate schizophrenic individuals from healthy controls. Accuracies of approximately 85·0% and 81·0% were obtained in multi-site pooling classification and leave-site-out transfer classification, respectively. The learned functional connectivity features revealed dysregulation of the cortical-striatal-cerebellar circuit in schizophrenia, and the most discriminating functional connections were primarily located within and across the default, salience, and control networks. The findings imply that dysfunctional integration of the cortical-striatal-cerebellar circuit across the default, salience, and control networks may play an important role in the "disconnectivity" model underlying the pathophysiology of schizophrenia. The proposed discriminant deep learning method may be capable of learning reliable connectome patterns and help in understanding the pathophysiology and achieving accurate prediction of schizophrenia across multiple independent imaging sites. Copyright © 2018 German Center for Neurodegenerative Diseases (DZNE). Published by Elsevier B.V. All rights reserved.
NASA Technical Reports Server (NTRS)
Abbey, Craig K.; Eckstein, Miguel P.
2002-01-01
We consider estimation and statistical hypothesis testing on classification images obtained from the two-alternative forced-choice experimental paradigm. We begin with a probabilistic model of task performance for simple forced-choice detection and discrimination tasks. Particular attention is paid to general linear filter models because these models lead to a direct interpretation of the classification image as an estimate of the filter weights. We then describe an estimation procedure for obtaining classification images from observer data. A number of statistical tests are presented for testing various hypotheses from classification images based on some more compact set of features derived from them. As an example of how the methods we describe can be used, we present a case study investigating detection of a Gaussian bump profile.
A comprehensive simulation study on classification of RNA-Seq data.
Zararsız, Gökmen; Goksuluk, Dincer; Korkmaz, Selcuk; Eldem, Vahap; Zararsiz, Gozde Erturk; Duru, Izzet Parug; Ozturk, Ahmet
2017-01-01
RNA sequencing (RNA-Seq) is a powerful technique for the gene-expression profiling of organisms that uses the capabilities of next-generation sequencing technologies. Developing gene-expression-based classification algorithms is an emerging powerful method for diagnosis, disease classification and monitoring at molecular level, as well as providing potential markers of diseases. Most of the statistical methods proposed for the classification of gene-expression data are either based on a continuous scale (eg. microarray data) or require a normal distribution assumption. Hence, these methods cannot be directly applied to RNA-Seq data since they violate both data structure and distributional assumptions. However, it is possible to apply these algorithms with appropriate modifications to RNA-Seq data. One way is to develop count-based classifiers, such as Poisson linear discriminant analysis and negative binomial linear discriminant analysis. Another way is to bring the data closer to microarrays and apply microarray-based classifiers. In this study, we compared several classifiers including PLDA with and without power transformation, NBLDA, single SVM, bagging SVM (bagSVM), classification and regression trees (CART), and random forests (RF). We also examined the effect of several parameters such as overdispersion, sample size, number of genes, number of classes, differential-expression rate, and the transformation method on model performances. A comprehensive simulation study is conducted and the results are compared with the results of two miRNA and two mRNA experimental datasets. The results revealed that increasing the sample size, differential-expression rate and decreasing the dispersion parameter and number of groups lead to an increase in classification accuracy. Similar with differential-expression studies, the classification of RNA-Seq data requires careful attention when handling data overdispersion. We conclude that, as a count-based classifier, the power transformed PLDA and, as a microarray-based classifier, vst or rlog transformed RF and SVM classifiers may be a good choice for classification. An R/BIOCONDUCTOR package, MLSeq, is freely available at https://www.bioconductor.org/packages/release/bioc/html/MLSeq.html.
NASA Astrophysics Data System (ADS)
Javidnia, Katayoun; Parish, Maryam; Karimi, Sadegh; Hemmateenejad, Bahram
2013-03-01
By using FT-IR spectroscopy, many researchers from different disciplines enrich the experimental complexity of their research for obtaining more precise information. Moreover chemometrics techniques have boosted the use of IR instruments. In the present study we aimed to emphasize on the power of FT-IR spectroscopy for discrimination between different oil samples (especially fat from vegetable oils). Also our data were used to compare the performance of different classification methods. FT-IR transmittance spectra of oil samples (Corn, Colona, Sunflower, Soya, Olive, and Butter) were measured in the wave-number interval of 450-4000 cm-1. Classification analysis was performed utilizing PLS-DA, interval PLS-DA, extended canonical variate analysis (ECVA) and interval ECVA methods. The effect of data preprocessing by extended multiplicative signal correction was investigated. Whilst all employed method could distinguish butter from vegetable oils, iECVA resulted in the best performances for calibration and external test set with 100% sensitivity and specificity.
Fang, Guihua; Goh, Jing Yeen; Tay, Manjun; Lau, Hiu Fung; Li, Sam Fong Yau
2013-06-01
The correct identification of oils and fats is important to consumers from both commercial and health perspectives. Proton nuclear magnetic resonance ((1)H NMR) spectroscopy, gas chromatography-mass spectrometry (GC/MS) fingerprinting and chemometrics were employed successfully for the quality control of oils and fats. Principal component analysis (PCA) of both techniques showed group clustering of 14 types of oils and fats. Partial least squares discriminant analysis (PLS-DA) and orthogonal projections to latent structures discriminant analysis (OPLS-DA) using GC/MS data had excellent classification sensitivity and specificity compared to models using NMR data. Depending on the availability of the instruments, data from either technique can effectively be applied for the establishment of an oils and fats database to identify unknown samples. Partial least squares (PLS) models were successfully established for the detection of as low as 5% of lard and beef tallow spiked into canola oil, thus illustrating possible applications in Islamic and Jewish countries. Copyright © 2012 Elsevier Ltd. All rights reserved.
Application of quantum-behaved particle swarm optimization to motor imagery EEG classification.
Hsu, Wei-Yen
2013-12-01
In this study, we propose a recognition system for single-trial analysis of motor imagery (MI) electroencephalogram (EEG) data. Applying event-related brain potential (ERP) data acquired from the sensorimotor cortices, the system chiefly consists of automatic artifact elimination, feature extraction, feature selection and classification. In addition to the use of independent component analysis, a similarity measure is proposed to further remove the electrooculographic (EOG) artifacts automatically. Several potential features, such as wavelet-fractal features, are then extracted for subsequent classification. Next, quantum-behaved particle swarm optimization (QPSO) is used to select features from the feature combination. Finally, selected sub-features are classified by support vector machine (SVM). Compared with without artifact elimination, feature selection using a genetic algorithm (GA) and feature classification with Fisher's linear discriminant (FLD) on MI data from two data sets for eight subjects, the results indicate that the proposed method is promising in brain-computer interface (BCI) applications.
Tan, Jin; Li, Rong; Jiang, Zi-Tao
2015-10-01
We report an application of data fusion for chemometric classification of 135 canned samples of Chinese lager beers by manufacturer based on the combination of fluorescence, UV and visible spectroscopies. Right-angle synchronous fluorescence spectra (SFS) at three wavelength difference Δλ=30, 60 and 80 nm and visible spectra in the range 380-700 nm of undiluted beers were recorded. UV spectra in the range 240-400 nm of diluted beers were measured. A classification model was built using principal component analysis (PCA) and linear discriminant analysis (LDA). LDA with cross-validation showed that the data fusion could achieve 78.5-86.7% correct classification (sensitivity), while those rates using individual spectroscopies ranged from 42.2% to 70.4%. The results demonstrated that the fluorescence, UV and visible spectroscopies complemented each other, yielding higher synergic effect. Copyright © 2015 Elsevier Ltd. All rights reserved.
The Raman spectrum character of skin tumor induced by UVB
NASA Astrophysics Data System (ADS)
Wu, Shulian; Hu, Liangjun; Wang, Yunxia; Li, Yongzeng
2016-03-01
In our study, the skin canceration processes induced by UVB were analyzed from the perspective of tissue spectrum. A home-made Raman spectral system with a millimeter order excitation laser spot size combined with a multivariate statistical analysis for monitoring the skin changed irradiated by UVB was studied and the discrimination were evaluated. Raman scattering signals of the SCC and normal skin were acquired. Spectral differences in Raman spectra were revealed. Linear discriminant analysis (LDA) based on principal component analysis (PCA) were employed to generate diagnostic algorithms for the classification of skin SCC and normal. The results indicated that Raman spectroscopy combined with PCA-LDA demonstrated good potential for improving the diagnosis of skin cancers.
Shift-invariant discrete wavelet transform analysis for retinal image classification.
Khademi, April; Krishnan, Sridhar
2007-12-01
This work involves retinal image classification and a novel analysis system was developed. From the compressed domain, the proposed scheme extracts textural features from wavelet coefficients, which describe the relative homogeneity of localized areas of the retinal images. Since the discrete wavelet transform (DWT) is shift-variant, a shift-invariant DWT was explored to ensure that a robust feature set was extracted. To combat the small database size, linear discriminant analysis classification was used with the leave one out method. 38 normal and 48 abnormal (exudates, large drusens, fine drusens, choroidal neovascularization, central vein and artery occlusion, histoplasmosis, arteriosclerotic retinopathy, hemi-central retinal vein occlusion and more) were used and a specificity of 79% and sensitivity of 85.4% were achieved (the average classification rate is 82.2%). The success of the system can be accounted to the highly robust feature set which included translation, scale and semi-rotational, features. Additionally, this technique is database independent since the features were specifically tuned to the pathologies of the human eye.
An Extended Spectral-Spatial Classification Approach for Hyperspectral Data
NASA Astrophysics Data System (ADS)
Akbari, D.
2017-11-01
In this paper an extended classification approach for hyperspectral imagery based on both spectral and spatial information is proposed. The spatial information is obtained by an enhanced marker-based minimum spanning forest (MSF) algorithm. Three different methods of dimension reduction are first used to obtain the subspace of hyperspectral data: (1) unsupervised feature extraction methods including principal component analysis (PCA), independent component analysis (ICA), and minimum noise fraction (MNF); (2) supervised feature extraction including decision boundary feature extraction (DBFE), discriminate analysis feature extraction (DAFE), and nonparametric weighted feature extraction (NWFE); (3) genetic algorithm (GA). The spectral features obtained are then fed into the enhanced marker-based MSF classification algorithm. In the enhanced MSF algorithm, the markers are extracted from the classification maps obtained by both SVM and watershed segmentation algorithm. To evaluate the proposed approach, the Pavia University hyperspectral data is tested. Experimental results show that the proposed approach using GA achieves an approximately 8 % overall accuracy higher than the original MSF-based algorithm.
Typification of cider brandy on the basis of cider used in its manufacture.
Rodríguez Madrera, Roberto; Mangas Alonso, Juan J
2005-04-20
A study of typification of cider brandies on the basis of the origin of the raw material used in their manufacture was conducted using chemometric techniques (principal component analysis, linear discriminant analysis, and Bayesian analysis) together with their composition in volatile compounds, as analyzed by gas chromatography with flame ionization to detect the major volatiles and by mass spectrometric to detect the minor ones. Significant principal components computed by a double cross-validation procedure allowed the structure of the database to be visualized as a function of the raw material, that is, cider made from fresh apple juice versus cider made from apple juice concentrate. Feasible and robust discriminant rules were computed and validated by a cross-validation procedure that allowed the authors to classify fresh and concentrate cider brandies, obtaining classification hits of >92%. The most discriminating variables for typifying cider brandies according to their raw material were 1-butanol and ethyl hexanoate.
NMR-based metabolomic analysis of spatial variation in soft corals.
He, Qing; Sun, Ruiqi; Liu, Huijuan; Geng, Zhufeng; Chen, Dawei; Li, Yinping; Han, Jiao; Lin, Wenhan; Du, Shushan; Deng, Zhiwei
2014-03-28
Soft corals are common marine organisms that inhabit tropical and subtropical oceans. They are shown to be rich source of secondary metabolites with biological activities. In this work, soft corals from two geographical locations were investigated using ¹H-NMR spectroscopy coupled with multivariate statistical analysis at the metabolic level. A partial least-squares discriminant analysis showed clear separation among extracts of soft corals grown in Sanya Bay and Weizhou Island. The specific markers that contributed to discrimination between soft corals in two origins belonged to terpenes, sterols and N-containing compounds. The satisfied precision of classification obtained indicates this approach using combined ¹H-NMR and chemometrics is effective to discriminate soft corals collected in different geographical locations. The results revealed that metabolites of soft corals evidently depended on living environmental condition, which would provide valuable information for further relevant coastal marine environment evaluation.
Kim, Keun Ho; Ku, Boncho; Kang, Namsik; Kim, Young-Su; Jang, Jun-Su; Kim, Jong Yeol
2012-01-01
The voice has been used to classify the four constitution types, and to recognize a subject's health condition by extracting meaningful physical quantities, in traditional Korean medicine. In this paper, we propose a method of selecting the reliable variables from various voice features, such as frequency derivative features, frequency band ratios, and intensity, from vowels and a sentence. Further, we suggest a process to extract independent variables by eliminating explanatory variables and reducing their correlation and remove outlying data to enable reliable discriminant analysis. Moreover, the suitable division of data for analysis, according to the gender and age of subjects, is discussed. Finally, the vocal features are applied to a discriminant analysis to classify each constitution type. This method of voice classification can be widely used in the u-Healthcare system of personalized medicine and for improving diagnostic accuracy. PMID:22529874
Karabagias, Ioannis K; Karabournioti, Sofia
2018-05-03
Twenty-two honey samples, namely clover and citrus honeys, were collected from the greater Cairo area during the harvesting year 2014⁻2015. The main purpose of the present study was to characterize the aforementioned honey types and to investigate whether the use of easily assessable physicochemical parameters, including color attributes in combination with chemometrics, could differentiate honey floral origin. Parameters taken into account were: pH, electrical conductivity, ash, free acidity, lactonic acidity, total acidity, moisture content, total sugars (degrees Brix-°Bx), total dissolved solids and their ratio to total acidity, salinity, CIELAB color parameters, along with browning index values. Results showed that all honey samples analyzed met the European quality standards set for honey and had variations in the aforementioned physicochemical parameters depending on floral origin. Application of linear discriminant analysis showed that eight physicochemical parameters, including color, could classify Egyptian honeys according to floral origin ( p < 0.05). Correct classification rate was 95.5% using the original method and 90.9% using the cross validation method. The discriminatory ability of the developed model was further validated using unknown honey samples. The overall correct classification rate was not affected. Specific physicochemical parameter analysis in combination with chemometrics has the potential to enhance the differences in floral honeys produced in a given geographical zone.
Karabournioti, Sofia
2018-01-01
Twenty-two honey samples, namely clover and citrus honeys, were collected from the greater Cairo area during the harvesting year 2014–2015. The main purpose of the present study was to characterize the aforementioned honey types and to investigate whether the use of easily assessable physicochemical parameters, including color attributes in combination with chemometrics, could differentiate honey floral origin. Parameters taken into account were: pH, electrical conductivity, ash, free acidity, lactonic acidity, total acidity, moisture content, total sugars (degrees Brix-°Bx), total dissolved solids and their ratio to total acidity, salinity, CIELAB color parameters, along with browning index values. Results showed that all honey samples analyzed met the European quality standards set for honey and had variations in the aforementioned physicochemical parameters depending on floral origin. Application of linear discriminant analysis showed that eight physicochemical parameters, including color, could classify Egyptian honeys according to floral origin (p < 0.05). Correct classification rate was 95.5% using the original method and 90.9% using the cross validation method. The discriminatory ability of the developed model was further validated using unknown honey samples. The overall correct classification rate was not affected. Specific physicochemical parameter analysis in combination with chemometrics has the potential to enhance the differences in floral honeys produced in a given geographical zone. PMID:29751543
Automotive System for Remote Surface Classification.
Bystrov, Aleksandr; Hoare, Edward; Tran, Thuy-Yung; Clarke, Nigel; Gashinova, Marina; Cherniakov, Mikhail
2017-04-01
In this paper we shall discuss a novel approach to road surface recognition, based on the analysis of backscattered microwave and ultrasonic signals. The novelty of our method is sonar and polarimetric radar data fusion, extraction of features for separate swathes of illuminated surface (segmentation), and using of multi-stage artificial neural network for surface classification. The developed system consists of 24 GHz radar and 40 kHz ultrasonic sensor. The features are extracted from backscattered signals and then the procedures of principal component analysis and supervised classification are applied to feature data. The special attention is paid to multi-stage artificial neural network which allows an overall increase in classification accuracy. The proposed technique was tested for recognition of a large number of real surfaces in different weather conditions with the average accuracy of correct classification of 95%. The obtained results thereby demonstrate that the use of proposed system architecture and statistical methods allow for reliable discrimination of various road surfaces in real conditions.
Automotive System for Remote Surface Classification
Bystrov, Aleksandr; Hoare, Edward; Tran, Thuy-Yung; Clarke, Nigel; Gashinova, Marina; Cherniakov, Mikhail
2017-01-01
In this paper we shall discuss a novel approach to road surface recognition, based on the analysis of backscattered microwave and ultrasonic signals. The novelty of our method is sonar and polarimetric radar data fusion, extraction of features for separate swathes of illuminated surface (segmentation), and using of multi-stage artificial neural network for surface classification. The developed system consists of 24 GHz radar and 40 kHz ultrasonic sensor. The features are extracted from backscattered signals and then the procedures of principal component analysis and supervised classification are applied to feature data. The special attention is paid to multi-stage artificial neural network which allows an overall increase in classification accuracy. The proposed technique was tested for recognition of a large number of real surfaces in different weather conditions with the average accuracy of correct classification of 95%. The obtained results thereby demonstrate that the use of proposed system architecture and statistical methods allow for reliable discrimination of various road surfaces in real conditions. PMID:28368297
On the Discriminant Analysis in the 2-Populations Case
NASA Astrophysics Data System (ADS)
Rublík, František
2008-01-01
The empirical Bayes Gaussian rule, which in the normal case yields good values of the probability of total error, may yield high values of the maximum probability error. From this point of view the presented modified version of the classification rule of Broffitt, Randles and Hogg appears to be superior. The modification included in this paper is termed as a WR method, and the choice of its weights is discussed. The mentioned methods are also compared with the K nearest neighbours classification rule.
Retinal vasculature classification using novel multifractal features
NASA Astrophysics Data System (ADS)
Ding, Y.; Ward, W. O. C.; Duan, Jinming; Auer, D. P.; Gowland, Penny; Bai, L.
2015-11-01
Retinal blood vessels have been implicated in a large number of diseases including diabetic retinopathy and cardiovascular diseases, which cause damages to retinal blood vessels. The availability of retinal vessel imaging provides an excellent opportunity for monitoring and diagnosis of retinal diseases, and automatic analysis of retinal vessels will help with the processes. However, state of the art vascular analysis methods such as counting the number of branches or measuring the curvature and diameter of individual vessels are unsuitable for the microvasculature. There has been published research using fractal analysis to calculate fractal dimensions of retinal blood vessels, but so far there has been no systematic research extracting discriminant features from retinal vessels for classifications. This paper introduces new methods for feature extraction from multifractal spectra of retinal vessels for classification. Two publicly available retinal vascular image databases are used for the experiments, and the proposed methods have produced accuracies of 85.5% and 77% for classification of healthy and diabetic retinal vasculatures. Experiments show that classification with multiple fractal features produces better rates compared with methods using a single fractal dimension value. In addition to this, experiments also show that classification accuracy can be affected by the accuracy of vessel segmentation algorithms.
ChariDingari, Narahara; Barman, Ishan; Myakalwar, Ashwin Kumar; Tewari, Surya P.; Kumar, G. Manoj
2012-01-01
Despite the intrinsic elemental analysis capability and lack of sample preparation requirements, laser-induced breakdown spectroscopy (LIBS) has not been extensively used for real world applications, e.g. quality assurance and process monitoring. Specifically, variability in sample, system and experimental parameters in LIBS studies present a substantive hurdle for robust classification, even when standard multivariate chemometric techniques are used for analysis. Considering pharmaceutical sample investigation as an example, we propose the use of support vector machines (SVM) as a non-linear classification method over conventional linear techniques such as soft independent modeling of class analogy (SIMCA) and partial least-squares discriminant analysis (PLS-DA) for discrimination based on LIBS measurements. Using over-the-counter pharmaceutical samples, we demonstrate that application of SVM enables statistically significant improvements in prospective classification accuracy (sensitivity), due to its ability to address variability in LIBS sample ablation and plasma self-absorption behavior. Furthermore, our results reveal that SVM provides nearly 10% improvement in correct allocation rate and a concomitant reduction in misclassification rates of 75% (cf. PLS-DA) and 80% (cf. SIMCA)-when measurements from samples not included in the training set are incorporated in the test data – highlighting its robustness. While further studies on a wider matrix of sample types performed using different LIBS systems is needed to fully characterize the capability of SVM to provide superior predictions, we anticipate that the improved sensitivity and robustness observed here will facilitate application of the proposed LIBS-SVM toolbox for screening drugs and detecting counterfeit samples as well as in related areas of forensic and biological sample analysis. PMID:22292496
Dingari, Narahara Chari; Barman, Ishan; Myakalwar, Ashwin Kumar; Tewari, Surya P; Kumar Gundawar, Manoj
2012-03-20
Despite the intrinsic elemental analysis capability and lack of sample preparation requirements, laser-induced breakdown spectroscopy (LIBS) has not been extensively used for real-world applications, e.g., quality assurance and process monitoring. Specifically, variability in sample, system, and experimental parameters in LIBS studies present a substantive hurdle for robust classification, even when standard multivariate chemometric techniques are used for analysis. Considering pharmaceutical sample investigation as an example, we propose the use of support vector machines (SVM) as a nonlinear classification method over conventional linear techniques such as soft independent modeling of class analogy (SIMCA) and partial least-squares discriminant analysis (PLS-DA) for discrimination based on LIBS measurements. Using over-the-counter pharmaceutical samples, we demonstrate that the application of SVM enables statistically significant improvements in prospective classification accuracy (sensitivity), because of its ability to address variability in LIBS sample ablation and plasma self-absorption behavior. Furthermore, our results reveal that SVM provides nearly 10% improvement in correct allocation rate and a concomitant reduction in misclassification rates of 75% (cf. PLS-DA) and 80% (cf. SIMCA)-when measurements from samples not included in the training set are incorporated in the test data-highlighting its robustness. While further studies on a wider matrix of sample types performed using different LIBS systems is needed to fully characterize the capability of SVM to provide superior predictions, we anticipate that the improved sensitivity and robustness observed here will facilitate application of the proposed LIBS-SVM toolbox for screening drugs and detecting counterfeit samples, as well as in related areas of forensic and biological sample analysis.
Predictors of Early Termination in a University Counseling Training Clinic
ERIC Educational Resources Information Center
Lampropoulos, Georgios K.; Schneider, Mercedes K.; Spengler, Paul M.
2009-01-01
Despite the existence of counseling dropout research, there are limited predictive data for counseling in training clinics. Potential predictor variables were investigated in this archival study of 380 client files in a university counseling training clinic. Multinomial logistic regression, predictive discriminant analysis, and classification and…
Using near infrared spectroscopy to classify soybean oil according to expiration date.
da Costa, Gean Bezerra; Fernandes, David Douglas Sousa; Gomes, Adriano A; de Almeida, Valber Elias; Veras, Germano
2016-04-01
A rapid and non-destructive methodology is proposed for the screening of edible vegetable oils according to conservation state expiration date employing near infrared (NIR) spectroscopy and chemometric tools. A total of fifty samples of soybean vegetable oil, of different brands andlots, were used in this study; these included thirty expired and twenty non-expired samples. The oil oxidation was measured by peroxide index. NIR spectra were employed in raw form and preprocessed by offset baseline correction and Savitzky-Golay derivative procedure, followed by PCA exploratory analysis, which showed that NIR spectra would be suitable for the classification task of soybean oil samples. The classification models were based in SPA-LDA (Linear Discriminant Analysis coupled with Successive Projection Algorithm) and PLS-DA (Discriminant Analysis by Partial Least Squares). The set of samples (50) was partitioned into two groups of training (35 samples: 15 non-expired and 20 expired) and test samples (15 samples 5 non-expired and 10 expired) using sample-selection approaches: (i) Kennard-Stone, (ii) Duplex, and (iii) Random, in order to evaluate the robustness of the models. The obtained results for the independent test set (in terms of correct classification rate) were 96% and 98% for SPA-LDA and PLS-DA, respectively, indicating that the NIR spectra can be used as an alternative to evaluate the degree of oxidation of soybean oil samples. Copyright © 2015 Elsevier Ltd. All rights reserved.
Escalante, Yolanda; Saavedra, Jose M; Tella, Victor; Mansilla, Mirella; García-Hermoso, Antonio; Domínguez, Ana M
2013-04-01
The aims of this study were (a) to compare water polo game-related statistics by context (winning and losing teams) and phase (preliminary, classification, and semifinal/bronze medal/gold medal), and (b) identify characteristics that discriminate performances for each phase. The game-related statistics of the 230 men's matches played in World Championships (2007, 2009, and 2011) and European Championships (2008 and 2010) were analyzed. Differences between contexts (winning or losing teams) in each phase (preliminary, classification, and semifinal/bronze medal/gold medal) were determined using the chi-squared statistic, also calculating the effect sizes of the differences. A discriminant analysis was then performed after the sample-splitting method according to context (winning and losing teams) in each of the 3 phases. It was found that the game-related statistics differentiate the winning from the losing teams in each phase of an international championship. The differentiating variables are both offensive and defensive, including action shots, sprints, goalkeeper-blocked shots, and goalkeeper-blocked action shots. However, the number of discriminatory variables decreases as the phase becomes more demanding and the teams become more equally matched. The discriminant analysis showed the game-related statistics to discriminate performance in all phases (preliminary, classificatory, and semifinal/bronze medal/gold medal phase) with high percentages (91, 90, and 73%, respectively). Again, the model selected both defensive and offensive variables.
Silva, Luís; Vaz, João Rocha; Castro, Maria António; Serranho, Pedro; Cabri, Jan; Pezarat-Correia, Pedro
2015-08-01
The quantification of non-linear characteristics of electromyography (EMG) must contain information allowing to discriminate neuromuscular strategies during dynamic skills. There are a lack of studies about muscle coordination under motor constrains during dynamic contractions. In golf, both handicap (Hc) and low back pain (LBP) are the main factors associated with the occurrence of injuries. The aim of this study was to analyze the accuracy of support vector machines SVM on EMG-based classification to discriminate Hc (low and high handicap) and LBP (with and without LPB) in the main phases of golf swing. For this purpose recurrence quantification analysis (RQA) features of the trunk and the lower limb muscles were used to feed a SVM classifier. Recurrence rate (RR) and the ratio between determinism (DET) and RR showed a high discriminant power. The Hc accuracy for the swing, backswing, and downswing were 94.4±2.7%, 97.1±2.3%, and 95.3±2.6%, respectively. For LBP, the accuracy was 96.9±3.8% for the swing, and 99.7±0.4% in the backswing. External oblique (EO), biceps femoris (BF), semitendinosus (ST) and rectus femoris (RF) showed high accuracy depending on the laterality within the phase. RQA features and SVM showed a high muscle discriminant capacity within swing phases by Hc and by LBP. Low back pain golfers showed different neuromuscular coordination strategies when compared with asymptomatic. Copyright © 2015 Elsevier Ltd. All rights reserved.
Aided diagnosis methods of breast cancer based on machine learning
NASA Astrophysics Data System (ADS)
Zhao, Yue; Wang, Nian; Cui, Xiaoyu
2017-08-01
In the field of medicine, quickly and accurately determining whether the patient is malignant or benign is the key to treatment. In this paper, K-Nearest Neighbor, Linear Discriminant Analysis, Logistic Regression were applied to predict the classification of thyroid,Her-2,PR,ER,Ki67,metastasis and lymph nodes in breast cancer, in order to recognize the benign and malignant breast tumors and achieve the purpose of aided diagnosis of breast cancer. The results showed that the highest classification accuracy of LDA was 88.56%, while the classification effect of KNN and Logistic Regression were better than that of LDA, the best accuracy reached 96.30%.
NASA Technical Reports Server (NTRS)
Bauer, M. E.; Cary, T. K.; Davis, B. J.; Swain, P. H.
1975-01-01
The results of classifications and experiments for the crop identification technology assessment for remote sensing are summarized. Using two analysis procedures, 15 data sets were classified. One procedure used class weights while the other assumed equal probabilities of occurrence for all classes. Additionally, 20 data sets were classified using training statistics from another segment or date. The classification and proportion estimation results of the local and nonlocal classifications are reported. Data also describe several other experiments to provide additional understanding of the results of the crop identification technology assessment for remote sensing. These experiments investigated alternative analysis procedures, training set selection and size, effects of multitemporal registration, spectral discriminability of corn, soybeans, and other, and analyses of aircraft multispectral data.
A software tool for automatic classification and segmentation of 2D/3D medical images
NASA Astrophysics Data System (ADS)
Strzelecki, Michal; Szczypinski, Piotr; Materka, Andrzej; Klepaczko, Artur
2013-02-01
Modern medical diagnosis utilizes techniques of visualization of human internal organs (CT, MRI) or of its metabolism (PET). However, evaluation of acquired images made by human experts is usually subjective and qualitative only. Quantitative analysis of MR data, including tissue classification and segmentation, is necessary to perform e.g. attenuation compensation, motion detection, and correction of partial volume effect in PET images, acquired with PET/MR scanners. This article presents briefly a MaZda software package, which supports 2D and 3D medical image analysis aiming at quantification of image texture. MaZda implements procedures for evaluation, selection and extraction of highly discriminative texture attributes combined with various classification, visualization and segmentation tools. Examples of MaZda application in medical studies are also provided.
Wiederoder, Michael S; Nallon, Eric C; Weiss, Matt; McGraw, Shannon K; Schnee, Vincent P; Bright, Collin J; Polcha, Michael P; Paffenroth, Randy; Uzarski, Joshua R
2017-11-22
A cross-reactive array of semiselective chemiresistive sensors made of polymer-graphene nanoplatelet (GNP) composite coated electrodes was examined for detection and discrimination of chemical warfare agents (CWA). The arrays employ a set of chemically diverse polymers to generate a unique response signature for multiple CWA simulants and background interferents. The developed sensors' signal remains consistent after repeated exposures to multiple analytes for up to 5 days with a similar signal magnitude across different replicate sensors with the same polymer-GNP coating. An array of 12 sensors each coated with a different polymer-GNP mixture was exposed 100 times to a cycle of single analyte vapors consisting of 5 chemically similar CWA simulants and 8 common background interferents. The collected data was vector normalized to reduce concentration dependency, z-scored to account for baseline drift and signal-to-noise ratio, and Kalman filtered to reduce noise. The processed data was dimensionally reduced with principal component analysis and analyzed with four different machine learning algorithms to evaluate discrimination capabilities. For 5 similarly structured CWA simulants alone 100% classification accuracy was achieved. For all analytes tested 99% classification accuracy was achieved demonstrating the CWA discrimination capabilities of the developed system. The novel sensor fabrication methods and data processing techniques are attractive for development of sensor platforms for discrimination of CWA and other classes of chemical vapors.
Cerruela García, G; García-Pedrajas, N; Luque Ruiz, I; Gómez-Nieto, M Á
2018-03-01
This paper proposes a method for molecular activity prediction in QSAR studies using ensembles of classifiers constructed by means of two supervised subspace projection methods, namely nonparametric discriminant analysis (NDA) and hybrid discriminant analysis (HDA). We studied the performance of the proposed ensembles compared to classical ensemble methods using four molecular datasets and eight different models for the representation of the molecular structure. Using several measures and statistical tests for classifier comparison, we observe that our proposal improves the classification results with respect to classical ensemble methods. Therefore, we show that ensembles constructed using supervised subspace projections offer an effective way of creating classifiers in cheminformatics.
Benson, Sarah J; Lennard, Christopher J; Maynard, Philip; Hill, David M; Andrew, Anita S; Roux, Claude
2009-06-01
An evaluation was undertaken to determine if isotope ratio mass spectrometry (IRMS) could assist in the investigation of complex forensic cases by providing a level of discrimination not achievable utilising traditional forensic techniques. The focus of the research was on ammonium nitrate (AN), a common oxidiser used in improvised explosive mixtures. The potential value of IRMS to attribute Australian AN samples to the manufacturing source was demonstrated through the development of a preliminary AN classification scheme based on nitrogen isotopes. Although the discrimination utilising nitrogen isotopes alone was limited and only relevant to samples from the three Australian manufacturers during the evaluated time period, the classification scheme has potential as an investigative aid. Combining oxygen and hydrogen stable isotope values permitted the differentiation of AN prills from three different Australian manufacturers. Samples from five different overseas sources could be differentiated utilising a combination of the nitrogen, oxygen and hydrogen isotope values. Limited differentiation between Australian and overseas prills was achieved for the samples analysed. The comparison of nitrogen isotope values from intact AN prill samples with those from post-blast AN prill residues highlighted that the nitrogen isotopic composition of the prills was not maintained post-blast; hence, limiting the technique to analysis of un-reacted explosive material.
König, Caroline; Alquézar, René; Vellido, Alfredo; Giraldo, Jesús
2018-03-01
G-protein-coupled receptors (GPCRs) are a large and diverse super-family of eukaryotic cell membrane proteins that play an important physiological role as transmitters of extracellular signal. In this paper, we investigate Class C, a member of this super-family that has attracted much attention in pharmacology. The limited knowledge about the complete 3D crystal structure of Class C receptors makes necessary the use of their primary amino acid sequences for analytical purposes. Here, we provide a systematic analysis of distinct receptor sequence segments with regard to their ability to differentiate between seven class C GPCR subtypes according to their topological location in the extracellular, transmembrane, or intracellular domains. We build on the results from the previous research that provided preliminary evidence of the potential use of separated domains of complete class C GPCR sequences as the basis for subtype classification. The use of the extracellular N-terminus domain alone was shown to result in a minor decrease in subtype discrimination in comparison with the complete sequence, despite discarding much of the sequence information. In this paper, we describe the use of Support Vector Machine-based classification models to evaluate the subtype-discriminating capacity of the specific topological sequence segments.
Gan, Heng-Hui; Soukoulis, Christos; Fisk, Ian
2014-03-01
In the present work, we have evaluated for first time the feasibility of APCI-MS volatile compound fingerprinting in conjunction with chemometrics (PLS-DA) as a new strategy for rapid and non-destructive food classification. For this purpose 202 clarified monovarietal juices extracted from apples differing in their botanical and geographical origin were used for evaluation of the performance of APCI-MS as a classification tool. For an independent test set PLS-DA analyses of pre-treated spectral data gave 100% and 94.2% correct classification rate for the classification by cultivar and geographical origin, respectively. Moreover, PLS-DA analysis of APCI-MS in conjunction with GC-MS data revealed that masses within the spectral ACPI-MS data set were related with parent ions or fragments of alkyesters, carbonyl compounds (hexanal, trans-2-hexenal) and alcohols (1-hexanol, 1-butanol, cis-3-hexenol) and had significant discriminating power both in terms of cultivar and geographical origin. Copyright © 2013 The Authors. Published by Elsevier Ltd.. All rights reserved.
Voice based gender classification using machine learning
NASA Astrophysics Data System (ADS)
Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.
2017-11-01
Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.
Superiority of artificial neural networks for a genetic classification procedure.
Sant'Anna, I C; Tomaz, R S; Silva, G N; Nascimento, M; Bhering, L L; Cruz, C D
2015-08-19
The correct classification of individuals is extremely important for the preservation of genetic variability and for maximization of yield in breeding programs using phenotypic traits and genetic markers. The Fisher and Anderson discriminant functions are commonly used multivariate statistical techniques for these situations, which allow for the allocation of an initially unknown individual to predefined groups. However, for higher levels of similarity, such as those found in backcrossed populations, these methods have proven to be inefficient. Recently, much research has been devoted to developing a new paradigm of computing known as artificial neural networks (ANNs), which can be used to solve many statistical problems, including classification problems. The aim of this study was to evaluate the feasibility of ANNs as an evaluation technique of genetic diversity by comparing their performance with that of traditional methods. The discriminant functions were equally ineffective in discriminating the populations, with error rates of 23-82%, thereby preventing the correct discrimination of individuals between populations. The ANN was effective in classifying populations with low and high differentiation, such as those derived from a genetic design established from backcrosses, even in cases of low differentiation of the data sets. The ANN appears to be a promising technique to solve classification problems, since the number of individuals classified incorrectly by the ANN was always lower than that of the discriminant functions. We envisage the potential relevant application of this improved procedure in the genomic classification of markers to distinguish between breeds and accessions.
ERIC Educational Resources Information Center
Laracy, Seth D.; Hojnoski, Robin L.; Dever, Bridget V.
2016-01-01
Receiver operating characteristic curve (ROC) analysis was used to investigate the ability of early numeracy curriculum-based measures (EN-CBM) administered in preschool to predict performance below the 25th and 40th percentiles on a quantity discrimination measure in kindergarten. Areas under the curve derived from a sample of 279 students ranged…
Discriminative clustering on manifold for adaptive transductive classification.
Zhang, Zhao; Jia, Lei; Zhang, Min; Li, Bing; Zhang, Li; Li, Fanzhang
2017-10-01
In this paper, we mainly propose a novel adaptive transductive label propagation approach by joint discriminative clustering on manifolds for representing and classifying high-dimensional data. Our framework seamlessly combines the unsupervised manifold learning, discriminative clustering and adaptive classification into a unified model. Also, our method incorporates the adaptive graph weight construction with label propagation. Specifically, our method is capable of propagating label information using adaptive weights over low-dimensional manifold features, which is different from most existing studies that usually predict the labels and construct the weights in the original Euclidean space. For transductive classification by our formulation, we first perform the joint discriminative K-means clustering and manifold learning to capture the low-dimensional nonlinear manifolds. Then, we construct the adaptive weights over the learnt manifold features, where the adaptive weights are calculated through performing the joint minimization of the reconstruction errors over features and soft labels so that the graph weights can be joint-optimal for data representation and classification. Using the adaptive weights, we can easily estimate the unknown labels of samples. After that, our method returns the updated weights for further updating the manifold features. Extensive simulations on image classification and segmentation show that our proposed algorithm can deliver the state-of-the-art performance on several public datasets. Copyright © 2017 Elsevier Ltd. All rights reserved.
Forest tree species discrimination in western Himalaya using EO-1 Hyperion
NASA Astrophysics Data System (ADS)
George, Rajee; Padalia, Hitendra; Kushwaha, S. P. S.
2014-05-01
The information acquired in the narrow bands of hyperspectral remote sensing data has potential to capture plant species spectral variability, thereby improving forest tree species mapping. This study assessed the utility of spaceborne EO-1 Hyperion data in discrimination and classification of broadleaved evergreen and conifer forest tree species in western Himalaya. The pre-processing of 242 bands of Hyperion data resulted into 160 noise-free and vertical stripe corrected reflectance bands. Of these, 29 bands were selected through step-wise exclusion of bands (Wilk's Lambda). Spectral Angle Mapper (SAM) and Support Vector Machine (SVM) algorithms were applied to the selected bands to assess their effectiveness in classification. SVM was also applied to broadband data (Landsat TM) to compare the variation in classification accuracy. All commonly occurring six gregarious tree species, viz., white oak, brown oak, chir pine, blue pine, cedar and fir in western Himalaya could be effectively discriminated. SVM produced a better species classification (overall accuracy 82.27%, kappa statistic 0.79) than SAM (overall accuracy 74.68%, kappa statistic 0.70). It was noticed that classification accuracy achieved with Hyperion bands was significantly higher than Landsat TM bands (overall accuracy 69.62%, kappa statistic 0.65). Study demonstrated the potential utility of narrow spectral bands of Hyperion data in discriminating tree species in a hilly terrain.
NASA Astrophysics Data System (ADS)
Seo, Young Wook; Yoon, Seung Chul; Park, Bosoon; Hinton, Arthur; Windham, William R.; Lawrence, Kurt C.
2013-05-01
Salmonella is a major cause of foodborne disease outbreaks resulting from the consumption of contaminated food products in the United States. This paper reports the development of a hyperspectral imaging technique for detecting and differentiating two of the most common Salmonella serotypes, Salmonella Enteritidis (SE) and Salmonella Typhimurium (ST), from background microflora that are often found in poultry carcass rinse. Presumptive positive screening of colonies with a traditional direct plating method is a labor intensive and time consuming task. Thus, this paper is concerned with the detection of differences in spectral characteristics among the pure SE, ST, and background microflora grown on brilliant green sulfa (BGS) and xylose lysine tergitol 4 (XLT4) agar media with a spread plating technique. Visible near-infrared hyperspectral imaging, providing the spectral and spatial information unique to each microorganism, was utilized to differentiate SE and ST from the background microflora. A total of 10 classification models, including five machine learning algorithms, each without and with principal component analysis (PCA), were validated and compared to find the best model in classification accuracy. The five machine learning (classification) algorithms used in this study were Mahalanobis distance (MD), k-nearest neighbor (kNN), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and support vector machine (SVM). The average classification accuracy of all 10 models on a calibration (or training) set of the pure cultures on BGS agar plates was 98% (Kappa coefficient = 0.95) in determining the presence of SE and/or ST although it was difficult to differentiate between SE and ST. The average classification accuracy of all 10 models on a training set for ST detection on XLT4 agar was over 99% (Kappa coefficient = 0.99) although SE colonies on XLT4 agar were difficult to differentiate from background microflora. The average classification accuracy of all 10 models on a validation set of chicken carcass rinses spiked with SE or ST and incubated on BGS agar plates was 94.45% and 83.73%, without and with PCA for classification, respectively. The best performing classification model on the validation set was QDA without PCA by achieving the classification accuracy of 98.65% (Kappa coefficient=0.98). The overall best performing classification model regardless of using PCA was MD with the classification accuracy of 94.84% (Kappa coefficient=0.88) on the validation set.
Classification Techniques for Multivariate Data Analysis.
1980-03-28
analysis among biologists, botanists, and ecologists, while some social scientists may refer "typology". Other frequently encountered terms are pattern...the determinantal equation: lB -XW 0 (42) 49 The solutions X. are the eigenvalues of the matrix W-1 B 1 as in discriminant analysis. There are t non...Statistical Package for Social Sciences (SPSS) (14) subprogram FACTOR was used for the principal components analysis. It is designed both for the factor
Domínguez, Rocio Berenice; Moreno-Barón, Laura; Muñoz, Roberto; Gutiérrez, Juan Manuel
2014-01-01
This paper describes a new method based on a voltammetric electronic tongue (ET) for the recognition of distinctive features in coffee samples. An ET was directly applied to different samples from the main Mexican coffee regions without any pretreatment before the analysis. The resulting electrochemical information was modeled with two different mathematical tools, namely Linear Discriminant Analysis (LDA) and Support Vector Machines (SVM). Growing conditions (i.e., organic or non-organic practices and altitude of crops) were considered for a first classification. LDA results showed an average discrimination rate of 88% ± 6.53% while SVM successfully accomplished an overall accuracy of 96.4% ± 3.50% for the same task. A second classification based on geographical origin of samples was carried out. Results showed an overall accuracy of 87.5% ± 7.79% for LDA and a superior performance of 97.5% ± 3.22% for SVM. Given the complexity of coffee samples, the high accuracy percentages achieved by ET coupled with SVM in both classification problems suggested a potential applicability of ET in the assessment of selected coffee features with a simpler and faster methodology along with a null sample pretreatment. In addition, the proposed method can be applied to authentication assessment while improving cost, time and accuracy of the general procedure. PMID:25254303
Domínguez, Rocio Berenice; Moreno-Barón, Laura; Muñoz, Roberto; Gutiérrez, Juan Manuel
2014-09-24
This paper describes a new method based on a voltammetric electronic tongue (ET) for the recognition of distinctive features in coffee samples. An ET was directly applied to different samples from the main Mexican coffee regions without any pretreatment before the analysis. The resulting electrochemical information was modeled with two different mathematical tools, namely Linear Discriminant Analysis (LDA) and Support Vector Machines (SVM). Growing conditions (i.e., organic or non-organic practices and altitude of crops) were considered for a first classification. LDA results showed an average discrimination rate of 88% ± 6.53% while SVM successfully accomplished an overall accuracy of 96.4% ± 3.50% for the same task. A second classification based on geographical origin of samples was carried out. Results showed an overall accuracy of 87.5% ± 7.79% for LDA and a superior performance of 97.5% ± 3.22% for SVM. Given the complexity of coffee samples, the high accuracy percentages achieved by ET coupled with SVM in both classification problems suggested a potential applicability of ET in the assessment of selected coffee features with a simpler and faster methodology along with a null sample pretreatment. In addition, the proposed method can be applied to authentication assessment while improving cost, time and accuracy of the general procedure.
Longobardi, F; Ventrella, A; Bianco, A; Catucci, L; Cafagna, I; Gallo, V; Mastrorilli, P; Agostiano, A
2013-12-01
In this study, non-targeted (1)H NMR fingerprinting was used in combination with multivariate statistical techniques for the classification of Italian sweet cherries based on their different geographical origins (Emilia Romagna and Puglia). As classification techniques, Soft Independent Modelling of Class Analogy (SIMCA), Partial Least Squares Discriminant Analysis (PLS-DA), and Linear Discriminant Analysis (LDA) were carried out and the results were compared. For LDA, before performing a refined selection of the number/combination of variables, two different strategies for a preliminary reduction of the variable number were tested. The best average recognition and CV prediction abilities (both 100.0%) were obtained for all the LDA models, although PLS-DA also showed remarkable performances (94.6%). All the statistical models were validated by observing the prediction abilities with respect to an external set of cherry samples. The best result (94.9%) was obtained with LDA by performing a best subset selection procedure on a set of 30 principal components previously selected by a stepwise decorrelation. The metabolites that mostly contributed to the classification performances of such LDA model, were found to be malate, glucose, fructose, glutamine and succinate. Copyright © 2013 Elsevier Ltd. All rights reserved.
Kwon, Yong-Kook; Bong, Yeon-Sik; Lee, Kwang-Sik; Hwang, Geum-Sook
2014-10-15
ICP-MS and (1)H NMR are commonly used to determine the geographical origin of food and crops. In this study, data from multielemental analysis performed by ICP-AES/ICP-MS and metabolomic data obtained from (1)H NMR were integrated to improve the reliability of determining the geographical origin of medicinal herbs. Astragalus membranaceus and Paeonia albiflora with different origins in Korea and China were analysed by (1)H NMR and ICP-AES/ICP-MS, and an integrated multivariate analysis was performed to characterise the differences between their origins. Four classification methods were applied: linear discriminant analysis (LDA), k-nearest neighbour classification (KNN), support vector machines (SVM), and partial least squares-discriminant analysis (PLS-DA). Results were compared using leave-one-out cross-validation and external validation. The integration of multielemental and metabolomic data was more suitable for determining geographical origin than the use of each individual data set alone. The integration of the two analytical techniques allowed diverse environmental factors such as climate and geology, to be considered. Our study suggests that an appropriate integration of different types of analytical data is useful for determining the geographical origin of food and crops with a high degree of reliability. Copyright © 2014 Elsevier Ltd. All rights reserved.
Besga, Ariadna; Chyzhyk, Darya; González-Ortega, Itxaso; Savio, Alexandre; Ayerdi, Borja; Echeveste, Jon; Graña, Manuel; González-Pinto, Ana
2016-01-01
Late Onset Bipolar Disorder (LOBD) is the arousal of Bipolar Disorder (BD) at old age (>60) without any previous history of disorders. LOBD is often difficult to distinguish from degenerative dementias, such as Alzheimer Disease (AD), due to comorbidities and common cognitive symptoms. Moreover, LOBD prevalence is increasing due to population aging. Biomarkers extracted from blood plasma are not discriminant because both pathologies share pathophysiological features related to neuroinflammation, therefore we look for anatomical features highly correlated with blood biomarkers that allow accurate diagnosis prediction. This may shed some light on the basic biological mechanisms leading to one or another disease. Moreover, accurate diagnosis is needed to select the best personalized treatment. We look for white matter features which are correlated with blood plasma biomarkers (inflammatory and neurotrophic) discriminating LOBD from AD. A sample of healthy controls (HC) (n=19), AD patients (n=35), and BD patients (n=24) has been recruited at the Alava University Hospital. Plasma biomarkers have been obtained at recruitment time. Diffusion weighted (DWI) magnetic resonance imaging (MRI) are obtained for each subject. DWI is preprocessed to obtain diffusion tensor imaging (DTI) data, which is reduced to fractional anisotropy (FA) data. In the selection phase, eigenanatomy finds FA eigenvolumes maximally correlated with plasma biomarkers by partial sparse canonical correlation analysis (PSCCAN). In the analysis phase, we take the eigenvolume projection coefficients as the classification features, carrying out cross-validation of support vector machine (SVM) to obtain discrimination power of each biomarker effects. The John Hopkins Universtiy white matter atlas is used to provide anatomical localizations of the detected feature clusters. Classification results show that one specific biomarker of oxidative stress (malondialdehyde MDA) gives the best classification performance ( accuracy 85%, F-score 86%, sensitivity, and specificity 87%, ) in the discrimination of AD and LOBD. Discriminating features appear to be localized in the posterior limb of the internal capsule and superior corona radiata. It is feasible to support contrast diagnosis among LOBD and AD by means of predictive classifiers based on eigenanatomy features computed from FA imaging correlated to plasma biomarkers. In addition, white matter eigenanatomy localizations offer some new avenues to assess the differential pathophysiology of LOBD and AD.
Collected Notes on the Workshop for Pattern Discovery in Large Databases
NASA Technical Reports Server (NTRS)
Buntine, Wray (Editor); Delalto, Martha (Editor)
1991-01-01
These collected notes are a record of material presented at the Workshop. The core data analysis is addressed that have traditionally required statistical or pattern recognition techniques. Some of the core tasks include classification, discrimination, clustering, supervised and unsupervised learning, discovery and diagnosis, i.e., general pattern discovery.
Mining for class-specific motifs in protein sequence classification
2013-01-01
Background In protein sequence classification, identification of the sequence motifs or n-grams that can precisely discriminate between classes is a more interesting scientific question than the classification itself. A number of classification methods aim at accurate classification but fail to explain which sequence features indeed contribute to the accuracy. We hypothesize that sequences in lower denominations (n-grams) can be used to explore the sequence landscape and to identify class-specific motifs that discriminate between classes during classification. Discriminative n-grams are short peptide sequences that are highly frequent in one class but are either minimally present or absent in other classes. In this study, we present a new substitution-based scoring function for identifying discriminative n-grams that are highly specific to a class. Results We present a scoring function based on discriminative n-grams that can effectively discriminate between classes. The scoring function, initially, harvests the entire set of 4- to 8-grams from the protein sequences of different classes in the dataset. Similar n-grams of the same size are combined to form new n-grams, where the similarity is defined by positive amino acid substitution scores in the BLOSUM62 matrix. Substitution has resulted in a large increase in the number of discriminatory n-grams harvested. Due to the unbalanced nature of the dataset, the frequencies of the n-grams are normalized using a dampening factor, which gives more weightage to the n-grams that appear in fewer classes and vice-versa. After the n-grams are normalized, the scoring function identifies discriminative 4- to 8-grams for each class that are frequent enough to be above a selection threshold. By mapping these discriminative n-grams back to the protein sequences, we obtained contiguous n-grams that represent short class-specific motifs in protein sequences. Our method fared well compared to an existing motif finding method known as Wordspy. We have validated our enriched set of class-specific motifs against the functionally important motifs obtained from the NLSdb, Prosite and ELM databases. We demonstrate that this method is very generic; thus can be widely applied to detect class-specific motifs in many protein sequence classification tasks. Conclusion The proposed scoring function and methodology is able to identify class-specific motifs using discriminative n-grams derived from the protein sequences. The implementation of amino acid substitution scores for similarity detection, and the dampening factor to normalize the unbalanced datasets have significant effect on the performance of the scoring function. Our multipronged validation tests demonstrate that this method can detect class-specific motifs from a wide variety of protein sequence classes with a potential application to detecting proteome-specific motifs of different organisms. PMID:23496846
Detection of non-milk fat in milk fat by gas chromatography and linear discriminant analysis.
Gutiérrez, R; Vega, S; Díaz, G; Sánchez, J; Coronado, M; Ramírez, A; Pérez, J; González, M; Schettino, B
2009-05-01
Gas chromatography was utilized to determine triacylglycerol profiles in milk and non-milk fat. The values of triacylglycerol were subjected to linear discriminant analysis to detect and quantify non-milk fat in milk fat. Two groups of milk fat were analyzed: A) raw milk fat from the central region of Mexico (n = 216) and B) ultrapasteurized milk fat from 3 industries (n = 36), as well as pork lard (n = 2), bovine tallow (n = 2), fish oil (n = 2), peanut (n = 2), corn (n = 2), olive (n = 2), and soy (n = 2). The samples of raw milk fat were adulterated with non-milk fats in proportions of 0, 5, 10, 15, and 20% to form 5 groups. The first function obtained from the linear discriminant analysis allowed the correct classification of 94.4% of the samples with levels <10% of adulteration. The triacylglycerol values of the ultrapasteurized milk fats were evaluated with the discriminant function, demonstrating that one industry added non-milk fat to its product in 80% of the samples analyzed.
NASA Astrophysics Data System (ADS)
Gutierrez-Velez, V. H.; DeFries, R. S.
2011-12-01
Oil palm expansion has led to clearing of extensive forest areas in the tropics. However quantitative assessments of the magnitude of oil palm expansion to deforestation have been challenging due in large part to the limitations presented by conventional optical data sets for discriminating plantations from forests and other tree cover vegetations. Recently available information from active remote sensors has opened the possibility of using these data sources to overcome these limitations. The purpose of this analysis is to evaluate the accuracy of oil palm classification when using ALOS/PALSAR active satellite data in conjunction with Landsat information, compared to the use of Landsat data only. The analysis takes place in a focused region around the city of Pucallpa in the Ucayali province of the Peruvian Amazon for the year 2010. Oil palm plantations were separated in five categories consisting of four age classes (0-3, 3-5, 5-10 and > 10 yrs) and an additional class accounting for degraded plantations older than 15 yr. Other land covers were water bodies, unvegetated land, short and tall grass, fallow, secondary vegetation, and forest. Classifications were performed using random forests. Training points for calibration and validation consisted of 411 polygons measured in areas representative of the land covers of interest and totaled 6,367 ha. Overall classification accuracy increased from 89.9% using only Landsat data sets to 94.3% using both Landast and ALOS/PALSAR. Both user's and producer's accuracy increased in all classes when using both data sets except for producer's accuracy in short grass which decreased by 1%. The largest increase in user's accuracy was obtained in oil palm plantations older than 10 years from 62 to 80% while producer's accuracy improved the most in plantations in age class 3-5 from 63 to 80%. Results demonstrate the suitability of data from ALOS/PALSAR and other active remote sensors to improve classification of oil palm plantations in age classes and discriminate them from other land covers. Results suggest a potential for improving discrimination of other tree cover types using a combination of active and conventional optical remote sensors.
Improved EEG Event Classification Using Differential Energy.
Harati, A; Golmohammadi, M; Lopez, S; Obeid, I; Picone, J
2015-12-01
Feature extraction for automatic classification of EEG signals typically relies on time frequency representations of the signal. Techniques such as cepstral-based filter banks or wavelets are popular analysis techniques in many signal processing applications including EEG classification. In this paper, we present a comparison of a variety of approaches to estimating and postprocessing features. To further aid in discrimination of periodic signals from aperiodic signals, we add a differential energy term. We evaluate our approaches on the TUH EEG Corpus, which is the largest publicly available EEG corpus and an exceedingly challenging task due to the clinical nature of the data. We demonstrate that a variant of a standard filter bank-based approach, coupled with first and second derivatives, provides a substantial reduction in the overall error rate. The combination of differential energy and derivatives produces a 24 % absolute reduction in the error rate and improves our ability to discriminate between signal events and background noise. This relatively simple approach proves to be comparable to other popular feature extraction approaches such as wavelets, but is much more computationally efficient.
NASA Astrophysics Data System (ADS)
Karahaliou, A.; Vassiou, K.; Skiadopoulos, S.; Kanavou, T.; Yiakoumelos, A.; Costaridou, L.
2009-07-01
The current study investigates whether texture features extracted from lesion kinetics feature maps can be used for breast cancer diagnosis. Fifty five women with 57 breast lesions (27 benign, 30 malignant) were subjected to dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) on 1.5T system. A linear-slope model was fitted pixel-wise to a representative lesion slice time series and fitted parameters were used to create three kinetic maps (wash out, time to peak enhancement and peak enhancement). 28 grey level co-occurrence matrices features were extracted from each lesion kinetic map. The ability of texture features per map in discriminating malignant from benign lesions was investigated using a Probabilistic Neural Network classifier. Additional classification was performed by combining classification outputs of most discriminating feature subsets from the three maps, via majority voting. The combined scheme outperformed classification based on individual maps achieving area under Receiver Operating Characteristics curve 0.960±0.029. Results suggest that heterogeneity of breast lesion kinetics, as quantified by texture analysis, may contribute to computer assisted tissue characterization in DCE-MRI.
Towards exaggerated emphysema stereotypes
NASA Astrophysics Data System (ADS)
Chen, C.; Sørensen, L.; Lauze, F.; Igel, C.; Loog, M.; Feragen, A.; de Bruijne, M.; Nielsen, M.
2012-03-01
Classification is widely used in the context of medical image analysis and in order to illustrate the mechanism of a classifier, we introduce the notion of an exaggerated image stereotype based on training data and trained classifier. The stereotype of some image class of interest should emphasize/exaggerate the characteristic patterns in an image class and visualize the information the employed classifier relies on. This is useful for gaining insight into the classification and serves for comparison with the biological models of disease. In this work, we build exaggerated image stereotypes by optimizing an objective function which consists of a discriminative term based on the classification accuracy, and a generative term based on the class distributions. A gradient descent method based on iterated conditional modes (ICM) is employed for optimization. We use this idea with Fisher's linear discriminant rule and assume a multivariate normal distribution for samples within a class. The proposed framework is applied to computed tomography (CT) images of lung tissue with emphysema. The synthesized stereotypes illustrate the exaggerated patterns of lung tissue with emphysema, which is underpinned by three different quantitative evaluation methods.
A Discriminant Distance Based Composite Vector Selection Method for Odor Classification
Choi, Sang-Il; Jeong, Gu-Min
2014-01-01
We present a composite vector selection method for an effective electronic nose system that performs well even in noisy environments. Each composite vector generated from a electronic nose data sample is evaluated by computing the discriminant distance. By quantitatively measuring the amount of discriminative information in each composite vector, composite vectors containing informative variables can be distinguished and the final composite features for odor classification are extracted using the selected composite vectors. Using the only informative composite vectors can be also helpful to extract better composite features instead of using all the generated composite vectors. Experimental results with different volatile organic compound data show that the proposed system has good classification performance even in a noisy environment compared to other methods. PMID:24747735
Basati, Zahra; Jamshidi, Bahareh; Rasekh, Mansour; Abbaspour-Gilandeh, Yousef
2018-05-30
The presence of sunn pest-damaged grains in wheat mass reduces the quality of flour and bread produced from it. Therefore, it is essential to assess the quality of the samples in collecting and storage centers of wheat and flour mills. In this research, the capability of visible/near-infrared (Vis/NIR) spectroscopy combined with pattern recognition methods was investigated for discrimination of wheat samples with different percentages of sunn pest-damaged. To this end, various samples belonging to five classes (healthy and 5%, 10%, 15% and 20% unhealthy) were analyzed using Vis/NIR spectroscopy (wavelength range of 350-1000 nm) based on both supervised and unsupervised pattern recognition methods. Principal component analysis (PCA) and hierarchical cluster analysis (HCA) as the unsupervised techniques and soft independent modeling of class analogies (SIMCA) and partial least squares-discriminant analysis (PLS-DA) as supervised methods were used. The results showed that Vis/NIR spectra of healthy samples were correctly clustered using both PCA and HCA. Due to the high overlapping between the four unhealthy classes (5%, 10%, 15% and 20%), it was not possible to discriminate all the unhealthy samples in individual classes. However, when considering only the two main categories of healthy and unhealthy, an acceptable degree of separation between the classes can be obtained after classification with supervised pattern recognition methods of SIMCA and PLS-DA. SIMCA based on PCA modeling correctly classified samples in two classes of healthy and unhealthy with classification accuracy of 100%. Moreover, the power of the wavelengths of 839 nm, 918 nm and 995 nm were more than other wavelengths to discriminate two classes of healthy and unhealthy. It was also concluded that PLS-DA provides excellent classification results of healthy and unhealthy samples (R 2 = 0.973 and RMSECV = 0.057). Therefore, Vis/NIR spectroscopy based on pattern recognition techniques can be useful for rapid distinguishing the healthy wheat samples from those damaged by sunn pest in the maintenance and processing centers. Copyright © 2018 Elsevier B.V. All rights reserved.
Gagné, Mathieu; Moore, Lynne; Beaudoin, Claudia; Batomen Kuimi, Brice Lionel; Sirois, Marie-Josée
2016-03-01
The International Classification of Diseases (ICD) is the main classification system used for population-based injury surveillance activities but does not contain information on injury severity. ICD-based injury severity measures can be empirically derived or mapped, but no single approach has been formally recommended. This study aimed to compare the performance of ICD-based injury severity measures to predict in-hospital mortality among injury-related admissions. A systematic review and a meta-analysis were conducted. MEDLINE, EMBASE, and Global Health databases were searched from their inception through September 2014. Observational studies that assessed the performance of ICD-based injury severity measures to predict in-hospital mortality and reported discriminative ability using the area under a receiver operating characteristic curve (AUC) were included. Metrics of model performance were extracted. Pooled AUC were estimated under random-effects models. Twenty-two eligible studies reported 72 assessments of discrimination on ICD-based injury severity measures. Reported AUC ranged from 0.681 to 0.958. Of the 72 assessments, 46 showed excellent (0.80 ≤ AUC < 0.90) and 6 outstanding (AUC ≥ 0.90) discriminative ability. Pooled AUC for ICD-based Injury Severity Score (ICISS) based on the product of traditional survival proportions was significantly higher than measures based on ICD mapped to Abbreviated Injury Scale (AIS) scores (0.863 vs. 0.825 for ICDMAP-ISS [p = 0.005] and ICDMAP-NISS [p = 0.016]). Similar results were observed when studies were stratified by the type of data used (trauma registry or hospital discharge) or the provenance of survival proportions (internally or externally derived). However, among studies published after 2003 the Trauma Mortality Prediction Model based on ICD-9 codes (TMPM-9) demonstrated superior discriminative ability than ICISS using the product of traditional survival proportions (0.850 vs. 0.802, p = 0.002). Models generally showed poor calibration. ICISS using the product of traditional survival proportions and TMPM-9 predict mortality more accurately than those mapped to AIS codes and should be preferred for describing injury severity when ICD is used to record injury diagnoses. Systematic review and meta-analysis, level III.
A hybrid sensing approach for pure and adulterated honey classification.
Subari, Norazian; Mohamad Saleh, Junita; Md Shakaff, Ali Yeon; Zakaria, Ammar
2012-10-17
This paper presents a comparison between data from single modality and fusion methods to classify Tualang honey as pure or adulterated using Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) statistical classification approaches. Ten different brands of certified pure Tualang honey were obtained throughout peninsular Malaysia and Sumatera, Indonesia. Various concentrations of two types of sugar solution (beet and cane sugar) were used in this investigation to create honey samples of 20%, 40%, 60% and 80% adulteration concentrations. Honey data extracted from an electronic nose (e-nose) and Fourier Transform Infrared Spectroscopy (FTIR) were gathered, analyzed and compared based on fusion methods. Visual observation of classification plots revealed that the PCA approach able to distinct pure and adulterated honey samples better than the LDA technique. Overall, the validated classification results based on FTIR data (88.0%) gave higher classification accuracy than e-nose data (76.5%) using the LDA technique. Honey classification based on normalized low-level and intermediate-level FTIR and e-nose fusion data scored classification accuracies of 92.2% and 88.7%, respectively using the Stepwise LDA method. The results suggested that pure and adulterated honey samples were better classified using FTIR and e-nose fusion data than single modality data.
Toward improving fine needle aspiration cytology by applying Raman microspectroscopy
NASA Astrophysics Data System (ADS)
Becker-Putsche, Melanie; Bocklitz, Thomas; Clement, Joachim; Rösch, Petra; Popp, Jürgen
2013-04-01
Medical diagnosis of biopsies performed by fine needle aspiration has to be very reliable. Therefore, pathologists/cytologists need additional biochemical information on single cancer cells for an accurate diagnosis. Accordingly, we applied three different classification models for discriminating various features of six breast cancer cell lines by analyzing Raman microspectroscopic data. The statistical evaluations are implemented by linear discriminant analysis (LDA) and support vector machines (SVM). For the first model, a total of 61,580 Raman spectra from 110 single cells are discriminated at the cell-line level with an accuracy of 99.52% using an SVM. The LDA classification based on Raman data achieved an accuracy of 94.04% by discriminating cell lines by their origin (solid tumor versus pleural effusion). In the third model, Raman cell spectra are classified by their cancer subtypes. LDA results show an accuracy of 97.45% and specificities of 97.78%, 99.11%, and 98.97% for the subtypes basal-like, HER2+/ER-, and luminal, respectively. These subtypes are confirmed by gene expression patterns, which are important prognostic features in diagnosis. This work shows the applicability of Raman spectroscopy and statistical data handling in analyzing cancer-relevant biochemical information for advanced medical diagnosis on the single-cell level.
Abdolali, Fatemeh; Zoroofi, Reza Aghaeizadeh; Otake, Yoshito; Sato, Yoshinobu
2017-02-01
Accurate detection of maxillofacial cysts is an essential step for diagnosis, monitoring and planning therapeutic intervention. Cysts can be of various sizes and shapes and existing detection methods lead to poor results. Customizing automatic detection systems to gain sufficient accuracy in clinical practice is highly challenging. For this purpose, integrating the engineering knowledge in efficient feature extraction is essential. This paper presents a novel framework for maxillofacial cysts detection. A hybrid methodology based on surface and texture information is introduced. The proposed approach consists of three main steps as follows: At first, each cystic lesion is segmented with high accuracy. Then, in the second and third steps, feature extraction and classification are performed. Contourlet and SPHARM coefficients are utilized as texture and shape features which are fed into the classifier. Two different classifiers are used in this study, i.e. support vector machine and sparse discriminant analysis. Generally SPHARM coefficients are estimated by the iterative residual fitting (IRF) algorithm which is based on stepwise regression method. In order to improve the accuracy of IRF estimation, a method based on extra orthogonalization is employed to reduce linear dependency. We have utilized a ground-truth dataset consisting of cone beam CT images of 96 patients, belonging to three maxillofacial cyst categories: radicular cyst, dentigerous cyst and keratocystic odontogenic tumor. Using orthogonalized SPHARM, residual sum of squares is decreased which leads to a more accurate estimation. Analysis of the results based on statistical measures such as specificity, sensitivity, positive predictive value and negative predictive value is reported. The classification rate of 96.48% is achieved using sparse discriminant analysis and orthogonalized SPHARM features. Classification accuracy at least improved by 8.94% with respect to conventional features. This study demonstrated that our proposed methodology can improve the computer assisted diagnosis (CAD) performance by incorporating more discriminative features. Using orthogonalized SPHARM is promising in computerized cyst detection and may have a significant impact in future CAD systems. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Liu, Xiaona; Zhang, Qiao; Wu, Zhisheng; Shi, Xinyuan; Zhao, Na; Qiao, Yanjiang
2015-01-01
Laser-induced breakdown spectroscopy (LIBS) was applied to perform a rapid elemental analysis and provenance study of Blumea balsamifera DC. Principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) were implemented to exploit the multivariate nature of the LIBS data. Scores and loadings of computed principal components visually illustrated the differing spectral data. The PLS-DA algorithm showed good classification performance. The PLS-DA model using complete spectra as input variables had similar discrimination performance to using selected spectral lines as input variables. The down-selection of spectral lines was specifically focused on the major elements of B. balsamifera samples. Results indicated that LIBS could be used to rapidly analyze elements and to perform provenance study of B. balsamifera. PMID:25558999
Eric R. Scholl; Thomas A. Waldrop
1999-01-01
Although prescribed burning is common in the Southeastern United States, most fuel models apply to only western forests. This paper documents a fuel classification system that was developed for plantations of loblolly and longleaf pines for the Upper Coastal Plain region. Multivariate analysis of variance and discriminant function analysis were used to confirm eight...
Dantas, Hebertty V; Barbosa, Mayara F; Nascimento, Elaine C L; Moreira, Pablo N T; Galvão, Roberto K H; Araújo, Mário C U
2013-03-15
This paper proposes a NIR spectrometric method for screening analysis of liquefied petroleum gas (LPG) samples. The proposed method is aimed at discriminating samples with low and high propane content, which can be useful for the adjustment of burn settings in industrial applications. A gas flow system was developed to introduce the LPG sample into a NIR flow cell at constant pressure. In addition, a gas chromatographer was employed to determine the propane content of the sample for reference purposes. The results of a principal component analysis, as well as a classification study using SIMCA (soft independent modeling of class analogies), revealed that the samples can be successfully discriminated with respect to propane content by using the NIR spectrum in the range 8100-8800 cm(-1). In addition, by using SPA-LDA (linear discriminant analysis with variables selected by the successive projections algorithm), it was found that perfect discrimination can also be achieved by using only two wavenumbers (8215 and 8324 cm(-1)). This finding may be of value for the design of a dedicated, low-cost instrument for routine analyses. Copyright © 2012 Elsevier B.V. All rights reserved.
Variations in the Intragene Methylation Profiles Hallmark Induced Pluripotency
Druzhkov, Pavel; Zolotykh, Nikolay; Meyerov, Iosif; Alsaedi, Ahmed; Shutova, Maria; Ivanchenko, Mikhail; Zaikin, Alexey
2015-01-01
We demonstrate the potential of differentiating embryonic and induced pluripotent stem cells by the regularized linear and decision tree machine learning classification algorithms, based on a number of intragene methylation measures. The resulting average accuracy of classification has been proven to be above 95%, which overcomes the earlier achievements. We propose a constructive and transparent method of feature selection based on classifier accuracy. Enrichment analysis reveals statistically meaningful presence of stemness group and cancer discriminating genes among the selected best classifying features. These findings stimulate the further research on the functional consequences of these differences in methylation patterns. The presented approach can be broadly used to discriminate the cells of different phenotype or in different state by their methylation profiles, identify groups of genes constituting multifeature classifiers, and assess enrichment of these groups by the sets of genes with a functionality of interest. PMID:26618180
Classification of lung cancer histology by gold nanoparticle sensors
Barash, Orna; Peled, Nir; Tisch, Ulrike; Bunn, Paul A.; Hirsch, Fred R.; Haick, Hossam
2016-01-01
We propose a nanomedical device for the classification of lung cancer (LC) histology. The device profiles volatile organic compounds (VOCs) in the headspace of (subtypes of) LC cells, using gold nanoparticle (GNP) sensors that are suitable for detecting LC-specific patterns of VOC profiles, as determined by gas chromatography–mass spectrometry analysis. Analyzing the GNP sensing signals by support vector machine allowed significant discrimination between (i) LC and healthy cells; (ii) small cell LC and non–small cell LC; and between (iii) two subtypes of non–small cell LC: adenocarcinoma and squamous cell carcinoma. The discriminative power of the GNP sensors was then linked with the chemical nature and composition of the headspace VOCs of each LC state. These proof-of-concept findings could totally revolutionize LC screening and diagnosis, and might eventually allow early and differential diagnosis of LC subtypes with detectable or unreachable lung nodules. PMID:22033081
Challenges in discriminating profanity from hate speech
NASA Astrophysics Data System (ADS)
Malmasi, Shervin; Zampieri, Marcos
2018-03-01
In this study, we approach the problem of distinguishing general profanity from hate speech in social media, something which has not been widely considered. Using a new dataset annotated specifically for this task, we employ supervised classification along with a set of features that includes ?-grams, skip-grams and clustering-based word representations. We apply approaches based on single classifiers as well as more advanced ensemble classifiers and stacked generalisation, achieving the best result of ? accuracy for this 3-class classification task. Analysis of the results reveals that discriminating hate speech and profanity is not a simple task, which may require features that capture a deeper understanding of the text not always possible with surface ?-grams. The variability of gold labels in the annotated data, due to differences in the subjective adjudications of the annotators, is also an issue. Other directions for future work are discussed.
NASA Astrophysics Data System (ADS)
Chen, Xue; Li, Xiaohui; Yu, Xin; Chen, Deying; Liu, Aichun
2018-01-01
Diagnosis of malignancies is a challenging clinical issue. In this work, we present quick and robust diagnosis and discrimination of lymphoma and multiple myeloma (MM) using laser-induced breakdown spectroscopy (LIBS) conducted on human serum samples, in combination with chemometric methods. The serum samples collected from lymphoma and MM cancer patients and healthy controls were deposited on filter papers and ablated with a pulsed 1064 nm Nd:YAG laser. 24 atomic lines of Ca, Na, K, H, O, and N were selected for malignancy diagnosis. Principal component analysis (PCA), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and k nearest neighbors (kNN) classification were applied to build the malignancy diagnosis and discrimination models. The performances of the models were evaluated using 10-fold cross validation. The discrimination accuracy, confusion matrix and receiver operating characteristic (ROC) curves were obtained. The values of area under the ROC curve (AUC), sensitivity and specificity at the cut-points were determined. The kNN model exhibits the best performances with overall discrimination accuracy of 96.0%. Distinct discrimination between malignancies and healthy controls has been achieved with AUC, sensitivity and specificity for healthy controls all approaching 1. For lymphoma, the best discrimination performance values are AUC = 0.990, sensitivity = 0.970 and specificity = 0.956. For MM, the corresponding values are AUC = 0.986, sensitivity = 0.892 and specificity = 0.994. The results show that the serum-LIBS technique can serve as a quick, less invasive and robust method for diagnosis and discrimination of human malignancies.
McCabe, Ciara; Rocha-Rego, Vanessa
2016-01-01
Dysfunctional neural responses to appetitive and aversive stimuli have been investigated as possible biomarkers for psychiatric disorders. However it is not clear to what degree these are separate processes across the brain or in fact overlapping systems. To help clarify this issue we used Gaussian process classifier (GPC) analysis to examine appetitive and aversive processing in the brain. 25 healthy controls underwent functional MRI whilst seeing pictures and receiving tastes of pleasant and unpleasant food. We applied GPCs to discriminate between the appetitive and aversive sights and tastes using functional activity patterns. The diagnostic accuracy of the GPC for the accuracy to discriminate appetitive taste from neutral condition was 86.5% (specificity = 81%, sensitivity = 92%, p = 0.001). If a participant experienced neutral taste stimuli the probability of correct classification was 92. The accuracy to discriminate aversive from neutral taste stimuli was 82.5% (specificity = 73%, sensitivity = 92%, p = 0.001) and appetitive from aversive taste stimuli was 73% (specificity = 77%, sensitivity = 69%, p = 0.001). In the sight modality, the accuracy to discriminate appetitive from neutral condition was 88.5% (specificity = 85%, sensitivity = 92%, p = 0.001), to discriminate aversive from neutral sight stimuli was 92% (specificity = 92%, sensitivity = 92%, p = 0.001), and to discriminate aversive from appetitive sight stimuli was 63.5% (specificity = 73%, sensitivity = 54%, p = 0.009). Our results demonstrate the predictive value of neurofunctional data in discriminating emotional and neutral networks of activity in the healthy human brain. It would be of interest to use pattern recognition techniques and fMRI to examine network dysfunction in the processing of appetitive, aversive and neutral stimuli in psychiatric disorders. Especially where problems with reward and punishment processing have been implicated in the pathophysiology of the disorder.
Histopathological Image Classification using Discriminative Feature-oriented Dictionary Learning
Vu, Tiep Huu; Mousavi, Hojjat Seyed; Monga, Vishal; Rao, Ganesh; Rao, UK Arvind
2016-01-01
In histopathological image analysis, feature extraction for classification is a challenging task due to the diversity of histology features suitable for each problem as well as presence of rich geometrical structures. In this paper, we propose an automatic feature discovery framework via learning class-specific dictionaries and present a low-complexity method for classification and disease grading in histopathology. Essentially, our Discriminative Feature-oriented Dictionary Learning (DFDL) method learns class-specific dictionaries such that under a sparsity constraint, the learned dictionaries allow representing a new image sample parsimoniously via the dictionary corresponding to the class identity of the sample. At the same time, the dictionary is designed to be poorly capable of representing samples from other classes. Experiments on three challenging real-world image databases: 1) histopathological images of intraductal breast lesions, 2) mammalian kidney, lung and spleen images provided by the Animal Diagnostics Lab (ADL) at Pennsylvania State University, and 3) brain tumor images from The Cancer Genome Atlas (TCGA) database, reveal the merits of our proposal over state-of-the-art alternatives. Moreover, we demonstrate that DFDL exhibits a more graceful decay in classification accuracy against the number of training images which is highly desirable in practice where generous training is often not available. PMID:26513781
Hu, Fei; Cheng, Yayun; Gui, Liangqi; Wu, Liang; Zhang, Xinyi; Peng, Xiaohui; Su, Jinlong
2016-11-01
The polarization properties of thermal millimeter-wave emission capture inherent information of objects, e.g., material composition, shape, and surface features. In this paper, a polarization-based material-classification technique using passive millimeter-wave polarimetric imagery is presented. Linear polarization ratio (LPR) is created to be a new feature discriminator that is sensitive to material type and to remove the reflected ambient radiation effect. The LPR characteristics of several common natural and artificial materials are investigated by theoretical and experimental analysis. Based on a priori information about LPR characteristics, the optimal range of incident angle and the classification criterion are discussed. Simulation and measurement results indicate that the presented classification technique is effective for distinguishing between metals and dielectrics. This technique suggests possible applications for outdoor metal target detection in open scenes.
Gromski, Piotr S; Correa, Elon; Vaughan, Andrew A; Wedge, David C; Turner, Michael L; Goodacre, Royston
2014-11-01
Accurate detection of certain chemical vapours is important, as these may be diagnostic for the presence of weapons, drugs of misuse or disease. In order to achieve this, chemical sensors could be deployed remotely. However, the readout from such sensors is a multivariate pattern, and this needs to be interpreted robustly using powerful supervised learning methods. Therefore, in this study, we compared the classification accuracy of four pattern recognition algorithms which include linear discriminant analysis (LDA), partial least squares-discriminant analysis (PLS-DA), random forests (RF) and support vector machines (SVM) which employed four different kernels. For this purpose, we have used electronic nose (e-nose) sensor data (Wedge et al., Sensors Actuators B Chem 143:365-372, 2009). In order to allow direct comparison between our four different algorithms, we employed two model validation procedures based on either 10-fold cross-validation or bootstrapping. The results show that LDA (91.56% accuracy) and SVM with a polynomial kernel (91.66% accuracy) were very effective at analysing these e-nose data. These two models gave superior prediction accuracy, sensitivity and specificity in comparison to the other techniques employed. With respect to the e-nose sensor data studied here, our findings recommend that SVM with a polynomial kernel should be favoured as a classification method over the other statistical models that we assessed. SVM with non-linear kernels have the advantage that they can be used for classifying non-linear as well as linear mapping from analytical data space to multi-group classifications and would thus be a suitable algorithm for the analysis of most e-nose sensor data.
NASA Astrophysics Data System (ADS)
Benaouda, D.; Wadge, G.; Whitmarsh, R. B.; Rothwell, R. G.; MacLeod, C.
1999-02-01
In boreholes with partial or no core recovery, interpretations of lithology in the remainder of the hole are routinely attempted using data from downhole geophysical sensors. We present a practical neural net-based technique that greatly enhances lithological interpretation in holes with partial core recovery by using downhole data to train classifiers to give a global classification scheme for those parts of the borehole for which no core was retrieved. We describe the system and its underlying methods of data exploration, selection and classification, and present a typical example of the system in use. Although the technique is equally applicable to oil industry boreholes, we apply it here to an Ocean Drilling Program (ODP) borehole (Hole 792E, Izu-Bonin forearc, a mixture of volcaniclastic sandstones, conglomerates and claystones). The quantitative benefits of quality-control measures and different subsampling strategies are shown. Direct comparisons between a number of discriminant analysis methods and the use of neural networks with back-propagation of error are presented. The neural networks perform better than the discriminant analysis techniques both in terms of performance rates with test data sets (2-3 per cent better) and in qualitative correlation with non-depth-matched core. We illustrate with the Hole 792E data how vital it is to have a system that permits the number and membership of training classes to be changed as analysis proceeds. The initial classification for Hole 792E evolved from a five-class to a three-class and then to a four-class scheme with resultant classification performance rates for the back-propagation neural network method of 83, 84 and 93 per cent respectively.
Classifying smoking urges via machine learning
Dumortier, Antoine; Beckjord, Ellen; Shiffman, Saul; Sejdić, Ervin
2016-01-01
Background and objective Smoking is the largest preventable cause of death and diseases in the developed world, and advances in modern electronics and machine learning can help us deliver real-time intervention to smokers in novel ways. In this paper, we examine different machine learning approaches to use situational features associated with having or not having urges to smoke during a quit attempt in order to accurately classify high-urge states. Methods To test our machine learning approaches, specifically, Bayes, discriminant analysis and decision tree learning methods, we used a dataset collected from over 300 participants who had initiated a quit attempt. The three classification approaches are evaluated observing sensitivity, specificity, accuracy and precision. Results The outcome of the analysis showed that algorithms based on feature selection make it possible to obtain high classification rates with only a few features selected from the entire dataset. The classification tree method outperformed the naive Bayes and discriminant analysis methods, with an accuracy of the classifications up to 86%. These numbers suggest that machine learning may be a suitable approach to deal with smoking cessation matters, and to predict smoking urges, outlining a potential use for mobile health applications. Conclusions In conclusion, machine learning classifiers can help identify smoking situations, and the search for the best features and classifier parameters significantly improves the algorithms’ performance. In addition, this study also supports the usefulness of new technologies in improving the effect of smoking cessation interventions, the management of time and patients by therapists, and thus the optimization of available health care resources. Future studies should focus on providing more adaptive and personalized support to people who really need it, in a minimum amount of time by developing novel expert systems capable of delivering real-time interventions. PMID:28110725
Classifying smoking urges via machine learning.
Dumortier, Antoine; Beckjord, Ellen; Shiffman, Saul; Sejdić, Ervin
2016-12-01
Smoking is the largest preventable cause of death and diseases in the developed world, and advances in modern electronics and machine learning can help us deliver real-time intervention to smokers in novel ways. In this paper, we examine different machine learning approaches to use situational features associated with having or not having urges to smoke during a quit attempt in order to accurately classify high-urge states. To test our machine learning approaches, specifically, Bayes, discriminant analysis and decision tree learning methods, we used a dataset collected from over 300 participants who had initiated a quit attempt. The three classification approaches are evaluated observing sensitivity, specificity, accuracy and precision. The outcome of the analysis showed that algorithms based on feature selection make it possible to obtain high classification rates with only a few features selected from the entire dataset. The classification tree method outperformed the naive Bayes and discriminant analysis methods, with an accuracy of the classifications up to 86%. These numbers suggest that machine learning may be a suitable approach to deal with smoking cessation matters, and to predict smoking urges, outlining a potential use for mobile health applications. In conclusion, machine learning classifiers can help identify smoking situations, and the search for the best features and classifier parameters significantly improves the algorithms' performance. In addition, this study also supports the usefulness of new technologies in improving the effect of smoking cessation interventions, the management of time and patients by therapists, and thus the optimization of available health care resources. Future studies should focus on providing more adaptive and personalized support to people who really need it, in a minimum amount of time by developing novel expert systems capable of delivering real-time interventions. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
On-Line Pattern Analysis and Recognition System. OLPARS VI. Software Reference Manual,
1982-06-18
Discriminant Analysis Data Transformation, Feature Extraction, Feature Evaluation Cluster Analysis, Classification Computer Software 20Z. ABSTRACT... cluster /scatter cut-off value, (2) change the one-space bin factor, (3) change from long prompts to short prompts or vice versa, (4) change the...value, a cluster plot is displayed, otherwise a scatter plot is shown. if option 1 is selected, the program requests that a new value be input
AlMasoud, Najla; Xu, Yun; Trivedi, Drupad K; Salivo, Simona; Abban, Tom; Rattray, Nicholas J W; Szula, Ewa; AlRabiah, Haitham; Sayqal, Ali; Goodacre, Royston
2016-11-01
Bacillus are aerobic spore-forming bacteria that are known to lead to specific diseases, such as anthrax and food poisoning. This study focuses on the characterization of these bacteria by the detection of lipids extracted from 33 well-characterized strains from the Bacillus and Brevibacillus genera, with the aim to discriminate between the different species. For the purpose of analysing the lipids extracted from these bacterial samples, two rapid physicochemical techniques were used: matrix-assisted laser desorption/ionisation time-of-flight mass spectrometry (MALDI-TOF-MS) and liquid chromatography in conjunction with mass spectrometry (LC-MS). The findings of this investigation confirmed that MALDI-TOF-MS could be used to identify different bacterial lipids and, in combination with appropriate chemometrics, allowed for the discrimination between these different bacterial species, which was supported by LC-MS. The average correct classification rates for the seven species of bacteria were 62.23 and 77.03 % based on MALDI-TOF-MS and LC-MS data, respectively. The Procrustes distance for the two datasets was 0.0699, indicating that the results from the two techniques were very similar. In addition, we also compared these bacterial lipid MALDI-TOF-MS profiles to protein profiles also collected by MALDI-TOF-MS on the same bacteria (Procrustes distance, 0.1006). The level of discrimination between lipids and proteins was equivalent, and this further indicated the potential of MALDI-TOF-MS analysis as a rapid, robust and reliable method for the classification of bacteria based on different bacterial chemical components. Graphical abstract MALDI-MS has been successfully developed for the characterization of bacteria at the subspecies level using lipids and benchmarked against HPLC.
Colliver, Jessica; Wang, Allan; Joss, Brendan; Ebert, Jay; Koh, Eamon; Breidahl, William; Ackland, Timothy
2016-04-01
This study investigated if patients with an intact tendon repair or partial-thickness retear early after rotator cuff repair display differences in clinical evaluations and whether early tendon healing can be predicted using these assessments. We prospectively evaluated 60 patients at 16 weeks after arthroscopic supraspinatus repair. Evaluation included the Oxford Shoulder Score, 11-item version of the Disabilities of the Arm, Shoulder and Hand, visual analog scale for pain, 12-item Short Form Health Survey, isokinetic strength, and magnetic resonance imaging (MRI). Independent t tests investigated clinical differences in patients based on the Sugaya MRI rotator cuff classification system (grades 1, 2, or 3). Discriminant analysis determined whether intact repairs (Sugaya grade 1) and partial-thickness retears (Sugaya grades 2 and 3) could be predicted. No differences (P < .05) existed in the clinical or strength measures. Although discriminant analysis revealed the 11-item version of the Disabilities of the Arm, Shoulder and Hand produced a 97% true-positive rate for predicting partial thickness retears, it also produced a 90% false-positive rate whereby it incorrectly predicted a retear in 90% of patients whose repair was intact. The ability to discriminate between groups was enhanced with up to 5 variables entered; however, only 87% of the partial-retear group and 36% of the intact-repair group were correctly classified. No differences in clinical scores existed between patients stratified by the Sugaya MRI classification system at 16 weeks. An intact repair or partial-thickness retear could not be accurately predicted. Our results suggest that correct classification of healing in the early postoperative stages should involve imaging. Copyright © 2016 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Elsevier Inc. All rights reserved.
Shimizu, Hideaki; Akamatsu, Fumikazu; Kamada, Aya; Koyama, Kazuya; Okuda, Masaki; Fukuda, Hisashi; Iwashita, Kazuhiro; Goto-Yamamoto, Nami
2018-04-01
Differences in mineral concentrations were examined among three types of wine in the Japanese market place: Japan wine, imported wine, and domestically produced wine mainly from foreign ingredients (DWF), where Japan wine has been recently defined by the National Tax Agency as domestically produced wine from grapes cultivated in Japan. The main objective of this study was to examine the possibility of controlling the authenticity of Japan wine. The concentrations of 18 minerals (Li, B, Na, Mg, Si, P, S, K, Ca, Mn, Co, Ni, Ga, Rb, Sr, Mo, Ba, and Pb) in 214 wine samples were determined by inductively coupled-plasma mass spectrometry (ICP-MS) and ICP-atomic emission spectrometry (ICP-AES). In general, Japan wine had a higher concentration of potassium and lower concentrations of eight elements (Li, B, Na, Si, S, Co, Sr, and Pb) as compared with the other two groups of wine. Linear discriminant analysis (LDA) models based on concentrations of the 18 minerals facilitated the identification of three wine groups: Japan wine, imported wine, and DWF with a 91.1% classification score and 87.9% prediction score. In addition, an LDA model for discrimination of wine from four domestic geographic origins (Yamanashi, Nagano, Hokkaido, and Yamagata Prefectures) using 18 elements gave a classification score of 93.1% and a prediction score of 76.4%. In summary, we have shown that an LDA model based on mineral concentrations is useful for distinguishing Japan wine from other wine groups, and can contribute to classification of the four main domestic wine-producing regions of Japan. Copyright © 2017 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Chiarucci, Riccardo; Madeo, Dario; Loffredo, Maria I.; Castellani, Eleonora; Santarcangelo, Enrica L.; Mocenni, Chiara
2014-07-01
Assessment of hypnotic susceptibility is usually obtained through the application of psychological instruments. A satisfying classification obtained through quantitative measures is still missing, although it would be very useful for both diagnostic and clinical purposes. Aiming at investigating the relationship between the cortical brain activity and the hypnotic susceptibility level, we propose the combined use of two methodologies - Recurrence Quantification Analysis and Detrended Fluctuation Analysis - both inherited from nonlinear dynamics. Indicators obtained through the application of these techniques to EEG signals of individuals in their ordinary state of consciousness allowed us to obtain a clear discrimination between subjects with high and low susceptibility to hypnosis. Finally a neural network approach was used to perform classification analysis.
Adams, Michelle M; Anslyn, Eric V
2009-12-02
There has been a growing interest in the use of differential sensing for analyte classification. In an effort to mimic the mammalian senses of taste and smell, which utilize protein-based receptors, we have introduced serum albumins as nonselective receptors for recognition of small hydrophobic molecules. Herein, we employ a sensing ensemble consisting of serum albumins, a hydrophobic fluorescent indicator (PRODAN), and a hydrophobic additive (deoxycholate) to detect terpenes. With the aid of linear discriminant analysis, we successfully applied our system to differentiate five terpenes. We then extended our terpene analysis and utilized our sensing ensemble for terpene discrimination within the complex mixtures found in perfume.
Guisande, Cástor; Vari, Richard P; Heine, Jürgen; García-Roselló, Emilio; González-Dacosta, Jacinto; Perez-Schofield, Baltasar J García; González-Vilas, Luis; Pelayo-Villamil, Patricia
2016-09-12
We present and discuss VARSEDIG, an algorithm which identifies the morphometric features that significantly discriminate two taxa and validates the morphological distinctness between them via a Monte-Carlo test. VARSEDIG is freely available as a function of the RWizard application PlotsR (http://www.ipez.es/RWizard) and as R package on CRAN. The variables selected by VARSEDIG with the overlap method were very similar to those selected by logistic regression and discriminant analysis, but overcomes some shortcomings of these methods. VARSEDIG is, therefore, a good alternative by comparison to current classical classification methods for identifying morphometric features that significantly discriminate a taxon and for validating its morphological distinctness from other taxa. As a demonstration of the potential of VARSEDIG for this purpose, we analyze morphological discrimination among some species of the Neotropical freshwater family Characidae.
Observation versus classification in supervised category learning.
Levering, Kimery R; Kurtz, Kenneth J
2015-02-01
The traditional supervised classification paradigm encourages learners to acquire only the knowledge needed to predict category membership (a discriminative approach). An alternative that aligns with important aspects of real-world concept formation is learning with a broader focus to acquire knowledge of the internal structure of each category (a generative approach). Our work addresses the impact of a particular component of the traditional classification task: the guess-and-correct cycle. We compare classification learning to a supervised observational learning task in which learners are shown labeled examples but make no classification response. The goals of this work sit at two levels: (1) testing for differences in the nature of the category representations that arise from two basic learning modes; and (2) evaluating the generative/discriminative continuum as a theoretical tool for understand learning modes and their outcomes. Specifically, we view the guess-and-correct cycle as consistent with a more discriminative approach and therefore expected it to lead to narrower category knowledge. Across two experiments, the observational mode led to greater sensitivity to distributional properties of features and correlations between features. We conclude that a relatively subtle procedural difference in supervised category learning substantially impacts what learners come to know about the categories. The results demonstrate the value of the generative/discriminative continuum as a tool for advancing the psychology of category learning and also provide a valuable constraint for formal models and associated theories.
Pan, Yu; Zhang, Ji; Li, Hong; Wang, Yuan-Zhong; Li, Wan-Yi
2016-10-01
Macamides with a benzylalkylamide nucleus are characteristic and major bioactive compounds in the functional food maca (Lepidium meyenii Walp). The aim of this study was to explore variations in macamide content among maca from China and Peru. Twenty-seven batches of maca hypocotyls with different phenotypes, sampled from different geographical origins, were extracted and profiled by liquid chromatography with ultraviolet detection/tandem mass spectrometry (LC-UV/MS/MS). Twelve macamides were identified by MS operated in multiple scanning modes. Similarity analysis showed that maca samples differed significantly in their macamide fingerprinting. Partial least squares discriminant analysis (PLS-DA) was used to differentiate samples according to their geographical origin and to identify the most relevant variables in the classification model. The prediction accuracy for raw maca was 91% and five macamides were selected and considered as chemical markers for sample classification. When combined with a PLS-DA model, characteristic fingerprinting based on macamides could be recommended for labelling for the authentication of maca from different geographical origins. The results provided potential evidence for the relationships between environmental or other factors and distribution of macamides. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.
Delineation of sympatric morphotypes of lake trout in Lake Superior
Moore, Seth A.; Bronte, Charles R.
2001-01-01
Three morphotypes of lake trout Salvelinus namaycush are recognized in Lake Superior: lean, siscowet, and humper. Absolute morphotype assignment can be difficult. We used a size-free, whole-body morphometric analysis (truss protocol) to determine whether differences in body shape existed among lake trout morphotypes. Our results showed discrimination where traditional morphometric characters and meristic measurements failed to detect differences. Principal components analysis revealed some separation of all three morphotypes based on head and caudal peduncle shape, but it also indicated considerable overlap in score values. Humper lake trout have smaller caudal peduncle widths to head length and depth characters than do lean or siscowet lake trout. Lean lake trout had larger head measures to caudal widths, whereas siscowet had higher caudal peduncle to head measures. Backward stepwise discriminant function analysis retained two head measures, three midbody measures, and four caudal peduncle measures; correct classification rates when using these variables were 83% for leans, 80% for siscowets, and 83% for humpers, which suggests the measures we used for initial classification were consistent. Although clear ecological reasons for these differences are not readily apparent, patterns in misclassification rates may be consistent with evolutionary hypotheses for lake trout within the Laurentian Great Lakes.
Differentiation of tea varieties using UV-Vis spectra and pattern recognition techniques
NASA Astrophysics Data System (ADS)
Palacios-Morillo, Ana; Alcázar, Ángela.; de Pablos, Fernando; Jurado, José Marcos
2013-02-01
Tea, one of the most consumed beverages all over the world, is of great importance in the economies of a number of countries. Several methods have been developed to classify tea varieties or origins based in pattern recognition techniques applied to chemical data, such as metal profile, amino acids, catechins and volatile compounds. Some of these analytical methods become tedious and expensive to be applied in routine works. The use of UV-Vis spectral data as discriminant variables, highly influenced by the chemical composition, can be an alternative to these methods. UV-Vis spectra of methanol-water extracts of tea have been obtained in the interval 250-800 nm. Absorbances have been used as input variables. Principal component analysis was used to reduce the number of variables and several pattern recognition methods, such as linear discriminant analysis, support vector machines and artificial neural networks, have been applied in order to differentiate the most common tea varieties. A successful classification model was built by combining principal component analysis and multilayer perceptron artificial neural networks, allowing the differentiation between tea varieties. This rapid and simple methodology can be applied to solve classification problems in food industry saving economic resources.
Lê Cao, Kim-Anh; Boitard, Simon; Besse, Philippe
2011-06-22
Variable selection on high throughput biological data, such as gene expression or single nucleotide polymorphisms (SNPs), becomes inevitable to select relevant information and, therefore, to better characterize diseases or assess genetic structure. There are different ways to perform variable selection in large data sets. Statistical tests are commonly used to identify differentially expressed features for explanatory purposes, whereas Machine Learning wrapper approaches can be used for predictive purposes. In the case of multiple highly correlated variables, another option is to use multivariate exploratory approaches to give more insight into cell biology, biological pathways or complex traits. A simple extension of a sparse PLS exploratory approach is proposed to perform variable selection in a multiclass classification framework. sPLS-DA has a classification performance similar to other wrapper or sparse discriminant analysis approaches on public microarray and SNP data sets. More importantly, sPLS-DA is clearly competitive in terms of computational efficiency and superior in terms of interpretability of the results via valuable graphical outputs. sPLS-DA is available in the R package mixOmics, which is dedicated to the analysis of large biological data sets.
Time resolved fluorescence of cow and goat milk powder
NASA Astrophysics Data System (ADS)
Brandao, Mariana P.; de Carvalho dos Anjos, Virgílio; Bell., Maria José V.
2017-01-01
Milk powder is an international dairy commodity. Goat and cow milk powders are significant sources of nutrients and the investigation of the authenticity and classification of milk powder is particularly important. The use of time-resolved fluorescence techniques to distinguish chemical composition and structure modifications could assist develop a portable and non-destructive methodology to perform milk powder classification and determine composition. This study goal is to differentiate milk powder samples from cows and goats using fluorescence lifetimes. The samples were excited at 315 nm and the fluorescence intensity decay registered at 468 nm. We observed fluorescence lifetimes of 1.5 ± 0.3, 6.4 ± 0.4 and 18.7 ± 2.5 ns for goat milk powder; and 1.7 ± 0.3, 6.9 ± 0.2 and 29.9 ± 1.6 ns for cow's milk powder. We discriminate goat and cow powder milk by analysis of variance using Fisher's method. In addition, we employed quadratic discriminant analysis to differentiate the milk samples with accuracy of 100%. Our results suggest that time-resolved fluorescence can provide a new method to the analysis of powder milk and its composition.
Fossil Signatures Using Elemental Abundance Distributions and Bayesian Probabilistic Classification
NASA Technical Reports Server (NTRS)
Hoover, Richard B.; Storrie-Lombardi, Michael C.
2004-01-01
Elemental abundances (C6, N7, O8, Na11, Mg12, Al3, P15, S16, Cl17, K19, Ca20, Ti22, Mn25, Fe26, and Ni28) were obtained for a set of terrestrial fossils and the rock matrix surrounding them. Principal Component Analysis extracted five factors accounting for the 92.5% of the data variance, i.e. information content, of the elemental abundance data. Hierarchical Cluster Analysis provided unsupervised sample classification distinguishing fossil from matrix samples on the basis of either raw abundances or PCA input that agreed strongly with visual classification. A stochastic, non-linear Artificial Neural Network produced a Bayesian probability of correct sample classification. The results provide a quantitative probabilistic methodology for discriminating terrestrial fossils from the surrounding rock matrix using chemical information. To demonstrate the applicability of these techniques to the assessment of meteoritic samples or in situ extraterrestrial exploration, we present preliminary data on samples of the Orgueil meteorite. In both systems an elemental signature produces target classification decisions remarkably consistent with morphological classification by a human expert using only structural (visual) information. We discuss the possibility of implementing a complexity analysis metric capable of automating certain image analysis and pattern recognition abilities of the human eye using low magnification optical microscopy images and discuss the extension of this technique across multiple scales.
Aural Classification and Temporal Robustness
2010-11-01
Canada – Atlantique ; novembre 2010. Contexte : Le présent projet vise le développement d’un classificateur robuste qui utilise des...10 4.2.2.2 Discriminant score . . . . . . . . . . . . . . . . . . . 11 4.2.3 Principal component analysis . . . . . . . . . . . . . . . . . . . 13 ...allows class separation. . . . . . . . . . . . 13 Figure 7: Hypothetical clutter and target pdfs and posterior probabilties shown as surfaces
Forest/non-forest stratification in Georgia with Landsat Thematic Mapper data
William H. Cooke
2000-01-01
Geographically accurate Forest Inventory and Analysis (FIA) data may be useful for training, classification, and accuracy assessment of Landsat Thematic Mapper (TM) data. Minimum expectation for maps derived from Landsat data is accurate discrimination of several land cover classes. Landsat TM costs have decreased dramatically, but acquiring cloud-free scenes at...
Narváez-Rivas, M; Pablos, F; Jurado, J M; León-Camacho, M
2011-02-01
The composition of volatile components of subcutaneous fat from Iberian pig has been studied. Purge and trap gas chromatography-mass spectrometry has been used. The composition of the volatile fraction of subcutaneous fat has been used for authentication purposes of different types of Iberian pig fat. Three types of this product have been considered, montanera, extensive cebo and intensive cebo. With classification purposes, several pattern recognition techniques have been applied. In order to find out possible tendencies in the sample distribution as well as the discriminant power of the variables, principal component analysis was applied as visualisation technique. Linear discriminant analysis (LDA) and soft independent modelling by class analogy (SIMCA) were used to obtain suitable classification models. LDA and SIMCA allowed the differentiation of three fattening diets by using the contents in 2,2,4,6,6-pentamethyl-heptane, m-xylene, 2,4-dimethyl-heptane, 6-methyl-tridecane, 1-methoxy-2-propanol, isopropyl alcohol, o-xylene, 3-ethyl-2,2-dimethyl-oxirane, 2,6-dimethyl-undecane, 3-methyl-3-pentanol and limonene.
Javidnia, Katayoun; Parish, Maryam; Karimi, Sadegh; Hemmateenejad, Bahram
2013-03-01
By using FT-IR spectroscopy, many researchers from different disciplines enrich the experimental complexity of their research for obtaining more precise information. Moreover chemometrics techniques have boosted the use of IR instruments. In the present study we aimed to emphasize on the power of FT-IR spectroscopy for discrimination between different oil samples (especially fat from vegetable oils). Also our data were used to compare the performance of different classification methods. FT-IR transmittance spectra of oil samples (Corn, Colona, Sunflower, Soya, Olive, and Butter) were measured in the wave-number interval of 450-4000 cm(-1). Classification analysis was performed utilizing PLS-DA, interval PLS-DA, extended canonical variate analysis (ECVA) and interval ECVA methods. The effect of data preprocessing by extended multiplicative signal correction was investigated. Whilst all employed method could distinguish butter from vegetable oils, iECVA resulted in the best performances for calibration and external test set with 100% sensitivity and specificity. Copyright © 2012 Elsevier B.V. All rights reserved.
Whole brain white matter connectivity analysis using machine learning: An application to autism.
Zhang, Fan; Savadjiev, Peter; Cai, Weidong; Song, Yang; Rathi, Yogesh; Tunç, Birkan; Parker, Drew; Kapur, Tina; Schultz, Robert T; Makris, Nikos; Verma, Ragini; O'Donnell, Lauren J
2018-05-15
In this paper, we propose an automated white matter connectivity analysis method for machine learning classification and characterization of white matter abnormality via identification of discriminative fiber tracts. The proposed method uses diffusion MRI tractography and a data-driven approach to find fiber clusters corresponding to subdivisions of the white matter anatomy. Features extracted from each fiber cluster describe its diffusion properties and are used for machine learning. The method is demonstrated by application to a pediatric neuroimaging dataset from 149 individuals, including 70 children with autism spectrum disorder (ASD) and 79 typically developing controls (TDC). A classification accuracy of 78.33% is achieved in this cross-validation study. We investigate the discriminative diffusion features based on a two-tensor fiber tracking model. We observe that the mean fractional anisotropy from the second tensor (associated with crossing fibers) is most affected in ASD. We also find that local along-tract (central cores and endpoint regions) differences between ASD and TDC are helpful in differentiating the two groups. These altered diffusion properties in ASD are associated with multiple robustly discriminative fiber clusters, which belong to several major white matter tracts including the corpus callosum, arcuate fasciculus, uncinate fasciculus and aslant tract; and the white matter structures related to the cerebellum, brain stem, and ventral diencephalon. These discriminative fiber clusters, a small part of the whole brain tractography, represent the white matter connections that could be most affected in ASD. Our results indicate the potential of a machine learning pipeline based on white matter fiber clustering. Copyright © 2017 Elsevier Inc. All rights reserved.
Keihaninejad, Shiva; Heckemann, Rolf A.; Gousias, Ioannis S.; Hajnal, Joseph V.; Duncan, John S.; Aljabar, Paul; Rueckert, Daniel; Hammers, Alexander
2012-01-01
Brain images contain information suitable for automatically sorting subjects into categories such as healthy controls and patients. We sought to identify morphometric criteria for distinguishing controls (n = 28) from patients with unilateral temporal lobe epilepsy (TLE), 60 with and 20 without hippocampal atrophy (TLE-HA and TLE-N, respectively), and for determining the presumed side of seizure onset. The framework employs multi-atlas segmentation to estimate the volumes of 83 brain structures. A kernel-based separability criterion was then used to identify structures whose volumes discriminate between the groups. Next, we applied support vector machines (SVM) to the selected set for classification on the basis of volumes. We also computed pairwise similarities between all subjects and used spectral analysis to convert these into per-subject features. SVM was again applied to these feature data. After training on a subgroup, all TLE-HA patients were correctly distinguished from controls, achieving an accuracy of 96 ± 2% in both classification schemes. For TLE-N patients, the accuracy was 86 ± 2% based on structural volumes and 91 ± 3% using spectral analysis. Structures discriminating between patients and controls were mainly localized ipsilaterally to the presumed seizure focus. For the TLE-HA group, they were mainly in the temporal lobe; for the TLE-N group they included orbitofrontal regions, as well as the ipsilateral substantia nigra. Correct lateralization of the presumed seizure onset zone was achieved using hippocampi and parahippocampal gyri in all TLE-HA patients using either classification scheme; in the TLE-N patients, lateralization was accurate based on structural volumes in 86 ± 4%, and in 94 ± 4% with the spectral analysis approach. Unilateral TLE has imaging features that can be identified automatically, even when they are invisible to human experts. Such morphometric image features may serve as classification and lateralization criteria. The technique also detects unsuspected distinguishing features like the substantia nigra, warranting further study. PMID:22523539
NASA Astrophysics Data System (ADS)
Ding, Hao; Cao, Ming; DuPont, Andrew W.; Scott, Larry D.; Guha, Sushovan; Singhal, Shashideep; Younes, Mamoun; Pence, Isaac; Herline, Alan; Schwartz, David; Xu, Hua; Mahadevan-Jansen, Anita; Bi, Xiaohong
2016-03-01
Inflammatory bowel disease (IBD) is an idiopathic disease that is typically characterized by chronic inflammation of the gastrointestinal tract. Recently much effort has been devoted to the development of novel diagnostic tools that can assist physicians for fast, accurate, and automated diagnosis of the disease. Previous research based on Raman spectroscopy has shown promising results in differentiating IBD patients from normal screening cases. In the current study, we examined IBD patients in vivo through a colonoscope-coupled Raman system. Optical diagnosis for IBD discrimination was conducted based on full-range spectra using multivariate statistical methods. Further, we incorporated several feature selection methods in machine learning into the classification model. The diagnostic performance for disease differentiation was significantly improved after feature selection. Our results showed that improved IBD diagnosis can be achieved using Raman spectroscopy in combination with multivariate analysis and feature selection.
Trace element analysis of rough diamond by LA-ICP-MS: a case of source discrimination?
Dalpé, Claude; Hudon, Pierre; Ballantyne, David J; Williams, Darrell; Marcotte, Denis
2010-11-01
Current profiling of rough diamond source is performed using different physical and/or morphological techniques that require strong knowledge and experience in the field. More recently, chemical impurities have been used to discriminate diamond source and with the advance of laser ablation-inductively coupled plasma-mass spectrometry (LA-ICP-MS) empirical profiling of rough diamonds is possible to some extent. In this study, we present a LA-ICP-MS methodology that we developed for analyzing ultra-trace element impurities in rough diamond for origin determination ("profiling"). Diamonds from two sources were analyzed by LA-ICP-MS and were statistically classified by accepted methods. For the two diamond populations analyzed in this study, binomial logistic regression produced a better overall correct classification than linear discriminant analysis. The results suggest that an anticipated matrix match reference material would improve the robustness of our methodology for forensic applications. © 2010 American Academy of Forensic Sciences.
Differentiating clinical groups using the serial color-word test (S-CWT).
Hentschel, Uwe; Rubino, I Alex; Bijleveld, Catrien
2011-04-01
The present study attempted to differentiate 11 diagnostic groups by means of the Serial Color-Word Test (S-CWT), using multivariate discriminant analysis. Two alternative scoring systems of the S-CWT were outlined. Asample of 514 individuals who had clinical diagnoses of various types and 397 controls who had no diagnostic findings comprised the sample. The first discriminant analysis failed to differentiate the groups adequately. The groups were consequently reduced to four (schizophrenia, bipolar disorders, temporo-mandibular joint pain dysfunction syndrome, and eating disturbances), which gave better reclassification findings for a clinical application of the test. This classification gave over 55% correct assignments. The final four groups had a statistically significant discrimination on the test, which remained stable also in a bootstrap procedure. Implications for treatment indications and outcomes as well as strategies for further studies using the S-CWT are discussed.
Shankar, Vijay; Reo, Nicholas V; Paliy, Oleg
2015-12-09
We previously showed that stool samples of pre-adolescent and adolescent US children diagnosed with diarrhea-predominant IBS (IBS-D) had different compositions of microbiota and metabolites compared to healthy age-matched controls. Here we explored whether observed fecal microbiota and metabolite differences between these two adolescent populations can be used to discriminate between IBS and health. We constructed individual microbiota- and metabolite-based sample classification models based on the partial least squares multivariate analysis and then applied a Bayesian approach to integrate individual models into a single classifier. The resulting combined classification achieved 84 % accuracy of correct sample group assignment and 86 % prediction for IBS-D in cross-validation tests. The performance of the cumulative classification model was further validated by the de novo analysis of stool samples from a small independent IBS-D cohort. High-throughput microbial and metabolite profiling of subject stool samples can be used to facilitate IBS diagnosis.
Classification of the Correct Quranic Letters Pronunciation of Male and Female Reciters
NASA Astrophysics Data System (ADS)
Khairuddin, Safiah; Ahmad, Salmiah; Embong, Abdul Halim; Nur Wahidah Nik Hashim, Nik; Altamas, Tareq M. K.; Nuratikah Syd Badaruddin, Syarifah; Shahbudin Hassan, Surul
2017-11-01
Recitation of the Holy Quran with the correct Tajweed is essential for every Muslim. Islam has encouraged Quranic education since early age as the recitation of the Quran correctly will represent the correct meaning of the words of Allah. It is important to recite the Quranic verses according to its characteristics (sifaat) and from its point of articulations (makhraj). This paper presents the identification and classification analysis of Quranic letters pronunciation for both male and female reciters, to obtain the unique representation of each letter by male as compared to female expert reciters. Linear Discriminant Analysis (LDA) was used as the classifier to classify the data with Formants and Power Spectral Density (PSD) as the acoustic features. The result shows that linear classifier of PSD with band 1 and band 2 power spectral combinations gives a high percentage of classification accuracy for most of the Quranic letters. It is also shown that the pronunciation by male reciters gives better result in the classification of the Quranic letters.
Rinaldi, Maurizio; Gindro, Roberto; Barbeni, Massimo; Allegrone, Gianna
2009-01-01
Orange (Citrus sinensis L.) juice comprises a complex mixture of volatile components that are difficult to identify and quantify. Classification and discrimination of the varieties on the basis of the volatile composition could help to guarantee the quality of a juice and to detect possible adulteration of the product. To provide information on the amounts of volatile constituents in fresh-squeezed juices from four orange cultivars and to establish suitable discrimination rules to differentiate orange juices using new chemometric approaches. Fresh juices of four orange cultivars were analysed by headspace solid-phase microextraction (HS-SPME) coupled with GC-MS. Principal component analysis, linear discriminant analysis and heuristic methods, such as neural networks, allowed clustering of the data from HS-SPME analysis while genetic algorithms addressed the problem of data reduction. To check the quality of the results the chemometric techniques were also evaluated on a sample. Thirty volatile compounds were identified by HS-SPME and GC-MS analyses and their relative amounts calculated. Differences in composition of orange juice volatile components were observed. The chosen orange cultivars could be discriminated using neural networks, genetic relocation algorithms and linear discriminant analysis. Genetic algorithms applied to the data were also able to detect the most significant compounds. SPME is a useful technique to investigate orange juice volatile composition and a flexible chemometric approach is able to correctly separate the juices.
Zhang, Xufeng; Liu, Yu; Li, Ying; Zhao, Xinda
2017-03-01
Geographic traceability is an important issue for food quality and safety control of seafood. In this study,δ 13 C and δ 15 N values, as well as fatty acid (FA) content of 133 samples of A. japonicus from seven sampling points in northern China Sea were determined to evaluate their applicability in the origin traceability of A. japonicus. Principal component analysis (PCA) and discriminant analysis (DA) were applied to different data sets in order to evaluate their performance in terms of classification or predictive ability. δ 13 C and δ 15 N values could effectively discriminate between different origins of A. japonicus. Significant differences in the FA compositions showed the effectiveness of FA composition as a tool for distinguishing between different origins of A. japonicus. The two technologies, combined with multivariate statistical analysis, can be promising methods to discriminate A. japonicus from different geographical areas. Copyright © 2016. Published by Elsevier Ltd.
NASA Astrophysics Data System (ADS)
Niu, Xiaoying; Ying, Yibin; Yu, Haiyan; Xie, Lijuan; Fu, Xiaping; Zhou, Ying; Jiang, Xuesong
2007-09-01
In this paper, 104 samples of Chinese rice wines of the same variety (Shaoxing rice wine), collected in three winery ("guyuelongshan", "pagoda" brand, "kuaijishan"), three brewed years (2002, 2004, 2004-2006) were analyzed by near-infrared transmission spectroscopy between 800 and 2500 nm. The spectral differences were studied by principal components analysis (PCA), and Classifications, according the brand, were carried out by discriminant analysis (DA) and partial least squares discriminant analysis (PLSDA). The DA model gained a total accuracy of 94.23% and when used to predict the brand of the validation set samples, a better result, correctly classified all of the three kinds of Chinese rice wine up to 100%, are obtained by PLSDA model. The work reported here is a feasibility study and requires further development with considerable samples of more different brands. Further studies are needed in order to improve the accuracy and robustness, and to extend the discrimination to other Chinese rice wine varieties or brands.
Discriminative least squares regression for multiclass classification and feature selection.
Xiang, Shiming; Nie, Feiping; Meng, Gaofeng; Pan, Chunhong; Zhang, Changshui
2012-11-01
This paper presents a framework of discriminative least squares regression (LSR) for multiclass classification and feature selection. The core idea is to enlarge the distance between different classes under the conceptual framework of LSR. First, a technique called ε-dragging is introduced to force the regression targets of different classes moving along opposite directions such that the distances between classes can be enlarged. Then, the ε-draggings are integrated into the LSR model for multiclass classification. Our learning framework, referred to as discriminative LSR, has a compact model form, where there is no need to train two-class machines that are independent of each other. With its compact form, this model can be naturally extended for feature selection. This goal is achieved in terms of L2,1 norm of matrix, generating a sparse learning model for feature selection. The model for multiclass classification and its extension for feature selection are finally solved elegantly and efficiently. Experimental evaluation over a range of benchmark datasets indicates the validity of our method.
Fischedick, Justin T
2017-01-01
Introduction: With laws changing around the world regarding the legal status of Cannabis sativa (cannabis) it is important to develop objective classification systems that help explain the chemical variation found among various cultivars. Currently cannabis cultivars are named using obscure and inconsistent nomenclature. Terpenoids, responsible for the aroma of cannabis, are a useful group of compounds for distinguishing cannabis cultivars with similar cannabinoid content. Methods: In this study we analyzed terpenoid content of cannabis samples obtained from a single medical cannabis dispensary in California over the course of a year. Terpenoids were quantified by gas chromatography with flame ionization detection and peak identification was confirmed with gas chromatography mass spectrometry. Quantitative data from 16 major terpenoids were analyzed using hierarchical clustering analysis (HCA), principal component analysis (PCA), partial least squares discriminant analysis (PLS-DA), and orthogonal partial least squares discriminant analysis (OPLS-DA). Results: A total of 233 samples representing 30 cultivars were used to develop a classification scheme based on quantitative data, HCA, PCA, and OPLS-DA. Initially cultivars were divided into five major groups, which were subdivided into 13 classes based on differences in terpenoid profile. Different classification models were compared with PLS-DA and found to perform best when many representative samples of a particular class were included. Conclusion: A hierarchy of terpenoid chemotypes was observed in the data set. Some cultivars fit into distinct chemotypes, whereas others seemed to represent a continuum of chemotypes. This study has demonstrated an approach to classifying cannabis cultivars based on terpenoid profile.
NASA Astrophysics Data System (ADS)
Shenoy Handiru, Vikram; Vinod, A. P.; Guan, Cuntai
2017-08-01
Objective. In electroencephalography (EEG)-based brain-computer interface (BCI) systems for motor control tasks the conventional practice is to decode motor intentions by using scalp EEG. However, scalp EEG only reveals certain limited information about the complex tasks of movement with a higher degree of freedom. Therefore, our objective is to investigate the effectiveness of source-space EEG in extracting relevant features that discriminate arm movement in multiple directions. Approach. We have proposed a novel feature extraction algorithm based on supervised factor analysis that models the data from source-space EEG. To this end, we computed the features from the source dipoles confined to Brodmann areas of interest (BA4a, BA4p and BA6). Further, we embedded class-wise labels of multi-direction (multi-class) source-space EEG to an unsupervised factor analysis to make it into a supervised learning method. Main Results. Our approach provided an average decoding accuracy of 71% for the classification of hand movement in four orthogonal directions, that is significantly higher (>10%) than the classification accuracy obtained using state-of-the-art spatial pattern features in sensor space. Also, the group analysis on the spectral characteristics of source-space EEG indicates that the slow cortical potentials from a set of cortical source dipoles reveal discriminative information regarding the movement parameter, direction. Significance. This study presents evidence that low-frequency components in the source space play an important role in movement kinematics, and thus it may lead to new strategies for BCI-based neurorehabilitation.
NASA Astrophysics Data System (ADS)
Kimuli, Daniel; Wang, Wei; Wang, Wei; Jiang, Hongzhe; Zhao, Xin; Chu, Xuan
2018-03-01
A short-wave infrared (SWIR) hyperspectral imaging system (1000-2500 nm) combined with chemometric data analysis was used to detect aflatoxin B1 (AFB1) on surfaces of 600 kernels of four yellow maize varieties from different States of the USA (Georgia, Illinois, Indiana and Nebraska). For each variety, four AFB1 solutions (10, 20, 100 and 500 ppb) were artificially deposited on kernels and a control group was generated from kernels treated with methanol solution. Principal component analysis (PCA), partial least squares discriminant analysis (PLSDA) and factorial discriminant analysis (FDA) were applied to explore and classify maize kernels according to AFB1 contamination. PCA results revealed partial separation of control kernels from AFB1 contaminated kernels for each variety while no pattern of separation was observed among pooled samples. A combination of standard normal variate and first derivative pre-treatments produced the best PLSDA classification model with accuracy of 100% and 96% in calibration and validation, respectively, from Illinois variety. The best AFB1 classification results came from FDA on raw spectra with accuracy of 100% in calibration and validation for Illinois and Nebraska varieties. However, for both PLSDA and FDA models, poor AFB1 classification results were obtained for pooled samples relative to individual varieties. SWIR spectra combined with chemometrics and spectra pre-treatments showed the possibility of detecting maize kernels of different varieties coated with AFB1. The study further suggests that increase of maize kernel constituents like water, protein, starch and lipid in a pooled sample may have influence on detection accuracy of AFB1 contamination.
Application of LANDSAT images to wetland study and land use classification in west Tennessee, part 1
NASA Technical Reports Server (NTRS)
Shahrokhi, F. (Principal Investigator); Jones, N. L.
1977-01-01
The author has identified the following significant results. densitometric analysis was performed on LANDSAT data to permit numerical classification of objects observed in the imagery on the basis of measurements of optical density. Relative light transmission measurements were taken on four types of scene elements in each of three LANDSAT black and white bands in order to determine which classification could be distinguished. The analysis of band 6 determined forest and agricultural classifications, but not the urban and wetlands. Both bands 4 and 5 showed a significant difference existed between the confirmed classification of wetlands-agriculture, and urban areas. Therefore, the combination of band 6 with either 4 or 5 would permit the separation of the urban from the wetland classification. To enhance the urban and wetland boundaries, the LANDSAT black and white bands were combined in a multispectral additive color viewer. Several combinations of filters and light intensities were used to obtain maximum discrimination between points of interest. The best results for enhancing wetland boundaries and urban areas were achieved by using a color composite (a blue, green, and red filter on bands 4, 5 and 6 respectively).
Huang, Y; Andueza, D; de Oliveira, L; Zawadzki, F; Prache, S
2015-11-01
Since consumers are showing increased interest in the origin and method of production of their food, it is important to be able to authenticate dietary history of animals by rapid and robust methods used in the ruminant products. Promising breakthroughs have been made in the use of spectroscopic methods on fat to discriminate pasture-fed and concentrate-fed lambs. However, questions remained on their discriminatory ability in more complex feeding conditions, such as concentrate-finishing after pasture-feeding. We compared the ability of visible reflectance spectroscopy (Vis RS, wavelength range: 400 to 700 nm) with that of visible-near-infrared reflectance spectroscopy (Vis-NIR RS, wavelength range: 400 to 2500 nm) to differentiate between carcasses of lambs reared with three feeding regimes, using partial least square discriminant analysis (PLS-DA) as a classification method. The sample set comprised perirenal fat of Romane male lambs fattened at pasture (P, n = 69), stall-fattened indoors on commercial concentrate and straw (S, n = 55) and finished indoors with concentrate and straw for 28 days after pasture-feeding (PS, n = 65). The overall correct classification rate was better for Vis-NIR RS than for Vis RS (99.0% v. 95.1%, P < 0.05). Vis-NIR RS allowed a correct classification rate of 98.6%, 100.0% and 98.5% for P, S and PS lambs, respectively, whereas Vis RS allowed a correct classification rate of 98.6%, 94.5% and 92.3% for P, S and PS lambs, respectively. This study suggests the likely implication of molecules absorbing light in the non-visible part of the Vis-NIR spectra (possibly fatty acids), together with carotenoid and haem pigments, in the discrimination of the three feeding regimes.
Multilevel image recognition using discriminative patches and kernel covariance descriptor
NASA Astrophysics Data System (ADS)
Lu, Le; Yao, Jianhua; Turkbey, Evrim; Summers, Ronald M.
2014-03-01
Computer-aided diagnosis of medical images has emerged as an important tool to objectively improve the performance, accuracy and consistency for clinical workflow. To computerize the medical image diagnostic recognition problem, there are three fundamental problems: where to look (i.e., where is the region of interest from the whole image/volume), image feature description/encoding, and similarity metrics for classification or matching. In this paper, we exploit the motivation, implementation and performance evaluation of task-driven iterative, discriminative image patch mining; covariance matrix based descriptor via intensity, gradient and spatial layout; and log-Euclidean distance kernel for support vector machine, to address these three aspects respectively. To cope with often visually ambiguous image patterns for the region of interest in medical diagnosis, discovery of multilabel selective discriminative patches is desired. Covariance of several image statistics summarizes their second order interactions within an image patch and is proved as an effective image descriptor, with low dimensionality compared with joint statistics and fast computation regardless of the patch size. We extensively evaluate two extended Gaussian kernels using affine-invariant Riemannian metric or log-Euclidean metric with support vector machines (SVM), on two medical image classification problems of degenerative disc disease (DDD) detection on cortical shell unwrapped CT maps and colitis detection on CT key images. The proposed approach is validated with promising quantitative results on these challenging tasks. Our experimental findings and discussion also unveil some interesting insights on the covariance feature composition with or without spatial layout for classification and retrieval, and different kernel constructions for SVM. This will also shed some light on future work using covariance feature and kernel classification for medical image analysis.
Using spectrotemporal indices to improve the fruit-tree crop classification accuracy
NASA Astrophysics Data System (ADS)
Peña, M. A.; Liao, R.; Brenning, A.
2017-06-01
This study assesses the potential of spectrotemporal indices derived from satellite image time series (SITS) to improve the classification accuracy of fruit-tree crops. Six major fruit-tree crop types in the Aconcagua Valley, Chile, were classified by applying various linear discriminant analysis (LDA) techniques on a Landsat-8 time series of nine images corresponding to the 2014-15 growing season. As features we not only used the complete spectral resolution of the SITS, but also all possible normalized difference indices (NDIs) that can be constructed from any two bands of the time series, a novel approach to derive features from SITS. Due to the high dimensionality of this "enhanced" feature set we used the lasso and ridge penalized variants of LDA (PLDA). Although classification accuracies yielded by the standard LDA applied on the full-band SITS were good (misclassification error rate, MER = 0.13), they were further improved by 23% (MER = 0.10) with ridge PLDA using the enhanced feature set. The most important bands to discriminate the crops of interest were mainly concentrated on the first two image dates of the time series, corresponding to the crops' greenup stage. Despite the high predictor weights provided by the red and near infrared bands, typically used to construct greenness spectral indices, other spectral regions were also found important for the discrimination, such as the shortwave infrared band at 2.11-2.19 μm, sensitive to foliar water changes. These findings support the usefulness of spectrotemporal indices in the context of SITS-based crop type classifications, which until now have been mainly constructed by the arithmetic combination of two bands of the same image date in order to derive greenness temporal profiles like those from the normalized difference vegetation index.
Gambetta, Joanna M; Cozzolino, Daniel; Bastian, Susan E P; Jeffery, David W
2017-01-31
The relationship between berry chemical composition, region of origin and quality grade was investigated for Chardonnay grapes sourced from vineyards located in seven South Australian Geographical Indications (GI). Measurements of basic chemical parameters, amino acids, elements, and free and bound volatiles were conducted for grapes collected during 2015 and 2016. Multiple factor analysis (MFA) was used to determine the sets of data that best discriminated each GI and quality grade. Important components for the discrimination of grapes based on GI were 2-phenylethanol, benzyl alcohol and C6 compounds, as well as Cu, Zn, and Mg, titratable acidity (TA), total soluble solids (TSS), and pH. Discriminant analysis (DA) based on MFA results correctly classified 100% of the samples into GI in 2015 and 2016. Classification according to grade was achieved based on the results for elements such as Cu, Na, Fe, volatiles including C6 and aryl alcohols, hydrolytically-released volatiles such as (Z)-linalool oxide and vitispirane, pH, TSS, alanine and proline. Correct classification through DA according to grade was 100% for both vintages. Significant correlations were observed between climate, GI, grade, and berry composition. Climate influenced the synthesis of free and bound volatiles as well as amino acids, sugars, and acids, as a result of higher temperatures and precipitation.
Artillery/mortar type classification based on detected acoustic transients
NASA Astrophysics Data System (ADS)
Morcos, Amir; Grasing, David; Desai, Sachi
2008-04-01
Feature extraction methods based on the statistical analysis of the change in event pressure levels over a period and the level of ambient pressure excitation facilitate the development of a robust classification algorithm. The features reliably discriminates mortar and artillery variants via acoustic signals produced during the launch events. Utilizing acoustic sensors to exploit the sound waveform generated from the blast for the identification of mortar and artillery variants as type A, etcetera through analysis of the waveform. Distinct characteristics arise within the different mortar/artillery variants because varying HE mortar payloads and related charges emphasize varying size events at launch. The waveform holds various harmonic properties distinct to a given mortar/artillery variant that through advanced signal processing and data mining techniques can employed to classify a given type. The skewness and other statistical processing techniques are used to extract the predominant components from the acoustic signatures at ranges exceeding 3000m. Exploiting these techniques will help develop a feature set highly independent of range, providing discrimination based on acoustic elements of the blast wave. Highly reliable discrimination will be achieved with a feed-forward neural network classifier trained on a feature space derived from the distribution of statistical coefficients, frequency spectrum, and higher frequency details found within different energy bands. The processes that are described herein extend current technologies, which emphasis acoustic sensor systems to provide such situational awareness.
Artillery/mortar round type classification to increase system situational awareness
NASA Astrophysics Data System (ADS)
Desai, Sachi; Grasing, David; Morcos, Amir; Hohil, Myron
2008-04-01
Feature extraction methods based on the statistical analysis of the change in event pressure levels over a period and the level of ambient pressure excitation facilitate the development of a robust classification algorithm. The features reliably discriminates mortar and artillery variants via acoustic signals produced during the launch events. Utilizing acoustic sensors to exploit the sound waveform generated from the blast for the identification of mortar and artillery variants as type A, etcetera through analysis of the waveform. Distinct characteristics arise within the different mortar/artillery variants because varying HE mortar payloads and related charges emphasize varying size events at launch. The waveform holds various harmonic properties distinct to a given mortar/artillery variant that through advanced signal processing and data mining techniques can employed to classify a given type. The skewness and other statistical processing techniques are used to extract the predominant components from the acoustic signatures at ranges exceeding 3000m. Exploiting these techniques will help develop a feature set highly independent of range, providing discrimination based on acoustic elements of the blast wave. Highly reliable discrimination will be achieved with a feedforward neural network classifier trained on a feature space derived from the distribution of statistical coefficients, frequency spectrum, and higher frequency details found within different energy bands. The processes that are described herein extend current technologies, which emphasis acoustic sensor systems to provide such situational awareness.
Gottfried, Jennifer L
2011-07-01
The potential of laser-induced breakdown spectroscopy (LIBS) to discriminate biological and chemical threat simulant residues prepared on multiple substrates and in the presence of interferents has been explored. The simulant samples tested include Bacillus atrophaeus spores, Escherichia coli, MS-2 bacteriophage, α-hemolysin from Staphylococcus aureus, 2-chloroethyl ethyl sulfide, and dimethyl methylphosphonate. The residue samples were prepared on polycarbonate, stainless steel and aluminum foil substrates by Battelle Eastern Science and Technology Center. LIBS spectra were collected by Battelle on a portable LIBS instrument developed by A3 Technologies. This paper presents the chemometric analysis of the LIBS spectra using partial least-squares discriminant analysis (PLS-DA). The performance of PLS-DA models developed based on the full LIBS spectra, and selected emission intensities and ratios have been compared. The full-spectra models generally provided better classification results based on the inclusion of substrate emission features; however, the intensity/ratio models were able to correctly identify more types of simulant residues in the presence of interferents. The fusion of the two types of PLS-DA models resulted in a significant improvement in classification performance for models built using multiple substrates. In addition to identifying the major components of residue mixtures, minor components such as growth media and solvents can be identified with an appropriately designed PLS-DA model.
Application of LANDSAT-2 to the management of Delaware's marine and wetland resources
NASA Technical Reports Server (NTRS)
Klemas, V. (Principal Investigator); Bartlett, D.; Philpot, W.; Davis, G.
1976-01-01
The author has identified the following significant results. Digital multispectral classification techniques can be used to discriminate coastal land use and vegetation with 87% to 94% categorization accuracy. Wetlands plant species, representing more detail than U.S.G.S. classification system level 2 categories can be discriminated using LANDSAT data with 85% to 88% accuracy at scales up to 1:24,000.
ERIC Educational Resources Information Center
Kanno, Atsushi
1989-01-01
The study was designed to investigate the learning processes in discrimination shift learning, in terms of developmental views of "logical manipulation by classification." Tasks comparing sizes of intradimensional value-classes and comparing sizes of interdimensional value-classes were devised in order to measure subjects' levels of…
Ortiz-Ramón, Rafael; Larroza, Andrés; Ruiz-España, Silvia; Arana, Estanislao; Moratal, David
2018-05-14
To examine the capability of MRI texture analysis to differentiate the primary site of origin of brain metastases following a radiomics approach. Sixty-seven untreated brain metastases (BM) were found in 3D T1-weighted MRI of 38 patients with cancer: 27 from lung cancer, 23 from melanoma and 17 from breast cancer. These lesions were segmented in 2D and 3D to compare the discriminative power of 2D and 3D texture features. The images were quantized using different number of gray-levels to test the influence of quantization. Forty-three rotation-invariant texture features were examined. Feature selection and random forest classification were implemented within a nested cross-validation structure. Classification was evaluated with the area under receiver operating characteristic curve (AUC) considering two strategies: multiclass and one-versus-one. In the multiclass approach, 3D texture features were more discriminative than 2D features. The best results were achieved for images quantized with 32 gray-levels (AUC = 0.873 ± 0.064) using the top four features provided by the feature selection method based on the p-value. In the one-versus-one approach, high accuracy was obtained when differentiating lung cancer BM from breast cancer BM (four features, AUC = 0.963 ± 0.054) and melanoma BM (eight features, AUC = 0.936 ± 0.070) using the optimal dataset (3D features, 32 gray-levels). Classification of breast cancer and melanoma BM was unsatisfactory (AUC = 0.607 ± 0.180). Volumetric MRI texture features can be useful to differentiate brain metastases from different primary cancers after quantizing the images with the proper number of gray-levels. • Texture analysis is a promising source of biomarkers for classifying brain neoplasms. • MRI texture features of brain metastases could help identifying the primary cancer. • Volumetric texture features are more discriminative than traditional 2D texture features.
Determination of glucose-6-phosphate dehydrogenase cut-off values in a Tunisian population.
Laouini, Naouel; Sahli, Chaima Abdelhafidh; Jouini, Latifa; Haloui, Sabrine; Fredj, Sondes Hadj; Daboubi, Rym; Siala, Hajer; Ouali, Faida; Becher, Meriam; Toumi, Nourelhouda; Bibi, Amina; Messsaoud, Taieb
2017-07-26
Glucose-6-phosphate dehydrogenase (G6PD) deficiency is the commonest enzymopathy worldwide. The incidence depends essentially on the methods used for the assessment. In this respect, we attempted in this study to set cut-off values of G6PD activity to discriminate among normal, heterozygous, and deficient individuals using the World Health Organization (WHO) classification and the receiver operating characteristics (ROC) curve analysis. Blood samples from 250 female and 302 male subjects were enrolled in this study. The G6PD activity was determined using a quantitative assay. The common G6PD mutations in Tunisia were determined using the amplification refractory mutation system (ARMS-PCR) method. The ROC curve was used to choice the best cut-off. Normal G6PD values were 7.69±2.37, 7.86±2.39, and 7.51±2.35 U/g Hb for the entire, male, and female groups, respectively. Cut-off values for the total, male, and female were determined using the WHO classification and ROC curves analysis. In the male population, both cut-offs established using ROC curve analysis (4.00 U/g Hb) and the 60% level (3.82 U/g Hb), respectively are sensitive and specific resulting in a good efficiency of discrimination between deficient and normal males. For the female group the ROC cut-off (5.84 U/g Hb) seems better than the 60% level cut-off (3.88 U/g Hb) to discriminate between normal and heterozygote or homozygote women with higher Youden Index. The establishment of the normal values for a population is important for a better evaluation of the assay result. The ROC curve analysis is an alternative method to determine the status of patients since it correlates DNA analysis and G6PD activity.
A Novel Hyperspectral Microscopic Imaging System for Evaluating Fresh Degree of Pork.
Xu, Yi; Chen, Quansheng; Liu, Yan; Sun, Xin; Huang, Qiping; Ouyang, Qin; Zhao, Jiewen
2018-04-01
This study proposed a rapid microscopic examination method for pork freshness evaluation by using the self-assembled hyperspectral microscopic imaging (HMI) system with the help of feature extraction algorithm and pattern recognition methods. Pork samples were stored for different days ranging from 0 to 5 days and the freshness of samples was divided into three levels which were determined by total volatile basic nitrogen (TVB-N) content. Meanwhile, hyperspectral microscopic images of samples were acquired by HMI system and processed by the following steps for the further analysis. Firstly, characteristic hyperspectral microscopic images were extracted by using principal component analysis (PCA) and then texture features were selected based on the gray level co-occurrence matrix (GLCM). Next, features data were reduced dimensionality by fisher discriminant analysis (FDA) for further building classification model. Finally, compared with linear discriminant analysis (LDA) model and support vector machine (SVM) model, good back propagation artificial neural network (BP-ANN) model obtained the best freshness classification with a 100 % accuracy rating based on the extracted data. The results confirm that the fabricated HMI system combined with multivariate algorithms has ability to evaluate the fresh degree of pork accurately in the microscopic level, which plays an important role in animal food quality control.
A Novel Hyperspectral Microscopic Imaging System for Evaluating Fresh Degree of Pork
Xu, Yi; Chen, Quansheng; Liu, Yan; Sun, Xin; Huang, Qiping; Ouyang, Qin; Zhao, Jiewen
2018-01-01
Abstract This study proposed a rapid microscopic examination method for pork freshness evaluation by using the self-assembled hyperspectral microscopic imaging (HMI) system with the help of feature extraction algorithm and pattern recognition methods. Pork samples were stored for different days ranging from 0 to 5 days and the freshness of samples was divided into three levels which were determined by total volatile basic nitrogen (TVB-N) content. Meanwhile, hyperspectral microscopic images of samples were acquired by HMI system and processed by the following steps for the further analysis. Firstly, characteristic hyperspectral microscopic images were extracted by using principal component analysis (PCA) and then texture features were selected based on the gray level co-occurrence matrix (GLCM). Next, features data were reduced dimensionality by fisher discriminant analysis (FDA) for further building classification model. Finally, compared with linear discriminant analysis (LDA) model and support vector machine (SVM) model, good back propagation artificial neural network (BP-ANN) model obtained the best freshness classification with a 100 % accuracy rating based on the extracted data. The results confirm that the fabricated HMI system combined with multivariate algorithms has ability to evaluate the fresh degree of pork accurately in the microscopic level, which plays an important role in animal food quality control. PMID:29805285
Guo, Jing; Yue, Tianli; Yuan, Yahong
2012-10-01
Apple juice is a complex mixture of volatile and nonvolatile components. To develop discrimination models on the basis of the volatile composition for an efficient classification of apple juices according to apple variety and geographical origin, chromatography volatile profiles of 50 apple juice samples belonging to 6 varieties and from 5 counties of Shaanxi (China) were obtained by headspace solid-phase microextraction coupled with gas chromatography. The volatile profiles were processed as continuous and nonspecific signals through multivariate analysis techniques. Different preprocessing methods were applied to raw chromatographic data. The blind chemometric analysis of the preprocessed chromatographic profiles was carried out. Stepwise linear discriminant analysis (SLDA) revealed satisfactory discriminations of apple juices according to variety and geographical origin, provided respectively 100% and 89.8% success rate in terms of prediction ability. Finally, the discriminant volatile compounds selected by SLDA were identified by gas chromatography-mass spectrometry. The proposed strategy was able to verify the variety and geographical origin of apple juices involving only a reduced number of discriminate retention times selected by the stepwise procedure. This result encourages the similar procedures to be considered in quality control of apple juices. This work presented a method for an efficient discrimination of apple juices according to apple variety and geographical origin using HS-SPME-GC-MS together with chemometric tools. Discrimination models developed could help to achieve greater control over the quality of the juice and to detect possible adulteration of the product. © 2012 Institute of Food Technologists®
Sweeney, Elizabeth M.; Vogelstein, Joshua T.; Cuzzocreo, Jennifer L.; Calabresi, Peter A.; Reich, Daniel S.; Crainiceanu, Ciprian M.; Shinohara, Russell T.
2014-01-01
Machine learning is a popular method for mining and analyzing large collections of medical data. We focus on a particular problem from medical research, supervised multiple sclerosis (MS) lesion segmentation in structural magnetic resonance imaging (MRI). We examine the extent to which the choice of machine learning or classification algorithm and feature extraction function impacts the performance of lesion segmentation methods. As quantitative measures derived from structural MRI are important clinical tools for research into the pathophysiology and natural history of MS, the development of automated lesion segmentation methods is an active research field. Yet, little is known about what drives performance of these methods. We evaluate the performance of automated MS lesion segmentation methods, which consist of a supervised classification algorithm composed with a feature extraction function. These feature extraction functions act on the observed T1-weighted (T1-w), T2-weighted (T2-w) and fluid-attenuated inversion recovery (FLAIR) MRI voxel intensities. Each MRI study has a manual lesion segmentation that we use to train and validate the supervised classification algorithms. Our main finding is that the differences in predictive performance are due more to differences in the feature vectors, rather than the machine learning or classification algorithms. Features that incorporate information from neighboring voxels in the brain were found to increase performance substantially. For lesion segmentation, we conclude that it is better to use simple, interpretable, and fast algorithms, such as logistic regression, linear discriminant analysis, and quadratic discriminant analysis, and to develop the features to improve performance. PMID:24781953
NASA Astrophysics Data System (ADS)
Chauhan, H.; Krishna Mohan, B.
2014-11-01
The present study was undertaken with the objective to check effectiveness of spectral similarity measures to develop precise crop spectra from the collected hyperspectral field spectra. In Multispectral and Hyperspectral remote sensing, classification of pixels is obtained by statistical comparison (by means of spectral similarity) of known field or library spectra to unknown image spectra. Though these algorithms are readily used, little emphasis has been placed on use of various spectral similarity measures to select precise crop spectra from the set of field spectra. Conventionally crop spectra are developed after rejecting outliers based only on broad-spectrum analysis. Here a successful attempt has been made to develop precise crop spectra based on spectral similarity. As unevaluated data usage leads to uncertainty in the image classification, it is very crucial to evaluate the data. Hence, notwithstanding the conventional method, the data precision has been performed effectively to serve the purpose of the present research work. The effectiveness of developed precise field spectra was evaluated by spectral discrimination measures and found higher discrimination values compared to spectra developed conventionally. Overall classification accuracy for the image classified by field spectra selected conventionally is 51.89% and 75.47% for the image classified by field spectra selected precisely based on spectral similarity. KHAT values are 0.37, 0.62 and Z values are 2.77, 9.59 for image classified using conventional and precise field spectra respectively. Reasonable higher classification accuracy, KHAT and Z values shows the possibility of a new approach for field spectra selection based on spectral similarity measure.
Sweeney, Elizabeth M; Vogelstein, Joshua T; Cuzzocreo, Jennifer L; Calabresi, Peter A; Reich, Daniel S; Crainiceanu, Ciprian M; Shinohara, Russell T
2014-01-01
Machine learning is a popular method for mining and analyzing large collections of medical data. We focus on a particular problem from medical research, supervised multiple sclerosis (MS) lesion segmentation in structural magnetic resonance imaging (MRI). We examine the extent to which the choice of machine learning or classification algorithm and feature extraction function impacts the performance of lesion segmentation methods. As quantitative measures derived from structural MRI are important clinical tools for research into the pathophysiology and natural history of MS, the development of automated lesion segmentation methods is an active research field. Yet, little is known about what drives performance of these methods. We evaluate the performance of automated MS lesion segmentation methods, which consist of a supervised classification algorithm composed with a feature extraction function. These feature extraction functions act on the observed T1-weighted (T1-w), T2-weighted (T2-w) and fluid-attenuated inversion recovery (FLAIR) MRI voxel intensities. Each MRI study has a manual lesion segmentation that we use to train and validate the supervised classification algorithms. Our main finding is that the differences in predictive performance are due more to differences in the feature vectors, rather than the machine learning or classification algorithms. Features that incorporate information from neighboring voxels in the brain were found to increase performance substantially. For lesion segmentation, we conclude that it is better to use simple, interpretable, and fast algorithms, such as logistic regression, linear discriminant analysis, and quadratic discriminant analysis, and to develop the features to improve performance.
Lyons-Weiler, James; Pelikan, Richard; Zeh, Herbert J; Whitcomb, David C; Malehorn, David E; Bigbee, William L; Hauskrecht, Milos
2005-01-01
Peptide profiles generated using SELDI/MALDI time of flight mass spectrometry provide a promising source of patient-specific information with high potential impact on the early detection and classification of cancer and other diseases. The new profiling technology comes, however, with numerous challenges and concerns. Particularly important are concerns of reproducibility of classification results and their significance. In this work we describe a computational validation framework, called PACE (Permutation-Achieved Classification Error), that lets us assess, for a given classification model, the significance of the Achieved Classification Error (ACE) on the profile data. The framework compares the performance statistic of the classifier on true data samples and checks if these are consistent with the behavior of the classifier on the same data with randomly reassigned class labels. A statistically significant ACE increases our belief that a discriminative signal was found in the data. The advantage of PACE analysis is that it can be easily combined with any classification model and is relatively easy to interpret. PACE analysis does not protect researchers against confounding in the experimental design, or other sources of systematic or random error. We use PACE analysis to assess significance of classification results we have achieved on a number of published data sets. The results show that many of these datasets indeed possess a signal that leads to a statistically significant ACE.
Zuo, Yamin; Deng, Xuehua; Wu, Qing
2018-05-04
Discrimination of Gastrodia elata ( G. elata ) geographical origin is of great importance to pharmaceutical companies and consumers in China. this paper focuses on the feasibility of near infrared spectrum (NIRS) combined multivariate analysis as a rapid and non-destructive method to prove its fit for this purpose. Firstly, 16 batches of G. elata samples from four main-cultivation regions in China were quantified by traditional HPLC method. It showed that samples from different origins could not be efficiently differentiated by the contents of four phenolic compounds in this study. Secondly, the raw near infrared (NIR) spectra of those samples were acquired and two different pattern recognition techniques were used to classify the geographical origins. The results showed that with spectral transformation optimized, discriminant analysis (DA) provided 97% and 99% correct classification for the calibration and validation sets of samples from discriminating of four different main-cultivation regions, and provided 98% and 99% correct classifications for the calibration and validation sets of samples from eight different cities, respectively, which all performed better than the principal component analysis (PCA) method. Thirdly, as phenolic compounds content (PCC) is highly related with the quality of G. elata , synergy interval partial least squares (Si-PLS) was applied to build the PCC prediction model. The coefficient of determination for prediction (R p ²) of the Si-PLS model was 0.9209, and root mean square error for prediction (RMSEP) was 0.338. The two regions (4800 cm −1 ⁻5200 cm −1 , and 5600 cm −1 ⁻6000 cm −1 ) selected by Si-PLS corresponded to the absorptions of aromatic ring in the basic phenolic structure. It can be concluded that NIR spectroscopy combined with PCA, DA and Si-PLS would be a potential tool to provide a reference for the quality control of G. elata.
Assessment of craniometric traits in South Indian dry skulls for sex determination.
Ramamoorthy, Balakrishnan; Pai, Mangala M; Prabhu, Latha V; Muralimanju, B V; Rai, Rajalakshmi
2016-01-01
The skeleton plays an important role in sex determination in forensic anthropology. The skull bone is considered as the second best after the pelvic bone in sex determination due to its better retention of morphological features. Different populations have varying skeletal characteristics, making population specific analysis for sex determination essential. Hence the objective of this investigation is to obtain the accuracy of sex determination using cranial parameters of adult skulls to the highest percentage in South Indian population and to provide a baseline data for sex determination in South India. Seventy adult preserved human skulls were taken and based on the morphological traits were classified into 43 male skulls and 27 female skulls. A total of 26 craniometric parameters were studied. The data were analyzed by using the SPSS discriminant function. The analysis of stepwise, multivariate, and univariate discriminant function gave an accuracy of 77.1%, 85.7%, and 72.9% respectively. Multivariate direct discriminant function analysis classified skull bones into male and female with highest levels of accuracy. Using stepwise discriminant function analysis, the most dimorphic variable to determine sex of the skull, was biauricular breadth followed by weight. Subjecting the best dimorphic variables to univariate discriminant analysis, high levels of accuracy of sexual dimorphism was obtained. Percentage classification of high accuracies were obtained in this study indicating high level of sexual dimorphism in the crania, setting specific discriminant equations for the gender determination in South Indian people. Copyright © 2015 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Detection and classification of concealed weapons using a magnetometer-based portal
NASA Astrophysics Data System (ADS)
Kotter, Dale K.; Roybal, Lyle G.; Polk, Robert E.
2002-08-01
A concealed weapons detection technology was developed through the support of the National Institute of Justice (NIJ) to provide a non intrusive means for rapid detection, location, and archiving of data (including visual) of potential suspects and weapon threats. This technology, developed by the Idaho National Engineering and Environmental Laboratory (INEEL), has been applied in a portal style weapons detection system using passive magnetic sensors as its basis. This paper will report on enhancements to the weapon detection system to enable weapon classification and to discriminate threats from non-threats. Advanced signal processing algorithms were used to analyze the magnetic spectrum generated when a person passes through a portal. These algorithms analyzed multiple variables including variance in the magnetic signature from random weapon placement and/or orientation. They perform pattern recognition and calculate the probability that the collected magnetic signature correlates to a known database of weapon versus non-weapon responses. Neural networks were used to further discriminate weapon type and identify controlled electronic items such as cell phones and pagers. False alarms were further reduced by analyzing the magnetic detector response by using a Joint Time Frequency Analysis digital signal processing technique. The frequency components and power spectrum for a given sensor response were derived. This unique fingerprint provided additional information to aid in signal analysis. This technology has the potential to produce major improvements in weapon detection and classification.
Automated aural classification used for inter-species discrimination of cetaceans.
Binder, Carolyn M; Hines, Paul C
2014-04-01
Passive acoustic methods are in widespread use to detect and classify cetacean species; however, passive acoustic systems often suffer from large false detection rates resulting from numerous transient sources. To reduce the acoustic analyst workload, automatic recognition methods may be implemented in a two-stage process. First, a general automatic detector is implemented that produces many detections to ensure cetacean presence is noted. Then an automatic classifier is used to significantly reduce the number of false detections and classify the cetacean species. This process requires development of a robust classifier capable of performing inter-species classification. Because human analysts can aurally discriminate species, an automated aural classifier that uses perceptual signal features was tested on a cetacean data set. The classifier successfully discriminated between four species of cetaceans-bowhead, humpback, North Atlantic right, and sperm whales-with 85% accuracy. It also performed well (100% accuracy) for discriminating sperm whale clicks from right whale gunshots. An accuracy of 92% and area under the receiver operating characteristic curve of 0.97 were obtained for the relatively challenging bowhead and humpback recognition case. These results demonstrated that the perceptual features employed by the aural classifier provided powerful discrimination cues for inter-species classification of cetaceans.
Kalegowda, Yogesh; Harmer, Sarah L
2012-03-20
Time-of-flight secondary ion mass spectrometry (TOF-SIMS) spectra of mineral samples are complex, comprised of large mass ranges and many peaks. Consequently, characterization and classification analysis of these systems is challenging. In this study, different chemometric and statistical data evaluation methods, based on monolayer sensitive TOF-SIMS data, have been tested for the characterization and classification of copper-iron sulfide minerals (chalcopyrite, chalcocite, bornite, and pyrite) at different flotation pulp conditions (feed, conditioned feed, and Eh modified). The complex mass spectral data sets were analyzed using the following chemometric and statistical techniques: principal component analysis (PCA); principal component-discriminant functional analysis (PC-DFA); soft independent modeling of class analogy (SIMCA); and k-Nearest Neighbor (k-NN) classification. PCA was found to be an important first step in multivariate analysis, providing insight into both the relative grouping of samples and the elemental/molecular basis for those groupings. For samples exposed to oxidative conditions (at Eh ~430 mV), each technique (PCA, PC-DFA, SIMCA, and k-NN) was found to produce excellent classification. For samples at reductive conditions (at Eh ~ -200 mV SHE), k-NN and SIMCA produced the most accurate classification. Phase identification of particles that contain the same elements but a different crystal structure in a mixed multimetal mineral system has been achieved.
NASA Astrophysics Data System (ADS)
Garcia-Allende, Pilar Beatriz; Conde, Olga M.; Madruga, Francisco J.; Cubillas, Ana M.; Lopez-Higuera, Jose M.
2008-03-01
A non-intrusive infrared sensor for the detection of spurious elements in an industrial raw material chain has been developed. The system is an extension to the whole near infrared range of the spectrum of a previously designed system based on the Vis-NIR range (400 - 1000 nm). It incorporates a hyperspectral imaging spectrograph able to register simultaneously the NIR reflected spectrum of the material under study along all the points of an image line. The working material has been different tobacco leaf blends mixed with typical spurious elements of this field such as plastics, cardboards, etc. Spurious elements are discriminated automatically by an artificial neural network able to perform the classification with a high degree of accuracy. Due to the high amount of information involved in the process, Principal Component Analysis is first applied to perform data redundancy removal. By means of the extension to the whole NIR range of the spectrum, from 1000 to 2400 nm, the characterization of the material under test is highly improved. The developed technique could be applied to the classification and discrimination of other materials, and, as a consequence of its non-contact operation it is particularly suitable for food quality control.
NASA Astrophysics Data System (ADS)
Xin, Ni; Gu, Xiao-Feng; Wu, Hao; Hu, Yu-Zhu; Yang, Zhong-Lin
2012-04-01
Most herbal medicines could be processed to fulfill the different requirements of therapy. The purpose of this study was to discriminate between raw and processed Dipsacus asperoides, a common traditional Chinese medicine, based on their near infrared (NIR) spectra. Least squares-support vector machine (LS-SVM) and random forests (RF) were employed for full-spectrum classification. Three types of kernels, including linear kernel, polynomial kernel and radial basis function kernel (RBF), were checked for optimization of LS-SVM model. For comparison, a linear discriminant analysis (LDA) model was performed for classification, and the successive projections algorithm (SPA) was executed prior to building an LDA model to choose an appropriate subset of wavelengths. The three methods were applied to a dataset containing 40 raw herbs and 40 corresponding processed herbs. We ran 50 runs of 10-fold cross validation to evaluate the model's efficiency. The performance of the LS-SVM with RBF kernel (RBF LS-SVM) was better than the other two kernels. The RF, RBF LS-SVM and SPA-LDA successfully classified all test samples. The mean error rates for the 50 runs of 10-fold cross validation were 1.35% for RBF LS-SVM, 2.87% for RF, and 2.50% for SPA-LDA. The best classification results were obtained by using LS-SVM with RBF kernel, while RF was fast in the training and making predictions.
A Hybrid Sensing Approach for Pure and Adulterated Honey Classification
Subari, Norazian; Saleh, Junita Mohamad; Shakaff, Ali Yeon Md; Zakaria, Ammar
2012-01-01
This paper presents a comparison between data from single modality and fusion methods to classify Tualang honey as pure or adulterated using Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) statistical classification approaches. Ten different brands of certified pure Tualang honey were obtained throughout peninsular Malaysia and Sumatera, Indonesia. Various concentrations of two types of sugar solution (beet and cane sugar) were used in this investigation to create honey samples of 20%, 40%, 60% and 80% adulteration concentrations. Honey data extracted from an electronic nose (e-nose) and Fourier Transform Infrared Spectroscopy (FTIR) were gathered, analyzed and compared based on fusion methods. Visual observation of classification plots revealed that the PCA approach able to distinct pure and adulterated honey samples better than the LDA technique. Overall, the validated classification results based on FTIR data (88.0%) gave higher classification accuracy than e-nose data (76.5%) using the LDA technique. Honey classification based on normalized low-level and intermediate-level FTIR and e-nose fusion data scored classification accuracies of 92.2% and 88.7%, respectively using the Stepwise LDA method. The results suggested that pure and adulterated honey samples were better classified using FTIR and e-nose fusion data than single modality data. PMID:23202033
[Social self-positioning as indicator of socioeconomic status].
Fernández, E; Alonso, R M; Quer, A; Borrell, C; Benach, J; Alonso, J; Gómez, G
2000-01-01
Self-perceived class results from directly questioning subjects about his or her social class. The aim of this investigation was to analyse self-perceived class in relation to other indicator variables of socioeconomic level. Data from the 1994 Catalan Health Interview Survey, a cross-sectional survey of a representative sample of the non-institutionalised population of Catalonia was used. We conducted a discriminant analysis to compute the degree of right classification when different socioeconomic variables potentially related to self-perceived class were considered. All subjects who directly answered the questionnaire were included (N = 12,245). With the aim of obtaining the discriminant functions in a group of subjects and to validate it in another one, the subjects were divided into two random samples, containing approximately 75% and 25% of subjects (analysis sample, n = 9,248; and validation sample, n = 2,997). The final function for men and women included level of education, social class (based in occupation) and equivalent income. This function correctly classified 40.9% of the subjects in the analysis sample and 39.2% in the validation sample. Two other functions were selected for men and women separately. In men, the function included level of education, professional category, and family income (39.2% of classification in analysis sample and 37.2% in validation sample). In women, the function (level of education, working status, and equivalent income) correctly classified 40.3% of women in analysis sample whereas the percentage was 38.9% in validation sample. The percentages of right classification were higher for the highest and lowest classes. These results show the utility of a simple variable to self-position within the social scale. Self-perceived class is related to education, income, and working determinants.
Provenance establishment of coffee using solution ICP-MS and ICP-AES.
Valentin, Jenna L; Watling, R John
2013-11-01
Statistical interpretation of the concentrations of 59 elements, determined using solution based inductively coupled plasma mass spectrometry (ICP-MS) and inductively coupled plasma emission spectroscopy (ICP-AES), was used to establish the provenance of coffee samples from 15 countries across five continents. Data confirmed that the harvest year, degree of ripeness and whether the coffees were green or roasted had little effect on the elemental composition of the coffees. The application of linear discriminant analysis and principal component analysis of the elemental concentrations permitted up to 96.9% correct classification of the coffee samples according to their continent of origin. When samples from each continent were considered separately, up to 100% correct classification of coffee samples into their countries, and plantations of origin was achieved. This research demonstrates the potential of using elemental composition, in combination with statistical classification methods, for accurate provenance establishment of coffee. Copyright © 2013 Elsevier Ltd. All rights reserved.
Mining sequential patterns for protein fold recognition.
Exarchos, Themis P; Papaloukas, Costas; Lampros, Christos; Fotiadis, Dimitrios I
2008-02-01
Protein data contain discriminative patterns that can be used in many beneficial applications if they are defined correctly. In this work sequential pattern mining (SPM) is utilized for sequence-based fold recognition. Protein classification in terms of fold recognition plays an important role in computational protein analysis, since it can contribute to the determination of the function of a protein whose structure is unknown. Specifically, one of the most efficient SPM algorithms, cSPADE, is employed for the analysis of protein sequence. A classifier uses the extracted sequential patterns to classify proteins in the appropriate fold category. For training and evaluating the proposed method we used the protein sequences from the Protein Data Bank and the annotation of the SCOP database. The method exhibited an overall accuracy of 25% in a classification problem with 36 candidate categories. The classification performance reaches up to 56% when the five most probable protein folds are considered.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jing, Yaqi; Meng, Qinghao, E-mail: qh-meng@tju.edu.cn; Qi, Peifeng
An electronic nose (e-nose) was designed to classify Chinese liquors of the same aroma style. A new method of feature reduction which combined feature selection with feature extraction was proposed. Feature selection method used 8 feature-selection algorithms based on information theory and reduced the dimension of the feature space to 41. Kernel entropy component analysis was introduced into the e-nose system as a feature extraction method and the dimension of feature space was reduced to 12. Classification of Chinese liquors was performed by using back propagation artificial neural network (BP-ANN), linear discrimination analysis (LDA), and a multi-linear classifier. The classificationmore » rate of the multi-linear classifier was 97.22%, which was higher than LDA and BP-ANN. Finally the classification of Chinese liquors according to their raw materials and geographical origins was performed using the proposed multi-linear classifier and classification rate was 98.75% and 100%, respectively.« less
NASA Astrophysics Data System (ADS)
Díaz-Ayil, G.; Amouroux, M.; Blondel, W. C. P. M.; Bourg-Heckly, G.; Leroux, A.; Guillemin, F.; Granjon, Y.
2009-07-01
This paper deals with the development and application of in vivo spatially-resolved bimodal spectroscopy (AutoFluorescence AF and Diffuse Reflectance DR), to discriminate various stages of skin precancer in a preclinical model (UV-irradiated mouse): Compensatory Hyperplasia CH, Atypical Hyperplasia AH and Dysplasia D. A programmable instrumentation was developed for acquiring AF emission spectra using 7 excitation wavelengths: 360, 368, 390, 400, 410, 420 and 430 nm, and DR spectra in the 390-720 nm wavelength range. After various steps of intensity spectra preprocessing (filtering, spectral correction and intensity normalization), several sets of spectral characteristics were extracted and selected based on their discrimination power statistically tested for every pair-wise comparison of histological classes. Data reduction with Principal Components Analysis (PCA) was performed and 3 classification methods were implemented (k-NN, LDA and SVM), in order to compare diagnostic performance of each method. Diagnostic performance was studied and assessed in terms of sensitivity (Se) and specificity (Sp) as a function of the selected features, of the combinations of 3 different inter-fibers distances and of the numbers of principal components, such that: Se and Sp ≈ 100% when discriminating CH vs. others; Sp ≈ 100% and Se > 95% when discriminating Healthy vs. AH or D; Sp ≈ 74% and Se ≈ 63%for AH vs. D.
Miao, Minmin; Zeng, Hong; Wang, Aimin; Zhao, Changsen; Liu, Feixiang
2017-02-15
Common spatial pattern (CSP) is most widely used in motor imagery based brain-computer interface (BCI) systems. In conventional CSP algorithm, pairs of the eigenvectors corresponding to both extreme eigenvalues are selected to construct the optimal spatial filter. In addition, an appropriate selection of subject-specific time segments and frequency bands plays an important role in its successful application. This study proposes to optimize spatial-frequency-temporal patterns for discriminative feature extraction. Spatial optimization is implemented by channel selection and finding discriminative spatial filters adaptively on each time-frequency segment. A novel Discernibility of Feature Sets (DFS) criteria is designed for spatial filter optimization. Besides, discriminative features located in multiple time-frequency segments are selected automatically by the proposed sparse time-frequency segment common spatial pattern (STFSCSP) method which exploits sparse regression for significant features selection. Finally, a weight determined by the sparse coefficient is assigned for each selected CSP feature and we propose a Weighted Naïve Bayesian Classifier (WNBC) for classification. Experimental results on two public EEG datasets demonstrate that optimizing spatial-frequency-temporal patterns in a data-driven manner for discriminative feature extraction greatly improves the classification performance. The proposed method gives significantly better classification accuracies in comparison with several competing methods in the literature. The proposed approach is a promising candidate for future BCI systems. Copyright © 2016 Elsevier B.V. All rights reserved.
Lim, Jongguk; Kim, Giyoung; Mo, Changyeun; Oh, Kyoungmin; Kim, Geonseob; Ham, Hyeonheui; Kim, Seongmin; Kim, Moon S.
2018-01-01
Fusarium is a common fungal disease in grains that reduces the yield of barley and wheat. In this study, a near infrared reflectance spectroscopic technique was used with a statistical prediction model to rapidly and non-destructively discriminate grain samples contaminated with Fusarium. Reflectance spectra were acquired from hulled barley, naked barley, and wheat samples contaminated with Fusarium using near infrared reflectance (NIR) spectroscopy with a wavelength range of 1175–2170 nm. After measurement, the samples were cultured in a medium to discriminate contaminated samples. A partial least square discrimination analysis (PLS-DA) prediction model was developed using the acquired reflectance spectra and the culture results. The correct classification rate (CCR) of Fusarium for the hulled barley, naked barley, and wheat samples developed using raw spectra was 98% or higher. The accuracy of discrimination prediction improved when second and third-order derivative pretreatments were applied. The grains contaminated with Fusarium could be rapidly discriminated using spectroscopy technology and a PLS-DA discrimination model, and the potential of the non-destructive discrimination method could be verified. PMID:29301319
Photoacoustic discrimination of vascular and pigmented lesions using classical and Bayesian methods
NASA Astrophysics Data System (ADS)
Swearingen, Jennifer A.; Holan, Scott H.; Feldman, Mary M.; Viator, John A.
2010-01-01
Discrimination of pigmented and vascular lesions in skin can be difficult due to factors such as size, subungual location, and the nature of lesions containing both melanin and vascularity. Misdiagnosis may lead to precancerous or cancerous lesions not receiving proper medical care. To aid in the rapid and accurate diagnosis of such pathologies, we develop a photoacoustic system to determine the nature of skin lesions in vivo. By irradiating skin with two laser wavelengths, 422 and 530 nm, we induce photoacoustic responses, and the relative response at these two wavelengths indicates whether the lesion is pigmented or vascular. This response is due to the distinct absorption spectrum of melanin and hemoglobin. In particular, pigmented lesions have ratios of photoacoustic amplitudes of approximately 1.4 to 1 at the two wavelengths, while vascular lesions have ratios of about 4.0 to 1. Furthermore, we consider two statistical methods for conducting classification of lesions: standard multivariate analysis classification techniques and a Bayesian-model-based approach. We study 15 human subjects with eight vascular and seven pigmented lesions. Using the classical method, we achieve a perfect classification rate, while the Bayesian approach has an error rate of 20%.
High-speed potato grading and quality inspection based on a color vision system
NASA Astrophysics Data System (ADS)
Noordam, Jacco C.; Otten, Gerwoud W.; Timmermans, Toine J. M.; van Zwol, Bauke H.
2000-03-01
A high-speed machine vision system for the quality inspection and grading of potatoes has been developed. The vision system grades potatoes on size, shape and external defects such as greening, mechanical damages, rhizoctonia, silver scab, common scab, cracks and growth cracks. A 3-CCD line-scan camera inspects the potatoes in flight as they pass under the camera. The use of mirrors to obtain a 360-degree view of the potato and the lack of product holders guarantee a full view of the potato. To achieve the required capacity of 12 tons/hour, 11 SHARC Digital Signal Processors perform the image processing and classification tasks. The total capacity of the system is about 50 potatoes/sec. The color segmentation procedure uses Linear Discriminant Analysis (LDA) in combination with a Mahalanobis distance classifier to classify the pixels. The procedure for the detection of misshapen potatoes uses a Fourier based shape classification technique. Features such as area, eccentricity and central moments are used to discriminate between similar colored defects. Experiments with red and yellow skin-colored potatoes have shown that the system is robust and consistent in its classification.
Shen, Shi; Wang, Jingbo; Zhuo, Qin; Chen, Xi; Liu, Tingting; Zhang, Shuang-Qing
2018-05-08
Phenolics and flavonoids in honey are considered as the main phytonutrients which not only act as natural antioxidants, but can also be used as floral markers for honey identification. In this study, the chemical profiles of phenolics and flavonoids, antioxidant competences including total phenolic content, DPPH and ABTS assays and discrimination using chemometric analysis of various Chinese monofloral honeys from six botanical origins (acacia, Vitex , linden, rapeseed, Astragalus and Codonopsis ) were examined. A reproducible and sensitive ultra-performance liquid chromatography-tandem mass spectrometry (UPLC-MS/MS) method was optimized and validated for the simultaneous determination of 38 phenolics, flavonoids and abscisic acid in honey. Formononetin, ononin, calycosin and calycosin-7- O -β-d-glucoside were identified and quantified in honeys for the first time. Principal component analysis (PCA) showed obvious differences among the honey samples in three-dimensional space accounting for 72.63% of the total variance. Hierarchical cluster analysis (HCA) also revealed that the botanical origins of honey samples correlated with their phenolic and flavonoid contents. Partial least squares-discriminant analysis (PLS-DA) classification was performed to derive a model with high prediction ability. Orthogonal partial least squares-discriminant analysis (OPLS-DA) model was employed to identify markers specific to a particular honey type. The results indicated that Chinese honeys contained various and discriminative phenolics and flavonoids, as well as antioxidant competence from different botanical origins, which was an alternative approach to honey identification and nutritional evaluation.
ERIC Educational Resources Information Center
Wighting, Mervyn J.; Liu, Jing; Rovai, Alfred P.
2008-01-01
Discriminant analysis was used to determine whether classifications could be made between students enrolled in e-learning and in face-to-face university courses (N = 353) based on their scores from separate instruments measuring sense of community and motivation. Study results provide evidence that the predictors were able to distinguish between…
Sepulveda, Esteban; Franco, José G; Trzepacz, Paula T; Gaviria, Ana M; Viñuelas, Eva; Palma, José; Ferré, Gisela; Grau, Imma; Vilella, Elisabet
2015-01-01
Delirium diagnosis in elderly is often complicated by underlying dementia. We evaluated performance of the Delirium Rating Scale-Revised-98 (DRS-R98) in patients with high dementia prevalence and also assessed concordance among past and current diagnostic criteria for delirium. Cross-sectional analysis of newly admitted patients to a skilled nursing facility over 6 months, who were rated within 24-48 hours after admission. Interview for Diagnostic and Statistical Manual of Mental Disorders, 3rd edition-R (DSM)-III-R, DSM-IV, DSM-5, and International Classification of Diseases 10th edition delirium ratings, administration of the DRS-R98, and assessment of dementia using the Informant Questionnaire on Cognitive Decline in the Elderly were independently performed by 3 researchers. Discriminant analyses (receiver operating characteristics curves) were used to study DRS-R98 accuracy against different diagnostic criteria. Hanley and McNeil test compared the area under the curve for DRS-R98's discriminant performance for all diagnostic criteria. Dementia was present in 85/125 (68.0%) subjects, and 36/125 (28.8%) met criteria for delirium by at least 1 classification system, whereas only 19/36 (52.8%) did by all. DSM-III-R diagnosed the most as delirious (27.2%), followed by DSM-5 (24.8%), DSM-IV-TR (22.4%), and International Classification of Diseases 10th edition (16%). DRS-R98 had the highest AUC when discriminating DSM-III-R delirium (92.9%), followed by DSM-IV (92.4%), DSM-5 (91%), and International Classification of Diseases 10th edition (90.5%), without statistical differences among them. The best DRS-R98 cutoff score was ≥14.5 for all diagnostic systems except International Classification of Diseases 10th edition (≥15.5). There is a low concordance across diagnostic systems for identification of delirium. The DRS-R98 performs well despite differences across classification systems perhaps because it broadly assesses phenomenology, even in this population with a high prevalence of dementia. Copyright © 2015 The Academy of Psychosomatic Medicine. Published by Elsevier Inc. All rights reserved.
Classification of product inspection items using nonlinear features
NASA Astrophysics Data System (ADS)
Talukder, Ashit; Casasent, David P.; Lee, H.-W.
1998-03-01
Automated processing and classification of real-time x-ray images of randomly oriented touching pistachio nuts is discussed. The ultimate objective is the development of a system for automated non-invasive detection of defective product items on a conveyor belt. This approach involves two main steps: preprocessing and classification. Preprocessing locates individual items and segments ones that touch using a modified watershed algorithm. The second stage involves extraction of features that allow discrimination between damaged and clean items (pistachio nuts). This feature extraction and classification stage is the new aspect of this paper. We use a new nonlinear feature extraction scheme called the maximum representation and discriminating feature (MRDF) extraction method to compute nonlinear features that are used as inputs to a classifier. The MRDF is shown to provide better classification and a better ROC (receiver operating characteristic) curve than other methods.
Metacarpophalangeal pattern profile analysis in Leri-Weill dyschondrosteosis.
Laurencikas, E; Soderman, E; Grigelioniene, G; Hagenäs, L; Jorulf, H
2005-04-01
To analyze the metacarpophalangeal profile (MCPP) in individuals with Leri-Weill dyschondrosteosis (LWD) and to assess its value as a possible contributor to early diagnosis. Hand profiles of 39 individuals with a diagnosis of LWD were calculated and analyzed. Discriminant analysis was applied to differentiate between LWD and normal individuals. There was a distinct pattern profile in LWD. Mean pattern profile showed two bone-shortening gradients, with increasing shortening from distal to proximal and from medial to lateral. Distal phalanx 2 was disproportionately long and second metacarpal was disproportionately short. Discriminant analysis yielded correct classification in 72% of analyzed cases. MCPP is not age-related and the analysis can be applied at any age, facilitating early diagnosis of LWD. In view of its availability, low costs, and diagnostic value, MCPP analysis should be considered as a routine method in the patients of short stature where LWD is suspected.
Neural CMOS-integrated circuit and its application to data classification.
Göknar, Izzet Cem; Yildiz, Merih; Minaei, Shahram; Deniz, Engin
2012-05-01
Implementation and new applications of a tunable complementary metal-oxide-semiconductor-integrated circuit (CMOS-IC) of a recently proposed classifier core-cell (CC) are presented and tested with two different datasets. With two algorithms-one based on Fisher's linear discriminant analysis and the other based on perceptron learning, used to obtain CCs' tunable parameters-the Haberman and Iris datasets are classified. The parameters so obtained are used for hard-classification of datasets with a neural network structured circuit. Classification performance and coefficient calculation times for both algorithms are given. The CC has 6-ns response time and 1.8-mW power consumption. The fabrication parameters used for the IC are taken from CMOS AMS 0.35-μm technology.
Robust linear discriminant analysis with distance based estimators
NASA Astrophysics Data System (ADS)
Lim, Yai-Fung; Yahaya, Sharipah Soaad Syed; Ali, Hazlina
2017-11-01
Linear discriminant analysis (LDA) is one of the supervised classification techniques concerning relationship between a categorical variable and a set of continuous variables. The main objective of LDA is to create a function to distinguish between populations and allocating future observations to previously defined populations. Under the assumptions of normality and homoscedasticity, the LDA yields optimal linear discriminant rule (LDR) between two or more groups. However, the optimality of LDA highly relies on the sample mean and pooled sample covariance matrix which are known to be sensitive to outliers. To alleviate these conflicts, a new robust LDA using distance based estimators known as minimum variance vector (MVV) has been proposed in this study. The MVV estimators were used to substitute the classical sample mean and classical sample covariance to form a robust linear discriminant rule (RLDR). Simulation and real data study were conducted to examine on the performance of the proposed RLDR measured in terms of misclassification error rates. The computational result showed that the proposed RLDR is better than the classical LDR and was comparable with the existing robust LDR.
NASA Astrophysics Data System (ADS)
Dronova, I.; Gong, P.; Wang, L.; Clinton, N.; Fu, W.; Qi, S.
2011-12-01
Remote sensing-based vegetation classifications representing plant function such as photosynthesis and productivity are challenging in wetlands with complex cover and difficult field access. Recent advances in object-based image analysis (OBIA) and machine-learning algorithms offer new classification tools; however, few comparisons of different algorithms and spatial scales have been discussed to date. We applied OBIA to delineate wetland plant functional types (PFTs) for Poyang Lake, the largest freshwater lake in China and Ramsar wetland conservation site, from 30-m Landsat TM scene at the peak of spring growing season. We targeted major PFTs (C3 grasses, C3 forbs and different types of C4 grasses and aquatic vegetation) that are both key players in system's biogeochemical cycles and critical providers of waterbird habitat. Classification results were compared among: a) several object segmentation scales (with average object sizes 900-9000 m2); b) several families of statistical classifiers (including Bayesian, Logistic, Neural Network, Decision Trees and Support Vector Machines) and c) two hierarchical levels of vegetation classification, a generalized 3-class set and more detailed 6-class set. We found that classification benefited from object-based approach which allowed including object shape, texture and context descriptors in classification. While a number of classifiers achieved high accuracy at the finest pixel-equivalent segmentation scale, the highest accuracies and best agreement among algorithms occurred at coarser object scales. No single classifier was consistently superior across all scales, although selected algorithms of Neural Network, Logistic and K-Nearest Neighbors families frequently provided the best discrimination of classes at different scales. The choice of vegetation categories also affected classification accuracy. The 6-class set allowed for higher individual class accuracies but lower overall accuracies than the 3-class set because individual classes differed in scales at which they were best discriminated from others. Main classification challenges included a) presence of C3 grasses in C4-grass areas, particularly following harvesting of C4 reeds and b) mixtures of emergent, floating and submerged aquatic plants at sub-object and sub-pixel scales. We conclude that OBIA with advanced statistical classifiers offers useful instruments for landscape vegetation analyses, and that spatial scale considerations are critical in mapping PFTs, while multi-scale comparisons can be used to guide class selection. Future work will further apply fuzzy classification and field-collected spectral data for PFT analysis and compare results with MODIS PFT products.
NASA Astrophysics Data System (ADS)
Nitze, Ingmar; Barrett, Brian; Cawkwell, Fiona
2015-02-01
The analysis and classification of land cover is one of the principal applications in terrestrial remote sensing. Due to the seasonal variability of different vegetation types and land surface characteristics, the ability to discriminate land cover types changes over time. Multi-temporal classification can help to improve the classification accuracies, but different constraints, such as financial restrictions or atmospheric conditions, may impede their application. The optimisation of image acquisition timing and frequencies can help to increase the effectiveness of the classification process. For this purpose, the Feature Importance (FI) measure of the state-of-the art machine learning method Random Forest was used to determine the optimal image acquisition periods for a general (Grassland, Forest, Water, Settlement, Peatland) and Grassland specific (Improved Grassland, Semi-Improved Grassland) land cover classification in central Ireland based on a 9-year time-series of MODIS Terra 16 day composite data (MOD13Q1). Feature Importances for each acquisition period of the Enhanced Vegetation Index (EVI) and Normalised Difference Vegetation Index (NDVI) were calculated for both classification scenarios. In the general land cover classification, the months December and January showed the highest, and July and August the lowest separability for both VIs over the entire nine-year period. This temporal separability was reflected in the classification accuracies, where the optimal choice of image dates outperformed the worst image date by 13% using NDVI and 5% using EVI on a mono-temporal analysis. With the addition of the next best image periods to the data input the classification accuracies converged quickly to their limit at around 8-10 images. The binary classification schemes, using two classes only, showed a stronger seasonal dependency with a higher intra-annual, but lower inter-annual variation. Nonetheless anomalous weather conditions, such as the cold winter of 2009/2010 can alter the temporal separability pattern significantly. Due to the extensive use of the NDVI for land cover discrimination, the findings of this study should be transferrable to data from other optical sensors with a higher spatial resolution. However, the high impact of outliers from the general climatic pattern highlights the limitation of spatial transferability to locations with different climatic and land cover conditions. The use of high-temporal, moderate resolution data such as MODIS in conjunction with machine-learning techniques proved to be a good base for the prediction of image acquisition timing for optimal land cover classification results.
Jo, J A; Fang, Q; Papaioannou, T; Qiao, J H; Fishbein, M C; Dorafshar, A; Reil, T; Baker, D; Freischlag, J; Marcu, L
2004-01-01
This study investigates the ability of new analytical methods of time-resolved laser-induced fluorescence spectroscopy (TR-LIFS) data to characterize tissue in-vivo, such as the composition of atherosclerotic vulnerable plaques. A total of 73 TR-LIFS measurements were taken in-vivo from the aorta of 8 rabbits, and subsequently analyzed using the Laguerre deconvolution technique. The investigated spots were classified as normal aorta, thin or thick lesions, and lesions rich in either collagen or macrophages/foam-cells. Different linear and nonlinear classification algorithms (linear discriminant analysis, stepwise linear discriminant analysis, principal component analysis, and feedforward neural networks) were developed using spectral and TR features (ratios of intensity values and Laguerre expansion coefficients, respectively). Normal intima and thin lesions were discriminated from thick lesions (sensitivity >90%, specificity 100%) using only spectral features. However, both spectral and time-resolved features were necessary to discriminate thick lesions rich in collagen from thick lesions rich in foam cells (sensitivity >85%, specificity >93%), and thin lesions rich in foam cells from normal aorta and thin lesions rich in collagen (sensitivity >85%, specificity >94%). Based on these findings, we believe that TR-LIFS information derived from the Laguerre expansion coefficients can provide a valuable additional dimension for in-vivo tissue characterization.
Proposition of a Classification of Adult Patients with Hemiparesis in Chronic Phase.
Chantraine, Frédéric; Filipetti, Paul; Schreiber, Céline; Remacle, Angélique; Kolanowski, Elisabeth; Moissenet, Florent
2016-01-01
Patients who have developed hemiparesis as a result of a central nervous system lesion, often experience reduced walking capacity and worse gait quality. Although clinically, similar gait patterns have been observed, presently, no clinically driven classification has been validated to group these patients' gait abnormalities at the level of the hip, knee and ankle joints. This study has thus intended to put forward a new gait classification for adult patients with hemiparesis in chronic phase, and to validate its discriminatory capacity. Twenty-six patients with hemiparesis were included in this observational study. Following a clinical examination, a clinical gait analysis, complemented by a video analysis, was performed whereby participants were requested to walk spontaneously on a 10m walkway. A patient's classification was established from clinical examination data and video analysis. This classification was made up of three groups, including two sub-groups, defined with key abnormalities observed whilst walking. Statistical analysis was achieved on the basis of 25 parameters resulting from the clinical gait analysis in order to assess the discriminatory characteristic of the classification as displayed by the walking speed and kinematic parameters. Results revealed that the parameters related to the discriminant criteria of the proposed classification were all significantly different between groups and subgroups. More generally, nearly two thirds of the 25 parameters showed significant differences (p<0.05) between the groups and sub-groups. However, prior to being fully validated, this classification must still be tested on a larger number of patients, and the repeatability of inter-operator measures must be assessed. This classification enables patients to be grouped on the basis of key abnormalities observed whilst walking and has the advantage of being able to be used in clinical routines without necessitating complex apparatus. In the midterm, this classification may allow a decision-tree of therapies to be developed on the basis of the group in which the patient has been categorised.
Proposition of a Classification of Adult Patients with Hemiparesis in Chronic Phase
Filipetti, Paul; Remacle, Angélique; Kolanowski, Elisabeth
2016-01-01
Background Patients who have developed hemiparesis as a result of a central nervous system lesion, often experience reduced walking capacity and worse gait quality. Although clinically, similar gait patterns have been observed, presently, no clinically driven classification has been validated to group these patients’ gait abnormalities at the level of the hip, knee and ankle joints. This study has thus intended to put forward a new gait classification for adult patients with hemiparesis in chronic phase, and to validate its discriminatory capacity. Methods and Findings Twenty-six patients with hemiparesis were included in this observational study. Following a clinical examination, a clinical gait analysis, complemented by a video analysis, was performed whereby participants were requested to walk spontaneously on a 10m walkway. A patient’s classification was established from clinical examination data and video analysis. This classification was made up of three groups, including two sub-groups, defined with key abnormalities observed whilst walking. Statistical analysis was achieved on the basis of 25 parameters resulting from the clinical gait analysis in order to assess the discriminatory characteristic of the classification as displayed by the walking speed and kinematic parameters. Results revealed that the parameters related to the discriminant criteria of the proposed classification were all significantly different between groups and subgroups. More generally, nearly two thirds of the 25 parameters showed significant differences (p<0.05) between the groups and sub-groups. However, prior to being fully validated, this classification must still be tested on a larger number of patients, and the repeatability of inter-operator measures must be assessed. Conclusions This classification enables patients to be grouped on the basis of key abnormalities observed whilst walking and has the advantage of being able to be used in clinical routines without necessitating complex apparatus. In the midterm, this classification may allow a decision-tree of therapies to be developed on the basis of the group in which the patient has been categorised. PMID:27271533
Conscientious Classification: A Data Scientist's Guide to Discrimination-Aware Classification.
d'Alessandro, Brian; O'Neil, Cathy; LaGatta, Tom
2017-06-01
Recent research has helped to cultivate growing awareness that machine-learning systems fueled by big data can create or exacerbate troubling disparities in society. Much of this research comes from outside of the practicing data science community, leaving its members with little concrete guidance to proactively address these concerns. This article introduces issues of discrimination to the data science community on its own terms. In it, we tour the familiar data-mining process while providing a taxonomy of common practices that have the potential to produce unintended discrimination. We also survey how discrimination is commonly measured, and suggest how familiar development processes can be augmented to mitigate systems' discriminatory potential. We advocate that data scientists should be intentional about modeling and reducing discriminatory outcomes. Without doing so, their efforts will result in perpetuating any systemic discrimination that may exist, but under a misleading veil of data-driven objectivity.
Maione, Camila; Barbosa, Rommel Melgaço
2018-01-24
Rice is one of the most important staple foods around the world. Authentication of rice is one of the most addressed concerns in the present literature, which includes recognition of its geographical origin and variety, certification of organic rice and many other issues. Good results have been achieved by multivariate data analysis and data mining techniques when combined with specific parameters for ascertaining authenticity and many other useful characteristics of rice, such as quality, yield and others. This paper brings a review of the recent research projects on discrimination and authentication of rice using multivariate data analysis and data mining techniques. We found that data obtained from image processing, molecular and atomic spectroscopy, elemental fingerprinting, genetic markers, molecular content and others are promising sources of information regarding geographical origin, variety and other aspects of rice, being widely used combined with multivariate data analysis techniques. Principal component analysis and linear discriminant analysis are the preferred methods, but several other data classification techniques such as support vector machines, artificial neural networks and others are also frequently present in some studies and show high performance for discrimination of rice.
Application of texture analysis method for mammogram density classification
NASA Astrophysics Data System (ADS)
Nithya, R.; Santhi, B.
2017-07-01
Mammographic density is considered a major risk factor for developing breast cancer. This paper proposes an automated approach to classify breast tissue types in digital mammogram. The main objective of the proposed Computer-Aided Diagnosis (CAD) system is to investigate various feature extraction methods and classifiers to improve the diagnostic accuracy in mammogram density classification. Texture analysis methods are used to extract the features from the mammogram. Texture features are extracted by using histogram, Gray Level Co-Occurrence Matrix (GLCM), Gray Level Run Length Matrix (GLRLM), Gray Level Difference Matrix (GLDM), Local Binary Pattern (LBP), Entropy, Discrete Wavelet Transform (DWT), Wavelet Packet Transform (WPT), Gabor transform and trace transform. These extracted features are selected using Analysis of Variance (ANOVA). The features selected by ANOVA are fed into the classifiers to characterize the mammogram into two-class (fatty/dense) and three-class (fatty/glandular/dense) breast density classification. This work has been carried out by using the mini-Mammographic Image Analysis Society (MIAS) database. Five classifiers are employed namely, Artificial Neural Network (ANN), Linear Discriminant Analysis (LDA), Naive Bayes (NB), K-Nearest Neighbor (KNN), and Support Vector Machine (SVM). Experimental results show that ANN provides better performance than LDA, NB, KNN and SVM classifiers. The proposed methodology has achieved 97.5% accuracy for three-class and 99.37% for two-class density classification.
An integrated method for cancer classification and rule extraction from microarray data
Huang, Liang-Tsung
2009-01-01
Different microarray techniques recently have been successfully used to investigate useful information for cancer diagnosis at the gene expression level due to their ability to measure thousands of gene expression levels in a massively parallel way. One important issue is to improve classification performance of microarray data. However, it would be ideal that influential genes and even interpretable rules can be explored at the same time to offer biological insight. Introducing the concepts of system design in software engineering, this paper has presented an integrated and effective method (named X-AI) for accurate cancer classification and the acquisition of knowledge from DNA microarray data. This method included a feature selector to systematically extract the relative important genes so as to reduce the dimension and retain as much as possible of the class discriminatory information. Next, diagonal quadratic discriminant analysis (DQDA) was combined to classify tumors, and generalized rule induction (GRI) was integrated to establish association rules which can give an understanding of the relationships between cancer classes and related genes. Two non-redundant datasets of acute leukemia were used to validate the proposed X-AI, showing significantly high accuracy for discriminating different classes. On the other hand, I have presented the abilities of X-AI to extract relevant genes, as well as to develop interpretable rules. Further, a web server has been established for cancer classification and it is freely available at . PMID:19272192
Improving oil classification quality from oil spill fingerprint beyond six sigma approach.
Juahir, Hafizan; Ismail, Azimah; Mohamed, Saiful Bahri; Toriman, Mohd Ekhwan; Kassim, Azlina Md; Zain, Sharifuddin Md; Ahmad, Wan Kamaruzaman Wan; Wah, Wong Kok; Zali, Munirah Abdul; Retnam, Ananthy; Taib, Mohd Zaki Mohd; Mokhtar, Mazlin
2017-07-15
This study involves the use of quality engineering in oil spill classification based on oil spill fingerprinting from GC-FID and GC-MS employing the six-sigma approach. The oil spills are recovered from various water areas of Peninsular Malaysia and Sabah (East Malaysia). The study approach used six sigma methodologies that effectively serve as the problem solving in oil classification extracted from the complex mixtures of oil spilled dataset. The analysis of six sigma link with the quality engineering improved the organizational performance to achieve its objectivity of the environmental forensics. The study reveals that oil spills are discriminated into four groups' viz. diesel, hydrocarbon fuel oil (HFO), mixture oil lubricant and fuel oil (MOLFO) and waste oil (WO) according to the similarity of the intrinsic chemical properties. Through the validation, it confirmed that four discriminant component, diesel, hydrocarbon fuel oil (HFO), mixture oil lubricant and fuel oil (MOLFO) and waste oil (WO) dominate the oil types with a total variance of 99.51% with ANOVA giving F stat >F critical at 95% confidence level and a Chi Square goodness test of 74.87. Results obtained from this study reveals that by employing six-sigma approach in a data-driven problem such as in the case of oil spill classification, good decision making can be expedited. Copyright © 2017. Published by Elsevier Ltd.
NASA Astrophysics Data System (ADS)
Lesniak, J. M.; Hupse, R.; Blanc, R.; Karssemeijer, N.; Székely, G.
2012-08-01
False positive (FP) marks represent an obstacle for effective use of computer-aided detection (CADe) of breast masses in mammography. Typically, the problem can be approached either by developing more discriminative features or by employing different classifier designs. In this paper, the usage of support vector machine (SVM) classification for FP reduction in CADe is investigated, presenting a systematic quantitative evaluation against neural networks, k-nearest neighbor classification, linear discriminant analysis and random forests. A large database of 2516 film mammography examinations and 73 input features was used to train the classifiers and evaluate for their performance on correctly diagnosed exams as well as false negatives. Further, classifier robustness was investigated using varying training data and feature sets as input. The evaluation was based on the mean exam sensitivity in 0.05-1 FPs on normals on the free-response receiver operating characteristic curve (FROC), incorporated into a tenfold cross validation framework. It was found that SVM classification using a Gaussian kernel offered significantly increased detection performance (P = 0.0002) compared to the reference methods. Varying training data and input features, SVMs showed improved exploitation of large feature sets. It is concluded that with the SVM-based CADe a significant reduction of FPs is possible outperforming other state-of-the-art approaches for breast mass CADe.
Soliman, Essam S; Moawed, Sherif A; Hassan, Rania A
2017-08-01
Birds litter contains unutilized nitrogen in the form of uric acid that is converted into ammonia; a fact that does not only affect poultry performance but also has a negative effect on people's health around the farm and contributes in the environmental degradation. The influence of microclimatic ammonia emissions on Ross and Hubbard broilers reared in different housing systems at two consecutive seasons (fall and winter) was evaluated using a discriminant function analysis to differentiate between Ross and Hubbard breeds. A total number of 400 air samples were collected and analyzed for ammonia levels during the experimental period. Data were analyzed using univariate and multivariate statistical methods. Ammonia levels were significantly higher (p< 0.01) in the Ross compared to the Hubbard breed farm, although no significant differences (p>0.05) were found between the two farms in body weight, body weight gain, feed intake, feed conversion ratio, and performance index (PI) of broilers. Body weight; weight gain and PI had increased values (p< 0.01) during fall compared to winter irrespective of broiler breed. Ammonia emissions were positively (although weekly) correlated with the ambient relative humidity (r=0.383; p< 0.01), but not with the ambient temperature (r=-0.045; p>0.05). Test of significance of discriminant function analysis did not show a classification based on the studied traits suggesting that they cannot been used as predictor variables. The percentage of correct classification was 52% and it was improved after deletion of highly correlated traits to 57%. The study revealed that broiler's growth was negatively affected by increased microclimatic ammonia concentrations and recommended the analysis of broilers' growth performance parameters data using multivariate discriminant function analysis.
Soliman, Essam S.; Moawed, Sherif A.; Hassan, Rania A.
2017-01-01
Background and Aim: Birds litter contains unutilized nitrogen in the form of uric acid that is converted into ammonia; a fact that does not only affect poultry performance but also has a negative effect on people’s health around the farm and contributes in the environmental degradation. The influence of microclimatic ammonia emissions on Ross and Hubbard broilers reared in different housing systems at two consecutive seasons (fall and winter) was evaluated using a discriminant function analysis to differentiate between Ross and Hubbard breeds. Materials and Methods: A total number of 400 air samples were collected and analyzed for ammonia levels during the experimental period. Data were analyzed using univariate and multivariate statistical methods. Results: Ammonia levels were significantly higher (p< 0.01) in the Ross compared to the Hubbard breed farm, although no significant differences (p>0.05) were found between the two farms in body weight, body weight gain, feed intake, feed conversion ratio, and performance index (PI) of broilers. Body weight; weight gain and PI had increased values (p< 0.01) during fall compared to winter irrespective of broiler breed. Ammonia emissions were positively (although weekly) correlated with the ambient relative humidity (r=0.383; p< 0.01), but not with the ambient temperature (r=−0.045; p>0.05). Test of significance of discriminant function analysis did not show a classification based on the studied traits suggesting that they cannot been used as predictor variables. The percentage of correct classification was 52% and it was improved after deletion of highly correlated traits to 57%. Conclusion: The study revealed that broiler’s growth was negatively affected by increased microclimatic ammonia concentrations and recommended the analysis of broilers’ growth performance parameters data using multivariate discriminant function analysis. PMID:28919677
Discriminative Bayesian Dictionary Learning for Classification.
Akhtar, Naveed; Shafait, Faisal; Mian, Ajmal
2016-12-01
We propose a Bayesian approach to learn discriminative dictionaries for sparse representation of data. The proposed approach infers probability distributions over the atoms of a discriminative dictionary using a finite approximation of Beta Process. It also computes sets of Bernoulli distributions that associate class labels to the learned dictionary atoms. This association signifies the selection probabilities of the dictionary atoms in the expansion of class-specific data. Furthermore, the non-parametric character of the proposed approach allows it to infer the correct size of the dictionary. We exploit the aforementioned Bernoulli distributions in separately learning a linear classifier. The classifier uses the same hierarchical Bayesian model as the dictionary, which we present along the analytical inference solution for Gibbs sampling. For classification, a test instance is first sparsely encoded over the learned dictionary and the codes are fed to the classifier. We performed experiments for face and action recognition; and object and scene-category classification using five public datasets and compared the results with state-of-the-art discriminative sparse representation approaches. Experiments show that the proposed Bayesian approach consistently outperforms the existing approaches.
Can early hepatic fibrosis stages be discriminated by combining ultrasonic parameters?
Bouzitoune, Razika; Meziri, Mahmoud; Machado, Christiano Bittencourt; Padilla, Frédéric; Pereira, Wagner Coelho de Albuquerque
2016-05-01
In this study, we put forward a new approach to classify early stages of fibrosis based on a multiparametric characterization using backscatter ultrasonic signals. Ultrasonic parameters, such as backscatter coefficient (Bc), speed of sound (SoS), attenuation coefficient (Ac), mean scatterer spacing (MSS), and spectral slope (SS), have shown their potential to differentiate between healthy and pathologic samples in different organs (eye, breast, prostate, liver). Recently, our group looked into the characterization of stages of hepatic fibrosis using the parameters cited above. The results showed that none of them could individually distinguish between the different stages. Therefore, we explored a multiparametric approach by combining these parameters in two and three, to test their potential to discriminate between the stages of liver fibrosis: F0 (normal), F1, F3, and/without F4 (cirrhosis), according to METAVIR Score. Discriminant analysis showed that the most relevant individual parameter was Bc, followed by SoS, SS, MSS, and Ac. The combination of (Bc, SoS) along with the four stages was the best in differentiating between the stages of fibrosis and correctly classified 85% of the liver samples with a high level of significance (p<0.0001). Nevertheless, when taking into account only stages F0, F1, and F3, the discriminant analysis showed that the parameters (Bc, SoS) and (Bc, Ac) had a better classification (93%) with a high level of significance (p<0.0001). The combination of the three parameters (Bc, SoS, and Ac) led to a 100% correct classification. In conclusion, the current findings show that the multiparametric approach has great potential in differentiating between the stages of fibrosis, and thus could play an important role in the diagnosis and follow-up of hepatic fibrosis. Copyright © 2016 Elsevier B.V. All rights reserved.
Hartman, C. Alex; Ackerman, Joshua T.; Eagles-Smith, Collin A.; Herzog, Mark
2016-01-01
In birds where males and females are similar in size and plumage, sex determination by alternative means is necessary. Discriminant function analysis based on external morphometrics was used to distinguish males from females in two closely related species: Western Grebe (Aechmophorus occidentalis) and Clark's Grebe (A. clarkii). Additionally, discriminant function analysis was used to evaluate morphometric divergence between Western and Clark's grebe adults and eggs. Aechmophorus grebe adults (n = 576) and eggs (n = 130) were sampled across 29 lakes and reservoirs throughout California, USA, and adult sex was determined using molecular analysis. Both Western and Clark's grebes exhibited considerable sexual size dimorphism. Males averaged 6–26% larger than females among seven morphological measurements, with the greatest sexual size dimorphism occurring for bill morphometrics. Discriminant functions based on bill length, bill depth, and short tarsus length correctly assigned sex to 98% of Western Grebes, and a function based on bill length and bill depth correctly assigned sex to 99% of Clark's Grebes. Further, a simplified discriminant function based only on bill depth correctly assigned sex to 96% of Western Grebes and 98% of Clark's Grebes. In contrast, external morphometrics were not suitable for differentiating between Western and Clark's grebe adults or their eggs, with correct classification rates of discriminant functions of only 60%, 63%, and 61% for adult males, adult females, and eggs, respectively. Our results indicate little divergence in external morphology between species of Aechmophorus grebes, and instead separation is much greater between males and females.