Sample records for classification principal component

  1. An efficient classification method based on principal component and sparse representation.

    PubMed

    Zhai, Lin; Fu, Shujun; Zhang, Caiming; Liu, Yunxian; Wang, Lu; Liu, Guohua; Yang, Mingqiang

    2016-01-01

    As an important application in optical imaging, palmprint recognition is interfered by many unfavorable factors. An effective fusion of blockwise bi-directional two-dimensional principal component analysis and grouping sparse classification is presented. The dimension reduction and normalizing are implemented by the blockwise bi-directional two-dimensional principal component analysis for palmprint images to extract feature matrixes, which are assembled into an overcomplete dictionary in sparse classification. A subspace orthogonal matching pursuit algorithm is designed to solve the grouping sparse representation. Finally, the classification result is gained by comparing the residual between testing and reconstructed images. Experiments are carried out on a palmprint database, and the results show that this method has better robustness against position and illumination changes of palmprint images, and can get higher rate of palmprint recognition.

  2. Wavelet based de-noising of breath air absorption spectra profiles for improved classification by principal component analysis

    NASA Astrophysics Data System (ADS)

    Kistenev, Yu. V.; Shapovalov, A. V.; Borisov, A. V.; Vrazhnov, D. A.; Nikolaev, V. V.; Nikiforova, O. Yu.

    2015-11-01

    The comparison results of different mother wavelets used for de-noising of model and experimental data which were presented by profiles of absorption spectra of exhaled air are presented. The impact of wavelets de-noising on classification quality made by principal component analysis are also discussed.

  3. Snapshot hyperspectral imaging probe with principal component analysis and confidence ellipse for classification

    NASA Astrophysics Data System (ADS)

    Lim, Hoong-Ta; Murukeshan, Vadakke Matham

    2017-06-01

    Hyperspectral imaging combines imaging and spectroscopy to provide detailed spectral information for each spatial point in the image. This gives a three-dimensional spatial-spatial-spectral datacube with hundreds of spectral images. Probe-based hyperspectral imaging systems have been developed so that they can be used in regions where conventional table-top platforms would find it difficult to access. A fiber bundle, which is made up of specially-arranged optical fibers, has recently been developed and integrated with a spectrograph-based hyperspectral imager. This forms a snapshot hyperspectral imaging probe, which is able to form a datacube using the information from each scan. Compared to the other configurations, which require sequential scanning to form a datacube, the snapshot configuration is preferred in real-time applications where motion artifacts and pixel misregistration can be minimized. Principal component analysis is a dimension-reducing technique that can be applied in hyperspectral imaging to convert the spectral information into uncorrelated variables known as principal components. A confidence ellipse can be used to define the region of each class in the principal component feature space and for classification. This paper demonstrates the use of the snapshot hyperspectral imaging probe to acquire data from samples of different colors. The spectral library of each sample was acquired and then analyzed using principal component analysis. Confidence ellipse was then applied to the principal components of each sample and used as the classification criteria. The results show that the applied analysis can be used to perform classification of the spectral data acquired using the snapshot hyperspectral imaging probe.

  4. Applications of principal component analysis to breath air absorption spectra profiles classification

    NASA Astrophysics Data System (ADS)

    Kistenev, Yu. V.; Shapovalov, A. V.; Borisov, A. V.; Vrazhnov, D. A.; Nikolaev, V. V.; Nikiforova, O. Y.

    2015-12-01

    The results of numerical simulation of application principal component analysis to absorption spectra of breath air of patients with pulmonary diseases are presented. Various methods of experimental data preprocessing are analyzed.

  5. [A study of Boletus bicolor from different areas using Fourier transform infrared spectrometry].

    PubMed

    Zhou, Zai-Jin; Liu, Gang; Ren, Xian-Pei

    2010-04-01

    It is hard to differentiate the same species of wild growing mushrooms from different areas by macromorphological features. In this paper, Fourier transform infrared (FTIR) spectroscopy combined with principal component analysis was used to identify 58 samples of boletus bicolor from five different areas. Based on the fingerprint infrared spectrum of boletus bicolor samples, principal component analysis was conducted on 58 boletus bicolor spectra in the range of 1 350-750 cm(-1) using the statistical software SPSS 13.0. According to the result, the accumulated contributing ratio of the first three principal components accounts for 88.87%. They included almost all the information of samples. The two-dimensional projection plot using first and second principal component is a satisfactory clustering effect for the classification and discrimination of boletus bicolor. All boletus bicolor samples were divided into five groups with a classification accuracy of 98.3%. The study demonstrated that wild growing boletus bicolor at species level from different areas can be identified by FTIR spectra combined with principal components analysis.

  6. Pattern classification using an olfactory model with PCA feature selection in electronic noses: study and application.

    PubMed

    Fu, Jun; Huang, Canqin; Xing, Jianguo; Zheng, Junbao

    2012-01-01

    Biologically-inspired models and algorithms are considered as promising sensor array signal processing methods for electronic noses. Feature selection is one of the most important issues for developing robust pattern recognition models in machine learning. This paper describes an investigation into the classification performance of a bionic olfactory model with the increase of the dimensions of input feature vector (outer factor) as well as its parallel channels (inner factor). The principal component analysis technique was applied for feature selection and dimension reduction. Two data sets of three classes of wine derived from different cultivars and five classes of green tea derived from five different provinces of China were used for experiments. In the former case the results showed that the average correct classification rate increased as more principal components were put in to feature vector. In the latter case the results showed that sufficient parallel channels should be reserved in the model to avoid pattern space crowding. We concluded that 6~8 channels of the model with principal component feature vector values of at least 90% cumulative variance is adequate for a classification task of 3~5 pattern classes considering the trade-off between time consumption and classification rate.

  7. Superpixel-based spectral classification for the detection of head and neck cancer with hyperspectral imaging

    NASA Astrophysics Data System (ADS)

    Chung, Hyunkoo; Lu, Guolan; Tian, Zhiqiang; Wang, Dongsheng; Chen, Zhuo Georgia; Fei, Baowei

    2016-03-01

    Hyperspectral imaging (HSI) is an emerging imaging modality for medical applications. HSI acquires two dimensional images at various wavelengths. The combination of both spectral and spatial information provides quantitative information for cancer detection and diagnosis. This paper proposes using superpixels, principal component analysis (PCA), and support vector machine (SVM) to distinguish regions of tumor from healthy tissue. The classification method uses 2 principal components decomposed from hyperspectral images and obtains an average sensitivity of 93% and an average specificity of 85% for 11 mice. The hyperspectral imaging technology and classification method can have various applications in cancer research and management.

  8. A HIERARCHIAL STOCHASTIC MODEL OF LARGE SCALE ATMOSPHERIC CIRCULATION PATTERNS AND MULTIPLE STATION DAILY PRECIPITATION

    EPA Science Inventory

    A stochastic model of weather states and concurrent daily precipitation at multiple precipitation stations is described. our algorithms are invested for classification of daily weather states; k means, fuzzy clustering, principal components, and principal components coupled with ...

  9. Pattern Classification Using an Olfactory Model with PCA Feature Selection in Electronic Noses: Study and Application

    PubMed Central

    Fu, Jun; Huang, Canqin; Xing, Jianguo; Zheng, Junbao

    2012-01-01

    Biologically-inspired models and algorithms are considered as promising sensor array signal processing methods for electronic noses. Feature selection is one of the most important issues for developing robust pattern recognition models in machine learning. This paper describes an investigation into the classification performance of a bionic olfactory model with the increase of the dimensions of input feature vector (outer factor) as well as its parallel channels (inner factor). The principal component analysis technique was applied for feature selection and dimension reduction. Two data sets of three classes of wine derived from different cultivars and five classes of green tea derived from five different provinces of China were used for experiments. In the former case the results showed that the average correct classification rate increased as more principal components were put in to feature vector. In the latter case the results showed that sufficient parallel channels should be reserved in the model to avoid pattern space crowding. We concluded that 6∼8 channels of the model with principal component feature vector values of at least 90% cumulative variance is adequate for a classification task of 3∼5 pattern classes considering the trade-off between time consumption and classification rate. PMID:22736979

  10. Support vector machine based classification of fast Fourier transform spectroscopy of proteins

    NASA Astrophysics Data System (ADS)

    Lazarevic, Aleksandar; Pokrajac, Dragoljub; Marcano, Aristides; Melikechi, Noureddine

    2009-02-01

    Fast Fourier transform spectroscopy has proved to be a powerful method for study of the secondary structure of proteins since peak positions and their relative amplitude are affected by the number of hydrogen bridges that sustain this secondary structure. However, to our best knowledge, the method has not been used yet for identification of proteins within a complex matrix like a blood sample. The principal reason is the apparent similarity of protein infrared spectra with actual differences usually masked by the solvent contribution and other interactions. In this paper, we propose a novel machine learning based method that uses protein spectra for classification and identification of such proteins within a given sample. The proposed method uses principal component analysis (PCA) to identify most important linear combinations of original spectral components and then employs support vector machine (SVM) classification model applied on such identified combinations to categorize proteins into one of given groups. Our experiments have been performed on the set of four different proteins, namely: Bovine Serum Albumin, Leptin, Insulin-like Growth Factor 2 and Osteopontin. Our proposed method of applying principal component analysis along with support vector machines exhibits excellent classification accuracy when identifying proteins using their infrared spectra.

  11. Feature selection for neural network based defect classification of ceramic components using high frequency ultrasound.

    PubMed

    Kesharaju, Manasa; Nagarajah, Romesh

    2015-09-01

    The motivation for this research stems from a need for providing a non-destructive testing method capable of detecting and locating any defects and microstructural variations within armour ceramic components before issuing them to the soldiers who rely on them for their survival. The development of an automated ultrasonic inspection based classification system would make possible the checking of each ceramic component and immediately alert the operator about the presence of defects. Generally, in many classification problems a choice of features or dimensionality reduction is significant and simultaneously very difficult, as a substantial computational effort is required to evaluate possible feature subsets. In this research, a combination of artificial neural networks and genetic algorithms are used to optimize the feature subset used in classification of various defects in reaction-sintered silicon carbide ceramic components. Initially wavelet based feature extraction is implemented from the region of interest. An Artificial Neural Network classifier is employed to evaluate the performance of these features. Genetic Algorithm based feature selection is performed. Principal Component Analysis is a popular technique used for feature selection and is compared with the genetic algorithm based technique in terms of classification accuracy and selection of optimal number of features. The experimental results confirm that features identified by Principal Component Analysis lead to improved performance in terms of classification percentage with 96% than Genetic algorithm with 94%. Copyright © 2015 Elsevier B.V. All rights reserved.

  12. Classification of time-of-flight secondary ion mass spectrometry spectra from complex Cu-Fe sulphides by principal component analysis and artificial neural networks.

    PubMed

    Kalegowda, Yogesh; Harmer, Sarah L

    2013-01-08

    Artificial neural network (ANN) and a hybrid principal component analysis-artificial neural network (PCA-ANN) classifiers have been successfully implemented for classification of static time-of-flight secondary ion mass spectrometry (ToF-SIMS) mass spectra collected from complex Cu-Fe sulphides (chalcopyrite, bornite, chalcocite and pyrite) at different flotation conditions. ANNs are very good pattern classifiers because of: their ability to learn and generalise patterns that are not linearly separable; their fault and noise tolerance capability; and high parallelism. In the first approach, fragments from the whole ToF-SIMS spectrum were used as input to the ANN, the model yielded high overall correct classification rates of 100% for feed samples, 88% for conditioned feed samples and 91% for Eh modified samples. In the second approach, the hybrid pattern classifier PCA-ANN was integrated. PCA is a very effective multivariate data analysis tool applied to enhance species features and reduce data dimensionality. Principal component (PC) scores which accounted for 95% of the raw spectral data variance, were used as input to the ANN, the model yielded high overall correct classification rates of 88% for conditioned feed samples and 95% for Eh modified samples. Copyright © 2012 Elsevier B.V. All rights reserved.

  13. Intelligence, Surveillance, and Reconnaissance Fusion for Coalition Operations

    DTIC Science & Technology

    2008-07-01

    classification of the targets of interest. The MMI features extracted in this manner have two properties that provide a sound justification for...are generalizations of well- known feature extraction methods such as Principal Components Analysis (PCA) and Independent Component Analysis (ICA...augment (without degrading performance) a large class of generic fusion processes. Ontologies Classifications Feature extraction Feature analysis

  14. Application of principal component analysis to distinguish patients with schizophrenia from healthy controls based on fractional anisotropy measurements.

    PubMed

    Caprihan, A; Pearlson, G D; Calhoun, V D

    2008-08-15

    Principal component analysis (PCA) is often used to reduce the dimension of data before applying more sophisticated data analysis methods such as non-linear classification algorithms or independent component analysis. This practice is based on selecting components corresponding to the largest eigenvalues. If the ultimate goal is separation of data in two groups, then these set of components need not have the most discriminatory power. We measured the distance between two such populations using Mahalanobis distance and chose the eigenvectors to maximize it, a modified PCA method, which we call the discriminant PCA (DPCA). DPCA was applied to diffusion tensor-based fractional anisotropy images to distinguish age-matched schizophrenia subjects from healthy controls. The performance of the proposed method was evaluated by the one-leave-out method. We show that for this fractional anisotropy data set, the classification error with 60 components was close to the minimum error and that the Mahalanobis distance was twice as large with DPCA, than with PCA. Finally, by masking the discriminant function with the white matter tracts of the Johns Hopkins University atlas, we identified left superior longitudinal fasciculus as the tract which gave the least classification error. In addition, with six optimally chosen tracts the classification error was zero.

  15. Development of neural network techniques for finger-vein pattern classification

    NASA Astrophysics Data System (ADS)

    Wu, Jian-Da; Liu, Chiung-Tsiung; Tsai, Yi-Jang; Liu, Jun-Ching; Chang, Ya-Wen

    2010-02-01

    A personal identification system using finger-vein patterns and neural network techniques is proposed in the present study. In the proposed system, the finger-vein patterns are captured by a device that can transmit near infrared through the finger and record the patterns for signal analysis and classification. The biometric system for verification consists of a combination of feature extraction using principal component analysis and pattern classification using both back-propagation network and adaptive neuro-fuzzy inference systems. Finger-vein features are first extracted by principal component analysis method to reduce the computational burden and removes noise residing in the discarded dimensions. The features are then used in pattern classification and identification. To verify the effect of the proposed adaptive neuro-fuzzy inference system in the pattern classification, the back-propagation network is compared with the proposed system. The experimental results indicated the proposed system using adaptive neuro-fuzzy inference system demonstrated a better performance than the back-propagation network for personal identification using the finger-vein patterns.

  16. A comparison of the usefulness of canonical analysis, principal components analysis, and band selection for extraction of features from TMS data for landcover analysis

    NASA Technical Reports Server (NTRS)

    Boyd, R. K.; Brumfield, J. O.; Campbell, W. J.

    1984-01-01

    Three feature extraction methods, canonical analysis (CA), principal component analysis (PCA), and band selection, have been applied to Thematic Mapper Simulator (TMS) data in order to evaluate the relative performance of the methods. The results obtained show that CA is capable of providing a transformation of TMS data which leads to better classification results than provided by all seven bands, by PCA, or by band selection. A second conclusion drawn from the study is that TMS bands 2, 3, 4, and 7 (thermal) are most important for landcover classification.

  17. Determination of the chemical parameters and manufacturer of divins from their broadband transmission spectra

    NASA Astrophysics Data System (ADS)

    Khodasevich, M. A.; Sinitsyn, G. V.; Skorbanova, E. A.; Rogovaya, M. V.; Kambur, E. I.; Aseev, V. A.

    2016-06-01

    Analysis of multiparametric data on transmission spectra of 24 divins (Moldovan cognacs) in the 190-2600 nm range allows identification of outliers and their removal from a sample under study in the following consideration. The principal component analysis and classification tree with a single-rank predictor constructed in the 2D space of principal components allow classification of divin manufacturers. It is shown that the accuracy of syringaldehyde, ethyl acetate, vanillin, and gallic acid concentrations in divins calculated with the regression to latent structures depends on the sample volume and is 3, 6, 16, and 20%, respectively, which is acceptable for the application.

  18. The Hughes phenomenon in hyperspectral classification based on the ground spectrum of grasslands in the region around Qinghai Lake

    NASA Astrophysics Data System (ADS)

    Ma, Weiwei; Gong, Cailan; Hu, Yong; Meng, Peng; Xu, Feifei

    2013-08-01

    Hyperspectral data, consisting of hundreds of spectral bands with a high spectral resolution, enables acquisition of continuous spectral characteristic curves, and therefore have served as a powerful tool for vegetation classification. The difficulty of using hyperspectral data is that they are usually redundant, strongly correlated and subject to Hughes phenomenon where classification accuracy increases gradually in the beginning as the number of spectral bands or dimensions increases, but decreases dramatically when the band number reaches some value. In recent years,some algorithms have been proposed to overcome the Hughes phenomenon in classification, such as selecting several bands from full bands, PCA- and MNF-based feature transformations. Up to date, however, few studies have been conducted to investigate the turning point of Hughes phenomenon (i.e., the point at which the classification accuracy begins to decline). In this paper, we firstly analyze reasons for occurrence of Hughes phenomenon, and then based on the Mahalanobis classifier, classify the ground spectrum of several grasslands which were recorded in September 2012 using FieldSpec3 spectrometer in the regions around Qinghai Lake,a important pasturing area in the north of China. Before classification, we extract features from hyperspectral data by bands selecting and PCA- based feature transformations, and In the process of classification, we analyze how the correlation coefficient between wavebands, the number of waveband channels and the number of principal components affect the classification result. The results show that Hushes phenomenon may occur when the correlation coefficient between wavebands is greater than 94%,the number of wavebands is greater than 6, or the number of principal components is greater than 6. Best classification result can be achieved (overall accuracy of grasslands 90%) if the number of wavebands equals to 3 (the band positions are 370nm, 509nm and 886nm respectively) or the number of principal components ranges from 4 to 6.

  19. Wavelet decomposition based principal component analysis for face recognition using MATLAB

    NASA Astrophysics Data System (ADS)

    Sharma, Mahesh Kumar; Sharma, Shashikant; Leeprechanon, Nopbhorn; Ranjan, Aashish

    2016-03-01

    For the realization of face recognition systems in the static as well as in the real time frame, algorithms such as principal component analysis, independent component analysis, linear discriminate analysis, neural networks and genetic algorithms are used for decades. This paper discusses an approach which is a wavelet decomposition based principal component analysis for face recognition. Principal component analysis is chosen over other algorithms due to its relative simplicity, efficiency, and robustness features. The term face recognition stands for identifying a person from his facial gestures and having resemblance with factor analysis in some sense, i.e. extraction of the principal component of an image. Principal component analysis is subjected to some drawbacks, mainly the poor discriminatory power and the large computational load in finding eigenvectors, in particular. These drawbacks can be greatly reduced by combining both wavelet transform decomposition for feature extraction and principal component analysis for pattern representation and classification together, by analyzing the facial gestures into space and time domain, where, frequency and time are used interchangeably. From the experimental results, it is envisaged that this face recognition method has made a significant percentage improvement in recognition rate as well as having a better computational efficiency.

  20. A comparison of autonomous techniques for multispectral image analysis and classification

    NASA Astrophysics Data System (ADS)

    Valdiviezo-N., Juan C.; Urcid, Gonzalo; Toxqui-Quitl, Carina; Padilla-Vivanco, Alfonso

    2012-10-01

    Multispectral imaging has given place to important applications related to classification and identification of objects from a scene. Because of multispectral instruments can be used to estimate the reflectance of materials in the scene, these techniques constitute fundamental tools for materials analysis and quality control. During the last years, a variety of algorithms has been developed to work with multispectral data, whose main purpose has been to perform the correct classification of the objects in the scene. The present study introduces a brief review of some classical as well as a novel technique that have been used for such purposes. The use of principal component analysis and K-means clustering techniques as important classification algorithms is here discussed. Moreover, a recent method based on the min-W and max-M lattice auto-associative memories, that was proposed for endmember determination in hyperspectral imagery, is introduced as a classification method. Besides a discussion of their mathematical foundation, we emphasize their main characteristics and the results achieved for two exemplar images conformed by objects similar in appearance, but spectrally different. The classification results state that the first components computed from principal component analysis can be used to highlight areas with different spectral characteristics. In addition, the use of lattice auto-associative memories provides good results for materials classification even in the cases where some spectral similarities appears in their spectral responses.

  1. [Spatial distribution characteristics of the physical and chemical properties of water in the Kunes River after the supply of snowmelt during spring].

    PubMed

    Liu, Xiang; Guo, Ling-Peng; Zhang, Fei-Yun; Ma, Jie; Mu, Shu-Yong; Zhao, Xin; Li, Lan-Hai

    2015-02-01

    Eight physical and chemical indicators related to water quality were monitored from nineteen sampling sites along the Kunes River at the end of snowmelt season in spring. To investigate the spatial distribution characteristics of water physical and chemical properties, cluster analysis (CA), discriminant analysis (DA) and principal component analysis (PCA) are employed. The result of cluster analysis showed that the Kunes River could be divided into three reaches according to the similarities of water physical and chemical properties among sampling sites, representing the upstream, midstream and downstream of the river, respectively; The result of discriminant analysis demonstrated that the reliability of such a classification was high, and DO, Cl- and BOD5 were the significant indexes leading to this classification; Three principal components were extracted on the basis of the principal component analysis, in which accumulative variance contribution could reach 86.90%. The result of principal component analysis also indicated that water physical and chemical properties were mostly affected by EC, ORP, NO3(-) -N, NH4(+) -N, Cl- and BOD5. The sorted results of principal component scores in each sampling sites showed that the water quality was mainly influenced by DO in upstream, by pH in midstream, and by the rest of indicators in downstream. The order of comprehensive scores for principal components revealed that the water quality degraded from the upstream to downstream, i.e., the upstream had the best water quality, followed by the midstream, while the water quality at downstream was the worst. This result corresponded exactly to the three reaches classified using cluster analysis. Anthropogenic activity and the accumulation of pollutants along the river were probably the main reasons leading to this spatial difference.

  2. Statistical classification of hydrogeologic regions in the fractured rock area of Maryland and parts of the District of Columbia, Virginia, West Virginia, Pennsylvania, and Delaware

    USGS Publications Warehouse

    Fleming, Brandon J.; LaMotte, Andrew E.; Sekellick, Andrew J.

    2013-01-01

    Hydrogeologic regions in the fractured rock area of Maryland were classified using geographic information system tools with principal components and cluster analyses. A study area consisting of the 8-digit Hydrologic Unit Code (HUC) watersheds with rivers that flow through the fractured rock area of Maryland and bounded by the Fall Line was further subdivided into 21,431 catchments from the National Hydrography Dataset Plus. The catchments were then used as a common hydrologic unit to compile relevant climatic, topographic, and geologic variables. A principal components analysis was performed on 10 input variables, and 4 principal components that accounted for 83 percent of the variability in the original data were identified. A subsequent cluster analysis grouped the catchments based on four principal component scores into six hydrogeologic regions. Two crystalline rock hydrogeologic regions, including large parts of the Washington, D.C. and Baltimore metropolitan regions that represent over 50 percent of the fractured rock area of Maryland, are distinguished by differences in recharge, Precipitation minus Potential Evapotranspiration, sand content in soils, and groundwater contributions to streams. This classification system will provide a georeferenced digital hydrogeologic framework for future investigations of groundwater availability in the fractured rock area of Maryland.

  3. Hyperspectral imaging of polymer banknotes for building and analysis of spectral library

    NASA Astrophysics Data System (ADS)

    Lim, Hoong-Ta; Murukeshan, Vadakke Matham

    2017-11-01

    The use of counterfeit banknotes increases crime rates and cripples the economy. New countermeasures are required to stop counterfeiters who use advancing technologies with criminal intent. Many countries started adopting polymer banknotes to replace paper notes, as polymer notes are more durable and have better quality. The research on authenticating such banknotes is of much interest to the forensic investigators. Hyperspectral imaging can be employed to build a spectral library of polymer notes, which can then be used for classification to authenticate these notes. This is however not widely reported and has become a research interest in forensic identification. This paper focuses on the use of hyperspectral imaging on polymer notes to build spectral libraries, using a pushbroom hyperspectral imager which has been previously reported. As an initial study, a spectral library will be built from three arbitrarily chosen regions of interest of five circulated genuine polymer notes. Principal component analysis is used for dimension reduction and to convert the information in the spectral library to principal components. A 99% confidence ellipse is formed around the cluster of principal component scores of each class and then used as classification criteria. The potential of the adopted methodology is demonstrated by the classification of the imaged regions as training samples.

  4. [Research on discrimination of cabbage and weeds based on visible and near-infrared spectrum analysis].

    PubMed

    Zu, Qin; Zhao, Chun-Jiang; Deng, Wei; Wang, Xiu

    2013-05-01

    The automatic identification of weeds forms the basis for precision spraying of crops infest. The canopy spectral reflectance within the 350-2 500 nm band of two strains of cabbages and five kinds of weeds such as barnyard grass, setaria, crabgrass, goosegrass and pigweed was acquired by ASD spectrometer. According to the spectral curve characteristics, the data in different bands were compressed with different levels to improve the operation efficiency. Firstly, the spectrum was denoised in accordance with the different order of multiple scattering correction (MSC) method and Savitzky-Golay (SG) convolution smoothing method set by different parameters, then the model was built by combining the principal component analysis (PCA) method to extract principal components, finally all kinds of plants were classified by using the soft independent modeling of class analogy (SIMCA) taxonomy and the classification results were compared. The tests results indicate that after the pretreatment of the spectral data with the method of the combination of MSC and SG set with 3rd order, 5th degree polynomial, 21 smoothing points, and the top 10 principal components extraction using PCA as a classification model input variable, 100% correct classification rate was achieved, and it is able to identify cabbage and several kinds of common weeds quickly and nondestructively.

  5. Automatic classification of retinal three-dimensional optical coherence tomography images using principal component analysis network with composite kernels

    NASA Astrophysics Data System (ADS)

    Fang, Leyuan; Wang, Chong; Li, Shutao; Yan, Jun; Chen, Xiangdong; Rabbani, Hossein

    2017-11-01

    We present an automatic method, termed as the principal component analysis network with composite kernel (PCANet-CK), for the classification of three-dimensional (3-D) retinal optical coherence tomography (OCT) images. Specifically, the proposed PCANet-CK method first utilizes the PCANet to automatically learn features from each B-scan of the 3-D retinal OCT images. Then, multiple kernels are separately applied to a set of very important features of the B-scans and these kernels are fused together, which can jointly exploit the correlations among features of the 3-D OCT images. Finally, the fused (composite) kernel is incorporated into an extreme learning machine for the OCT image classification. We tested our proposed algorithm on two real 3-D spectral domain OCT (SD-OCT) datasets (of normal subjects and subjects with the macular edema and age-related macular degeneration), which demonstrated its effectiveness.

  6. Approximation-based common principal component for feature extraction in multi-class brain-computer interfaces.

    PubMed

    Hoang, Tuan; Tran, Dat; Huang, Xu

    2013-01-01

    Common Spatial Pattern (CSP) is a state-of-the-art method for feature extraction in Brain-Computer Interface (BCI) systems. However it is designed for 2-class BCI classification problems. Current extensions of this method to multiple classes based on subspace union and covariance matrix similarity do not provide a high performance. This paper presents a new approach to solving multi-class BCI classification problems by forming a subspace resembled from original subspaces and the proposed method for this approach is called Approximation-based Common Principal Component (ACPC). We perform experiments on Dataset 2a used in BCI Competition IV to evaluate the proposed method. This dataset was designed for motor imagery classification with 4 classes. Preliminary experiments show that the proposed ACPC feature extraction method when combining with Support Vector Machines outperforms CSP-based feature extraction methods on the experimental dataset.

  7. Subacute casemix classification for stroke rehabilitation in Australia. How well does AN-SNAP v2 explain variance in outcomes?

    PubMed

    Kohler, Friedbert; Renton, Roger; Dickson, Hugh G; Estell, John; Connolly, Carol E

    2011-02-01

    We sought the best predictors for length of stay, discharge destination and functional improvement for inpatients undergoing rehabilitation following a stroke and compared these predictors against AN-SNAP v2. The Oxfordshire classification subgroup, sociodemographic data and functional data were collected for patients admitted between 1997 and 2007, with a diagnosis of recent stroke. The data were factor analysed using Principal Components Analysis for categorical data (CATPCA). Categorical regression analyses was performed to determine the best predictors of length of stay, discharge destination, and functional improvement. A total of 1154 patients were included in the study. Principal components analysis indicated that the data were effectively unidimensional, with length of stay being the most important component. Regression analysis demonstrated that the best predictor was the admission motor FIM score, explaining 38.9% of variance for length of stay, 37.4%.of variance for functional improvement and 16% of variance for discharge destination. The best explanatory variable in our inpatient rehabilitation service is the admission motor FIM. AN- SNAP v2 classification is a less effective explanatory variable. This needs to be taken into account when using AN-SNAP v2 classification for clinical or funding purposes.

  8. From Periodic Properties to a Periodic Table Arrangement

    ERIC Educational Resources Information Center

    Besalú, Emili

    2013-01-01

    A periodic table is constructed from the consideration of periodic properties and the application of the principal components analysis technique. This procedure is useful for objects classification and data reduction and has been used in the field of chemistry for many applications, such as lanthanides, molecules, or conformers classification.…

  9. Geometric subspace methods and time-delay embedding for EEG artifact removal and classification.

    PubMed

    Anderson, Charles W; Knight, James N; O'Connor, Tim; Kirby, Michael J; Sokolov, Artem

    2006-06-01

    Generalized singular-value decomposition is used to separate multichannel electroencephalogram (EEG) into components found by optimizing a signal-to-noise quotient. These components are used to filter out artifacts. Short-time principal components analysis of time-delay embedded EEG is used to represent windowed EEG data to classify EEG according to which mental task is being performed. Examples are presented of the filtering of various artifacts and results are shown of classification of EEG from five mental tasks using committees of decision trees.

  10. Multivariate classification of the infrared spectra of cell and tissue samples

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Haaland, D.M.; Jones, H.D.; Thomas, E.V.

    1997-03-01

    Infrared microspectroscopy of biopsied canine lymph cells and tissue was performed to investigate the possibility of using IR spectra coupled with multivariate classification methods to classify the samples as normal, hyperplastic, or neoplastic (malignant). IR spectra were obtained in transmission mode through BaF{sub 2} windows and in reflection mode from samples prepared on gold-coated microscope slides. Cytology and histopathology samples were prepared by a variety of methods to identify the optimal methods of sample preparation. Cytospinning procedures that yielded a monolayer of cells on the BaF{sub 2} windows produced a limited set of IR transmission spectra. These transmission spectra weremore » converted to absorbance and formed the basis for a classification rule that yielded 100{percent} correct classification in a cross-validated context. Classifications of normal, hyperplastic, and neoplastic cell sample spectra were achieved by using both partial least-squares (PLS) and principal component regression (PCR) classification methods. Linear discriminant analysis applied to principal components obtained from the spectral data yielded a small number of misclassifications. PLS weight loading vectors yield valuable qualitative insight into the molecular changes that are responsible for the success of the infrared classification. These successful classification results show promise for assisting pathologists in the diagnosis of cell types and offer future potential for {ital in vivo} IR detection of some types of cancer. {copyright} {ital 1997} {ital Society for Applied Spectroscopy}« less

  11. [Research on spectra recognition method for cabbages and weeds based on PCA and SIMCA].

    PubMed

    Zu, Qin; Deng, Wei; Wang, Xiu; Zhao, Chun-Jiang

    2013-10-01

    In order to improve the accuracy and efficiency of weed identification, the difference of spectral reflectance was employed to distinguish between crops and weeds. Firstly, the different combinations of Savitzky-Golay (SG) convolutional derivation and multiplicative scattering correction (MSC) method were applied to preprocess the raw spectral data. Then the clustering analysis of various types of plants was completed by using principal component analysis (PCA) method, and the feature wavelengths which were sensitive for classifying various types of plants were extracted according to the corresponding loading plots of the optimal principal components in PCA results. Finally, setting the feature wavelengths as the input variables, the soft independent modeling of class analogy (SIMCA) classification method was used to identify the various types of plants. The experimental results of classifying cabbages and weeds showed that on the basis of the optimal pretreatment by a synthetic application of MSC and SG convolutional derivation with SG's parameters set as 1rd order derivation, 3th degree polynomial and 51 smoothing points, 23 feature wavelengths were extracted in accordance with the top three principal components in PCA results. When SIMCA method was used for classification while the previously selected 23 feature wavelengths were set as the input variables, the classification rates of the modeling set and the prediction set were respectively up to 98.6% and 100%.

  12. A network view on psychiatric disorders: network clusters of symptoms as elementary syndromes of psychopathology.

    PubMed

    Goekoop, Rutger; Goekoop, Jaap G

    2014-01-01

    The vast number of psychopathological syndromes that can be observed in clinical practice can be described in terms of a limited number of elementary syndromes that are differentially expressed. Previous attempts to identify elementary syndromes have shown limitations that have slowed progress in the taxonomy of psychiatric disorders. To examine the ability of network community detection (NCD) to identify elementary syndromes of psychopathology and move beyond the limitations of current classification methods in psychiatry. 192 patients with unselected mental disorders were tested on the Comprehensive Psychopathological Rating Scale (CPRS). Principal component analysis (PCA) was performed on the bootstrapped correlation matrix of symptom scores to extract the principal component structure (PCS). An undirected and weighted network graph was constructed from the same matrix. Network community structure (NCS) was optimized using a previously published technique. In the optimal network structure, network clusters showed a 89% match with principal components of psychopathology. Some 6 network clusters were found, including "Depression", "Mania", "Anxiety", "Psychosis", "Retardation", and "Behavioral Disorganization". Network metrics were used to quantify the continuities between the elementary syndromes. We present the first comprehensive network graph of psychopathology that is free from the biases of previous classifications: a 'Psychopathology Web'. Clusters within this network represent elementary syndromes that are connected via a limited number of bridge symptoms. Many problems of previous classifications can be overcome by using a network approach to psychopathology.

  13. Automated Analysis, Classification, and Display of Waveforms

    NASA Technical Reports Server (NTRS)

    Kwan, Chiman; Xu, Roger; Mayhew, David; Zhang, Frank; Zide, Alan; Bonggren, Jeff

    2004-01-01

    A computer program partly automates the analysis, classification, and display of waveforms represented by digital samples. In the original application for which the program was developed, the raw waveform data to be analyzed by the program are acquired from space-shuttle auxiliary power units (APUs) at a sampling rate of 100 Hz. The program could also be modified for application to other waveforms -- for example, electrocardiograms. The program begins by performing principal-component analysis (PCA) of 50 normal-mode APU waveforms. Each waveform is segmented. A covariance matrix is formed by use of the segmented waveforms. Three eigenvectors corresponding to three principal components are calculated. To generate features, each waveform is then projected onto the eigenvectors. These features are displayed on a three-dimensional diagram, facilitating the visualization of the trend of APU operations.

  14. Comparative study on fast classification of brick samples by combination of principal component analysis and linear discriminant analysis using stand-off and table-top laser-induced breakdown spectroscopy

    NASA Astrophysics Data System (ADS)

    Vítková, Gabriela; Prokeš, Lubomír; Novotný, Karel; Pořízka, Pavel; Novotný, Jan; Všianský, Dalibor; Čelko, Ladislav; Kaiser, Jozef

    2014-11-01

    Focusing on historical aspect, during archeological excavation or restoration works of buildings or different structures built from bricks it is important to determine, preferably in-situ and in real-time, the locality of bricks origin. Fast classification of bricks on the base of Laser-Induced Breakdown Spectroscopy (LIBS) spectra is possible using multivariate statistical methods. Combination of principal component analysis (PCA) and linear discriminant analysis (LDA) was applied in this case. LIBS was used to classify altogether the 29 brick samples from 7 different localities. Realizing comparative study using two different LIBS setups - stand-off and table-top it is shown that stand-off LIBS has a big potential for archeological in-field measurements.

  15. Automatic classification of retinal three-dimensional optical coherence tomography images using principal component analysis network with composite kernels.

    PubMed

    Fang, Leyuan; Wang, Chong; Li, Shutao; Yan, Jun; Chen, Xiangdong; Rabbani, Hossein

    2017-11-01

    We present an automatic method, termed as the principal component analysis network with composite kernel (PCANet-CK), for the classification of three-dimensional (3-D) retinal optical coherence tomography (OCT) images. Specifically, the proposed PCANet-CK method first utilizes the PCANet to automatically learn features from each B-scan of the 3-D retinal OCT images. Then, multiple kernels are separately applied to a set of very important features of the B-scans and these kernels are fused together, which can jointly exploit the correlations among features of the 3-D OCT images. Finally, the fused (composite) kernel is incorporated into an extreme learning machine for the OCT image classification. We tested our proposed algorithm on two real 3-D spectral domain OCT (SD-OCT) datasets (of normal subjects and subjects with the macular edema and age-related macular degeneration), which demonstrated its effectiveness. (2017) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).

  16. Blood hyperviscosity identification with reflective spectroscopy of tongue tip based on principal component analysis combining artificial neural network.

    PubMed

    Liu, Ming; Zhao, Jing; Lu, XiaoZuo; Li, Gang; Wu, Taixia; Zhang, LiFu

    2018-05-10

    With spectral methods, noninvasive determination of blood hyperviscosity in vivo is very potential and meaningful in clinical diagnosis. In this study, 67 male subjects (41 health, and 26 hyperviscosity according to blood sample analysis results) participate. Reflectance spectra of subjects' tongue tips is measured, and a classification method bases on principal component analysis combined with artificial neural network model is built to identify hyperviscosity. Hold-out and Leave-one-out methods are used to avoid significant bias and lessen overfitting problem, which are widely accepted in the model validation. To measure the performance of the classification, sensitivity, specificity, accuracy and F-measure are calculated, respectively. The accuracies with 100 times Hold-out method and 67 times Leave-one-out method are 88.05% and 97.01%, respectively. Experimental results indicate that the built classification model has certain practical value and proves the feasibility of using spectroscopy to identify hyperviscosity by noninvasive determination.

  17. The classification of the patients with pulmonary diseases using breath air samples spectral analysis

    NASA Astrophysics Data System (ADS)

    Kistenev, Yury V.; Borisov, Alexey V.; Kuzmin, Dmitry A.; Bulanova, Anna A.

    2016-08-01

    Technique of exhaled breath sampling is discussed. The procedure of wavelength auto-calibration is proposed and tested. Comparison of the experimental data with the model absorption spectra of 5% CO2 is conducted. The classification results of three study groups obtained by using support vector machine and principal component analysis methods are presented.

  18. Lightweight biometric detection system for human classification using pyroelectric infrared detectors.

    PubMed

    Burchett, John; Shankar, Mohan; Hamza, A Ben; Guenther, Bob D; Pitsianis, Nikos; Brady, David J

    2006-05-01

    We use pyroelectric detectors that are differential in nature to detect motion in humans by their heat emissions. Coded Fresnel lens arrays create boundaries that help to localize humans in space as well as to classify the nature of their motion. We design and implement a low-cost biometric tracking system by using off-the-shelf components. We demonstrate two classification methods by using data gathered from sensor clusters of dual-element pyroelectric detectors with coded Fresnel lens arrays. We propose two algorithms for person identification, a more generalized spectral clustering method and a more rigorous example that uses principal component regression to perform a blind classification.

  19. Use of principal component analysis in the evaluation of adherence to statin treatment: a method to determine a potential target population for public health intervention.

    PubMed

    Latry, Philippe; Martin-Latry, Karin; Labat, Anne; Molimard, Mathieu; Peter, Claude

    2011-08-01

    The prevalence of statin use is high but adherence low. For public health intervention to be rational, subpopulations of nonadherent subjects must be defined. To categorise statin users with respect to patterns of reimbursement, this study was performed using the main French health reimbursement database for the Aquitaine region of south-western France. The cohort included subjects who submitted a reimbursement for at least one delivery of a statin (index) during the inclusion period (1st of September 2004-31st of December 2004). Indicators of adherence from reimbursement data were considered for principal component analysis. The 119,570 subjects included and analysed had a sex ratio of 1.1, mean (SD) age of 65.9 (11.9), and 13% were considered incident statin users. Principal component analysis found three dimensions that explained 67% of the variance. Using a K-means classification combined with a hierarchical ascendant classification, six groups were characterised. One group was considered nonadherent (10% of study population) and one group least adherent (1%). This novel application of principal component analysis identified groups that may be potential targets for intervention. The least adherent group appears to be one of the most appropriate because of both its relatively small size for case review with prescribing physicians and its very poor adherence. © 2010 The Authors Fundamental and Clinical Pharmacology © 2010 Société Française de Pharmacologie et de Thérapeutique.

  20. Confocal Raman imaging for cancer cell classification

    NASA Astrophysics Data System (ADS)

    Mathieu, Evelien; Van Dorpe, Pol; Stakenborg, Tim; Liu, Chengxun; Lagae, Liesbet

    2014-05-01

    We propose confocal Raman imaging as a label-free single cell characterization method that can be used as an alternative for conventional cell identification techniques that typically require labels, long incubation times and complex sample preparation. In this study it is investigated whether cancer and blood cells can be distinguished based on their Raman spectra. 2D Raman scans are recorded of 114 single cells, i.e. 60 breast (MCF-7), 5 cervix (HeLa) and 39 prostate (LNCaP) cancer cells and 10 monocytes (from healthy donors). For each cell an average spectrum is calculated and principal component analysis is performed on all average cell spectra. The main features of these principal components indicate that the information for cell identification based on Raman spectra mainly comes from the fatty acid composition in the cell. Based on the second and third principal component, blood cells could be distinguished from cancer cells; and prostate cancer cells could be distinguished from breast and cervix cancer cells. However, it was not possible to distinguish breast and cervix cancer cells. The results obtained in this study, demonstrate the potential of confocal Raman imaging for cell type classification and identification purposes.

  1. Wood identification of Dalbergia nigra (CITES Appendix I) using quantitative wood anatomy, principal components analysis and naïve Bayes classification

    PubMed Central

    Gasson, Peter; Miller, Regis; Stekel, Dov J.; Whinder, Frances; Ziemińska, Kasia

    2010-01-01

    Background and Aims Dalbergia nigra is one of the most valuable timber species of its genus, having been traded for over 300 years. Due to over-exploitation it is facing extinction and trade has been banned under CITES Appendix I since 1992. Current methods, primarily comparative wood anatomy, are inadequate for conclusive species identification. This study aims to find a set of anatomical characters that distinguish the wood of D. nigra from other commercially important species of Dalbergia from Latin America. Methods Qualitative and quantitative wood anatomy, principal components analysis and naïve Bayes classification were conducted on 43 specimens of Dalbergia, eight D. nigra and 35 from six other Latin American species. Key Results Dalbergia cearensis and D. miscolobium can be distinguished from D. nigra on the basis of vessel frequency for the former, and ray frequency for the latter. Principal components analysis was unable to provide any further basis for separating the species. Naïve Bayes classification using the four characters: minimum vessel diameter; frequency of solitary vessels; mean ray width; and frequency of axially fused rays, classified all eight D. nigra correctly with no false negatives, but there was a false positive rate of 36·36 %. Conclusions Wood anatomy alone cannot distinguish D. nigra from all other commercially important Dalbergia species likely to be encountered by customs officials, but can be used to reduce the number of specimens that would need further study. PMID:19884155

  2. Wood identification of Dalbergia nigra (CITES Appendix I) using quantitative wood anatomy, principal components analysis and naive Bayes classification.

    PubMed

    Gasson, Peter; Miller, Regis; Stekel, Dov J; Whinder, Frances; Zieminska, Kasia

    2010-01-01

    Dalbergia nigra is one of the most valuable timber species of its genus, having been traded for over 300 years. Due to over-exploitation it is facing extinction and trade has been banned under CITES Appendix I since 1992. Current methods, primarily comparative wood anatomy, are inadequate for conclusive species identification. This study aims to find a set of anatomical characters that distinguish the wood of D. nigra from other commercially important species of Dalbergia from Latin America. Qualitative and quantitative wood anatomy, principal components analysis and naïve Bayes classification were conducted on 43 specimens of Dalbergia, eight D. nigra and 35 from six other Latin American species. Dalbergia cearensis and D. miscolobium can be distinguished from D. nigra on the basis of vessel frequency for the former, and ray frequency for the latter. Principal components analysis was unable to provide any further basis for separating the species. Naïve Bayes classification using the four characters: minimum vessel diameter; frequency of solitary vessels; mean ray width; and frequency of axially fused rays, classified all eight D. nigra correctly with no false negatives, but there was a false positive rate of 36.36 %. Wood anatomy alone cannot distinguish D. nigra from all other commercially important Dalbergia species likely to be encountered by customs officials, but can be used to reduce the number of specimens that would need further study.

  3. A Network View on Psychiatric Disorders: Network Clusters of Symptoms as Elementary Syndromes of Psychopathology

    PubMed Central

    Goekoop, Rutger; Goekoop, Jaap G.

    2014-01-01

    Introduction The vast number of psychopathological syndromes that can be observed in clinical practice can be described in terms of a limited number of elementary syndromes that are differentially expressed. Previous attempts to identify elementary syndromes have shown limitations that have slowed progress in the taxonomy of psychiatric disorders. Aim To examine the ability of network community detection (NCD) to identify elementary syndromes of psychopathology and move beyond the limitations of current classification methods in psychiatry. Methods 192 patients with unselected mental disorders were tested on the Comprehensive Psychopathological Rating Scale (CPRS). Principal component analysis (PCA) was performed on the bootstrapped correlation matrix of symptom scores to extract the principal component structure (PCS). An undirected and weighted network graph was constructed from the same matrix. Network community structure (NCS) was optimized using a previously published technique. Results In the optimal network structure, network clusters showed a 89% match with principal components of psychopathology. Some 6 network clusters were found, including "DEPRESSION", "MANIA", “ANXIETY”, "PSYCHOSIS", "RETARDATION", and "BEHAVIORAL DISORGANIZATION". Network metrics were used to quantify the continuities between the elementary syndromes. Conclusion We present the first comprehensive network graph of psychopathology that is free from the biases of previous classifications: a ‘Psychopathology Web’. Clusters within this network represent elementary syndromes that are connected via a limited number of bridge symptoms. Many problems of previous classifications can be overcome by using a network approach to psychopathology. PMID:25427156

  4. Support vector machine and principal component analysis for microarray data classification

    NASA Astrophysics Data System (ADS)

    Astuti, Widi; Adiwijaya

    2018-03-01

    Cancer is a leading cause of death worldwide although a significant proportion of it can be cured if it is detected early. In recent decades, technology called microarray takes an important role in the diagnosis of cancer. By using data mining technique, microarray data classification can be performed to improve the accuracy of cancer diagnosis compared to traditional techniques. The characteristic of microarray data is small sample but it has huge dimension. Since that, there is a challenge for researcher to provide solutions for microarray data classification with high performance in both accuracy and running time. This research proposed the usage of Principal Component Analysis (PCA) as a dimension reduction method along with Support Vector Method (SVM) optimized by kernel functions as a classifier for microarray data classification. The proposed scheme was applied on seven data sets using 5-fold cross validation and then evaluation and analysis conducted on term of both accuracy and running time. The result showed that the scheme can obtained 100% accuracy for Ovarian and Lung Cancer data when Linear and Cubic kernel functions are used. In term of running time, PCA greatly reduced the running time for every data sets.

  5. Online signature recognition using principal component analysis and artificial neural network

    NASA Astrophysics Data System (ADS)

    Hwang, Seung-Jun; Park, Seung-Je; Baek, Joong-Hwan

    2016-12-01

    In this paper, we propose an algorithm for on-line signature recognition using fingertip point in the air from the depth image acquired by Kinect. We extract 10 statistical features from X, Y, Z axis, which are invariant to changes in shifting and scaling of the signature trajectories in three-dimensional space. Artificial neural network is adopted to solve the complex signature classification problem. 30 dimensional features are converted into 10 principal components using principal component analysis, which is 99.02% of total variances. We implement the proposed algorithm and test to actual on-line signatures. In experiment, we verify the proposed method is successful to classify 15 different on-line signatures. Experimental result shows 98.47% of recognition rate when using only 10 feature vectors.

  6. Craters on Earth, Moon, and Mars: Multivariate classification and mode of origin

    USGS Publications Warehouse

    Pike, R.J.

    1974-01-01

    Testing extraterrestrial craters and candidate terrestrial analogs for morphologic similitude is treated as a problem in numerical taxonomy. According to a principal-components solution and a cluster analysis, 402 representative craters on the Earth, the Moon, and Mars divide into two major classes of contrasting shapes and modes of origin. Craters of net accumulation of material (cratered lunar domes, Martian "calderas," and all terrestrial volcanoes except maars and tuff rings) group apart from craters of excavation (terrestrial meteorite impact and experimental explosion craters, typical Martian craters, and all other lunar craters). Maars and tuff rings belong to neither group but are transitional. The classification criteria are four independent attributes of topographic geometry derived from seven descriptive variables by the principal-components transformation. Morphometric differences between crater bowl and raised rim constitute the strongest of the four components. Although single topographic variables cannot confidently predict the genesis of individual extraterrestrial craters, multivariate statistical models constructed from several variables can distinguish consistently between large impact craters and volcanoes. ?? 1974.

  7. Wilderness ecology: virgin plant communities of the Boundary Waters Canoe Area.

    Treesearch

    Lewis F. Ohmann; Robert R. Ream

    1971-01-01

    Describes virgin plant communities in the Boundary Waters Canoe Area. Data from all vegetative components of 106 virgin upland stands were used to construct a community classification through a combination of agglomerative clustering and principal components analysis. Discusses the relation of communities to their environment and to past wildfires.

  8. Transforming Graph Data for Statistical Relational Learning

    DTIC Science & Technology

    2012-10-01

    Jordan, 2003), PLSA (Hofmann, 1999), ? Classification via RMN (Taskar et al., 2003) or SVM (Hasan, Chaoji, Salem , & Zaki, 2006) ? Hierarchical...dimensionality reduction methods such as Principal 407 Rossi, McDowell, Aha, & Neville Component Analysis (PCA), Principal Factor Analysis ( PFA ), and...clustering algorithm. Journal of the Royal Statistical Society. Series C, Applied statistics, 28, 100–108. Hasan, M. A., Chaoji, V., Salem , S., & Zaki, M

  9. Using robust principal component analysis to alleviate day-to-day variability in EEG based emotion classification.

    PubMed

    Ping-Keng Jao; Yuan-Pin Lin; Yi-Hsuan Yang; Tzyy-Ping Jung

    2015-08-01

    An emerging challenge for emotion classification using electroencephalography (EEG) is how to effectively alleviate day-to-day variability in raw data. This study employed the robust principal component analysis (RPCA) to address the problem with a posed hypothesis that background or emotion-irrelevant EEG perturbations lead to certain variability across days and somehow submerge emotion-related EEG dynamics. The empirical results of this study evidently validated our hypothesis and demonstrated the RPCA's feasibility through the analysis of a five-day dataset of 12 subjects. The RPCA allowed tackling the sparse emotion-relevant EEG dynamics from the accompanied background perturbations across days. Sequentially, leveraging the RPCA-purified EEG trials from more days appeared to improve the emotion-classification performance steadily, which was not found in the case using the raw EEG features. Therefore, incorporating the RPCA with existing emotion-aware machine-learning frameworks on a longitudinal dataset of each individual may shed light on the development of a robust affective brain-computer interface (ABCI) that can alleviate ecological inter-day variability.

  10. Classification of Nortes in the Gulf of Mexico derived from wave energy maps

    NASA Astrophysics Data System (ADS)

    Appendini, C. M.; Hernández-Lasheras, J.

    2016-02-01

    Extreme wave climate in the Gulf of Mexico is determined by tropical cyclones and winds from the Central American Cold Surges, locally referred to as Nortes. While hurricanes can have catastrophic effects, extreme waves and storm surge from Nortes occur several times a year, and thus have greater impacts on human activities along the Mexican coast of the Gulf of Mexico. Despite the constant impacts from Nortes, there is no available classification that relates their characteristics (e.g. pressure gradients, wind speed), to the associated coastal impacts. This work presents a first approximation to characterize and classify Nortes, which is based on the assumption that the derived wave energy synthetizes information (i.e. wind intensity, direction and duration) of individual Norte events as they pass through the Gulf of Mexico. First, we developed an index to identify Nortes based on surface pressure differences of two locations. To validate the methodology we compared the events identified with other studies and available Nortes logs. Afterwards, we detected Nortes from the 1986/1987, 2008/2009 and 2009/2010 seasons and used their corresponding wind fields to derive the wave energy maps using a numerical wave model. We used the energy maps to classify the events into groups using manual (visual) and automatic classifications (principal component analysis and k-means). The manual classification identified 3 types of Nortes and the automatic classification identified 5, although 3 of them had a high degree of similarity. The principal component analysis indicated that all events have similar characteristics, as few components are necessary to explain almost all of the variance. The classification from the k-means indicated that 81% of analyzed Nortes affect the southeastern Gulf of Mexico, while a smaller percentage affects the northern Gulf of Mexico and even less affect the western Caribbean.

  11. Classification of narcotics in solid mixtures using principal component analysis and Raman spectroscopy.

    PubMed

    Ryder, Alan G

    2002-03-01

    Eighty-five solid samples consisting of illegal narcotics diluted with several different materials were analyzed by near-infrared (785 nm excitation) Raman spectroscopy. Principal Component Analysis (PCA) was employed to classify the samples according to narcotic type. The best sample discrimination was obtained by using the first derivative of the Raman spectra. Furthermore, restricting the spectral variables for PCA to 2 or 3% of the original spectral data according to the most intense peaks in the Raman spectrum of the pure narcotic resulted in a rapid discrimination method for classifying samples according to narcotic type. This method allows for the easy discrimination between cocaine, heroin, and MDMA mixtures even when the Raman spectra are complex or very similar. This approach of restricting the spectral variables also decreases the computational time by a factor of 30 (compared to the complete spectrum), making the methodology attractive for rapid automatic classification and identification of suspect materials.

  12. Chemometric and multivariate statistical analysis of time-of-flight secondary ion mass spectrometry spectra from complex Cu-Fe sulfides.

    PubMed

    Kalegowda, Yogesh; Harmer, Sarah L

    2012-03-20

    Time-of-flight secondary ion mass spectrometry (TOF-SIMS) spectra of mineral samples are complex, comprised of large mass ranges and many peaks. Consequently, characterization and classification analysis of these systems is challenging. In this study, different chemometric and statistical data evaluation methods, based on monolayer sensitive TOF-SIMS data, have been tested for the characterization and classification of copper-iron sulfide minerals (chalcopyrite, chalcocite, bornite, and pyrite) at different flotation pulp conditions (feed, conditioned feed, and Eh modified). The complex mass spectral data sets were analyzed using the following chemometric and statistical techniques: principal component analysis (PCA); principal component-discriminant functional analysis (PC-DFA); soft independent modeling of class analogy (SIMCA); and k-Nearest Neighbor (k-NN) classification. PCA was found to be an important first step in multivariate analysis, providing insight into both the relative grouping of samples and the elemental/molecular basis for those groupings. For samples exposed to oxidative conditions (at Eh ~430 mV), each technique (PCA, PC-DFA, SIMCA, and k-NN) was found to produce excellent classification. For samples at reductive conditions (at Eh ~ -200 mV SHE), k-NN and SIMCA produced the most accurate classification. Phase identification of particles that contain the same elements but a different crystal structure in a mixed multimetal mineral system has been achieved.

  13. Principal Component 2-D Long Short-Term Memory for Font Recognition on Single Chinese Characters.

    PubMed

    Tao, Dapeng; Lin, Xu; Jin, Lianwen; Li, Xuelong

    2016-03-01

    Chinese character font recognition (CCFR) has received increasing attention as the intelligent applications based on optical character recognition becomes popular. However, traditional CCFR systems do not handle noisy data effectively. By analyzing in detail the basic strokes of Chinese characters, we propose that font recognition on a single Chinese character is a sequence classification problem, which can be effectively solved by recurrent neural networks. For robust CCFR, we integrate a principal component convolution layer with the 2-D long short-term memory (2DLSTM) and develop principal component 2DLSTM (PC-2DLSTM) algorithm. PC-2DLSTM considers two aspects: 1) the principal component layer convolution operation helps remove the noise and get a rational and complete font information and 2) simultaneously, 2DLSTM deals with the long-range contextual processing along scan directions that can contribute to capture the contrast between character trajectory and background. Experiments using the frequently used CCFR dataset suggest the effectiveness of PC-2DLSTM compared with other state-of-the-art font recognition methods.

  14. Stationary Wavelet-based Two-directional Two-dimensional Principal Component Analysis for EMG Signal Classification

    NASA Astrophysics Data System (ADS)

    Ji, Yi; Sun, Shanlin; Xie, Hong-Bo

    2017-06-01

    Discrete wavelet transform (WT) followed by principal component analysis (PCA) has been a powerful approach for the analysis of biomedical signals. Wavelet coefficients at various scales and channels were usually transformed into a one-dimensional array, causing issues such as the curse of dimensionality dilemma and small sample size problem. In addition, lack of time-shift invariance of WT coefficients can be modeled as noise and degrades the classifier performance. In this study, we present a stationary wavelet-based two-directional two-dimensional principal component analysis (SW2D2PCA) method for the efficient and effective extraction of essential feature information from signals. Time-invariant multi-scale matrices are constructed in the first step. The two-directional two-dimensional principal component analysis then operates on the multi-scale matrices to reduce the dimension, rather than vectors in conventional PCA. Results are presented from an experiment to classify eight hand motions using 4-channel electromyographic (EMG) signals recorded in healthy subjects and amputees, which illustrates the efficiency and effectiveness of the proposed method for biomedical signal analysis.

  15. Radar fall detection using principal component analysis

    NASA Astrophysics Data System (ADS)

    Jokanovic, Branka; Amin, Moeness; Ahmad, Fauzia; Boashash, Boualem

    2016-05-01

    Falls are a major cause of fatal and nonfatal injuries in people aged 65 years and older. Radar has the potential to become one of the leading technologies for fall detection, thereby enabling the elderly to live independently. Existing techniques for fall detection using radar are based on manual feature extraction and require significant parameter tuning in order to provide successful detections. In this paper, we employ principal component analysis for fall detection, wherein eigen images of observed motions are employed for classification. Using real data, we demonstrate that the PCA based technique provides performance improvement over the conventional feature extraction methods.

  16. Variability search in M 31 using principal component analysis and the Hubble Source Catalogue

    NASA Astrophysics Data System (ADS)

    Moretti, M. I.; Hatzidimitriou, D.; Karampelas, A.; Sokolovsky, K. V.; Bonanos, A. Z.; Gavras, P.; Yang, M.

    2018-06-01

    Principal component analysis (PCA) is being extensively used in Astronomy but not yet exhaustively exploited for variability search. The aim of this work is to investigate the effectiveness of using the PCA as a method to search for variable stars in large photometric data sets. We apply PCA to variability indices computed for light curves of 18 152 stars in three fields in M 31 extracted from the Hubble Source Catalogue. The projection of the data into the principal components is used as a stellar variability detection and classification tool, capable of distinguishing between RR Lyrae stars, long-period variables (LPVs) and non-variables. This projection recovered more than 90 per cent of the known variables and revealed 38 previously unknown variable stars (about 30 per cent more), all LPVs except for one object of uncertain variability type. We conclude that this methodology can indeed successfully identify candidate variable stars.

  17. Principal component analysis of indocyanine green fluorescence dynamics for diagnosis of vascular diseases

    NASA Astrophysics Data System (ADS)

    Seo, Jihye; An, Yuri; Lee, Jungsul; Choi, Chulhee

    2015-03-01

    Indocyanine green (ICG), a near-infrared fluorophore, has been used in visualization of vascular structure and non-invasive diagnosis of vascular disease. Although many imaging techniques have been developed, there are still limitations in diagnosis of vascular diseases. We have recently developed a minimally invasive diagnostics system based on ICG fluorescence imaging for sensitive detection of vascular insufficiency. In this study, we used principal component analysis (PCA) to examine ICG spatiotemporal profile and to obtain pathophysiological information from ICG dynamics. Here we demonstrated that principal components of ICG dynamics in both feet showed significant differences between normal control and diabetic patients with vascula complications. We extracted the PCA time courses of the first three components and found distinct pattern in diabetic patient. We propose that PCA of ICG dynamics reveal better classification performance compared to fluorescence intensity analysis. We anticipate that specific feature of spatiotemporal ICG dynamics can be useful in diagnosis of various vascular diseases.

  18. Intelligent data analysis to interpret major risk factors for diabetic patients with and without ischemic stroke in a small population

    PubMed Central

    Gürgen, Fikret; Gürgen, Nurgül

    2003-01-01

    This study proposes an intelligent data analysis approach to investigate and interpret the distinctive factors of diabetes mellitus patients with and without ischemic (non-embolic type) stroke in a small population. The database consists of a total of 16 features collected from 44 diabetic patients. Features include age, gender, duration of diabetes, cholesterol, high density lipoprotein, triglyceride levels, neuropathy, nephropathy, retinopathy, peripheral vascular disease, myocardial infarction rate, glucose level, medication and blood pressure. Metric and non-metric features are distinguished. First, the mean and covariance of the data are estimated and the correlated components are observed. Second, major components are extracted by principal component analysis. Finally, as common examples of local and global classification approach, a k-nearest neighbor and a high-degree polynomial classifier such as multilayer perceptron are employed for classification with all the components and major components case. Macrovascular changes emerged as the principal distinctive factors of ischemic-stroke in diabetes mellitus. Microvascular changes were generally ineffective discriminators. Recommendations were made according to the rules of evidence-based medicine. Briefly, this case study, based on a small population, supports theories of stroke in diabetes mellitus patients and also concludes that the use of intelligent data analysis improves personalized preventive intervention. PMID:12685939

  19. Clinical study of noninvasive in vivo melanoma and nonmelanoma skin cancers using multimodal spectral diagnosis

    PubMed Central

    Lim, Liang; Nichols, Brandon; Migden, Michael R.; Rajaram, Narasimhan; Reichenberg, Jason S.; Markey, Mia K.; Ross, Merrick I.; Tunnell, James W.

    2014-01-01

    Abstract. The goal of this study was to determine the diagnostic capability of a multimodal spectral diagnosis (SD) for in vivo noninvasive disease diagnosis of melanoma and nonmelanoma skin cancers. We acquired reflectance, fluorescence, and Raman spectra from 137 lesions in 76 patients using custom-built optical fiber-based clinical systems. Biopsies of lesions were classified using standard histopathology as malignant melanoma (MM), nonmelanoma pigmented lesion (PL), basal cell carcinoma (BCC), actinic keratosis (AK), and squamous cell carcinoma (SCC). Spectral data were analyzed using principal component analysis. Using multiple diagnostically relevant principal components, we built leave-one-out logistic regression classifiers. Classification results were compared with histopathology of the lesion. Sensitivity/specificity for classifying MM versus PL (12 versus 17 lesions) was 100%/100%, for SCC and BCC versus AK (57 versus 14 lesions) was 95%/71%, and for AK and SCC and BCC versus normal skin (71 versus 71 lesions) was 90%/85%. The best classification for nonmelanoma skin cancers required multiple modalities; however, the best melanoma classification occurred with Raman spectroscopy alone. The high diagnostic accuracy for classifying both melanoma and nonmelanoma skin cancer lesions demonstrates the potential for SD as a clinical diagnostic device. PMID:25375350

  20. Clinical study of noninvasive in vivo melanoma and nonmelanoma skin cancers using multimodal spectral diagnosis

    NASA Astrophysics Data System (ADS)

    Lim, Liang; Nichols, Brandon; Migden, Michael R.; Rajaram, Narasimhan; Reichenberg, Jason S.; Markey, Mia K.; Ross, Merrick I.; Tunnell, James W.

    2014-11-01

    The goal of this study was to determine the diagnostic capability of a multimodal spectral diagnosis (SD) for in vivo noninvasive disease diagnosis of melanoma and nonmelanoma skin cancers. We acquired reflectance, fluorescence, and Raman spectra from 137 lesions in 76 patients using custom-built optical fiber-based clinical systems. Biopsies of lesions were classified using standard histopathology as malignant melanoma (MM), nonmelanoma pigmented lesion (PL), basal cell carcinoma (BCC), actinic keratosis (AK), and squamous cell carcinoma (SCC). Spectral data were analyzed using principal component analysis. Using multiple diagnostically relevant principal components, we built leave-one-out logistic regression classifiers. Classification results were compared with histopathology of the lesion. Sensitivity/specificity for classifying MM versus PL (12 versus 17 lesions) was 100%;/100%;, for SCC and BCC versus AK (57 versus 14 lesions) was 95%;/71%, and for AK and SCC and BCC versus normal skin (71 versus 71 lesions) was 90%/85%. The best classification for nonmelanoma skin cancers required multiple modalities; however, the best melanoma classification occurred with Raman spectroscopy alone. The high diagnostic accuracy for classifying both melanoma and nonmelanoma skin cancer lesions demonstrates the potential for SD as a clinical diagnostic device.

  1. [Identification of varieties of textile fibers by using Vis/NIR infrared spectroscopy technique].

    PubMed

    Wu, Gui-Fang; He, Yong

    2010-02-01

    The aim of the present paper was to provide new insight into Vis/NIR spectroscopic analysis of textile fibers. In order to achieve rapid identification of the varieties of fibers, the authors selected 5 kinds of fibers of cotton, flax, wool, silk and tencel to do a study with Vis/NIR spectroscopy. Firstly, the spectra of each kind of fiber were scanned by spectrometer, and principal component analysis (PCA) method was used to analyze the characteristics of the pattern of Vis/NIR spectra. Principal component scores scatter plot (PC1 x PC2 x PC3) of fiber indicated the classification effect of five varieties of fibers. The former 6 principal components (PCs) were selected according to the quantity and size of PCs. The PCA classification model was optimized by using the least-squares support vector machines (LS-SVM) method. The authors used the 6 PCs extracted by PCA as the inputs of LS-SVM, and PCA-LS-SVM model was built to achieve varieties validation as well as mathematical model building and optimization analysis. Two hundred samples (40 samples for each variety of fibers) of five varieties of fibers were used for calibration of PCA-LS-SVM model, and the other 50 samples (10 samples for each variety of fibers) were used for validation. The result of validation showed that Vis/NIR spectroscopy technique based on PCA-LS-SVM had a powerful classification capability. It provides a new method for identifying varieties of fibers rapidly and real time, so it has important significance for protecting the rights of consumers, ensuring the quality of textiles, and implementing rationalization production and transaction of textile materials and its production.

  2. Feature extraction via KPCA for classification of gait patterns.

    PubMed

    Wu, Jianning; Wang, Jue; Liu, Li

    2007-06-01

    Automated recognition of gait pattern change is important in medical diagnostics as well as in the early identification of at-risk gait in the elderly. We evaluated the use of Kernel-based Principal Component Analysis (KPCA) to extract more gait features (i.e., to obtain more significant amounts of information about human movement) and thus to improve the classification of gait patterns. 3D gait data of 24 young and 24 elderly participants were acquired using an OPTOTRAK 3020 motion analysis system during normal walking, and a total of 36 gait spatio-temporal and kinematic variables were extracted from the recorded data. KPCA was used first for nonlinear feature extraction to then evaluate its effect on a subsequent classification in combination with learning algorithms such as support vector machines (SVMs). Cross-validation test results indicated that the proposed technique could allow spreading the information about the gait's kinematic structure into more nonlinear principal components, thus providing additional discriminatory information for the improvement of gait classification performance. The feature extraction ability of KPCA was affected slightly with different kernel functions as polynomial and radial basis function. The combination of KPCA and SVM could identify young-elderly gait patterns with 91% accuracy, resulting in a markedly improved performance compared to the combination of PCA and SVM. These results suggest that nonlinear feature extraction by KPCA improves the classification of young-elderly gait patterns, and holds considerable potential for future applications in direct dimensionality reduction and interpretation of multiple gait signals.

  3. Identification and classification of upper limb motions using PCA.

    PubMed

    Veer, Karan; Vig, Renu

    2018-03-28

    This paper describes the utility of principal component analysis (PCA) in classifying upper limb signals. PCA is a powerful tool for analyzing data of high dimension. Here, two different input strategies were explored. The first method uses upper arm dual-position-based myoelectric signal acquisition and the other solely uses PCA for classifying surface electromyogram (SEMG) signals. SEMG data from the biceps and the triceps brachii muscles and four independent muscle activities of the upper arm were measured in seven subjects (total dataset=56). The datasets used for the analysis are rotated by class-specific principal component matrices to decorrelate the measured data prior to feature extraction.

  4. Multi-angle backscatter classification and sub-bottom profiling for improved seafloor characterization

    NASA Astrophysics Data System (ADS)

    Alevizos, Evangelos; Snellen, Mirjam; Simons, Dick; Siemes, Kerstin; Greinert, Jens

    2018-06-01

    This study applies three classification methods exploiting the angular dependence of acoustic seafloor backscatter along with high resolution sub-bottom profiling for seafloor sediment characterization in the Eckernförde Bay, Baltic Sea Germany. This area is well suited for acoustic backscatter studies due to its shallowness, its smooth bathymetry and the presence of a wide range of sediment types. Backscatter data were acquired using a Seabeam1180 (180 kHz) multibeam echosounder and sub-bottom profiler data were recorded using a SES-2000 parametric sonar transmitting 6 and 12 kHz. The high density of seafloor soundings allowed extracting backscatter layers for five beam angles over a large part of the surveyed area. A Bayesian probability method was employed for sediment classification based on the backscatter variability at a single incidence angle, whereas Maximum Likelihood Classification (MLC) and Principal Components Analysis (PCA) were applied to the multi-angle layers. The Bayesian approach was used for identifying the optimum number of acoustic classes because cluster validation is carried out prior to class assignment and class outputs are ordinal categorical values. The method is based on the principle that backscatter values from a single incidence angle express a normal distribution for a particular sediment type. The resulting Bayesian classes were well correlated to median grain sizes and the percentage of coarse material. The MLC method uses angular response information from five layers of training areas extracted from the Bayesian classification map. The subsequent PCA analysis is based on the transformation of these five layers into two principal components that comprise most of the data variability. These principal components were clustered in five classes after running an external cluster validation test. In general both methods MLC and PCA, separated the various sediment types effectively, showing good agreement (kappa >0.7) with the Bayesian approach which also correlates well with ground truth data (r2 > 0.7). In addition, sub-bottom data were used in conjunction with the Bayesian classification results to characterize acoustic classes with respect to their geological and stratigraphic interpretation. The joined interpretation of seafloor and sub-seafloor data sets proved to be an efficient approach for a better understanding of seafloor backscatter patchiness and to discriminate acoustically similar classes in different geological/bathymetric settings.

  5. Analysis of genetic diversity in banana cultivars (Musa cvs.) from the South of Oman using AFLP markers and classification by phylogenetic, hierarchical clustering and principal component analyses*

    PubMed Central

    Opara, Umezuruike Linus; Jacobson, Dan; Al-Saady, Nadiya Abubakar

    2010-01-01

    Banana is an important crop grown in Oman and there is a dearth of information on its genetic diversity to assist in crop breeding and improvement programs. This study employed amplified fragment length polymorphism (AFLP) to investigate the genetic variation in local banana cultivars from the southern region of Oman. Using 12 primer combinations, a total of 1094 bands were scored, of which 1012 were polymorphic. Eighty-two unique markers were identified, which revealed the distinct separation of the seven cultivars. The results obtained show that AFLP can be used to differentiate the banana cultivars. Further classification by phylogenetic, hierarchical clustering and principal component analyses showed significant differences between the clusters found with molecular markers and those clusters created by previous studies using morphological analysis. Based on the analytical results, a consensus dendrogram of the banana cultivars is presented. PMID:20443211

  6. Principal component analysis of Raman spectra for TiO2 nanoparticle characterization

    NASA Astrophysics Data System (ADS)

    Ilie, Alina Georgiana; Scarisoareanu, Monica; Morjan, Ion; Dutu, Elena; Badiceanu, Maria; Mihailescu, Ion

    2017-09-01

    The Raman spectra of anatase/rutile mixed phases of Sn doped TiO2 nanoparticles and undoped TiO2 nanoparticles, synthesised by laser pyrolysis, with nanocrystallite dimensions varying from 8 to 28 nm, was simultaneously processed with a self-written software that applies Principal Component Analysis (PCA) on the measured spectrum to verify the possibility of objective auto-characterization of nanoparticles from their vibrational modes. The photo-excited process of Raman scattering is very sensible to the material characteristics, especially in the case of nanomaterials, where more properties become relevant for the vibrational behaviour. We used PCA, a statistical procedure that performs eigenvalue decomposition of descriptive data covariance, to automatically analyse the sample's measured Raman spectrum, and to interfere the correlation between nanoparticle dimensions, tin and carbon concentration, and their Principal Component values (PCs). This type of application can allow an approximation of the crystallite size, or tin concentration, only by measuring the Raman spectrum of the sample. The study of loadings of the principal components provides information of the way the vibrational modes are affected by the nanoparticle features and the spectral area relevant for the classification.

  7. Inference of Ancestry in Forensic Analysis II: Analysis of Genetic Data.

    PubMed

    Santos, Carla; Phillips, Chris; Gomez-Tato, A; Alvarez-Dios, J; Carracedo, Ángel; Lareu, Maria Victoria

    2016-01-01

    Three approaches applicable to the analysis of forensic ancestry-informative marker data-STRUCTURE, principal component analysis, and the Snipper Bayesian classification system-are reviewed. Detailed step-by-step guidance is provided for adjusting parameter settings in STRUCTURE with particular regard to their effect when differentiating populations. Several enhancements to the Snipper online forensic classification portal are described, highlighting the added functionality they bring to particular aspects of ancestry-informative SNP analysis in a forensic context.

  8. Modeling Governance KB with CATPCA to Overcome Multicollinearity in the Logistic Regression

    NASA Astrophysics Data System (ADS)

    Khikmah, L.; Wijayanto, H.; Syafitri, U. D.

    2017-04-01

    The problem often encounters in logistic regression modeling are multicollinearity problems. Data that have multicollinearity between explanatory variables with the result in the estimation of parameters to be bias. Besides, the multicollinearity will result in error in the classification. In general, to overcome multicollinearity in regression used stepwise regression. They are also another method to overcome multicollinearity which involves all variable for prediction. That is Principal Component Analysis (PCA). However, classical PCA in only for numeric data. Its data are categorical, one method to solve the problems is Categorical Principal Component Analysis (CATPCA). Data were used in this research were a part of data Demographic and Population Survey Indonesia (IDHS) 2012. This research focuses on the characteristic of women of using the contraceptive methods. Classification results evaluated using Area Under Curve (AUC) values. The higher the AUC value, the better. Based on AUC values, the classification of the contraceptive method using stepwise method (58.66%) is better than the logistic regression model (57.39%) and CATPCA (57.39%). Evaluation of the results of logistic regression using sensitivity, shows the opposite where CATPCA method (99.79%) is better than logistic regression method (92.43%) and stepwise (92.05%). Therefore in this study focuses on major class classification (using a contraceptive method), then the selected model is CATPCA because it can raise the level of the major class model accuracy.

  9. Classification of high-resolution multispectral satellite remote sensing images using extended morphological attribute profiles and independent component analysis

    NASA Astrophysics Data System (ADS)

    Wu, Yu; Zheng, Lijuan; Xie, Donghai; Zhong, Ruofei

    2017-07-01

    In this study, the extended morphological attribute profiles (EAPs) and independent component analysis (ICA) were combined for feature extraction of high-resolution multispectral satellite remote sensing images and the regularized least squares (RLS) approach with the radial basis function (RBF) kernel was further applied for the classification. Based on the major two independent components, the geometrical features were extracted using the EAPs method. In this study, three morphological attributes were calculated and extracted for each independent component, including area, standard deviation, and moment of inertia. The extracted geometrical features classified results using RLS approach and the commonly used LIB-SVM library of support vector machines method. The Worldview-3 and Chinese GF-2 multispectral images were tested, and the results showed that the features extracted by EAPs and ICA can effectively improve the accuracy of the high-resolution multispectral image classification, 2% larger than EAPs and principal component analysis (PCA) method, and 6% larger than APs and original high-resolution multispectral data. Moreover, it is also suggested that both the GURLS and LIB-SVM libraries are well suited for the multispectral remote sensing image classification. The GURLS library is easy to be used with automatic parameter selection but its computation time may be larger than the LIB-SVM library. This study would be helpful for the classification application of high-resolution multispectral satellite remote sensing images.

  10. Intelligent Classification in Huge Heterogeneous Data Sets

    DTIC Science & Technology

    2015-06-01

    Competencies DoD Department of Defense GMTI Ground Moving Target Indicator ISR Intelligence, Surveillance and Reconnaissance NCD Noncoherent Change...Detection OCR Optical Character Recognition PCA Principal Component Analysis SAR Synthetic Aperture Radar SVD Singular Value Decomponsition USPS United States Postal Service 8 Approved for Public Release; Distribution Unlimited.

  11. Metabolite Profiling and Classification of DNA-Authenticated Licorice Botanicals

    PubMed Central

    Simmler, Charlotte; Anderson, Jeffrey R.; Gauthier, Laura; Lankin, David C.; McAlpine, James B.; Chen, Shao-Nong; Pauli, Guido F.

    2015-01-01

    Raw licorice roots represent heterogeneous materials obtained from mainly three Glycyrrhiza species. G. glabra, G. uralensis, and G. inflata exhibit marked metabolite differences in terms of flavanones (Fs), chalcones (Cs), and other phenolic constituents. The principal objective of this work was to develop complementary chemometric models for the metabolite profiling, classification, and quality control of authenticated licorice. A total of 51 commercial and macroscopically verified samples were DNA authenticated. Principal component analysis and canonical discriminant analysis were performed on 1H NMR spectra and area under the curve values obtained from UHPLC-UV chromatograms, respectively. The developed chemometric models enable the identification and classification of Glycyrrhiza species according to their composition in major Fs, Cs, and species specific phenolic compounds. Further key outcomes demonstrated that DNA authentication combined with chemometric analyses enabled the characterization of mixtures, hybrids, and species outliers. This study provides a new foundation for the botanical and chemical authentication, classification, and metabolomic characterization of crude licorice botanicals and derived materials. Collectively, the proposed methods offer a comprehensive approach for the quality control of licorice as one of the most widely used botanical dietary supplements. PMID:26244884

  12. Quantitative study of flavonoids in leaves of citrus plants.

    PubMed

    Kawaii, S; Tomono, Y; Katase, E; Ogawa, K; Yano, M; Koizumi, M; Ito, C; Furukawa, H

    2000-09-01

    Leaf flavonoids were quantitatively determined in 68 representative or economically important Citrus species, cultivars, and near-Citrus relatives. Contents of 23 flavonoids including 6 polymethoxylated flavones were analyzed by means of reversed phase HPLC analysis. Principal component analysis revealed that the 7 associations according to Tanaka's classification were observed, but some do overlap each other. Group VII species could be divided into two different subgroups, namely, the first-10-species class and the last-19-species class according to Tanaka's classification numbers.

  13. Analysis and comparison of sleeping posture classification methods using pressure sensitive bed system.

    PubMed

    Hsia, C C; Liou, K J; Aung, A P W; Foo, V; Huang, W; Biswas, J

    2009-01-01

    Pressure ulcers are common problems for bedridden patients. Caregivers need to reposition the sleeping posture of a patient every two hours in order to reduce the risk of getting ulcers. This study presents the use of Kurtosis and skewness estimation, principal component analysis (PCA) and support vector machines (SVMs) for sleeping posture classification using cost-effective pressure sensitive mattress that can help caregivers to make correct sleeping posture changes for the prevention of pressure ulcers.

  14. Study on nondestructive discrimination of genuine and counterfeit wild ginsengs using NIRS

    NASA Astrophysics Data System (ADS)

    Lu, Q.; Fan, Y.; Peng, Z.; Ding, H.; Gao, H.

    2012-07-01

    A new approach for the nondestructive discrimination between genuine wild ginsengs and the counterfeit ones by near infrared spectroscopy (NIRS) was developed. Both discriminant analysis and back propagation artificial neural network (BP-ANN) were applied to the model establishment for discrimination. Optimal modeling wavelengths were determined based on the anomalous spectral information of counterfeit samples. Through principal component analysis (PCA) of various wild ginseng samples, genuine and counterfeit, the cumulative percentages of variance of the principal components were obtained, serving as a reference for principal component (PC) factor determination. Discriminant analysis achieved an identification ratio of 88.46%. With sample' truth values as its outputs, a three-layer BP-ANN model was built, which yielded a higher discrimination accuracy of 100%. The overall results sufficiently demonstrate that NIRS combined with BP-ANN classification algorithm performs better on ginseng discrimination than discriminant analysis, and can be used as a rapid and nondestructive method for the detection of counterfeit wild ginsengs in food and pharmaceutical industry.

  15. FPGA Implementation of Generalized Hebbian Algorithm for Texture Classification

    PubMed Central

    Lin, Shiow-Jyu; Hwang, Wen-Jyi; Lee, Wei-Hao

    2012-01-01

    This paper presents a novel hardware architecture for principal component analysis. The architecture is based on the Generalized Hebbian Algorithm (GHA) because of its simplicity and effectiveness. The architecture is separated into three portions: the weight vector updating unit, the principal computation unit and the memory unit. In the weight vector updating unit, the computation of different synaptic weight vectors shares the same circuit for reducing the area costs. To show the effectiveness of the circuit, a texture classification system based on the proposed architecture is physically implemented by Field Programmable Gate Array (FPGA). It is embedded in a System-On-Programmable-Chip (SOPC) platform for performance measurement. Experimental results show that the proposed architecture is an efficient design for attaining both high speed performance and low area costs. PMID:22778640

  16. The Complexity of Human Walking: A Knee Osteoarthritis Study

    PubMed Central

    Kotti, Margarita; Duffell, Lynsey D.; Faisal, Aldo A.; McGregor, Alison H.

    2014-01-01

    This study proposes a framework for deconstructing complex walking patterns to create a simple principal component space before checking whether the projection to this space is suitable for identifying changes from the normality. We focus on knee osteoarthritis, the most common knee joint disease and the second leading cause of disability. Knee osteoarthritis affects over 250 million people worldwide. The motivation for projecting the highly dimensional movements to a lower dimensional and simpler space is our belief that motor behaviour can be understood by identifying a simplicity via projection to a low principal component space, which may reflect upon the underlying mechanism. To study this, we recruited 180 subjects, 47 of which reported that they had knee osteoarthritis. They were asked to walk several times along a walkway equipped with two force plates that capture their ground reaction forces along 3 axes, namely vertical, anterior-posterior, and medio-lateral, at 1000 Hz. Data when the subject does not clearly strike the force plate were excluded, leaving 1–3 gait cycles per subject. To examine the complexity of human walking, we applied dimensionality reduction via Probabilistic Principal Component Analysis. The first principal component explains 34% of the variance in the data, whereas over 80% of the variance is explained by 8 principal components or more. This proves the complexity of the underlying structure of the ground reaction forces. To examine if our musculoskeletal system generates movements that are distinguishable between normal and pathological subjects in a low dimensional principal component space, we applied a Bayes classifier. For the tested cross-validated, subject-independent experimental protocol, the classification accuracy equals 82.62%. Also, a novel complexity measure is proposed, which can be used as an objective index to facilitate clinical decision making. This measure proves that knee osteoarthritis subjects exhibit more variability in the two-dimensional principal component space. PMID:25232949

  17. Alternative ways of representing Zapotec and Cuicatec folk classification of birds: a multidimensional model and its implications for culturally-informed conservation in Oaxaca, México.

    PubMed

    Alcántara-Salinas, Graciela; Ellen, Roy F; Valiñas-Coalla, Leopoldo; Caballero, Javier; Argueta-Villamar, Arturo

    2013-12-09

    We report on a comparative ethno-ornithological study of Zapotec and Cuicatec communities in Northern Oaxaca, Mexico that provided a challenge to some existing descriptions of folk classification. Our default model was the taxonomic system of ranks developed by Brent Berlin. Fieldwork was conducted in the Zapotec village of San Miguel Tiltepec and in the Cuicatec village of San Juan Teponaxtla, using a combination of ethnographic interviews and pile-sorting tests. Post-fieldwork, Principal Component Analysis using NTSYSpc V. 2.11f was applied to obtain pattern variation for the answers from different participants. Using language and pile-sorting data analysed through Principal Component Analysis, we show how both Zapotec and Cuicatec subjects place a particular emphasis on an intermediate level of classification.These categories group birds with non-birds using ecological and behavioral criteria, and violate a strict distinction between symbolic and mundane (or ‘natural’), and between ‘general-purpose’ and ‘single-purpose’ schemes. We suggest that shared classificatory knowledge embodying everyday schemes for apprehending the world of birds might be better reflected in a multidimensional model that would also provide a more realistic basis for developing culturally-informed conservation strategies.

  18. Alternative ways of representing Zapotec and Cuicatec folk classification of birds: a multidimensional model and its implications for culturally-informed conservation in Oaxaca, México

    PubMed Central

    2013-01-01

    Background We report on a comparative ethno-ornithological study of Zapotec and Cuicatec communities in Northern Oaxaca, Mexico that provided a challenge to some existing descriptions of folk classification. Our default model was the taxonomic system of ranks developed by Brent Berlin. Methods Fieldwork was conducted in the Zapotec village of San Miguel Tiltepec and in the Cuicatec village of San Juan Teponaxtla, using a combination of ethnographic interviews and pile-sorting tests. Post-fieldwork, Principal Component Analysis using NTSYSpc V. 2.11f was applied to obtain pattern variation for the answers from different participants. Results and conclusion Using language and pile-sorting data analysed through Principal Component Analysis, we show how both Zapotec and Cuicatec subjects place a particular emphasis on an intermediate level of classification. These categories group birds with non-birds using ecological and behavioral criteria, and violate a strict distinction between symbolic and mundane (or ‘natural’), and between ‘general-purpose’ and ‘single-purpose’ schemes. We suggest that shared classificatory knowledge embodying everyday schemes for apprehending the world of birds might be better reflected in a multidimensional model that would also provide a more realistic basis for developing culturally-informed conservation strategies. PMID:24321280

  19. A dimension reduction strategy for improving the efficiency of computer-aided detection for CT colonography

    NASA Astrophysics Data System (ADS)

    Song, Bowen; Zhang, Guopeng; Wang, Huafeng; Zhu, Wei; Liang, Zhengrong

    2013-02-01

    Various types of features, e.g., geometric features, texture features, projection features etc., have been introduced for polyp detection and differentiation tasks via computer aided detection and diagnosis (CAD) for computed tomography colonography (CTC). Although these features together cover more information of the data, some of them are statistically highly-related to others, which made the feature set redundant and burdened the computation task of CAD. In this paper, we proposed a new dimension reduction method which combines hierarchical clustering and principal component analysis (PCA) for false positives (FPs) reduction task. First, we group all the features based on their similarity using hierarchical clustering, and then PCA is employed within each group. Different numbers of principal components are selected from each group to form the final feature set. Support vector machine is used to perform the classification. The results show that when three principal components were chosen from each group we can achieve an area under the curve of receiver operating characteristics of 0.905, which is as high as the original dataset. Meanwhile, the computation time is reduced by 70% and the feature set size is reduce by 77%. It can be concluded that the proposed method captures the most important information of the feature set and the classification accuracy is not affected after the dimension reduction. The result is promising and further investigation, such as automatically threshold setting, are worthwhile and are under progress.

  20. Lippia origanoides chemotype differentiation based on essential oil GC-MS and principal component analysis.

    PubMed

    Stashenko, Elena E; Martínez, Jairo R; Ruíz, Carlos A; Arias, Ginna; Durán, Camilo; Salgar, William; Cala, Mónica

    2010-01-01

    Chromatographic (GC/flame ionization detection, GC/MS) and statistical analyses were applied to the study of essential oils and extracts obtained from flowers, leaves, and stems of Lippia origanoides plants, growing wild in different Colombian regions. Retention indices, mass spectra, and standard substances were used in the identification of 139 substances detected in these essential oils and extracts. Principal component analysis allowed L. origanoides classification into three chemotypes, characterized according to their essential oil major components. Alpha- and beta-phellandrenes, p-cymene, and limonene distinguished chemotype A; carvacrol and thymol were the distinctive major components of chemotypes B and C, respectively. Pinocembrin (5,7-dihydroxyflavanone) was found in L. origanoides chemotype A supercritical fluid (CO(2)) extract at a concentration of 0.83+/-0.03 mg/g of dry plant material, which makes this plant an interesting source of an important bioactive flavanone with diverse potential applications in cosmetic, food, and pharmaceutical products.

  1. An Extended Spectral-Spatial Classification Approach for Hyperspectral Data

    NASA Astrophysics Data System (ADS)

    Akbari, D.

    2017-11-01

    In this paper an extended classification approach for hyperspectral imagery based on both spectral and spatial information is proposed. The spatial information is obtained by an enhanced marker-based minimum spanning forest (MSF) algorithm. Three different methods of dimension reduction are first used to obtain the subspace of hyperspectral data: (1) unsupervised feature extraction methods including principal component analysis (PCA), independent component analysis (ICA), and minimum noise fraction (MNF); (2) supervised feature extraction including decision boundary feature extraction (DBFE), discriminate analysis feature extraction (DAFE), and nonparametric weighted feature extraction (NWFE); (3) genetic algorithm (GA). The spectral features obtained are then fed into the enhanced marker-based MSF classification algorithm. In the enhanced MSF algorithm, the markers are extracted from the classification maps obtained by both SVM and watershed segmentation algorithm. To evaluate the proposed approach, the Pavia University hyperspectral data is tested. Experimental results show that the proposed approach using GA achieves an approximately 8 % overall accuracy higher than the original MSF-based algorithm.

  2. Application of Hyperspectral Imaging and Chemometric Calibrations for Variety Discrimination of Maize Seeds

    PubMed Central

    Zhang, Xiaolei; Liu, Fei; He, Yong; Li, Xiaoli

    2012-01-01

    Hyperspectral imaging in the visible and near infrared (VIS-NIR) region was used to develop a novel method for discriminating different varieties of commodity maize seeds. Firstly, hyperspectral images of 330 samples of six varieties of maize seeds were acquired using a hyperspectral imaging system in the 380–1,030 nm wavelength range. Secondly, principal component analysis (PCA) and kernel principal component analysis (KPCA) were used to explore the internal structure of the spectral data. Thirdly, three optimal wavelengths (523, 579 and 863 nm) were selected by implementing PCA directly on each image. Then four textural variables including contrast, homogeneity, energy and correlation were extracted from gray level co-occurrence matrix (GLCM) of each monochromatic image based on the optimal wavelengths. Finally, several models for maize seeds identification were established by least squares-support vector machine (LS-SVM) and back propagation neural network (BPNN) using four different combinations of principal components (PCs), kernel principal components (KPCs) and textural features as input variables, respectively. The recognition accuracy achieved in the PCA-GLCM-LS-SVM model (98.89%) was the most satisfactory one. We conclude that hyperspectral imaging combined with texture analysis can be implemented for fast classification of different varieties of maize seeds. PMID:23235456

  3. Fusion of Modis and Palsar Principal Component Images Through Curvelet Transform for Land Cover Classification

    NASA Astrophysics Data System (ADS)

    Singh, Dharmendra; Kumar, Harish

    Earth observation satellites provide data that covers different portions of the electromagnetic spectrum at different spatial and spectral resolutions. The increasing availability of information products generated from satellite images are extending the ability to understand the patterns and dynamics of the earth resource systems at all scales of inquiry. In which one of the most important application is the generation of land cover classification from satellite images for understanding the actual status of various land cover classes. The prospect for the use of satel-lite images in land cover classification is an extremely promising one. The quality of satellite images available for land-use mapping is improving rapidly by development of advanced sensor technology. Particularly noteworthy in this regard is the improved spatial and spectral reso-lution of the images captured by new satellite sensors like MODIS, ASTER, Landsat 7, and SPOT 5. For the full exploitation of increasingly sophisticated multisource data, fusion tech-niques are being developed. Fused images may enhance the interpretation capabilities. The images used for fusion have different temporal, and spatial resolution. Therefore, the fused image provides a more complete view of the observed objects. It is one of the main aim of image fusion to integrate different data in order to obtain more information that can be de-rived from each of the single sensor data alone. A good example of this is the fusion of images acquired by different sensors having a different spatial resolution and of different spectral res-olution. Researchers are applying the fusion technique since from three decades and propose various useful methods and techniques. The importance of high-quality synthesis of spectral information is well suited and implemented for land cover classification. More recently, an underlying multiresolution analysis employing the discrete wavelet transform has been used in image fusion. It was found that multisensor image fusion is a tradeoff between the spectral information from a low resolution multi-spectral images and the spatial information from a high resolution multi-spectral images. With the wavelet transform based fusion method, it is easy to control this tradeoff. A new transform, the curvelet transform was used in recent years by Starck. A ridgelet transform is applied to square blocks of detail frames of undecimated wavelet decomposition, consequently the curvelet transform is obtained. Since the ridgelet transform possesses basis functions matching directional straight lines therefore, the curvelet transform is capable of representing piecewise linear contours on multiple scales through few significant coefficients. This property leads to a better separation between geometric details and background noise, which may be easily reduced by thresholding curvelet coefficients before they are used for fusion. The Terra and Aqua Moderate Resolution Imaging Spectroradiometer (MODIS) instrument provides high radiometric sensitivity (12 bit) in 36 spectral bands ranging in wavelength from 0.4 m to 14.4 m and also it is freely available. Two bands are imaged at a nominal resolution of 250 m at nadir, with five bands at 500 m, and the remaining 29 bands at 1 km. In this paper, the band 1 of spatial resolution 250 m and bandwidth 620-670 nm, and band 2, of spatial resolution of 250m and bandwidth 842-876 nm is considered as these bands has special features to identify the agriculture and other land covers. In January 2006, the Advanced Land Observing Satellite (ALOS) was successfully launched by the Japan Aerospace Exploration Agency (JAXA). The Phased Arraytype L-band SAR (PALSAR) sensor onboard the satellite acquires SAR imagery at a wavelength of 23.5 cm (frequency 1.27 GHz) with capabilities of multimode and multipolarization observation. PALSAR can operate in several modes: the fine-beam single (FBS) polarization mode (HH), fine-beam dual (FBD) polariza-tion mode (HH/HV or VV/VH), polarimetric (PLR) mode (HH/HV/VH/VV), and ScanSAR (WB) mode (HH/VV) [15]. These makes PALSAR imagery very attractive for spatially and temporally consistent monitoring system. The Overview of Principal Component Analysis is that the most of the information within all the bands can be compressed into a much smaller number of bands with little loss of information. It allows us to extract the low-dimensional subspaces that capture the main linear correlation among the high-dimensional image data. This facilitates viewing the explained variance or signal in the available imagery, allowing both gross and more subtle features in the imagery to be seen. In this paper we have explored the fusion technique for enhancing the land cover classification of low resolution satellite data espe-cially freely available satellite data. For this purpose, we have considered to fuse the PALSAR principal component data with MODIS principal component data. Initially, the MODIS band 1 and band 2 is considered, its principal component is computed. Similarly the PALSAR HH, HV and VV polarized data are considered, and there principal component is also computed. con-sequently, the PALSAR principal component image is fused with MODIS principal component image. The aim of this paper is to analyze the effect of classification accuracy on major type of land cover types like agriculture, water and urban bodies with fusion of PALSAR data to MODIS data. Curvelet transformation has been applied for fusion of these two satellite images and Minimum Distance classification technique has been applied for the resultant fused image. It is qualitatively and visually observed that the overall classification accuracy of MODIS image after fusion is enhanced. This type of fusion technique may be quite helpful in near future to use freely available satellite data to develop monitoring system for different land cover classes on the earth.

  4. Object-based land cover classification based on fusion of multifrequency SAR data and THAICHOTE optical imagery

    NASA Astrophysics Data System (ADS)

    Sukawattanavijit, Chanika; Srestasathiern, Panu

    2017-10-01

    Land Use and Land Cover (LULC) information are significant to observe and evaluate environmental change. LULC classification applying remotely sensed data is a technique popularly employed on a global and local dimension particularly, in urban areas which have diverse land cover types. These are essential components of the urban terrain and ecosystem. In the present, object-based image analysis (OBIA) is becoming widely popular for land cover classification using the high-resolution image. COSMO-SkyMed SAR data was fused with THAICHOTE (namely, THEOS: Thailand Earth Observation Satellite) optical data for land cover classification using object-based. This paper indicates a comparison between object-based and pixel-based approaches in image fusion. The per-pixel method, support vector machines (SVM) was implemented to the fused image based on Principal Component Analysis (PCA). For the objectbased classification was applied to the fused images to separate land cover classes by using nearest neighbor (NN) classifier. Finally, the accuracy assessment was employed by comparing with the classification of land cover mapping generated from fused image dataset and THAICHOTE image. The object-based data fused COSMO-SkyMed with THAICHOTE images demonstrated the best classification accuracies, well over 85%. As the results, an object-based data fusion provides higher land cover classification accuracy than per-pixel data fusion.

  5. Aural Classification and Temporal Robustness

    DTIC Science & Technology

    2010-11-01

    Canada – Atlantique ; novembre 2010. Contexte : Le présent projet vise le développement d’un classificateur robuste qui utilise des...10 4.2.2.2 Discriminant score . . . . . . . . . . . . . . . . . . . 11 4.2.3 Principal component analysis . . . . . . . . . . . . . . . . . . . 13 ...allows class separation. . . . . . . . . . . . 13 Figure 7: Hypothetical clutter and target pdfs and posterior probabilties shown as surfaces

  6. Data analysis techniques

    NASA Technical Reports Server (NTRS)

    Park, Steve

    1990-01-01

    A large and diverse number of computational techniques are routinely used to process and analyze remotely sensed data. These techniques include: univariate statistics; multivariate statistics; principal component analysis; pattern recognition and classification; other multivariate techniques; geometric correction; registration and resampling; radiometric correction; enhancement; restoration; Fourier analysis; and filtering. Each of these techniques will be considered, in order.

  7. Spatially resolved bimodal spectroscopy for classification/evaluation of mouse skin inflammatory and pre-cancerous stages

    NASA Astrophysics Data System (ADS)

    Díaz-Ayil, Gilberto; Amouroux, Marine; Clanché, Fabien; Granjon, Yves; Blondel, Walter C. P. M.

    2009-07-01

    Spatially-resolved bimodal spectroscopy (multiple AutoFluorescence AF excitation and Diffuse Reflectance DR), was used in vivo to discriminate various healthy and precancerous skin stages in a pre-clinical model (UV-irradiated mouse): Compensatory Hyperplasia CH, Atypical Hyperplasia AH and Dysplasia D. A specific data preprocessing scheme was applied to intensity spectra (filtering, spectral correction and intensity normalization), and several sets of spectral characteristics were automatically extracted and selected based on their discrimination power, statistically tested for every pair-wise comparison of histological classes. Data reduction with Principal Components Analysis (PCA) was performed and 3 classification methods were implemented (k-NN, LDA and SVM), in order to compare diagnostic performance of each method. Diagnostic performance was studied and assessed in terms of Sensibility (Se) and Specificity (Sp) as a function of the selected features, of the combinations of 3 different inter-fibres distances and of the numbers of principal components, such that: Se and Sp ~ 100% when discriminating CH vs. others; Sp ~ 100% and Se > 95% when discriminating Healthy vs. AH or D; Sp ~ 74% and Se ~ 63% for AH vs. D.

  8. Feature Extraction and Selection Strategies for Automated Target Recognition

    NASA Technical Reports Server (NTRS)

    Greene, W. Nicholas; Zhang, Yuhan; Lu, Thomas T.; Chao, Tien-Hsin

    2010-01-01

    Several feature extraction and selection methods for an existing automatic target recognition (ATR) system using JPLs Grayscale Optical Correlator (GOC) and Optimal Trade-Off Maximum Average Correlation Height (OT-MACH) filter were tested using MATLAB. The ATR system is composed of three stages: a cursory region of-interest (ROI) search using the GOC and OT-MACH filter, a feature extraction and selection stage, and a final classification stage. Feature extraction and selection concerns transforming potential target data into more useful forms as well as selecting important subsets of that data which may aide in detection and classification. The strategies tested were built around two popular extraction methods: Principal Component Analysis (PCA) and Independent Component Analysis (ICA). Performance was measured based on the classification accuracy and free-response receiver operating characteristic (FROC) output of a support vector machine(SVM) and a neural net (NN) classifier.

  9. Feature extraction and selection strategies for automated target recognition

    NASA Astrophysics Data System (ADS)

    Greene, W. Nicholas; Zhang, Yuhan; Lu, Thomas T.; Chao, Tien-Hsin

    2010-04-01

    Several feature extraction and selection methods for an existing automatic target recognition (ATR) system using JPLs Grayscale Optical Correlator (GOC) and Optimal Trade-Off Maximum Average Correlation Height (OT-MACH) filter were tested using MATLAB. The ATR system is composed of three stages: a cursory regionof- interest (ROI) search using the GOC and OT-MACH filter, a feature extraction and selection stage, and a final classification stage. Feature extraction and selection concerns transforming potential target data into more useful forms as well as selecting important subsets of that data which may aide in detection and classification. The strategies tested were built around two popular extraction methods: Principal Component Analysis (PCA) and Independent Component Analysis (ICA). Performance was measured based on the classification accuracy and free-response receiver operating characteristic (FROC) output of a support vector machine(SVM) and a neural net (NN) classifier.

  10. Microglia Morphological Categorization in a Rat Model of Neuroinflammation by Hierarchical Cluster and Principal Components Analysis.

    PubMed

    Fernández-Arjona, María Del Mar; Grondona, Jesús M; Granados-Durán, Pablo; Fernández-Llebrez, Pedro; López-Ávalos, María D

    2017-01-01

    It is known that microglia morphology and function are closely related, but only few studies have objectively described different morphological subtypes. To address this issue, morphological parameters of microglial cells were analyzed in a rat model of aseptic neuroinflammation. After the injection of a single dose of the enzyme neuraminidase (NA) within the lateral ventricle (LV) an acute inflammatory process occurs. Sections from NA-injected animals and sham controls were immunolabeled with the microglial marker IBA1, which highlights ramifications and features of the cell shape. Using images obtained by section scanning, individual microglial cells were sampled from various regions (septofimbrial nucleus, hippocampus and hypothalamus) at different times post-injection (2, 4 and 12 h). Each cell yielded a set of 15 morphological parameters by means of image analysis software. Five initial parameters (including fractal measures) were statistically different in cells from NA-injected rats (most of them IL-1β positive, i.e., M1-state) compared to those from control animals (none of them IL-1β positive, i.e., surveillant state). However, additional multimodal parameters were revealed more suitable for hierarchical cluster analysis (HCA). This method pointed out the classification of microglia population in four clusters. Furthermore, a linear discriminant analysis (LDA) suggested three specific parameters to objectively classify any microglia by a decision tree. In addition, a principal components analysis (PCA) revealed two extra valuable variables that allowed to further classifying microglia in a total of eight sub-clusters or types. The spatio-temporal distribution of these different morphotypes in our rat inflammation model allowed to relate specific morphotypes with microglial activation status and brain location. An objective method for microglia classification based on morphological parameters is proposed. Main points Microglia undergo a quantifiable morphological change upon neuraminidase induced inflammation.Hierarchical cluster and principal components analysis allow morphological classification of microglia.Brain location of microglia is a relevant factor.

  11. Microglia Morphological Categorization in a Rat Model of Neuroinflammation by Hierarchical Cluster and Principal Components Analysis

    PubMed Central

    Fernández-Arjona, María del Mar; Grondona, Jesús M.; Granados-Durán, Pablo; Fernández-Llebrez, Pedro; López-Ávalos, María D.

    2017-01-01

    It is known that microglia morphology and function are closely related, but only few studies have objectively described different morphological subtypes. To address this issue, morphological parameters of microglial cells were analyzed in a rat model of aseptic neuroinflammation. After the injection of a single dose of the enzyme neuraminidase (NA) within the lateral ventricle (LV) an acute inflammatory process occurs. Sections from NA-injected animals and sham controls were immunolabeled with the microglial marker IBA1, which highlights ramifications and features of the cell shape. Using images obtained by section scanning, individual microglial cells were sampled from various regions (septofimbrial nucleus, hippocampus and hypothalamus) at different times post-injection (2, 4 and 12 h). Each cell yielded a set of 15 morphological parameters by means of image analysis software. Five initial parameters (including fractal measures) were statistically different in cells from NA-injected rats (most of them IL-1β positive, i.e., M1-state) compared to those from control animals (none of them IL-1β positive, i.e., surveillant state). However, additional multimodal parameters were revealed more suitable for hierarchical cluster analysis (HCA). This method pointed out the classification of microglia population in four clusters. Furthermore, a linear discriminant analysis (LDA) suggested three specific parameters to objectively classify any microglia by a decision tree. In addition, a principal components analysis (PCA) revealed two extra valuable variables that allowed to further classifying microglia in a total of eight sub-clusters or types. The spatio-temporal distribution of these different morphotypes in our rat inflammation model allowed to relate specific morphotypes with microglial activation status and brain location. An objective method for microglia classification based on morphological parameters is proposed. Main points Microglia undergo a quantifiable morphological change upon neuraminidase induced inflammation.Hierarchical cluster and principal components analysis allow morphological classification of microglia.Brain location of microglia is a relevant factor. PMID:28848398

  12. Demonstrated Potential of Ion Mobility Spectrometry for Detection of Adulterated Perfumes and Plant Speciation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Clark, Jared Matthew; Daum, Keith Alvin; Kalival, J. H.

    2003-01-01

    This initial study evaluates the use of ion mobility spectrometry (IMS) as a rapid test procedure for potential detection of adulterated perfumes and speciation of plant life. Sample types measured consist of five genuine perfumes, two species of sagebrush, and four species of flowers. Each sample type is treated as a separate classification problem. It is shown that discrimination using principal component analysis with K-nearest neighbors can distinguish one class from another. Discriminatory models generated using principal component regressions are not as effective. Results from this examination are encouraging and represent an initial phase demonstrating that perfumes and plants possessmore » characteristic chemical signatures that can be used for reliable identification.« less

  13. Optical perception for detection of cutaneous T-cell lymphoma by multi-spectral imaging

    NASA Astrophysics Data System (ADS)

    Hsiao, Yu-Ping; Wang, Hsiang-Chen; Chen, Shih-Hua; Tsai, Chung-Hung; Yang, Jen-Hung

    2014-12-01

    In this study, the spectrum of each picture element of the patient’s skin image was obtained by multi-spectral imaging technology. Spectra of normal or pathological skin were collected from 15 patients. Principal component analysis and principal component scores of skin spectra were employed to distinguish the spectral characteristics with different diseases. Finally, skin regions with suspected cutaneous T-cell lymphoma (CTCL) lesions were successfully predicted by evaluation and classification of the spectra of pathological skin. The sensitivity and specificity of this technique were 89.65% and 95.18% after the analysis of about 109 patients. The probability of atopic dermatitis and psoriasis patients misinterpreted as CTCL were 5.56% and 4.54%, respectively.

  14. An initial analysis of LANDSAT 4 Thematic Mapper data for the classification of agricultural, forested wetland, and urban land covers

    NASA Technical Reports Server (NTRS)

    Quattrochi, D. A.; Anderson, J. E.; Brannon, D. P.; Hill, C. L.

    1982-01-01

    An initial analysis of LANDSAT 4 thematic mapper (TM) data for the delineation and classification of agricultural, forested wetland, and urban land covers was conducted. A study area in Poinsett County, Arkansas was used to evaluate a classification of agricultural lands derived from multitemporal LANDSAT multispectral scanner (MSS) data in comparison with a classification of TM data for the same area. Data over Reelfoot Lake in northwestern Tennessee were utilized to evaluate the TM for delineating forested wetland species. A classification of the study area was assessed for accuracy in discriminating five forested wetland categories. Finally, the TM data were used to identify urban features within a small city. A computer generated classification of Union City, Tennessee was analyzed for accuracy in delineating urban land covers. An evaluation of digitally enhanced TM data using principal components analysis to facilitate photointerpretation of urban features was also performed.

  15. Quantitation of flavonoid constituents in citrus fruits.

    PubMed

    Kawaii, S; Tomono, Y; Katase, E; Ogawa, K; Yano, M

    1999-09-01

    Twenty-four flavonoids have been determined in 66 Citrus species and near-citrus relatives, grown in the same field and year, by means of reversed phase high-performance liquid chromatography analysis. Statistical methods have been applied to find relations among the species. The F ratios of 21 flavonoids obtained by applying ANOVA analysis are significant, indicating that a classification of the species using these variables is reasonable to pursue. Principal component analysis revealed that the distributions of Citrus species belonging to different classes were largely in accordance with Tanaka's classification system.

  16. Kernel PLS-SVC for Linear and Nonlinear Discrimination

    NASA Technical Reports Server (NTRS)

    Rosipal, Roman; Trejo, Leonard J.; Matthews, Bryan

    2003-01-01

    A new methodology for discrimination is proposed. This is based on kernel orthonormalized partial least squares (PLS) dimensionality reduction of the original data space followed by support vector machines for classification. Close connection of orthonormalized PLS and Fisher's approach to linear discrimination or equivalently with canonical correlation analysis is described. This gives preference to use orthonormalized PLS over principal component analysis. Good behavior of the proposed method is demonstrated on 13 different benchmark data sets and on the real world problem of the classification finger movement periods versus non-movement periods based on electroencephalogram.

  17. Pattern classification of fMRI data: applications for analysis of spatially distributed cortical networks.

    PubMed

    Yourganov, Grigori; Schmah, Tanya; Churchill, Nathan W; Berman, Marc G; Grady, Cheryl L; Strother, Stephen C

    2014-08-01

    The field of fMRI data analysis is rapidly growing in sophistication, particularly in the domain of multivariate pattern classification. However, the interaction between the properties of the analytical model and the parameters of the BOLD signal (e.g. signal magnitude, temporal variance and functional connectivity) is still an open problem. We addressed this problem by evaluating a set of pattern classification algorithms on simulated and experimental block-design fMRI data. The set of classifiers consisted of linear and quadratic discriminants, linear support vector machine, and linear and nonlinear Gaussian naive Bayes classifiers. For linear discriminant, we used two methods of regularization: principal component analysis, and ridge regularization. The classifiers were used (1) to classify the volumes according to the behavioral task that was performed by the subject, and (2) to construct spatial maps that indicated the relative contribution of each voxel to classification. Our evaluation metrics were: (1) accuracy of out-of-sample classification and (2) reproducibility of spatial maps. In simulated data sets, we performed an additional evaluation of spatial maps with ROC analysis. We varied the magnitude, temporal variance and connectivity of simulated fMRI signal and identified the optimal classifier for each simulated environment. Overall, the best performers were linear and quadratic discriminants (operating on principal components of the data matrix) and, in some rare situations, a nonlinear Gaussian naïve Bayes classifier. The results from the simulated data were supported by within-subject analysis of experimental fMRI data, collected in a study of aging. This is the first study that systematically characterizes interactions between analysis model and signal parameters (such as magnitude, variance and correlation) on the performance of pattern classifiers for fMRI. Copyright © 2014 Elsevier Inc. All rights reserved.

  18. Facial Expression Recognition using Multiclass Ensemble Least-Square Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Lawi, Armin; Sya'Rani Machrizzandi, M.

    2018-03-01

    Facial expression is one of behavior characteristics of human-being. The use of biometrics technology system with facial expression characteristics makes it possible to recognize a person’s mood or emotion. The basic components of facial expression analysis system are face detection, face image extraction, facial classification and facial expressions recognition. This paper uses Principal Component Analysis (PCA) algorithm to extract facial features with expression parameters, i.e., happy, sad, neutral, angry, fear, and disgusted. Then Multiclass Ensemble Least-Squares Support Vector Machine (MELS-SVM) is used for the classification process of facial expression. The result of MELS-SVM model obtained from our 185 different expression images of 10 persons showed high accuracy level of 99.998% using RBF kernel.

  19. Enhanced Quality Control in Pharmaceutical Applications by Combining Raman Spectroscopy and Machine Learning Techniques

    NASA Astrophysics Data System (ADS)

    Martinez, J. C.; Guzmán-Sepúlveda, J. R.; Bolañoz Evia, G. R.; Córdova, T.; Guzmán-Cabrera, R.

    2018-06-01

    In this work, we applied machine learning techniques to Raman spectra for the characterization and classification of manufactured pharmaceutical products. Our measurements were taken with commercial equipment, for accurate assessment of variations with respect to one calibrated control sample. Unlike the typical use of Raman spectroscopy in pharmaceutical applications, in our approach the principal components of the Raman spectrum are used concurrently as attributes in machine learning algorithms. This permits an efficient comparison and classification of the spectra measured from the samples under study. This also allows for accurate quality control as all relevant spectral components are considered simultaneously. We demonstrate our approach with respect to the specific case of acetaminophen, which is one of the most widely used analgesics in the market. In the experiments, commercial samples from thirteen different laboratories were analyzed and compared against a control sample. The raw data were analyzed based on an arithmetic difference between the nominal active substance and the measured values in each commercial sample. The principal component analysis was applied to the data for quantitative verification (i.e., without considering the actual concentration of the active substance) of the difference in the calibrated sample. Our results show that by following this approach adulterations in pharmaceutical compositions can be clearly identified and accurately quantified.

  20. Genetic Classification of Populations Using Supervised Learning

    PubMed Central

    Bridges, Michael; Heron, Elizabeth A.; O'Dushlaine, Colm; Segurado, Ricardo; Morris, Derek; Corvin, Aiden; Gill, Michael; Pinto, Carlos

    2011-01-01

    There are many instances in genetics in which we wish to determine whether two candidate populations are distinguishable on the basis of their genetic structure. Examples include populations which are geographically separated, case–control studies and quality control (when participants in a study have been genotyped at different laboratories). This latter application is of particular importance in the era of large scale genome wide association studies, when collections of individuals genotyped at different locations are being merged to provide increased power. The traditional method for detecting structure within a population is some form of exploratory technique such as principal components analysis. Such methods, which do not utilise our prior knowledge of the membership of the candidate populations. are termed unsupervised. Supervised methods, on the other hand are able to utilise this prior knowledge when it is available. In this paper we demonstrate that in such cases modern supervised approaches are a more appropriate tool for detecting genetic differences between populations. We apply two such methods, (neural networks and support vector machines) to the classification of three populations (two from Scotland and one from Bulgaria). The sensitivity exhibited by both these methods is considerably higher than that attained by principal components analysis and in fact comfortably exceeds a recently conjectured theoretical limit on the sensitivity of unsupervised methods. In particular, our methods can distinguish between the two Scottish populations, where principal components analysis cannot. We suggest, on the basis of our results that a supervised learning approach should be the method of choice when classifying individuals into pre-defined populations, particularly in quality control for large scale genome wide association studies. PMID:21589856

  1. Hyperspectral Image Denoising Using a Nonlocal Spectral Spatial Principal Component Analysis

    NASA Astrophysics Data System (ADS)

    Li, D.; Xu, L.; Peng, J.; Ma, J.

    2018-04-01

    Hyperspectral images (HSIs) denoising is a critical research area in image processing duo to its importance in improving the quality of HSIs, which has a negative impact on object detection and classification and so on. In this paper, we develop a noise reduction method based on principal component analysis (PCA) for hyperspectral imagery, which is dependent on the assumption that the noise can be removed by selecting the leading principal components. The main contribution of paper is to introduce the spectral spatial structure and nonlocal similarity of the HSIs into the PCA denoising model. PCA with spectral spatial structure can exploit spectral correlation and spatial correlation of HSI by using 3D blocks instead of 2D patches. Nonlocal similarity means the similarity between the referenced pixel and other pixels in nonlocal area, where Mahalanobis distance algorithm is used to estimate the spatial spectral similarity by calculating the distance in 3D blocks. The proposed method is tested on both simulated and real hyperspectral images, the results demonstrate that the proposed method is superior to several other popular methods in HSI denoising.

  2. [Discrimination of varieties of brake fluid using visual-near infrared spectra].

    PubMed

    Jiang, Lu-lu; Tan, Li-hong; Qiu, Zheng-jun; Lu, Jiang-feng; He, Yong

    2008-06-01

    A new method was developed to fast discriminate brands of brake fluid by means of visual-near infrared spectroscopy. Five different brands of brake fluid were analyzed using a handheld near infrared spectrograph, manufactured by ASD Company, and 60 samples were gotten from each brand of brake fluid. The samples data were pretreated using average smoothing and standard normal variable method, and then analyzed using principal component analysis (PCA). A 2-dimensional plot was drawn based on the first and the second principal components, and the plot indicated that the clustering characteristic of different brake fluid is distinct. The foregoing 6 principal components were taken as input variable, and the band of brake fluid as output variable to build the discriminate model by stepwise discriminant analysis method. Two hundred twenty five samples selected randomly were used to create the model, and the rest 75 samples to verify the model. The result showed that the distinguishing rate was 94.67%, indicating that the method proposed in this paper has good performance in classification and discrimination. It provides a new way to fast discriminate different brands of brake fluid.

  3. Towards automatic lithological classification from remote sensing data using support vector machines

    NASA Astrophysics Data System (ADS)

    Yu, Le; Porwal, Alok; Holden, Eun-Jung; Dentith, Michael

    2010-05-01

    Remote sensing data can be effectively used as a mean to build geological knowledge for poorly mapped terrains. Spectral remote sensing data from space- and air-borne sensors have been widely used to geological mapping, especially in areas of high outcrop density in arid regions. However, spectral remote sensing information by itself cannot be efficiently used for a comprehensive lithological classification of an area due to (1) diagnostic spectral response of a rock within an image pixel is conditioned by several factors including the atmospheric effects, spectral and spatial resolution of the image, sub-pixel level heterogeneity in chemical and mineralogical composition of the rock, presence of soil and vegetation cover; (2) only surface information and is therefore highly sensitive to the noise due to weathering, soil cover, and vegetation. Consequently, for efficient lithological classification, spectral remote sensing data needs to be supplemented with other remote sensing datasets that provide geomorphological and subsurface geological information, such as digital topographic model (DEM) and aeromagnetic data. Each of the datasets contain significant information about geology that, in conjunction, can potentially be used for automated lithological classification using supervised machine learning algorithms. In this study, support vector machine (SVM), which is a kernel-based supervised learning method, was applied to automated lithological classification of a study area in northwestern India using remote sensing data, namely, ASTER, DEM and aeromagnetic data. Several digital image processing techniques were used to produce derivative datasets that contained enhanced information relevant to lithological discrimination. A series of SVMs (trained using k-folder cross-validation with grid search) were tested using various combinations of input datasets selected from among 50 datasets including the original 14 ASTER bands and 36 derivative datasets (including 14 principal component bands, 14 independent component bands, 3 band ratios, 3 DEM derivatives: slope/curvatureroughness and 2 aeromagnetic derivatives: mean and variance of susceptibility) extracted from the ASTER, DEM and aeromagnetic data, in order to determine the optimal inputs that provide the highest classification accuracy. It was found that a combination of ASTER-derived independent components, principal components and band ratios, DEM-derived slope, curvature and roughness, and aeromagnetic-derived mean and variance of magnetic susceptibility provide the highest classification accuracy of 93.4% on independent test samples. A comparison of the classification results of the SVM with those of maximum likelihood (84.9%) and minimum distance (38.4%) classifiers clearly show that the SVM algorithm returns much higher classification accuracy. Therefore, the SVM method can be used to produce quick and reliable geological maps from scarce geological information, which is still the case with many under-developed frontier regions of the world.

  4. Principal components - Petrology and chemistry of polyphase units in chondritic porous interplanetary dust particles

    NASA Astrophysics Data System (ADS)

    Rietmeijer, Frans J. M.

    1997-03-01

    Chondritic porous (CP) interplanetary dust particles (IDPs) can be described as 'cosmic sediments'. It should be possible to recognize in these IDPs the 4500 Myrs old solar nebula dusts. The studies of unaltered chondritic IDPs show that their matrices are a mixture of three different principal components (PCs) that also describe variable C/Si ratios of chondritic IDPs. Among others, PCs include polyphase units (PUs) that are amorphous to holocrystalline, both ultrafine- and coarse-grained, ferromagnesiosilica(te) materials with minor Al and Ca. The properties of PCs and their alteration products define the physical and chemical processes that produced and altered these components. PCs are also cornerstones of IDP classification. For example, the bulk composition of ultrafine-grained PCs can be reconstructed using the 'butterfly method' and also allows an evaluation of the metamorphic signatures, (e.g., dynamic pyrometamorphism), in chondritic IDPs.

  5. Assessment of Gait Characteristics in Total Knee Arthroplasty Patients Using a Hierarchical Partial Least Squares Method.

    PubMed

    Wang, Wei; Ackland, David C; McClelland, Jodie A; Webster, Kate E; Halgamuge, Saman

    2018-01-01

    Quantitative gait analysis is an important tool in objective assessment and management of total knee arthroplasty (TKA) patients. Studies evaluating gait patterns in TKA patients have tended to focus on discrete data such as spatiotemporal information, joint range of motion and peak values of kinematics and kinetics, or consider selected principal components of gait waveforms for analysis. These strategies may not have the capacity to capture small variations in gait patterns associated with each joint across an entire gait cycle, and may ultimately limit the accuracy of gait classification. The aim of this study was to develop an automatic feature extraction method to analyse patterns from high-dimensional autocorrelated gait waveforms. A general linear feature extraction framework was proposed and a hierarchical partial least squares method derived for discriminant analysis of multiple gait waveforms. The effectiveness of this strategy was verified using a dataset of joint angle and ground reaction force waveforms from 43 patients after TKA surgery and 31 healthy control subjects. Compared with principal component analysis and partial least squares methods, the hierarchical partial least squares method achieved generally better classification performance on all possible combinations of waveforms, with the highest classification accuracy . The novel hierarchical partial least squares method proposed is capable of capturing virtually all significant differences between TKA patients and the controls, and provides new insights into data visualization. The proposed framework presents a foundation for more rigorous classification of gait, and may ultimately be used to evaluate the effects of interventions such as surgery and rehabilitation.

  6. Metabolic profiles are principally different between cancers of the liver, pancreas and breast.

    PubMed

    Budhu, Anuradha; Terunuma, Atsushi; Zhang, Geng; Hussain, S Perwez; Ambs, Stefan; Wang, Xin Wei

    2014-01-01

    Molecular profiling of primary tumors may facilitate the classification of patients with cancer into more homogenous biological groups to aid clinical management. Metabolomic profiling has been shown to be a powerful tool in characterizing the biological mechanisms underlying a disease but has not been evaluated for its ability to classify cancers by their tissue of origin. Thus, we assessed metabolomic profiling as a novel tool for multiclass cancer characterization. Global metabolic profiling was employed to identify metabolites in paired tumor and non-tumor liver (n=60), breast (n=130) and pancreatic (n=76) tissue specimens. Unsupervised principal component analysis showed that metabolites are principally unique to each tissue and cancer type. Such a difference can also be observed even among early stage cancers, suggesting a significant and unique alteration of global metabolic pathways associated with each cancer type. Our global high-throughput metabolomic profiling study shows that specific biochemical alterations distinguish liver, pancreatic and breast cancer and could be applied as cancer classification tools to differentiate tumors based on tissue of origin.

  7. A comparison of PCA/ICA for data preprocessing in remote sensing imagery classification

    NASA Astrophysics Data System (ADS)

    He, Hui; Yu, Xianchuan

    2005-10-01

    In this paper a performance comparison of a variety of data preprocessing algorithms in remote sensing image classification is presented. These selected algorithms are principal component analysis (PCA) and three different independent component analyses, ICA (Fast-ICA (Aapo Hyvarinen, 1999), Kernel-ICA (KCCA and KGV (Bach & Jordan, 2002), EFFICA (Aiyou Chen & Peter Bickel, 2003). These algorithms were applied to a remote sensing imagery (1600×1197), obtained from Shunyi, Beijing. For classification, a MLC method is used for the raw and preprocessed data. The results show that classification with the preprocessed data have more confident results than that with raw data and among the preprocessing algorithms, ICA algorithms improve on PCA and EFFICA performs better than the others. The convergence of these ICA algorithms (for data points more than a million) are also studied, the result shows EFFICA converges much faster than the others. Furthermore, because EFFICA is a one-step maximum likelihood estimate (MLE) which reaches asymptotic Fisher efficiency (EFFICA), it computers quite small so that its demand of memory come down greatly, which settled the "out of memory" problem occurred in the other algorithms.

  8. Single-accelerometer-based daily physical activity classification.

    PubMed

    Long, Xi; Yin, Bin; Aarts, Ronald M

    2009-01-01

    In this study, a single tri-axial accelerometer placed on the waist was used to record the acceleration data for human physical activity classification. The data collection involved 24 subjects performing daily real-life activities in a naturalistic environment without researchers' intervention. For the purpose of assessing customers' daily energy expenditure, walking, running, cycling, driving, and sports were chosen as target activities for classification. This study compared a Bayesian classification with that of a Decision Tree based approach. A Bayes classifier has the advantage to be more extensible, requiring little effort in classifier retraining and software update upon further expansion or modification of the target activities. Principal components analysis was applied to remove the correlation among features and to reduce the feature vector dimension. Experiments using leave-one-subject-out and 10-fold cross validation protocols revealed a classification accuracy of approximately 80%, which was comparable with that obtained by a Decision Tree classifier.

  9. Automated cloud screening of AVHRR imagery using split-and-merge clustering

    NASA Technical Reports Server (NTRS)

    Gallaudet, Timothy C.; Simpson, James J.

    1991-01-01

    Previous methods to segment clouds from ocean in AVHRR imagery have shown varying degrees of success, with nighttime approaches being the most limited. An improved method of automatic image segmentation, the principal component transformation split-and-merge clustering (PCTSMC) algorithm, is presented and applied to cloud screening of both nighttime and daytime AVHRR data. The method combines spectral differencing, the principal component transformation, and split-and-merge clustering to sample objectively the natural classes in the data. This segmentation method is then augmented by supervised classification techniques to screen clouds from the imagery. Comparisons with other nighttime methods demonstrate its improved capability in this application. The sensitivity of the method to clustering parameters is presented; the results show that the method is insensitive to the split-and-merge thresholds.

  10. Rapid Elemental Analysis and Provenance Study of Blumea balsamifera DC Using Laser-Induced Breakdown Spectroscopy

    PubMed Central

    Liu, Xiaona; Zhang, Qiao; Wu, Zhisheng; Shi, Xinyuan; Zhao, Na; Qiao, Yanjiang

    2015-01-01

    Laser-induced breakdown spectroscopy (LIBS) was applied to perform a rapid elemental analysis and provenance study of Blumea balsamifera DC. Principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) were implemented to exploit the multivariate nature of the LIBS data. Scores and loadings of computed principal components visually illustrated the differing spectral data. The PLS-DA algorithm showed good classification performance. The PLS-DA model using complete spectra as input variables had similar discrimination performance to using selected spectral lines as input variables. The down-selection of spectral lines was specifically focused on the major elements of B. balsamifera samples. Results indicated that LIBS could be used to rapidly analyze elements and to perform provenance study of B. balsamifera. PMID:25558999

  11. Independent Component Analysis of Textures

    NASA Technical Reports Server (NTRS)

    Manduchi, Roberto; Portilla, Javier

    2000-01-01

    A common method for texture representation is to use the marginal probability densities over the outputs of a set of multi-orientation, multi-scale filters as a description of the texture. We propose a technique, based on Independent Components Analysis, for choosing the set of filters that yield the most informative marginals, meaning that the product over the marginals most closely approximates the joint probability density function of the filter outputs. The algorithm is implemented using a steerable filter space. Experiments involving both texture classification and synthesis show that compared to Principal Components Analysis, ICA provides superior performance for modeling of natural and synthetic textures.

  12. The Analysis of Dimensionality Reduction Techniques in Cryptographic Object Code Classification

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jason L. Wright; Milos Manic

    2010-05-01

    This paper compares the application of three different dimension reduction techniques to the problem of locating cryptography in compiled object code. A simple classi?er is used to compare dimension reduction via sorted covariance, principal component analysis, and correlation-based feature subset selection. The analysis concentrates on the classi?cation accuracy as the number of dimensions is increased.

  13. Automated diagnosis of Alzheimer's disease with multi-atlas based whole brain segmentations

    NASA Astrophysics Data System (ADS)

    Luo, Yuan; Tang, Xiaoying

    2017-03-01

    Voxel-based analysis is widely used in quantitative analysis of structural brain magnetic resonance imaging (MRI) and automated disease detection, such as Alzheimer's disease (AD). However, noise at the voxel level may cause low sensitivity to AD-induced structural abnormalities. This can be addressed with the use of a whole brain structural segmentation approach which greatly reduces the dimension of features (the number of voxels). In this paper, we propose an automatic AD diagnosis system that combines such whole brain segmen- tations with advanced machine learning methods. We used a multi-atlas segmentation technique to parcellate T1-weighted images into 54 distinct brain regions and extract their structural volumes to serve as the features for principal-component-analysis-based dimension reduction and support-vector-machine-based classification. The relationship between the number of retained principal components (PCs) and the diagnosis accuracy was systematically evaluated, in a leave-one-out fashion, based on 28 AD subjects and 23 age-matched healthy subjects. Our approach yielded pretty good classification results with 96.08% overall accuracy being achieved using the three foremost PCs. In addition, our approach yielded 96.43% specificity, 100% sensitivity, and 0.9891 area under the receiver operating characteristic curve.

  14. FT-IR spectroscopy and multivariate analysis as an auxiliary tool for diagnosis of mental disorders: Bipolar and schizophrenia cases

    NASA Astrophysics Data System (ADS)

    Ogruc Ildiz, G.; Arslan, M.; Unsalan, O.; Araujo-Andrade, C.; Kurt, E.; Karatepe, H. T.; Yilmaz, A.; Yalcinkaya, O. B.; Herken, H.

    2016-01-01

    In this study, a methodology based on Fourier-transform infrared spectroscopy and principal component analysis and partial least square methods is proposed for the analysis of blood plasma samples in order to identify spectral changes correlated with some biomarkers associated with schizophrenia and bipolarity. Our main goal was to use the spectral information for the calibration of statistical models to discriminate and classify blood plasma samples belonging to bipolar and schizophrenic patients. IR spectra of 30 samples of blood plasma obtained from each, bipolar and schizophrenic patients and healthy control group were collected. The results obtained from principal component analysis (PCA) show a clear discrimination between the bipolar (BP), schizophrenic (SZ) and control group' (CG) blood samples that also give possibility to identify three main regions that show the major differences correlated with both mental disorders (biomarkers). Furthermore, a model for the classification of the blood samples was calibrated using partial least square discriminant analysis (PLS-DA), allowing the correct classification of BP, SZ and CG samples. The results obtained applying this methodology suggest that it can be used as a complimentary diagnostic tool for the detection and discrimination of these mental diseases.

  15. A Novel Acoustic Sensor Approach to Classify Seeds Based on Sound Absorption Spectra

    PubMed Central

    Gasso-Tortajada, Vicent; Ward, Alastair J.; Mansur, Hasib; Brøchner, Torben; Sørensen, Claus G.; Green, Ole

    2010-01-01

    A non-destructive and novel in situ acoustic sensor approach based on the sound absorption spectra was developed for identifying and classifying different seed types. The absorption coefficient spectra were determined by using the impedance tube measurement method. Subsequently, a multivariate statistical analysis, i.e., principal component analysis (PCA), was performed as a way to generate a classification of the seeds based on the soft independent modelling of class analogy (SIMCA) method. The results show that the sound absorption coefficient spectra of different seed types present characteristic patterns which are highly dependent on seed size and shape. In general, seed particle size and sphericity were inversely related with the absorption coefficient. PCA presented reliable grouping capabilities within the diverse seed types, since the 95% of the total spectral variance was described by the first two principal components. Furthermore, the SIMCA classification model based on the absorption spectra achieved optimal results as 100% of the evaluation samples were correctly classified. This study contains the initial structuring of an innovative method that will present new possibilities in agriculture and industry for classifying and determining physical properties of seeds and other materials. PMID:22163455

  16. Differentiation of tea varieties using UV-Vis spectra and pattern recognition techniques

    NASA Astrophysics Data System (ADS)

    Palacios-Morillo, Ana; Alcázar, Ángela.; de Pablos, Fernando; Jurado, José Marcos

    2013-02-01

    Tea, one of the most consumed beverages all over the world, is of great importance in the economies of a number of countries. Several methods have been developed to classify tea varieties or origins based in pattern recognition techniques applied to chemical data, such as metal profile, amino acids, catechins and volatile compounds. Some of these analytical methods become tedious and expensive to be applied in routine works. The use of UV-Vis spectral data as discriminant variables, highly influenced by the chemical composition, can be an alternative to these methods. UV-Vis spectra of methanol-water extracts of tea have been obtained in the interval 250-800 nm. Absorbances have been used as input variables. Principal component analysis was used to reduce the number of variables and several pattern recognition methods, such as linear discriminant analysis, support vector machines and artificial neural networks, have been applied in order to differentiate the most common tea varieties. A successful classification model was built by combining principal component analysis and multilayer perceptron artificial neural networks, allowing the differentiation between tea varieties. This rapid and simple methodology can be applied to solve classification problems in food industry saving economic resources.

  17. Medical diagnosis of atherosclerosis from Carotid Artery Doppler Signals using principal component analysis (PCA), k-NN based weighting pre-processing and Artificial Immune Recognition System (AIRS).

    PubMed

    Latifoğlu, Fatma; Polat, Kemal; Kara, Sadik; Güneş, Salih

    2008-02-01

    In this study, we proposed a new medical diagnosis system based on principal component analysis (PCA), k-NN based weighting pre-processing, and Artificial Immune Recognition System (AIRS) for diagnosis of atherosclerosis from Carotid Artery Doppler Signals. The suggested system consists of four stages. First, in the feature extraction stage, we have obtained the features related with atherosclerosis disease using Fast Fourier Transformation (FFT) modeling and by calculating of maximum frequency envelope of sonograms. Second, in the dimensionality reduction stage, the 61 features of atherosclerosis disease have been reduced to 4 features using PCA. Third, in the pre-processing stage, we have weighted these 4 features using different values of k in a new weighting scheme based on k-NN based weighting pre-processing. Finally, in the classification stage, AIRS classifier has been used to classify subjects as healthy or having atherosclerosis. Hundred percent of classification accuracy has been obtained by the proposed system using 10-fold cross validation. This success shows that the proposed system is a robust and effective system in diagnosis of atherosclerosis disease.

  18. REGIONAL-SCALE WIND FIELD CLASSIFICATION EMPLOYING CLUSTER ANALYSIS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Glascoe, L G; Glaser, R E; Chin, H S

    2004-06-17

    The classification of time-varying multivariate regional-scale wind fields at a specific location can assist event planning as well as consequence and risk analysis. Further, wind field classification involves data transformation and inference techniques that effectively characterize stochastic wind field variation. Such a classification scheme is potentially useful for addressing overall atmospheric transport uncertainty and meteorological parameter sensitivity issues. Different methods to classify wind fields over a location include the principal component analysis of wind data (e.g., Hardy and Walton, 1978) and the use of cluster analysis for wind data (e.g., Green et al., 1992; Kaufmann and Weber, 1996). The goalmore » of this study is to use a clustering method to classify the winds of a gridded data set, i.e, from meteorological simulations generated by a forecast model.« less

  19. The Classification of Ground Roasted Decaffeinated Coffee Using UV-VIS Spectroscopy and SIMCA Method

    NASA Astrophysics Data System (ADS)

    Yulia, M.; Asnaning, A. R.; Suhandy, D.

    2018-05-01

    In this work, an investigation on the classification between decaffeinated and non- decaffeinated coffee samples using UV-VIS spectroscopy and SIMCA method was investigated. Total 200 samples of ground roasted coffee were used (100 samples for decaffeinated coffee and 100 samples for non-decaffeinated coffee). After extraction and dilution, the spectra of coffee samples solution were acquired using a UV-VIS spectrometer (Genesys™ 10S UV-VIS, Thermo Scientific, USA) in the range of 190-1100 nm. The multivariate analyses of the spectra were performed using principal component analysis (PCA) and soft independent modeling of class analogy (SIMCA). The SIMCA model showed that the classification between decaffeinated and non-decaffeinated coffee samples was detected with 100% sensitivity and specificity.

  20. Multi-element analysis of wines by ICP-MS and ICP-OES and their classification according to geographical origin in Slovenia.

    PubMed

    Selih, Vid S; Sala, Martin; Drgan, Viktor

    2014-06-15

    Inductively coupled plasma mass spectrometry and optical emission were used to determine the multi-element composition of 272 bottled Slovenian wines. To achieve geographical classification of the wines by their elemental composition, principal component analysis (PCA) and counter-propagation artificial neural networks (CPANN) have been used. From 49 elements measured, 19 were used to build the final classification models. CPANN was used for the final predictions because of its superior results. The best model gave 82% correct predictions for external set of the white wine samples. Taking into account the small size of whole Slovenian wine growing regions, we consider the classification results were very good. For the red wines, which were mostly represented from one region, even-sub region classification was possible with great precision. From the level maps of the CPANN model, some of the most important elements for classification were identified. Copyright © 2013 Elsevier Ltd. All rights reserved.

  1. Feature extraction for ultrasonic sensor based defect detection in ceramic components

    NASA Astrophysics Data System (ADS)

    Kesharaju, Manasa; Nagarajah, Romesh

    2014-02-01

    High density silicon carbide materials are commonly used as the ceramic element of hard armour inserts used in traditional body armour systems to reduce their weight, while providing improved hardness, strength and elastic response to stress. Currently, armour ceramic tiles are inspected visually offline using an X-ray technique that is time consuming and very expensive. In addition, from X-rays multiple defects are also misinterpreted as single defects. Therefore, to address these problems the ultrasonic non-destructive approach is being investigated. Ultrasound based inspection would be far more cost effective and reliable as the methodology is applicable for on-line quality control including implementation of accept/reject criteria. This paper describes a recently developed methodology to detect, locate and classify various manufacturing defects in ceramic tiles using sub band coding of ultrasonic test signals. The wavelet transform is applied to the ultrasonic signal and wavelet coefficients in the different frequency bands are extracted and used as input features to an artificial neural network (ANN) for purposes of signal classification. Two different classifiers, using artificial neural networks (supervised) and clustering (un-supervised) are supplied with features selected using Principal Component Analysis(PCA) and their classification performance compared. This investigation establishes experimentally that Principal Component Analysis(PCA) can be effectively used as a feature selection method that provides superior results for classifying various defects in the context of ultrasonic inspection in comparison with the X-ray technique.

  2. Investigation of domain walls in PPLN by confocal raman microscopy and PCA analysis

    NASA Astrophysics Data System (ADS)

    Shur, Vladimir Ya.; Zelenovskiy, Pavel; Bourson, Patrice

    2017-07-01

    Confocal Raman microscopy (CRM) is a powerful tool for investigation of ferroelectric domains. Mechanical stresses and electric fields existed in the vicinity of neutral and charged domain walls modify frequency, intensity and width of spectral lines [1], thus allowing to visualize micro- and nanodomain structures both at the surface and in the bulk of the crystal [2,3]. Stresses and fields are naturally coupled in ferroelectrics due to inverse piezoelectric effect and hardly can be separated in Raman spectra. PCA is a powerful statistical method for analysis of large data matrix providing a set of orthogonal variables, called principal components (PCs). PCA is widely used for classification of experimental data, for example, in crystallization experiments, for detection of small amounts of components in solid mixtures etc. [4,5]. In Raman spectroscopy PCA was applied for analysis of phase transitions and provided critical pressure with good accuracy [6]. In the present work we for the first time applied Principal Component Analysis (PCA) method for analysis of Raman spectra measured in periodically poled lithium niobate (PPLN). We found that principal components demonstrate different sensitivity to mechanical stresses and electric fields in the vicinity of the domain walls. This allowed us to separately visualize spatial distribution of fields and electric fields at the surface and in the bulk of PPLN.

  3. Bioclimatic Classification of Northeast Asia for climate change response

    NASA Astrophysics Data System (ADS)

    Choi, Y.; Jeon, S. W.; Lim, C. H.

    2016-12-01

    As climate change has been getting worse, we should monitor the change of biodiversity, and distribution of species to handle the crisis and take advantage of climate change. The development of bioclimatic map which classifies land into homogenous zones by similar environment properties is the first step to establish a strategy. Statistically derived classifications of land provide useful spatial frameworks to support ecosystem research, monitoring and policy decisions. Many countries are trying to make this kind of map and actively utilize it to ecosystem conservation and management. However, the Northeast Asia including North Korea doesn't have detailed environmental information, and has not built environmental classification map. Therefore, this study presents a bioclimatic map of Northeast Asia based on statistical clustering of bioclimate data. Bioclim data ver1.4 which provided by WorldClim were considered for inclusion in a model. Eight of the most relevant climate variables were selected by correlation analysis, based on previous studies. Principal Components Analysis (PCA) was used to explain 86% of the variation into three independent dimensions, which were subsequently clustered using an ISODATA clustering. The bioclimatic zone of Northeast Asia could consist of 29, 35, and 50 zones. This bioclimatic map has a 30' resolution. To assess the accuracy, the correlation coefficient was calculated between the first principal component values of the classification variables and the vegetation index, Gross Primary Production (GPP). It shows about 0.5 Pearson correlation coefficient. This study constructed Northeast Asia bioclimatic map by statistical method with high resolution, but in order to better reflect the realities, the variety of climate variables should be considered. Also, further studies should do more quantitative and qualitative validation in various ways. Then, this could be used more effectively to support decision making on climate change adaptation.

  4. Capillary electrophoresis fingerprinting and spectrophotometric determination of antioxidant potential for classification of Mentha products.

    PubMed

    Roblová, Vendula; Bittová, Miroslava; Kubáň, Petr; Kubáň, Vlastimil

    2016-07-01

    In this work aqueous infusions from ten Mentha herbal samples (four different Mentha species and six hybrids of Mentha x piperita) and 20 different peppermint teas were screened by capillary electrophoresis with UV detection. The fingerprint separation was accomplished in a 25 mM borate background electrolyte with 10% methanol at pH 9.3. The total polyphenolic content in the extracts was determined spectrophotometrically at 765 nm by a Folin-Ciocalteu phenol assay. Total antioxidant activity was determined by scavenging of 2,2-diphenyl-1-picrylhydrazyl radical at 515 nm. The peak areas of 12 dominant peaks from CE analysis, present in all samples, and the value of total polyphenolic content and total antioxidant activity obtained by spectrophotometry was combined into a single data matrix and principal component analysis was applied. The obtained principal component analysis model resulted in distinct clusters of Mentha and peppermint tea samples distinguishing the samples according to their potential protective antioxidant effect. Principal component analysis, using a non-targeted approach with no need for compound identification, was found as a new promising tool for the screening of herbal tea products. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. Genetic variation and seed transfer guidelines for ponderosa pine in the Ochoco and Malheur National Forests of central Oregon.

    Treesearch

    Frank C. Sorensen; John C. Weber

    1994-01-01

    Adaptive genetic variation in seed and seedling traits was evaluated for 280 families from 220 locations. Factor scores from three principal components were related by multiple regression to latitude, longitude, elevation, slope, and aspect of the seed source, and by classification analysis to seed zone and elevation band in seed zone. Location variance was significant...

  6. [Discrimination of varieties of borneol using terahertz spectra based on principal component analysis and support vector machine].

    PubMed

    Li, Wu; Hu, Bing; Wang, Ming-wei

    2014-12-01

    In the present paper, the terahertz time-domain spectroscopy (THz-TDS) identification model of borneol based on principal component analysis (PCA) and support vector machine (SVM) was established. As one Chinese common agent, borneol needs a rapid, simple and accurate detection and identification method for its different source and being easily confused in the pharmaceutical and trade links. In order to assure the quality of borneol product and guard the consumer's right, quickly, efficiently and correctly identifying borneol has significant meaning to the production and transaction of borneol. Terahertz time-domain spectroscopy is a new spectroscopy approach to characterize material using terahertz pulse. The absorption terahertz spectra of blumea camphor, borneol camphor and synthetic borneol were measured in the range of 0.2 to 2 THz with the transmission THz-TDS. The PCA scores of 2D plots (PC1 X PC2) and 3D plots (PC1 X PC2 X PC3) of three kinds of borneol samples were obtained through PCA analysis, and both of them have good clustering effect on the 3 different kinds of borneol. The value matrix of the first 10 principal components (PCs) was used to replace the original spectrum data, and the 60 samples of the three kinds of borneol were trained and then the unknown 60 samples were identified. Four kinds of support vector machine model of different kernel functions were set up in this way. Results show that the accuracy of identification and classification of SVM RBF kernel function for three kinds of borneol is 100%, and we selected the SVM with the radial basis kernel function to establish the borneol identification model, in addition, in the noisy case, the classification accuracy rates of four SVM kernel function are above 85%, and this indicates that SVM has strong generalization ability. This study shows that PCA with SVM method of borneol terahertz spectroscopy has good classification and identification effects, and provides a new method for species identification of borneol in Chinese medicine.

  7. Functional data analysis of sleeping energy expenditure.

    PubMed

    Lee, Jong Soo; Zakeri, Issa F; Butte, Nancy F

    2017-01-01

    Adequate sleep is crucial during childhood for metabolic health, and physical and cognitive development. Inadequate sleep can disrupt metabolic homeostasis and alter sleeping energy expenditure (SEE). Functional data analysis methods were applied to SEE data to elucidate the population structure of SEE and to discriminate SEE between obese and non-obese children. Minute-by-minute SEE in 109 children, ages 5-18, was measured in room respiration calorimeters. A smoothing spline method was applied to the calorimetric data to extract the true smoothing function for each subject. Functional principal component analysis was used to capture the important modes of variation of the functional data and to identify differences in SEE patterns. Combinations of functional principal component analysis and classifier algorithm were used to classify SEE. Smoothing effectively removed instrumentation noise inherent in the room calorimeter data, providing more accurate data for analysis of the dynamics of SEE. SEE exhibited declining but subtly undulating patterns throughout the night. Mean SEE was markedly higher in obese than non-obese children, as expected due to their greater body mass. SEE was higher among the obese than non-obese children (p<0.01); however, the weight-adjusted mean SEE was not statistically different (p>0.1, after post hoc testing). Functional principal component scores for the first two components explained 77.8% of the variance in SEE and also differed between groups (p = 0.037). Logistic regression, support vector machine or random forest classification methods were able to distinguish weight-adjusted SEE between obese and non-obese participants with good classification rates (62-64%). Our results implicate other factors, yet to be uncovered, that affect the weight-adjusted SEE of obese and non-obese children. Functional data analysis revealed differences in the structure of SEE between obese and non-obese children that may contribute to disruption of metabolic homeostasis.

  8. Supernova Photometric Lightcurve Classification

    NASA Astrophysics Data System (ADS)

    Zaidi, Tayeb; Narayan, Gautham

    2016-01-01

    This is a preliminary report on photometric supernova classification. We first explore the properties of supernova light curves, and attempt to restructure the unevenly sampled and sparse data from assorted datasets to allow for processing and classification. The data was primarily drawn from the Dark Energy Survey (DES) simulated data, created for the Supernova Photometric Classification Challenge. This poster shows a method for producing a non-parametric representation of the light curve data, and applying a Random Forest classifier algorithm to distinguish between supernovae types. We examine the impact of Principal Component Analysis to reduce the dimensionality of the dataset, for future classification work. The classification code will be used in a stage of the ANTARES pipeline, created for use on the Large Synoptic Survey Telescope alert data and other wide-field surveys. The final figure-of-merit for the DES data in the r band was 60% for binary classification (Type I vs II).Zaidi was supported by the NOAO/KPNO Research Experiences for Undergraduates (REU) Program which is funded by the National Science Foundation Research Experiences for Undergraduates Program (AST-1262829).

  9. Probabilisitc Geobiological Classification Using Elemental Abundance Distributions and Lossless Image Compression in Recent and Modern Organisms

    NASA Technical Reports Server (NTRS)

    Storrie-Lombardi, Michael C.; Hoover, Richard B.

    2005-01-01

    Last year we presented techniques for the detection of fossils during robotic missions to Mars using both structural and chemical signatures[Storrie-Lombardi and Hoover, 2004]. Analyses included lossless compression of photographic images to estimate the relative complexity of a putative fossil compared to the rock matrix [Corsetti and Storrie-Lombardi, 2003] and elemental abundance distributions to provide mineralogical classification of the rock matrix [Storrie-Lombardi and Fisk, 2004]. We presented a classification strategy employing two exploratory classification algorithms (Principal Component Analysis and Hierarchical Cluster Analysis) and non-linear stochastic neural network to produce a Bayesian estimate of classification accuracy. We now present an extension of our previous experiments exploring putative fossil forms morphologically resembling cyanobacteria discovered in the Orgueil meteorite. Elemental abundances (C6, N7, O8, Na11, Mg12, Ai13, Si14, P15, S16, Cl17, K19, Ca20, Fe26) obtained for both extant cyanobacteria and fossil trilobites produce signatures readily distinguishing them from meteorite targets. When compared to elemental abundance signatures for extant cyanobacteria Orgueil structures exhibit decreased abundances for C6, N7, Na11, All3, P15, Cl17, K19, Ca20 and increases in Mg12, S16, Fe26. Diatoms and silicified portions of cyanobacterial sheaths exhibiting high levels of silicon and correspondingly low levels of carbon cluster more closely with terrestrial fossils than with extant cyanobacteria. Compression indices verify that variations in random and redundant textural patterns between perceived forms and the background matrix contribute significantly to morphological visual identification. The results provide a quantitative probabilistic methodology for discriminating putatitive fossils from the surrounding rock matrix and &om extant organisms using both structural and chemical information. The techniques described appear applicable to the geobiological analysis of meteoritic samples or in situ exploration of the Mars regolith. Keywords: cyanobacteria, microfossils, Mars, elemental abundances, complexity analysis, multifactor analysis, principal component analysis, hierarchical cluster analysis, artificial neural networks, paleo-biosignatures

  10. On the construction of a new stellar classification template library for the LAMOST spectral analysis pipeline

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wei, Peng; Luo, Ali; Li, Yinbi

    2014-05-01

    The LAMOST spectral analysis pipeline, called the 1D pipeline, aims to classify and measure the spectra observed in the LAMOST survey. Through this pipeline, the observed stellar spectra are classified into different subclasses by matching with template spectra. Consequently, the performance of the stellar classification greatly depends on the quality of the template spectra. In this paper, we construct a new LAMOST stellar spectral classification template library, which is supposed to improve the precision and credibility of the present LAMOST stellar classification. About one million spectra are selected from LAMOST Data Release One to construct the new stellar templates, andmore » they are gathered in 233 groups by two criteria: (1) pseudo g – r colors obtained by convolving the LAMOST spectra with the Sloan Digital Sky Survey ugriz filter response curve, and (2) the stellar subclass given by the LAMOST pipeline. In each group, the template spectra are constructed using three steps. (1) Outliers are excluded using the Local Outlier Probabilities algorithm, and then the principal component analysis method is applied to the remaining spectra of each group. About 5% of the one million spectra are ruled out as outliers. (2) All remaining spectra are reconstructed using the first principal components of each group. (3) The weighted average spectrum is used as the template spectrum in each group. Using the previous 3 steps, we initially obtain 216 stellar template spectra. We visually inspect all template spectra, and 29 spectra are abandoned due to low spectral quality. Furthermore, the MK classification for the remaining 187 template spectra is manually determined by comparing with 3 template libraries. Meanwhile, 10 template spectra whose subclass is difficult to determine are abandoned. Finally, we obtain a new template library containing 183 LAMOST template spectra with 61 different MK classes by combining it with the current library.« less

  11. Signal-to-noise contribution of principal component loads in reconstructed near-infrared Raman tissue spectra.

    PubMed

    Grimbergen, M C M; van Swol, C F P; Kendall, C; Verdaasdonk, R M; Stone, N; Bosch, J L H R

    2010-01-01

    The overall quality of Raman spectra in the near-infrared region, where biological samples are often studied, has benefited from various improvements to optical instrumentation over the past decade. However, obtaining ample spectral quality for analysis is still challenging due to device requirements and short integration times required for (in vivo) clinical applications of Raman spectroscopy. Multivariate analytical methods, such as principal component analysis (PCA) and linear discriminant analysis (LDA), are routinely applied to Raman spectral datasets to develop classification models. Data compression is necessary prior to discriminant analysis to prevent or decrease the degree of over-fitting. The logical threshold for the selection of principal components (PCs) to be used in discriminant analysis is likely to be at a point before the PCs begin to introduce equivalent signal and noise and, hence, include no additional value. Assessment of the signal-to-noise ratio (SNR) at a certain peak or over a specific spectral region will depend on the sample measured. Therefore, the mean SNR over the whole spectral region (SNR(msr)) is determined in the original spectrum as well as for spectra reconstructed from an increasing number of principal components. This paper introduces a method of assessing the influence of signal and noise from individual PC loads and indicates a method of selection of PCs for LDA. To evaluate this method, two data sets with different SNRs were used. The sets were obtained with the same Raman system and the same measurement parameters on bladder tissue collected during white light cystoscopy (set A) and fluorescence-guided cystoscopy (set B). This method shows that the mean SNR over the spectral range in the original Raman spectra of these two data sets is related to the signal and noise contribution of principal component loads. The difference in mean SNR over the spectral range can also be appreciated since fewer principal components can reliably be used in the low SNR data set (set B) compared to the high SNR data set (set A). Despite the fact that no definitive threshold could be found, this method may help to determine the cutoff for the number of principal components used in discriminant analysis. Future analysis of a selection of spectral databases using this technique will allow optimum thresholds to be selected for different applications and spectral data quality levels.

  12. Neuro-classification of multi-type Landsat Thematic Mapper data

    NASA Technical Reports Server (NTRS)

    Zhuang, Xin; Engel, Bernard A.; Fernandez, R. N.; Johannsen, Chris J.

    1991-01-01

    Neural networks have been successful in image classification and have shown potential for classifying remotely sensed data. This paper presents classifications of multitype Landsat Thematic Mapper (TM) data using neural networks. The Landsat TM Image for March 23, 1987 with accompanying ground observation data for a study area In Miami County, Indiana, U.S.A. was utilized to assess recognition of crop residues. Principal components and spectral ratio transformations were performed on the TM data. In addition, a layer of the geographic information system (GIS) for the study site was incorporated to generate GIS-enhanced TM data. This paper discusses (1) the performance of neuro-classification on each type of data, (2) how neural networks recognized each type of data as a new image and (3) comparisons of the results for each type of data obtained using neural networks, maximum likelihood, and minimum distance classifiers.

  13. Chemometric classification of Chinese lager beers according to manufacturer based on data fusion of fluorescence, UV and visible spectroscopies.

    PubMed

    Tan, Jin; Li, Rong; Jiang, Zi-Tao

    2015-10-01

    We report an application of data fusion for chemometric classification of 135 canned samples of Chinese lager beers by manufacturer based on the combination of fluorescence, UV and visible spectroscopies. Right-angle synchronous fluorescence spectra (SFS) at three wavelength difference Δλ=30, 60 and 80 nm and visible spectra in the range 380-700 nm of undiluted beers were recorded. UV spectra in the range 240-400 nm of diluted beers were measured. A classification model was built using principal component analysis (PCA) and linear discriminant analysis (LDA). LDA with cross-validation showed that the data fusion could achieve 78.5-86.7% correct classification (sensitivity), while those rates using individual spectroscopies ranged from 42.2% to 70.4%. The results demonstrated that the fluorescence, UV and visible spectroscopies complemented each other, yielding higher synergic effect. Copyright © 2015 Elsevier Ltd. All rights reserved.

  14. Exploring the CAESAR database using dimensionality reduction techniques

    NASA Astrophysics Data System (ADS)

    Mendoza-Schrock, Olga; Raymer, Michael L.

    2012-06-01

    The Civilian American and European Surface Anthropometry Resource (CAESAR) database containing over 40 anthropometric measurements on over 4000 humans has been extensively explored for pattern recognition and classification purposes using the raw, original data [1-4]. However, some of the anthropometric variables would be impossible to collect in an uncontrolled environment. Here, we explore the use of dimensionality reduction methods in concert with a variety of classification algorithms for gender classification using only those variables that are readily observable in an uncontrolled environment. Several dimensionality reduction techniques are employed to learn the underlining structure of the data. These techniques include linear projections such as the classical Principal Components Analysis (PCA) and non-linear (manifold learning) techniques, such as Diffusion Maps and the Isomap technique. This paper briefly describes all three techniques, and compares three different classifiers, Naïve Bayes, Adaboost, and Support Vector Machines (SVM), for gender classification in conjunction with each of these three dimensionality reduction approaches.

  15. Automated database-guided expert-supervised orientation for immunophenotypic diagnosis and classification of acute leukemia

    PubMed Central

    Lhermitte, L; Mejstrikova, E; van der Sluijs-Gelling, A J; Grigore, G E; Sedek, L; Bras, A E; Gaipa, G; Sobral da Costa, E; Novakova, M; Sonneveld, E; Buracchi, C; de Sá Bacelar, T; te Marvelde, J G; Trinquand, A; Asnafi, V; Szczepanski, T; Matarraz, S; Lopez, A; Vidriales, B; Bulsa, J; Hrusak, O; Kalina, T; Lecrevisse, Q; Martin Ayuso, M; Brüggemann, M; Verde, J; Fernandez, P; Burgos, L; Paiva, B; Pedreira, C E; van Dongen, J J M; Orfao, A; van der Velden, V H J

    2018-01-01

    Precise classification of acute leukemia (AL) is crucial for adequate treatment. EuroFlow has previously designed an AL orientation tube (ALOT) to guide towards the relevant classification panel (T-cell acute lymphoblastic leukemia (T-ALL), B-cell precursor (BCP)-ALL and/or acute myeloid leukemia (AML)) and final diagnosis. Now we built a reference database with 656 typical AL samples (145 T-ALL, 377 BCP-ALL, 134 AML), processed and analyzed via standardized protocols. Using principal component analysis (PCA)-based plots and automated classification algorithms for direct comparison of single-cells from individual patients against the database, another 783 cases were subsequently evaluated. Depending on the database-guided results, patients were categorized as: (i) typical T, B or Myeloid without or; (ii) with a transitional component to another lineage; (iii) atypical; or (iv) mixed-lineage. Using this automated algorithm, in 781/783 cases (99.7%) the right panel was selected, and data comparable to the final WHO-diagnosis was already provided in >93% of cases (85% T-ALL, 97% BCP-ALL, 95% AML and 87% mixed-phenotype AL patients), even without data on the full-characterization panels. Our results show that database-guided analysis facilitates standardized interpretation of ALOT results and allows accurate selection of the relevant classification panels, hence providing a solid basis for designing future WHO AL classifications. PMID:29089646

  16. Characterization and classification of South American land cover types using satellite data

    NASA Technical Reports Server (NTRS)

    Townshend, J. R. G.; Justice, C. O.; Kalb, V.

    1987-01-01

    Various methods are compared for carrying out land cover classifications of South America using multitemporal Advanced Very High Resolution Radiometer data. Fifty-two images of the normalized difference vegetation index (NDVI) from a 1-year period are used to generate multitemporal data sets. Three main approaches to land cover classification are considered, namely the use of the principal components transformed images, the use of a characteristic curves procedure based on NDVI values plotted against time, and finally application of the maximum likelihood rule to multitemporal data sets. Comparison of results from training sites indicates that the last approach yields the most accurate results. Despite the reliance on training site figures for performance assessment, the results are nevertheless extremely encouraging, with accuracies for several cover types exceeding 90 per cent.

  17. Classification and pose estimation of objects using nonlinear features

    NASA Astrophysics Data System (ADS)

    Talukder, Ashit; Casasent, David P.

    1998-03-01

    A new nonlinear feature extraction method called the maximum representation and discrimination feature (MRDF) method is presented for extraction of features from input image data. It implements transformations similar to the Sigma-Pi neural network. However, the weights of the MRDF are obtained in closed form, and offer advantages compared to nonlinear neural network implementations. The features extracted are useful for both object discrimination (classification) and object representation (pose estimation). We show its use in estimating the class and pose of images of real objects and rendered solid CAD models of machine parts from single views using a feature-space trajectory (FST) neural network classifier. We show more accurate classification and pose estimation results than are achieved by standard principal component analysis (PCA) and Fukunaga-Koontz (FK) feature extraction methods.

  18. Research on potential user identification model for electric energy substitution

    NASA Astrophysics Data System (ADS)

    Xia, Huaijian; Chen, Meiling; Lin, Haiying; Yang, Shuo; Miao, Bo; Zhu, Xinzhi

    2018-01-01

    The implementation of energy substitution plays an important role in promoting the development of energy conservation and emission reduction in china. Energy service management platform of alternative energy users based on the data in the enterprise production value, product output, coal and other energy consumption as a potential evaluation index, using principal component analysis model to simplify the formation of characteristic index, comprehensive index contains the original variables, and using fuzzy clustering model for the same industry user’s flexible classification. The comprehensive index number and user clustering classification based on constructed particle optimization neural network classification model based on the user, user can replace electric potential prediction. The results of an example show that the model can effectively predict the potential of users’ energy potential.

  19. Biometric Authentication for Gender Classification Techniques: A Review

    NASA Astrophysics Data System (ADS)

    Mathivanan, P.; Poornima, K.

    2017-12-01

    One of the challenging biometric authentication applications is gender identification and age classification, which captures gait from far distance and analyze physical information of the subject such as gender, race and emotional state of the subject. It is found that most of the gender identification techniques have focused only with frontal pose of different human subject, image size and type of database used in the process. The study also classifies different feature extraction process such as, Principal Component Analysis (PCA) and Local Directional Pattern (LDP) that are used to extract the authentication features of a person. This paper aims to analyze different gender classification techniques that help in evaluating strength and weakness of existing gender identification algorithm. Therefore, it helps in developing a novel gender classification algorithm with less computation cost and more accuracy. In this paper, an overview and classification of different gender identification techniques are first presented and it is compared with other existing human identification system by means of their performance.

  20. Automotive System for Remote Surface Classification.

    PubMed

    Bystrov, Aleksandr; Hoare, Edward; Tran, Thuy-Yung; Clarke, Nigel; Gashinova, Marina; Cherniakov, Mikhail

    2017-04-01

    In this paper we shall discuss a novel approach to road surface recognition, based on the analysis of backscattered microwave and ultrasonic signals. The novelty of our method is sonar and polarimetric radar data fusion, extraction of features for separate swathes of illuminated surface (segmentation), and using of multi-stage artificial neural network for surface classification. The developed system consists of 24 GHz radar and 40 kHz ultrasonic sensor. The features are extracted from backscattered signals and then the procedures of principal component analysis and supervised classification are applied to feature data. The special attention is paid to multi-stage artificial neural network which allows an overall increase in classification accuracy. The proposed technique was tested for recognition of a large number of real surfaces in different weather conditions with the average accuracy of correct classification of 95%. The obtained results thereby demonstrate that the use of proposed system architecture and statistical methods allow for reliable discrimination of various road surfaces in real conditions.

  1. Automotive System for Remote Surface Classification

    PubMed Central

    Bystrov, Aleksandr; Hoare, Edward; Tran, Thuy-Yung; Clarke, Nigel; Gashinova, Marina; Cherniakov, Mikhail

    2017-01-01

    In this paper we shall discuss a novel approach to road surface recognition, based on the analysis of backscattered microwave and ultrasonic signals. The novelty of our method is sonar and polarimetric radar data fusion, extraction of features for separate swathes of illuminated surface (segmentation), and using of multi-stage artificial neural network for surface classification. The developed system consists of 24 GHz radar and 40 kHz ultrasonic sensor. The features are extracted from backscattered signals and then the procedures of principal component analysis and supervised classification are applied to feature data. The special attention is paid to multi-stage artificial neural network which allows an overall increase in classification accuracy. The proposed technique was tested for recognition of a large number of real surfaces in different weather conditions with the average accuracy of correct classification of 95%. The obtained results thereby demonstrate that the use of proposed system architecture and statistical methods allow for reliable discrimination of various road surfaces in real conditions. PMID:28368297

  2. Gender classification of running subjects using full-body kinematics

    NASA Astrophysics Data System (ADS)

    Williams, Christina M.; Flora, Jeffrey B.; Iftekharuddin, Khan M.

    2016-05-01

    This paper proposes novel automated gender classification of subjects while engaged in running activity. The machine learning techniques include preprocessing steps using principal component analysis followed by classification with linear discriminant analysis, and nonlinear support vector machines, and decision-stump with AdaBoost. The dataset consists of 49 subjects (25 males, 24 females, 2 trials each) all equipped with approximately 80 retroreflective markers. The trials are reflective of the subject's entire body moving unrestrained through a capture volume at a self-selected running speed, thus producing highly realistic data. The classification accuracy using leave-one-out cross validation for the 49 subjects is improved from 66.33% using linear discriminant analysis to 86.74% using the nonlinear support vector machine. Results are further improved to 87.76% by means of implementing a nonlinear decision stump with AdaBoost classifier. The experimental findings suggest that the linear classification approaches are inadequate in classifying gender for a large dataset with subjects running in a moderately uninhibited environment.

  3. A Comparative Analysis of Machine Learning with WorldView-2 Pan-Sharpened Imagery for Tea Crop Mapping

    PubMed Central

    Chuang, Yung-Chung Matt; Shiu, Yi-Shiang

    2016-01-01

    Tea is an important but vulnerable economic crop in East Asia, highly impacted by climate change. This study attempts to interpret tea land use/land cover (LULC) using very high resolution WorldView-2 imagery of central Taiwan with both pixel and object-based approaches. A total of 80 variables derived from each WorldView-2 band with pan-sharpening, standardization, principal components and gray level co-occurrence matrix (GLCM) texture indices transformation, were set as the input variables. For pixel-based image analysis (PBIA), 34 variables were selected, including seven principal components, 21 GLCM texture indices and six original WorldView-2 bands. Results showed that support vector machine (SVM) had the highest tea crop classification accuracy (OA = 84.70% and KIA = 0.690), followed by random forest (RF), maximum likelihood algorithm (ML), and logistic regression analysis (LR). However, the ML classifier achieved the highest classification accuracy (OA = 96.04% and KIA = 0.887) in object-based image analysis (OBIA) using only six variables. The contribution of this study is to create a new framework for accurately identifying tea crops in a subtropical region with real-time high-resolution WorldView-2 imagery without field survey, which could further aid agriculture land management and a sustainable agricultural product supply. PMID:27128915

  4. A Comparative Analysis of Machine Learning with WorldView-2 Pan-Sharpened Imagery for Tea Crop Mapping.

    PubMed

    Chuang, Yung-Chung Matt; Shiu, Yi-Shiang

    2016-04-26

    Tea is an important but vulnerable economic crop in East Asia, highly impacted by climate change. This study attempts to interpret tea land use/land cover (LULC) using very high resolution WorldView-2 imagery of central Taiwan with both pixel and object-based approaches. A total of 80 variables derived from each WorldView-2 band with pan-sharpening, standardization, principal components and gray level co-occurrence matrix (GLCM) texture indices transformation, were set as the input variables. For pixel-based image analysis (PBIA), 34 variables were selected, including seven principal components, 21 GLCM texture indices and six original WorldView-2 bands. Results showed that support vector machine (SVM) had the highest tea crop classification accuracy (OA = 84.70% and KIA = 0.690), followed by random forest (RF), maximum likelihood algorithm (ML), and logistic regression analysis (LR). However, the ML classifier achieved the highest classification accuracy (OA = 96.04% and KIA = 0.887) in object-based image analysis (OBIA) using only six variables. The contribution of this study is to create a new framework for accurately identifying tea crops in a subtropical region with real-time high-resolution WorldView-2 imagery without field survey, which could further aid agriculture land management and a sustainable agricultural product supply.

  5. Polarization in Raman spectroscopy helps explain bone brittleness in genetic mouse models

    NASA Astrophysics Data System (ADS)

    Makowski, Alexander J.; Pence, Isaac J.; Uppuganti, Sasidhar; Zein-Sabatto, Ahbid; Huszagh, Meredith C.; Mahadevan-Jansen, Anita; Nyman, Jeffry S.

    2014-11-01

    Raman spectroscopy (RS) has been extensively used to characterize bone composition. However, the link between bone biomechanics and RS measures is not well established. Here, we leveraged the sensitivity of RS polarization to organization, thereby assessing whether RS can explain differences in bone toughness in genetic mouse models for which traditional RS peak ratios are not informative. In the selected mutant mice-activating transcription factor 4 (ATF4) or matrix metalloproteinase 9 (MMP9) knock-outs-toughness is reduced but differences in bone strength do not exist between knock-out and corresponding wild-type controls. To incorporate differences in the RS of bone occurring at peak shoulders, a multivariate approach was used. Full spectrum principal components analysis of two paired, orthogonal bone orientations (relative to laser polarization) improved genotype classification and correlation to bone toughness when compared to traditional peak ratios. When applied to femurs from wild-type mice at 8 and 20 weeks of age, the principal components of orthogonal bone orientations improved age classification but not the explanation of the maturation-related increase in strength. Overall, increasing polarization information by collecting spectra from two bone orientations improves the ability of multivariate RS to explain variance in bone toughness, likely due to polarization sensitivity to organizational changes in both mineral and collagen.

  6. An expert system based on principal component analysis, artificial immune system and fuzzy k-NN for diagnosis of valvular heart diseases.

    PubMed

    Sengur, Abdulkadir

    2008-03-01

    In the last two decades, the use of artificial intelligence methods in medical analysis is increasing. This is mainly because the effectiveness of classification and detection systems have improved a great deal to help the medical experts in diagnosing. In this work, we investigate the use of principal component analysis (PCA), artificial immune system (AIS) and fuzzy k-NN to determine the normal and abnormal heart valves from the Doppler heart sounds. The proposed heart valve disorder detection system is composed of three stages. The first stage is the pre-processing stage. Filtering, normalization and white de-noising are the processes that were used in this stage. The feature extraction is the second stage. During feature extraction stage, wavelet packet decomposition was used. As a next step, wavelet entropy was considered as features. For reducing the complexity of the system, PCA was used for feature reduction. In the classification stage, AIS and fuzzy k-NN were used. To evaluate the performance of the proposed methodology, a comparative study is realized by using a data set containing 215 samples. The validation of the proposed method is measured by using the sensitivity and specificity parameters; 95.9% sensitivity and 96% specificity rate was obtained.

  7. VizieR Online Data Catalog: RR Lyrae in SDSS Stripe 82 (Suveges+, 2012)

    NASA Astrophysics Data System (ADS)

    Suveges, M.; Sesar, B.; Varadi, M.; Mowlavi, N.; Becker, A. C.; Ivezic, Z.; Beck, M.; Nienartowicz, K.; Rimoldini, L.; Dubath, P.; Bartholdi, P.; Eyer, L.

    2013-05-01

    We propose a robust principal component analysis framework for the exploitation of multiband photometric measurements in large surveys. Period search results are improved using the time-series of the first principal component due to its optimized signal-to-noise ratio. The presence of correlated excess variations in the multivariate time-series enables the detection of weaker variability. Furthermore, the direction of the largest variance differs for certain types of variable stars. This can be used as an efficient attribute for classification. The application of the method to a subsample of Sloan Digital Sky Survey Stripe 82 data yielded 132 high-amplitude delta Scuti variables. We also found 129 new RR Lyrae variables, complementary to the catalogue of Sesar et al., extending the halo area mapped by Stripe 82 RR Lyrae stars towards the Galactic bulge. The sample also comprises 25 multiperiodic or Blazhko RR Lyrae stars. (8 data files).

  8. Portable XRF and principal component analysis for bill characterization in forensic science.

    PubMed

    Appoloni, C R; Melquiades, F L

    2014-02-01

    Several modern techniques have been applied to prevent counterfeiting of money bills. The objective of this study was to demonstrate the potential of Portable X-ray Fluorescence (PXRF) technique and the multivariate analysis method of Principal Component Analysis (PCA) for classification of bills in order to use it in forensic science. Bills of Dollar, Euro and Real (Brazilian currency) were measured directly at different colored regions, without any previous preparation. Spectra interpretation allowed the identification of Ca, Ti, Fe, Cu, Sr, Y, Zr and Pb. PCA analysis separated the bills in three groups and subgroups among Brazilian currency. In conclusion, the samples were classified according to its origin identifying the elements responsible for differentiation and basic pigment composition. PXRF allied to multivariate discriminate methods is a promising technique for rapid and no destructive identification of false bills in forensic science. Copyright © 2013 Elsevier Ltd. All rights reserved.

  9. Typification of cider brandy on the basis of cider used in its manufacture.

    PubMed

    Rodríguez Madrera, Roberto; Mangas Alonso, Juan J

    2005-04-20

    A study of typification of cider brandies on the basis of the origin of the raw material used in their manufacture was conducted using chemometric techniques (principal component analysis, linear discriminant analysis, and Bayesian analysis) together with their composition in volatile compounds, as analyzed by gas chromatography with flame ionization to detect the major volatiles and by mass spectrometric to detect the minor ones. Significant principal components computed by a double cross-validation procedure allowed the structure of the database to be visualized as a function of the raw material, that is, cider made from fresh apple juice versus cider made from apple juice concentrate. Feasible and robust discriminant rules were computed and validated by a cross-validation procedure that allowed the authors to classify fresh and concentrate cider brandies, obtaining classification hits of >92%. The most discriminating variables for typifying cider brandies according to their raw material were 1-butanol and ethyl hexanoate.

  10. Classification of adulterated honeys by multivariate analysis.

    PubMed

    Amiry, Saber; Esmaiili, Mohsen; Alizadeh, Mohammad

    2017-06-01

    In this research, honey samples were adulterated with date syrup (DS) and invert sugar syrup (IS) at three concentrations (7%, 15% and 30%). 102 adulterated samples were prepared in six batches with 17 replications for each batch. For each sample, 32 parameters including color indices, rheological, physical, and chemical parameters were determined. To classify the samples, based on type and concentrations of adulterant, a multivariate analysis was applied using principal component analysis (PCA) followed by a linear discriminant analysis (LDA). Then, 21 principal components (PCs) were selected in five sets. Approximately two-thirds were identified correctly using color indices (62.75%) or rheological properties (67.65%). A power discrimination was obtained using physical properties (97.06%), and the best separations were achieved using two sets of chemical properties (set 1: lactone, diastase activity, sucrose - 100%) (set 2: free acidity, HMF, ash - 95%). Copyright © 2016 Elsevier Ltd. All rights reserved.

  11. Determination of butter adulteration with margarine using Raman spectroscopy.

    PubMed

    Uysal, Reyhan Selin; Boyaci, Ismail Hakki; Genis, Hüseyin Efe; Tamer, Ugur

    2013-12-15

    In this study, adulteration of butter with margarine was analysed using Raman spectroscopy combined with chemometric methods (principal component analysis (PCA), principal component regression (PCR), partial least squares (PLS)) and artificial neural networks (ANNs). Different butter and margarine samples were mixed at various concentrations ranging from 0% to 100% w/w. PCA analysis was applied for the classification of butters, margarines and mixtures. PCR, PLS and ANN were used for the detection of adulteration ratios of butter. Models were created using a calibration data set and developed models were evaluated using a validation data set. The coefficient of determination (R(2)) values between actual and predicted values obtained for PCR, PLS and ANN for the validation data set were 0.968, 0.987 and 0.978, respectively. In conclusion, a combination of Raman spectroscopy with chemometrics and ANN methods can be applied for testing butter adulteration. Copyright © 2013 Elsevier Ltd. All rights reserved.

  12. Linear Discriminant Analysis Achieves High Classification Accuracy for the BOLD fMRI Response to Naturalistic Movie Stimuli

    PubMed Central

    Mandelkow, Hendrik; de Zwart, Jacco A.; Duyn, Jeff H.

    2016-01-01

    Naturalistic stimuli like movies evoke complex perceptual processes, which are of great interest in the study of human cognition by functional MRI (fMRI). However, conventional fMRI analysis based on statistical parametric mapping (SPM) and the general linear model (GLM) is hampered by a lack of accurate parametric models of the BOLD response to complex stimuli. In this situation, statistical machine-learning methods, a.k.a. multivariate pattern analysis (MVPA), have received growing attention for their ability to generate stimulus response models in a data-driven fashion. However, machine-learning methods typically require large amounts of training data as well as computational resources. In the past, this has largely limited their application to fMRI experiments involving small sets of stimulus categories and small regions of interest in the brain. By contrast, the present study compares several classification algorithms known as Nearest Neighbor (NN), Gaussian Naïve Bayes (GNB), and (regularized) Linear Discriminant Analysis (LDA) in terms of their classification accuracy in discriminating the global fMRI response patterns evoked by a large number of naturalistic visual stimuli presented as a movie. Results show that LDA regularized by principal component analysis (PCA) achieved high classification accuracies, above 90% on average for single fMRI volumes acquired 2 s apart during a 300 s movie (chance level 0.7% = 2 s/300 s). The largest source of classification errors were autocorrelations in the BOLD signal compounded by the similarity of consecutive stimuli. All classifiers performed best when given input features from a large region of interest comprising around 25% of the voxels that responded significantly to the visual stimulus. Consistent with this, the most informative principal components represented widespread distributions of co-activated brain regions that were similar between subjects and may represent functional networks. In light of these results, the combination of naturalistic movie stimuli and classification analysis in fMRI experiments may prove to be a sensitive tool for the assessment of changes in natural cognitive processes under experimental manipulation. PMID:27065832

  13. Molecular classification of pesticides including persistent organic pollutants, phenylurea and sulphonylurea herbicides.

    PubMed

    Torrens, Francisco; Castellano, Gloria

    2014-06-05

    Pesticide residues in wine were analyzed by liquid chromatography-tandem mass spectrometry. Retentions are modelled by structure-property relationships. Bioplastic evolution is an evolutionary perspective conjugating effect of acquired characters and evolutionary indeterminacy-morphological determination-natural selection principles; its application to design co-ordination index barely improves correlations. Fractal dimensions and partition coefficient differentiate pesticides. Classification algorithms are based on information entropy and its production. Pesticides allow a structural classification by nonplanarity, and number of O, S, N and Cl atoms and cycles; different behaviours depend on number of cycles. The novelty of the approach is that the structural parameters are related to retentions. Classification algorithms are based on information entropy. When applying procedures to moderate-sized sets, excessive results appear compatible with data suffering a combinatorial explosion. However, equipartition conjecture selects criterion resulting from classification between hierarchical trees. Information entropy permits classifying compounds agreeing with principal component analyses. Periodic classification shows that pesticides in the same group present similar properties; those also in equal period, maximum resemblance. The advantage of the classification is to predict the retentions for molecules not included in the categorization. Classification extends to phenyl/sulphonylureas and the application will be to predict their retentions.

  14. Assessing Footwear Effects from Principal Features of Plantar Loading during Running.

    PubMed

    Trudeau, Matthieu B; von Tscharner, Vinzenz; Vienneau, Jordyn; Hoerzer, Stefan; Nigg, Benno M

    2015-09-01

    The effects of footwear on the musculoskeletal system are commonly assessed by interpreting the resultant force at the foot during the stance phase of running. However, this approach overlooks loading patterns across the entire foot. An alternative technique for assessing foot loading across different footwear conditions is possible using comprehensive analysis tools that extract different foot loading features, thus enhancing the functional interpretation of the differences across different interventions. The purpose of this article was to use pattern recognition techniques to develop and use a novel comprehensive method for assessing the effects of different footwear interventions on plantar loading. A principal component analysis was used to extract different loading features from the stance phase of running, and a support vector machine (SVM) was used to determine whether and how these loading features were different across three shoe conditions. The results revealed distinct loading features at the foot during the stance phase of running. The loading features determined from the principal component analysis allowed successful classification of all three shoe conditions using the SVM. Several differences were found in the location and timing of the loading across each pairwise shoe comparison using the output from the SVM. The analysis approach proposed can successfully be used to compare different loading patterns with a much greater resolution than has been reported previously. This study has several important applications. One such application is that it would not be relevant for a user to select a shoe or for a manufacturer to alter a shoe's construction if the classification across shoe conditions would not have been significant.

  15. Rapid fingerprinting of white wine oxidizable fraction and classification of white wines using disposable screen printed sensors and derivative voltammetry.

    PubMed

    Ugliano, Maurizio

    2016-12-01

    This work describes the application of disposable screen printed carbon paste sensors for the analysis of the main white wine oxidizable compounds as well as for the rapid fingerprinting and classification of white wines from different grape varieties. The response of individual white wine antioxidants such as flavanols, flavanol derivatives, phenolic acids, SO2 and ascorbic acid was first assessed in model wine. Analysis of commercial white wines gave voltammograms featuring two unresolved anodic waves corresponding to the oxidation of different compounds, mostly phenolic antioxidants. Calculation of the first order derivative of measured current vs. applied potential allowed resolving these two waves, highlighting the occurrence of several electrode processes corresponding to the oxidation of individual wine components. Through the application of Principal Component Analysis (PCA), derivative voltammograms were used to discriminate among wines of different varieties. Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. An explorative childhood pneumonia analysis based on ultrasonic imaging texture features

    NASA Astrophysics Data System (ADS)

    Zenteno, Omar; Diaz, Kristians; Lavarello, Roberto; Zimic, Mirko; Correa, Malena; Mayta, Holger; Anticona, Cynthia; Pajuelo, Monica; Oberhelman, Richard; Checkley, William; Gilman, Robert H.; Figueroa, Dante; Castañeda, Benjamín.

    2015-12-01

    According to World Health Organization, pneumonia is the respiratory disease with the highest pediatric mortality rate accounting for 15% of all deaths of children under 5 years old worldwide. The diagnosis of pneumonia is commonly made by clinical criteria with support from ancillary studies and also laboratory findings. Chest imaging is commonly done with chest X-rays and occasionally with a chest CT scan. Lung ultrasound is a promising alternative for chest imaging; however, interpretation is subjective and requires adequate training. In the present work, a two-class classification algorithm based on four Gray-level co-occurrence matrix texture features (i.e., Contrast, Correlation, Energy and Homogeneity) extracted from lung ultrasound images from children aged between six months and five years is presented. Ultrasound data was collected using a L14-5/38 linear transducer. The data consisted of 22 positive- and 68 negative-diagnosed B-mode cine-loops selected by a medical expert and captured in the facilities of the Instituto Nacional de Salud del Niño (Lima, Peru), for a total number of 90 videos obtained from twelve children diagnosed with pneumonia. The classification capacity of each feature was explored independently and the optimal threshold was selected by a receiver operator characteristic (ROC) curve analysis. In addition, a principal component analysis was performed to evaluate the combined performance of all the features. Contrast and correlation resulted the two more significant features. The classification performance of these two features by principal components was evaluated. The results revealed 82% sensitivity, 76% specificity, 78% accuracy and 0.85 area under the ROC.

  17. Estimating persistence of brominated and chlorinated organic pollutants in air, water, soil, and sediments with the QSPR-based classification scheme.

    PubMed

    Puzyn, T; Haranczyk, M; Suzuki, N; Sakurai, T

    2011-02-01

    We have estimated degradation half-lives of both brominated and chlorinated dibenzo-p-dioxins (PBDDs and PCDDs), furans (PBDFs and PCDFs), biphenyls (PBBs and PCBs), naphthalenes (PBNs and PCNs), diphenyl ethers (PBDEs and PCDEs) as well as selected unsubstituted polycyclic aromatic hydrocarbons (PAHs) in air, surface water, surface soil, and sediments (in total of 1,431 compounds in four compartments). Next, we compared the persistence between chloro- (relatively well-studied) and bromo- (less studied) analogs. The predictions have been performed based on the quantitative structure-property relationship (QSPR) scheme with use of k-nearest neighbors (kNN) classifier and the semi-quantitative system of persistence classes. The classification models utilized principal components derived from the principal component analysis of a set of 24 constitutional and quantum mechanical descriptors as input variables. Accuracies of classification (based on an external validation) were 86, 85, 87, and 75% for air, surface water, surface soil, and sediments, respectively. The persistence of all chlorinated species increased with increasing halogenation degree. In the case of brominated organic pollutants (Br-OPs), the trend was the same for air and sediments. However, we noticed that the opposite trend for persistence in surface water and soil. The results suggest that, due to high photoreactivity of C-Br chemical bonds, photolytic processes occurring in surface water and soil are able to play significant role in transforming and removing Br-OPs from these compartments. This contribution is the first attempt of classifying together Br-OPs and Cl-OPs according to their persistence, in particular, environmental compartments.

  18. Dimensionality reduction for the quantitative evaluation of a smartphone-based Timed Up and Go test.

    PubMed

    Palmerini, Luca; Mellone, Sabato; Rocchi, Laura; Chiari, Lorenzo

    2011-01-01

    The Timed Up and Go is a clinical test to assess mobility in the elderly and in Parkinson's disease. Lately instrumented versions of the test are being considered, where inertial sensors assess motion. To improve the pervasiveness, ease of use, and cost, we consider a smartphone's accelerometer as the measurement system. Several parameters (usually highly correlated) can be computed from the signals recorded during the test. To avoid redundancy and obtain the features that are most sensitive to the locomotor performance, a dimensionality reduction was performed through principal component analysis (PCA). Forty-nine healthy subjects of different ages were tested. PCA was performed to extract new features (principal components) which are not redundant combinations of the original parameters and account for most of the data variability. They can be useful for exploratory analysis and outlier detection. Then, a reduced set of the original parameters was selected through correlation analysis with the principal components. This set could be recommended for studies based on healthy adults. The proposed procedure could be used as a first-level feature selection in classification studies (i.e. healthy-Parkinson's disease, fallers-non fallers) and could allow, in the future, a complete system for movement analysis to be incorporated in a smartphone.

  19. [Principal component analysis and cluster analysis of inorganic elements in sea cucumber Apostichopus japonicus].

    PubMed

    Liu, Xiao-Fang; Xue, Chang-Hu; Wang, Yu-Ming; Li, Zhao-Jie; Xue, Yong; Xu, Jie

    2011-11-01

    The present study is to investigate the feasibility of multi-elements analysis in determination of the geographical origin of sea cucumber Apostichopus japonicus, and to make choice of the effective tracers in sea cucumber Apostichopus japonicus geographical origin assessment. The content of the elements such as Al, V, Cr, Mn, Fe, Co, Ni, Cu, Zn, As, Se, Mo, Cd, Hg and Pb in sea cucumber Apostichopus japonicus samples from seven places of geographical origin were determined by means of ICP-MS. The results were used for the development of elements database. Cluster analysis(CA) and principal component analysis (PCA) were applied to differentiate the sea cucumber Apostichopus japonicus geographical origin. Three principal components which accounted for over 89% of the total variance were extracted from the standardized data. The results of Q-type cluster analysis showed that the 26 samples could be clustered reasonably into five groups, the classification results were significantly associated with the marine distribution of the sea cucumber Apostichopus japonicus samples. The CA and PCA were the effective methods for elements analysis of sea cucumber Apostichopus japonicus samples. The content of the mineral elements in sea cucumber Apostichopus japonicus samples was good chemical descriptors for differentiating their geographical origins.

  20. Classification of fMRI resting-state maps using machine learning techniques: A comparative study

    NASA Astrophysics Data System (ADS)

    Gallos, Ioannis; Siettos, Constantinos

    2017-11-01

    We compare the efficiency of Principal Component Analysis (PCA) and nonlinear learning manifold algorithms (ISOMAP and Diffusion maps) for classifying brain maps between groups of schizophrenia patients and healthy from fMRI scans during a resting-state experiment. After a standard pre-processing pipeline, we applied spatial Independent component analysis (ICA) to reduce (a) noise and (b) spatial-temporal dimensionality of fMRI maps. On the cross-correlation matrix of the ICA components, we applied PCA, ISOMAP and Diffusion Maps to find an embedded low-dimensional space. Finally, support-vector-machines (SVM) and k-NN algorithms were used to evaluate the performance of the algorithms in classifying between the two groups.

  1. Principals' Leadership Orientation in Relationship to the Classification of Their Schools in New Jersey

    ERIC Educational Resources Information Center

    dela Cruz, Samuel

    2016-01-01

    The relationship of principals' leadership orientations to the classification of their schools in New Jersey were examined in this study. While their role has expanded over the years, school principals continue to be essential in school reform and sustainability efforts. However, they are often overshadowed by the role of teachers. This…

  2. The classification of LANDSAT data for the Orlando, Florida, urban fringe area

    NASA Technical Reports Server (NTRS)

    Walthall, C. L.; Knapp, E. M.

    1978-01-01

    Procedures used to map residential land cover on the Orlando, Florida, Urban fringe zone are detailed. The NASA Bureau of the Census Applications Systems Verification and Transfer project and the test site are described as well as the LANDSAT data used as the land cover information sources. Both single-date LANDSAT data processing and multitemporal principal components LANDSAT data processing are described. A summary of significant findings is included.

  3. Classification Techniques for Multivariate Data Analysis.

    DTIC Science & Technology

    1980-03-28

    analysis among biologists, botanists, and ecologists, while some social scientists may refer "typology". Other frequently encountered terms are pattern...the determinantal equation: lB -XW 0 (42) 49 The solutions X. are the eigenvalues of the matrix W-1 B 1 as in discriminant analysis. There are t non...Statistical Package for Social Sciences (SPSS) (14) subprogram FACTOR was used for the principal components analysis. It is designed both for the factor

  4. Multivariate Quality Control Procedures

    DTIC Science & Technology

    1988-10-01

    CLASSIFICATION OF THIS PAGE PREFACE The mathematical modeling work described in this report was authorized under Project No. IC162706A553, CB Defense and...the sum of the measurements. A CUSUM of the first principal component would detect changes in the overall thickness of the sheet. A linear trend could...develop- ment of a unique outlier rule for the specific application. 28 LITERATURE CITED 1. Mood, A.M., Graybill , F.A., and Boes, D.C., Introduction to

  5. Improving protein complex classification accuracy using amino acid composition profile.

    PubMed

    Huang, Chien-Hung; Chou, Szu-Yu; Ng, Ka-Lok

    2013-09-01

    Protein complex prediction approaches are based on the assumptions that complexes have dense protein-protein interactions and high functional similarity between their subunits. We investigated those assumptions by studying the subunits' interaction topology, sequence similarity and molecular function for human and yeast protein complexes. Inclusion of amino acids' physicochemical properties can provide better understanding of protein complex properties. Principal component analysis is carried out to determine the major features. Adopting amino acid composition profile information with the SVM classifier serves as an effective post-processing step for complexes classification. Improvement is based on primary sequence information only, which is easy to obtain. Copyright © 2013 Elsevier Ltd. All rights reserved.

  6. Comparison of classification algorithms for various methods of preprocessing radar images of the MSTAR base

    NASA Astrophysics Data System (ADS)

    Borodinov, A. A.; Myasnikov, V. V.

    2018-04-01

    The present work is devoted to comparing the accuracy of the known qualification algorithms in the task of recognizing local objects on radar images for various image preprocessing methods. Preprocessing involves speckle noise filtering and normalization of the object orientation in the image by the method of image moments and by a method based on the Hough transform. In comparison, the following classification algorithms are used: Decision tree; Support vector machine, AdaBoost, Random forest. The principal component analysis is used to reduce the dimension. The research is carried out on the objects from the base of radar images MSTAR. The paper presents the results of the conducted studies.

  7. Assessing therapeutic relevance of biologically interesting, ampholytic substances based on their physicochemical and spectral characteristics with chemometric tools

    NASA Astrophysics Data System (ADS)

    Judycka, U.; Jagiello, K.; Bober, L.; Błażejowski, J.; Puzyn, T.

    2018-06-01

    Chemometric tools were applied to investigate the biological behaviour of ampholytic substances in relation to their physicochemical and spectral properties. Results of the Principal Component Analysis suggest that size of molecules and their electronic and spectral characteristics are the key properties required to predict therapeutic relevance of the compounds examined. These properties were used for developing the structure-activity classification model. The classification model allows assessing the therapeutic behaviour of ampholytic substances on the basis of solely values of descriptors that can be obtained computationally. Thus, the prediction is possible without necessity of carrying out time-consuming and expensive laboratory tests, which is its main advantage.

  8. Rotationally Invariant Image Representation for Viewing Direction Classification in Cryo-EM

    PubMed Central

    Zhao, Zhizhen; Singer, Amit

    2014-01-01

    We introduce a new rotationally invariant viewing angle classification method for identifying, among a large number of cryo-EM projection images, similar views without prior knowledge of the molecule. Our rotationally invariant features are based on the bispectrum. Each image is denoised and compressed using steerable principal component analysis (PCA) such that rotating an image is equivalent to phase shifting the expansion coefficients. Thus we are able to extend the theory of bispectrum of 1D periodic signals to 2D images. The randomized PCA algorithm is then used to efficiently reduce the dimensionality of the bispectrum coefficients, enabling fast computation of the similarity between any pair of images. The nearest neighbors provide an initial classification of similar viewing angles. In this way, rotational alignment is only performed for images with their nearest neighbors. The initial nearest neighbor classification and alignment are further improved by a new classification method called vector diffusion maps. Our pipeline for viewing angle classification and alignment is experimentally shown to be faster and more accurate than reference-free alignment with rotationally invariant K-means clustering, MSA/MRA 2D classification, and their modern approximations. PMID:24631969

  9. Sunspot Pattern Classification using PCA and Neural Networks (Poster)

    NASA Technical Reports Server (NTRS)

    Rajkumar, T.; Thompson, D. E.; Slater, G. L.

    2005-01-01

    The sunspot classification scheme presented in this paper is considered as a 2-D classification problem on archived datasets, and is not a real-time system. As a first step, it mirrors the Zuerich/McIntosh historical classification system and reproduces classification of sunspot patterns based on preprocessing and neural net training datasets. Ultimately, the project intends to move from more rudimentary schemes, to develop spatial-temporal-spectral classes derived by correlating spatial and temporal variations in various wavelengths to the brightness fluctuation spectrum of the sun in those wavelengths. Once the approach is generalized, then the focus will naturally move from a 2-D to an n-D classification, where "n" includes time and frequency. Here, the 2-D perspective refers both to the actual SOH0 Michelson Doppler Imager (MDI) images that are processed, but also refers to the fact that a 2-D matrix is created from each image during preprocessing. The 2-D matrix is the result of running Principal Component Analysis (PCA) over the selected dataset images, and the resulting matrices and their eigenvalues are the objects that are stored in a database, classified, and compared. These matrices are indexed according to the standard McIntosh classification scheme.

  10. A hybrid sensing approach for pure and adulterated honey classification.

    PubMed

    Subari, Norazian; Mohamad Saleh, Junita; Md Shakaff, Ali Yeon; Zakaria, Ammar

    2012-10-17

    This paper presents a comparison between data from single modality and fusion methods to classify Tualang honey as pure or adulterated using Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) statistical classification approaches. Ten different brands of certified pure Tualang honey were obtained throughout peninsular Malaysia and Sumatera, Indonesia. Various concentrations of two types of sugar solution (beet and cane sugar) were used in this investigation to create honey samples of 20%, 40%, 60% and 80% adulteration concentrations. Honey data extracted from an electronic nose (e-nose) and Fourier Transform Infrared Spectroscopy (FTIR) were gathered, analyzed and compared based on fusion methods. Visual observation of classification plots revealed that the PCA approach able to distinct pure and adulterated honey samples better than the LDA technique. Overall, the validated classification results based on FTIR data (88.0%) gave higher classification accuracy than e-nose data (76.5%) using the LDA technique. Honey classification based on normalized low-level and intermediate-level FTIR and e-nose fusion data scored classification accuracies of 92.2% and 88.7%, respectively using the Stepwise LDA method. The results suggested that pure and adulterated honey samples were better classified using FTIR and e-nose fusion data than single modality data.

  11. Contribution of non-negative matrix factorization to the classification of remote sensing images

    NASA Astrophysics Data System (ADS)

    Karoui, M. S.; Deville, Y.; Hosseini, S.; Ouamri, A.; Ducrot, D.

    2008-10-01

    Remote sensing has become an unavoidable tool for better managing our environment, generally by realizing maps of land cover using classification techniques. The classification process requires some pre-processing, especially for data size reduction. The most usual technique is Principal Component Analysis. Another approach consists in regarding each pixel of the multispectral image as a mixture of pure elements contained in the observed area. Using Blind Source Separation (BSS) methods, one can hope to unmix each pixel and to perform the recognition of the classes constituting the observed scene. Our contribution consists in using Non-negative Matrix Factorization (NMF) combined with sparse coding as a solution to BSS, in order to generate new images (which are at least partly separated images) using HRV SPOT images from Oran area, Algeria). These images are then used as inputs of a supervised classifier integrating textural information. The results of classifications of these "separated" images show a clear improvement (correct pixel classification rate improved by more than 20%) compared to classification of initial (i.e. non separated) images. These results show the contribution of NMF as an attractive pre-processing for classification of multispectral remote sensing imagery.

  12. Comparative Study of SVM Methods Combined with Voxel Selection for Object Category Classification on fMRI Data

    PubMed Central

    Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li

    2011-01-01

    Background Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Methodology/Principal Findings Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. Conclusions/Significance The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice. PMID:21359184

  13. An efficient rhythmic component expression and weighting synthesis strategy for classifying motor imagery EEG in a brain computer interface

    NASA Astrophysics Data System (ADS)

    Wang, Tao; He, Bin

    2004-03-01

    The recognition of mental states during motor imagery tasks is crucial for EEG-based brain computer interface research. We have developed a new algorithm by means of frequency decomposition and weighting synthesis strategy for recognizing imagined right- and left-hand movements. A frequency range from 5 to 25 Hz was divided into 20 band bins for each trial, and the corresponding envelopes of filtered EEG signals for each trial were extracted as a measure of instantaneous power at each frequency band. The dimensionality of the feature space was reduced from 200 (corresponding to 2 s) to 3 by down-sampling of envelopes of the feature signals, and subsequently applying principal component analysis. The linear discriminate analysis algorithm was then used to classify the features, due to its generalization capability. Each frequency band bin was weighted by a function determined according to the classification accuracy during the training process. The present classification algorithm was applied to a dataset of nine human subjects, and achieved a success rate of classification of 90% in training and 77% in testing. The present promising results suggest that the present classification algorithm can be used in initiating a general-purpose mental state recognition based on motor imagery tasks.

  14. Principal component directed partial least squares analysis for combining nuclear magnetic resonance and mass spectrometry data in metabolomics: application to the detection of breast cancer.

    PubMed

    Gu, Haiwei; Pan, Zhengzheng; Xi, Bowei; Asiago, Vincent; Musselman, Brian; Raftery, Daniel

    2011-02-07

    Nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS) are the two most commonly used analytical tools in metabolomics, and their complementary nature makes the combination particularly attractive. A combined analytical approach can improve the potential for providing reliable methods to detect metabolic profile alterations in biofluids or tissues caused by disease, toxicity, etc. In this paper, (1)H NMR spectroscopy and direct analysis in real time (DART)-MS were used for the metabolomics analysis of serum samples from breast cancer patients and healthy controls. Principal component analysis (PCA) of the NMR data showed that the first principal component (PC1) scores could be used to separate cancer from normal samples. However, no such obvious clustering could be observed in the PCA score plot of DART-MS data, even though DART-MS can provide a rich and informative metabolic profile. Using a modified multivariate statistical approach, the DART-MS data were then reevaluated by orthogonal signal correction (OSC) pretreated partial least squares (PLS), in which the Y matrix in the regression was set to the PC1 score values from the NMR data analysis. This approach, and a similar one using the first latent variable from PLS-DA of the NMR data resulted in a significant improvement of the separation between the disease samples and normals, and a metabolic profile related to breast cancer could be extracted from DART-MS. The new approach allows the disease classification to be expressed on a continuum as opposed to a binary scale and thus better represents the disease and healthy classifications. An improved metabolic profile obtained by combining MS and NMR by this approach may be useful to achieve more accurate disease detection and gain more insight regarding disease mechanisms and biology. Copyright © 2010 Elsevier B.V. All rights reserved.

  15. Q-mode versus R-mode principal component analysis for linear discriminant analysis (LDA)

    NASA Astrophysics Data System (ADS)

    Lee, Loong Chuen; Liong, Choong-Yeun; Jemain, Abdul Aziz

    2017-05-01

    Many literature apply Principal Component Analysis (PCA) as either preliminary visualization or variable con-struction methods or both. Focus of PCA can be on the samples (R-mode PCA) or variables (Q-mode PCA). Traditionally, R-mode PCA has been the usual approach to reduce high-dimensionality data before the application of Linear Discriminant Analysis (LDA), to solve classification problems. Output from PCA composed of two new matrices known as loadings and scores matrices. Each matrix can then be used to produce a plot, i.e. loadings plot aids identification of important variables whereas scores plot presents spatial distribution of samples on new axes that are also known as Principal Components (PCs). Fundamentally, the scores matrix always be the input variables for building classification model. A recent paper uses Q-mode PCA but the focus of analysis was not on the variables but instead on the samples. As a result, the authors have exchanged the use of both loadings and scores plots in which clustering of samples was studied using loadings plot whereas scores plot has been used to identify important manifest variables. Therefore, the aim of this study is to statistically validate the proposed practice. Evaluation is based on performance of external error obtained from LDA models according to number of PCs. On top of that, bootstrapping was also conducted to evaluate the external error of each of the LDA models. Results show that LDA models produced by PCs from R-mode PCA give logical performance and the matched external error are also unbiased whereas the ones produced with Q-mode PCA show the opposites. With that, we concluded that PCs produced from Q-mode is not statistically stable and thus should not be applied to problems of classifying samples, but variables. We hope this paper will provide some insights on the disputable issues.

  16. Circulation types related to lightning activity over Catalonia and the Principality of Andorra

    NASA Astrophysics Data System (ADS)

    Pineda, N.; Esteban, P.; Trapero, L.; Soler, X.; Beck, C.

    In the present study, we use a Principal Component Analysis (PCA) to characterize the surface 6-h circulation types related to substantial lightning activity over the Catalonia area (north-eastern Iberia) and the Principality of Andorra (eastern Pyrenees) from January 2003 to December 2007. The gridded data used for classification of the circulation types is the NCEP Final Analyses of the Global Tropospheric Analyses at 1° resolution over the region 35°N-48°N by 5°W-8°E. Lightning information was collected by the SAFIR lightning detection system operated by the Meteorological Service of Catalonia (SMC), which covers the region studied. We determined nine circulation types on the basis of the S-mode orthogonal rotated Principal Component Analysis. The “extreme scores” principle was used previous to the assignation of all cases, to obtain the number of final types and their centroids. The distinct differences identified in the resulting mean Sea Level Pressure (SLP) fields enabled us to group the types into three main patterns, taking into account their scale/dynamical origin. The first group of types shows the different distribution of the centres of action at synoptic scale associated with the occurrence of lightning. The second group is connected to mesoscale dynamics, mainly induced by the relief of the Pyrenees. The third group shows types with low gradient SLP patterns in which the lightning activity is a consequence of thermal dynamics (coastal and mountain breezes). Apart from reinforcing the consistency of the groups obtained, analysis of the resulting classification improves our understanding of the geographical distribution and genesis factors of thunderstorm activity in the study area, and provides complementary information for supporting weather forecasting. Thus, the catalogue obtained will provide advances in different climatological and meteorological applications, such as nowcasting products or detection of climate change trends.

  17. Semi-supervised vibration-based classification and condition monitoring of compressors

    NASA Astrophysics Data System (ADS)

    Potočnik, Primož; Govekar, Edvard

    2017-09-01

    Semi-supervised vibration-based classification and condition monitoring of the reciprocating compressors installed in refrigeration appliances is proposed in this paper. The method addresses the problem of industrial condition monitoring where prior class definitions are often not available or difficult to obtain from local experts. The proposed method combines feature extraction, principal component analysis, and statistical analysis for the extraction of initial class representatives, and compares the capability of various classification methods, including discriminant analysis (DA), neural networks (NN), support vector machines (SVM), and extreme learning machines (ELM). The use of the method is demonstrated on a case study which was based on industrially acquired vibration measurements of reciprocating compressors during the production of refrigeration appliances. The paper presents a comparative qualitative analysis of the applied classifiers, confirming the good performance of several nonlinear classifiers. If the model parameters are properly selected, then very good classification performance can be obtained from NN trained by Bayesian regularization, SVM and ELM classifiers. The method can be effectively applied for the industrial condition monitoring of compressors.

  18. Nonlinear features for classification and pose estimation of machined parts from single views

    NASA Astrophysics Data System (ADS)

    Talukder, Ashit; Casasent, David P.

    1998-10-01

    A new nonlinear feature extraction method is presented for classification and pose estimation of objects from single views. The feature extraction method is called the maximum representation and discrimination feature (MRDF) method. The nonlinear MRDF transformations to use are obtained in closed form, and offer significant advantages compared to nonlinear neural network implementations. The features extracted are useful for both object discrimination (classification) and object representation (pose estimation). We consider MRDFs on image data, provide a new 2-stage nonlinear MRDF solution, and show it specializes to well-known linear and nonlinear image processing transforms under certain conditions. We show the use of MRDF in estimating the class and pose of images of rendered solid CAD models of machine parts from single views using a feature-space trajectory neural network classifier. We show new results with better classification and pose estimation accuracy than are achieved by standard principal component analysis and Fukunaga-Koontz feature extraction methods.

  19. Forensic Discrimination of Latent Fingerprints Using Laser-Induced Breakdown Spectroscopy (LIBS) and Chemometric Approaches.

    PubMed

    Yang, Jun-Ho; Yoh, Jack J

    2018-01-01

    A novel technique is reported for separating overlapping latent fingerprints using chemometric approaches that combine laser-induced breakdown spectroscopy (LIBS) and multivariate analysis. The LIBS technique provides the capability of real time analysis and high frequency scanning as well as the data regarding the chemical composition of overlapping latent fingerprints. These spectra offer valuable information for the classification and reconstruction of overlapping latent fingerprints by implementing appropriate statistical multivariate analysis. The current study employs principal component analysis and partial least square methods for the classification of latent fingerprints from the LIBS spectra. This technique was successfully demonstrated through a classification study of four distinct latent fingerprints using classification methods such as soft independent modeling of class analogy (SIMCA) and partial least squares discriminant analysis (PLS-DA). The novel method yielded an accuracy of more than 85% and was proven to be sufficiently robust. Furthermore, through laser scanning analysis at a spatial interval of 125 µm, the overlapping fingerprints were reconstructed as separate two-dimensional forms.

  20. Classification of breast tissue in mammograms using efficient coding.

    PubMed

    Costa, Daniel D; Campos, Lúcio F; Barros, Allan K

    2011-06-24

    Female breast cancer is the major cause of death by cancer in western countries. Efforts in Computer Vision have been made in order to improve the diagnostic accuracy by radiologists. Some methods of lesion diagnosis in mammogram images were developed based in the technique of principal component analysis which has been used in efficient coding of signals and 2D Gabor wavelets used for computer vision applications and modeling biological vision. In this work, we present a methodology that uses efficient coding along with linear discriminant analysis to distinguish between mass and non-mass from 5090 region of interest from mammograms. The results show that the best rates of success reached with Gabor wavelets and principal component analysis were 85.28% and 87.28%, respectively. In comparison, the model of efficient coding presented here reached up to 90.07%. Altogether, the results presented demonstrate that independent component analysis performed successfully the efficient coding in order to discriminate mass from non-mass tissues. In addition, we have observed that LDA with ICA bases showed high predictive performance for some datasets and thus provide significant support for a more detailed clinical investigation.

  1. A fuzzy-based data transformation for feature extraction to increase classification performance with small medical data sets.

    PubMed

    Li, Der-Chiang; Liu, Chiao-Wen; Hu, Susan C

    2011-05-01

    Medical data sets are usually small and have very high dimensionality. Too many attributes will make the analysis less efficient and will not necessarily increase accuracy, while too few data will decrease the modeling stability. Consequently, the main objective of this study is to extract the optimal subset of features to increase analytical performance when the data set is small. This paper proposes a fuzzy-based non-linear transformation method to extend classification related information from the original data attribute values for a small data set. Based on the new transformed data set, this study applies principal component analysis (PCA) to extract the optimal subset of features. Finally, we use the transformed data with these optimal features as the input data for a learning tool, a support vector machine (SVM). Six medical data sets: Pima Indians' diabetes, Wisconsin diagnostic breast cancer, Parkinson disease, echocardiogram, BUPA liver disorders dataset, and bladder cancer cases in Taiwan, are employed to illustrate the approach presented in this paper. This research uses the t-test to evaluate the classification accuracy for a single data set; and uses the Friedman test to show the proposed method is better than other methods over the multiple data sets. The experiment results indicate that the proposed method has better classification performance than either PCA or kernel principal component analysis (KPCA) when the data set is small, and suggest creating new purpose-related information to improve the analysis performance. This paper has shown that feature extraction is important as a function of feature selection for efficient data analysis. When the data set is small, using the fuzzy-based transformation method presented in this work to increase the information available produces better results than the PCA and KPCA approaches. Copyright © 2011 Elsevier B.V. All rights reserved.

  2. Intelligent Color Vision System for Ripeness Classification of Oil Palm Fresh Fruit Bunch

    PubMed Central

    Fadilah, Norasyikin; Mohamad-Saleh, Junita; Halim, Zaini Abdul; Ibrahim, Haidi; Ali, Syed Salim Syed

    2012-01-01

    Ripeness classification of oil palm fresh fruit bunches (FFBs) during harvesting is important to ensure that they are harvested during optimum stage for maximum oil production. This paper presents the application of color vision for automated ripeness classification of oil palm FFB. Images of oil palm FFBs of type DxP Yangambi were collected and analyzed using digital image processing techniques. Then the color features were extracted from those images and used as the inputs for Artificial Neural Network (ANN) learning. The performance of the ANN for ripeness classification of oil palm FFB was investigated using two methods: training ANN with full features and training ANN with reduced features based on the Principal Component Analysis (PCA) data reduction technique. Results showed that compared with using full features in ANN, using the ANN trained with reduced features can improve the classification accuracy by 1.66% and is more effective in developing an automated ripeness classifier for oil palm FFB. The developed ripeness classifier can act as a sensor in determining the correct oil palm FFB ripeness category. PMID:23202043

  3. Intelligent color vision system for ripeness classification of oil palm fresh fruit bunch.

    PubMed

    Fadilah, Norasyikin; Mohamad-Saleh, Junita; Abdul Halim, Zaini; Ibrahim, Haidi; Syed Ali, Syed Salim

    2012-10-22

    Ripeness classification of oil palm fresh fruit bunches (FFBs) during harvesting is important to ensure that they are harvested during optimum stage for maximum oil production. This paper presents the application of color vision for automated ripeness classification of oil palm FFB. Images of oil palm FFBs of type DxP Yangambi were collected and analyzed using digital image processing techniques. Then the color features were extracted from those images and used as the inputs for Artificial Neural Network (ANN) learning. The performance of the ANN for ripeness classification of oil palm FFB was investigated using two methods: training ANN with full features and training ANN with reduced features based on the Principal Component Analysis (PCA) data reduction technique. Results showed that compared with using full features in ANN, using the ANN trained with reduced features can improve the classification accuracy by 1.66% and is more effective in developing an automated ripeness classifier for oil palm FFB. The developed ripeness classifier can act as a sensor in determining the correct oil palm FFB ripeness category.

  4. Semi-Empirical, First-Principles, and Hybrid Modeling of the Thermosphere to Enhance Data Assimilation

    DTIC Science & Technology

    2015-10-27

    CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 61102F 6. AUTHOR(S) Eric K. Sutton 5d. PROJECT NUMBER 3001 5e. TASK NUMBER PPM00018035...principal components, hybrid model, helium model, neutral composition, low-Earth orbit 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT 18...difficult force to determine and predict, in the orbit propagation model of low earth orbiting satellites [36]. The drag acceleration vector, ~a

  5. Micro-Raman spectroscopy for identification and classification of UTI bacteria

    NASA Astrophysics Data System (ADS)

    Yogesha, M.; Chawla, Kiran; Acharya, Mahendra; Chidangil, Santhosh; Bankapur, Aseefhali

    2017-07-01

    Urinary tract infection (UTI) is one of the major clinical problems known to mankind, especially among adult women. Conventional methods for identification of UTI causing bacteria are time consuming and expensive. Therefore, a rapid and cost-effective method is desired. In the present study, five bacteria (one Gram-positive and four Gram-negative), most commonly known to cause UTI, have been identified and classified using micro-Raman spectroscopy combined with principal component analysis (PCA).

  6. Bimodal spectroscopic evaluation of ultra violet-irradiated mouse skin inflammatory and precancerous stages: instrumentation, spectral feature extraction/selection and classification (k-NN, LDA and SVM)

    NASA Astrophysics Data System (ADS)

    Díaz-Ayil, G.; Amouroux, M.; Blondel, W. C. P. M.; Bourg-Heckly, G.; Leroux, A.; Guillemin, F.; Granjon, Y.

    2009-07-01

    This paper deals with the development and application of in vivo spatially-resolved bimodal spectroscopy (AutoFluorescence AF and Diffuse Reflectance DR), to discriminate various stages of skin precancer in a preclinical model (UV-irradiated mouse): Compensatory Hyperplasia CH, Atypical Hyperplasia AH and Dysplasia D. A programmable instrumentation was developed for acquiring AF emission spectra using 7 excitation wavelengths: 360, 368, 390, 400, 410, 420 and 430 nm, and DR spectra in the 390-720 nm wavelength range. After various steps of intensity spectra preprocessing (filtering, spectral correction and intensity normalization), several sets of spectral characteristics were extracted and selected based on their discrimination power statistically tested for every pair-wise comparison of histological classes. Data reduction with Principal Components Analysis (PCA) was performed and 3 classification methods were implemented (k-NN, LDA and SVM), in order to compare diagnostic performance of each method. Diagnostic performance was studied and assessed in terms of sensitivity (Se) and specificity (Sp) as a function of the selected features, of the combinations of 3 different inter-fibers distances and of the numbers of principal components, such that: Se and Sp ≈ 100% when discriminating CH vs. others; Sp ≈ 100% and Se > 95% when discriminating Healthy vs. AH or D; Sp ≈ 74% and Se ≈ 63%for AH vs. D.

  7. Chemometric techniques in oil classification from oil spill fingerprinting.

    PubMed

    Ismail, Azimah; Toriman, Mohd Ekhwan; Juahir, Hafizan; Kassim, Azlina Md; Zain, Sharifuddin Md; Ahmad, Wan Kamaruzaman Wan; Wong, Kok Fah; Retnam, Ananthy; Zali, Munirah Abdul; Mokhtar, Mazlin; Yusri, Mohd Ayub

    2016-10-15

    Extended use of GC-FID and GC-MS in oil spill fingerprinting and matching is significantly important for oil classification from the oil spill sources collected from various areas of Peninsular Malaysia and Sabah (East Malaysia). Oil spill fingerprinting from GC-FID and GC-MS coupled with chemometric techniques (discriminant analysis and principal component analysis) is used as a diagnostic tool to classify the types of oil polluting the water. Clustering and discrimination of oil spill compounds in the water from the actual site of oil spill events are divided into four groups viz. diesel, Heavy Fuel Oil (HFO), Mixture Oil containing Light Fuel Oil (MOLFO) and Waste Oil (WO) according to the similarity of their intrinsic chemical properties. Principal component analysis (PCA) demonstrates that diesel, HFO, MOLFO and WO are types of oil or oil products from complex oil mixtures with a total variance of 85.34% and are identified with various anthropogenic activities related to either intentional releasing of oil or accidental discharge of oil into the environment. Our results show that the use of chemometric techniques is significant in providing independent validation for classifying the types of spilled oil in the investigation of oil spill pollution in Malaysia. This, in consequence would result in cost and time saving in identification of the oil spill sources. Copyright © 2016. Published by Elsevier Ltd.

  8. 47 CFR 64.2305 - Definitions.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... businesses. (c) Primary advertising classification. A primary advertising classification is the principal... advertising classification is the classification of a subscriber to telephone exchange service as a business...' telephone numbers, addresses, or primary advertising classifications (as such classifications are assigned...

  9. 47 CFR 64.2305 - Definitions.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... businesses. (c) Primary advertising classification. A primary advertising classification is the principal... advertising classification is the classification of a subscriber to telephone exchange service as a business...' telephone numbers, addresses, or primary advertising classifications (as such classifications are assigned...

  10. Multivariate classification of small order watersheds in the Quabbin Reservoir Basin, Massachusetts

    USGS Publications Warehouse

    Lent, R.M.; Waldron, M.C.; Rader, J.C.

    1998-01-01

    A multivariate approach was used to analyze hydrologic, geologic, geographic, and water-chemistry data from small order watersheds in the Quabbin Reservoir Basin in central Massachusetts. Eighty three small order watersheds were delineated and landscape attributes defining hydrologic, geologic, and geographic features of the watersheds were compiled from geographic information system data layers. Principal components analysis was used to evaluate 11 chemical constituents collected bi-weekly for 1 year at 15 surface-water stations in order to subdivide the basin into subbasins comprised of watersheds with similar water quality characteristics. Three principal components accounted for about 90 percent of the variance in water chemistry data. The principal components were defined as a biogeochemical variable related to wetland density, an acid-neutralization variable, and a road-salt variable related to density of primary roads. Three subbasins were identified. Analysis of variance and multiple comparisons of means were used to identify significant differences in stream water chemistry and landscape attributes among subbasins. All stream water constituents were significantly different among subbasins. Multiple regression techniques were used to relate stream water chemistry to landscape attributes. Important differences in landscape attributes were related to wetlands, slope, and soil type.A multivariate approach was used to analyze hydrologic, geologic, geographic, and water-chemistry data from small order watersheds in the Quabbin Reservoir Basin in central Massachusetts. Eighty three small order watersheds were delineated and landscape attributes defining hydrologic, geologic, and geographic features of the watersheds were compiled from geographic information system data layers. Principal components analysis was used to evaluate 11 chemical constituents collected bi-weekly for 1 year at 15 surface-water stations in order to subdivide the basin into subbasins comprised of watersheds with similar water quality characteristics. Three principal components accounted for about 90 percent of the variance in water chemistry data. The principal components were defined as a biogeochemical variable related to wetland density, an acid-neutralization variable, and a road-salt variable related to density of primary roads. Three subbasins were identified. Analysis of variance and multiple comparisons of means were used to identify significant differences in stream water chemistry and landscape attributes among subbasins. All stream water constituents were significantly different among subbasins. Multiple regression techniques were used to relate stream water chemistry to landscape attributes. Important differences in landscape attributes were related to wetlands, slope, and soil type.

  11. Fast classification of hazelnut cultivars through portable infrared spectroscopy and chemometrics

    NASA Astrophysics Data System (ADS)

    Manfredi, Marcello; Robotti, Elisa; Quasso, Fabio; Mazzucco, Eleonora; Calabrese, Giorgio; Marengo, Emilio

    2018-01-01

    The authentication and traceability of hazelnuts is very important for both the consumer and the food industry, to safeguard the protected varieties and the food quality. This study investigates the use of a portable FTIR spectrometer coupled to multivariate statistical analysis for the classification of raw hazelnuts. The method discriminates hazelnuts from different origins/cultivars based on differences of the signal intensities of their IR spectra. The multivariate classification methods, namely principal component analysis (PCA) followed by linear discriminant analysis (LDA) and partial least square discriminant analysis (PLS-DA), with or without variable selection, allowed a very good discrimination among the groups, with PLS-DA coupled to variable selection providing the best results. Due to the fast analysis, high sensitivity, simplicity and no sample preparation, the proposed analytical methodology could be successfully used to verify the cultivar of hazelnuts, and the analysis can be performed quickly and directly on site.

  12. The COG database: new developments in phylogenetic classification of proteins from complete genomes

    PubMed Central

    Tatusov, Roman L.; Natale, Darren A.; Garkavtsev, Igor V.; Tatusova, Tatiana A.; Shankavaram, Uma T.; Rao, Bachoti S.; Kiryutin, Boris; Galperin, Michael Y.; Fedorova, Natalie D.; Koonin, Eugene V.

    2001-01-01

    The database of Clusters of Orthologous Groups of proteins (COGs), which represents an attempt on a phylogenetic classification of the proteins encoded in complete genomes, currently consists of 2791 COGs including 45 350 proteins from 30 genomes of bacteria, archaea and the yeast Saccharomyces cerevisiae (http://www.ncbi.nlm.nih.gov/COG). In addition, a supplement to the COGs is available, in which proteins encoded in the genomes of two multicellular eukaryotes, the nematode Caenorhabditis elegans and the fruit fly Drosophila melanogaster, and shared with bacteria and/or archaea were included. The new features added to the COG database include information pages with structural and functional details on each COG and literature references, improvements of the COGNITOR program that is used to fit new proteins into the COGs, and classification of genomes and COGs constructed by using principal component analysis. PMID:11125040

  13. A Spacecraft Electrical Characteristics Multi-Label Classification Method Based on Off-Line FCM Clustering and On-Line WPSVM

    PubMed Central

    Li, Ke; Liu, Yi; Wang, Quanxin; Wu, Yalei; Song, Shimin; Sun, Yi; Liu, Tengchong; Wang, Jun; Li, Yang; Du, Shaoyi

    2015-01-01

    This paper proposes a novel multi-label classification method for resolving the spacecraft electrical characteristics problems which involve many unlabeled test data processing, high-dimensional features, long computing time and identification of slow rate. Firstly, both the fuzzy c-means (FCM) offline clustering and the principal component feature extraction algorithms are applied for the feature selection process. Secondly, the approximate weighted proximal support vector machine (WPSVM) online classification algorithms is used to reduce the feature dimension and further improve the rate of recognition for electrical characteristics spacecraft. Finally, the data capture contribution method by using thresholds is proposed to guarantee the validity and consistency of the data selection. The experimental results indicate that the method proposed can obtain better data features of the spacecraft electrical characteristics, improve the accuracy of identification and shorten the computing time effectively. PMID:26544549

  14. Unsupervised Feature Learning for Heart Sounds Classification Using Autoencoder

    NASA Astrophysics Data System (ADS)

    Hu, Wei; Lv, Jiancheng; Liu, Dongbo; Chen, Yao

    2018-04-01

    Cardiovascular disease seriously threatens the health of many people. It is usually diagnosed during cardiac auscultation, which is a fast and efficient method of cardiovascular disease diagnosis. In recent years, deep learning approach using unsupervised learning has made significant breakthroughs in many fields. However, to our knowledge, deep learning has not yet been used for heart sound classification. In this paper, we first use the average Shannon energy to extract the envelope of the heart sounds, then find the highest point of S1 to extract the cardiac cycle. We convert the time-domain signals of the cardiac cycle into spectrograms and apply principal component analysis whitening to reduce the dimensionality of the spectrogram. Finally, we apply a two-layer autoencoder to extract the features of the spectrogram. The experimental results demonstrate that the features from the autoencoder are suitable for heart sound classification.

  15. Provenance establishment of coffee using solution ICP-MS and ICP-AES.

    PubMed

    Valentin, Jenna L; Watling, R John

    2013-11-01

    Statistical interpretation of the concentrations of 59 elements, determined using solution based inductively coupled plasma mass spectrometry (ICP-MS) and inductively coupled plasma emission spectroscopy (ICP-AES), was used to establish the provenance of coffee samples from 15 countries across five continents. Data confirmed that the harvest year, degree of ripeness and whether the coffees were green or roasted had little effect on the elemental composition of the coffees. The application of linear discriminant analysis and principal component analysis of the elemental concentrations permitted up to 96.9% correct classification of the coffee samples according to their continent of origin. When samples from each continent were considered separately, up to 100% correct classification of coffee samples into their countries, and plantations of origin was achieved. This research demonstrates the potential of using elemental composition, in combination with statistical classification methods, for accurate provenance establishment of coffee. Copyright © 2013 Elsevier Ltd. All rights reserved.

  16. Liquid contrabands classification based on energy dispersive X-ray diffraction and hybrid discriminant analysis

    NASA Astrophysics Data System (ADS)

    YangDai, Tianyi; Zhang, Li

    2016-02-01

    Energy dispersive X-ray diffraction (EDXRD) combined with hybrid discriminant analysis (HDA) has been utilized for classifying the liquid materials for the first time. The XRD spectra of 37 kinds of liquid contrabands and daily supplies were obtained using an EDXRD test bed facility. The unique spectra of different samples reveal XRD's capability to distinguish liquid contrabands from daily supplies. In order to create a system to detect liquid contrabands, the diffraction spectra were subjected to HDA which is the combination of principal components analysis (PCA) and linear discriminant analysis (LDA). Experiments based on the leave-one-out method demonstrate that HDA is a practical method with higher classification accuracy and lower noise sensitivity than the other methods in this application. The study shows the great capability and potential of the combination of XRD and HDA for liquid contrabands classification.

  17. [Classification of Children with Attention-Deficit/Hyperactivity Disorder and Typically Developing Children Based on Electroencephalogram Principal Component Analysis and k-Nearest Neighbor].

    PubMed

    Yang, Jiaojiao; Guo, Qian; Li, Wenjie; Wang, Suhong; Zou, Ling

    2016-04-01

    This paper aims to assist the individual clinical diagnosis of children with attention-deficit/hyperactivity disorder using electroencephalogram signal detection method.Firstly,in our experiments,we obtained and studied the electroencephalogram signals from fourteen attention-deficit/hyperactivity disorder children and sixteen typically developing children during the classic interference control task of Simon-spatial Stroop,and we completed electroencephalogram data preprocessing including filtering,segmentation,removal of artifacts and so on.Secondly,we selected the subset electroencephalogram electrodes using principal component analysis(PCA)method,and we collected the common channels of the optimal electrodes which occurrence rates were more than 90%in each kind of stimulation.We then extracted the latency(200~450ms)mean amplitude features of the common electrodes.Finally,we used the k-nearest neighbor(KNN)classifier based on Euclidean distance and the support vector machine(SVM)classifier based on radial basis kernel function to classify.From the experiment,at the same kind of interference control task,the attention-deficit/hyperactivity disorder children showed lower correct response rates and longer reaction time.The N2 emerged in prefrontal cortex while P2 presented in the inferior parietal area when all kinds of stimuli demonstrated.Meanwhile,the children with attention-deficit/hyperactivity disorder exhibited markedly reduced N2 and P2amplitude compared to typically developing children.KNN resulted in better classification accuracy than SVM classifier,and the best classification rate was 89.29%in StI task.The results showed that the electroencephalogram signals were different in the brain regions of prefrontal cortex and inferior parietal cortex between attention-deficit/hyperactivity disorder and typically developing children during the interference control task,which provided a scientific basis for the clinical diagnosis of attention-deficit/hyperactivity disorder individuals.

  18. Automatic loudness control in short-form content for broadcasting.

    PubMed

    Pires, Leandro da S; Vieira, Maurílio N; Yehia, Hani C

    2017-03-01

    During the early years of the International Telecommunication Union (ITU) loudness calculation standard for sound broadcasting [ITU-R (2006), Rec. BS Series, 1770], the need for additional loudness descriptors to evaluate short-form content, such as commercials and live inserts, was identified. This work proposes a loudness control scheme to prevent loudness jumps, which can bother audiences. It employs short-form content audio detection and dynamic range processing methods for the maximum loudness level criteria. Detection is achieved by combining principal component analysis for dimensionality reduction and support vector machines for binary classification. Subsequent processing is based on short-term loudness integrators and Hilbert transformers. The performance was assessed using quality classification metrics and demonstrated through a loudness control example.

  19. Locally linear embedding: dimension reduction of massive protostellar spectra

    NASA Astrophysics Data System (ADS)

    Ward, J. L.; Lumsden, S. L.

    2016-09-01

    We present the results of the application of locally linear embedding (LLE) to reduce the dimensionality of dereddened and continuum subtracted near-infrared spectra using a combination of models and real spectra of massive protostars selected from the Red MSX Source survey data base. A brief comparison is also made with two other dimension reduction techniques; principal component analysis (PCA) and Isomap using the same set of spectra as well as a more advanced form of LLE, Hessian locally linear embedding. We find that whilst LLE certainly has its limitations, it significantly outperforms both PCA and Isomap in classification of spectra based on the presence/absence of emission lines and provides a valuable tool for classification and analysis of large spectral data sets.

  20. Feature extraction through parallel Probabilistic Principal Component Analysis for heart disease diagnosis

    NASA Astrophysics Data System (ADS)

    Shah, Syed Muhammad Saqlain; Batool, Safeera; Khan, Imran; Ashraf, Muhammad Usman; Abbas, Syed Hussnain; Hussain, Syed Adnan

    2017-09-01

    Automatic diagnosis of human diseases are mostly achieved through decision support systems. The performance of these systems is mainly dependent on the selection of the most relevant features. This becomes harder when the dataset contains missing values for the different features. Probabilistic Principal Component Analysis (PPCA) has reputation to deal with the problem of missing values of attributes. This research presents a methodology which uses the results of medical tests as input, extracts a reduced dimensional feature subset and provides diagnosis of heart disease. The proposed methodology extracts high impact features in new projection by using Probabilistic Principal Component Analysis (PPCA). PPCA extracts projection vectors which contribute in highest covariance and these projection vectors are used to reduce feature dimension. The selection of projection vectors is done through Parallel Analysis (PA). The feature subset with the reduced dimension is provided to radial basis function (RBF) kernel based Support Vector Machines (SVM). The RBF based SVM serves the purpose of classification into two categories i.e., Heart Patient (HP) and Normal Subject (NS). The proposed methodology is evaluated through accuracy, specificity and sensitivity over the three datasets of UCI i.e., Cleveland, Switzerland and Hungarian. The statistical results achieved through the proposed technique are presented in comparison to the existing research showing its impact. The proposed technique achieved an accuracy of 82.18%, 85.82% and 91.30% for Cleveland, Hungarian and Switzerland dataset respectively.

  1. Morphological analysis of Trichomycterus areolatus Valenciennes, 1846 from southern Chilean rivers using a truss-based system (Siluriformes, Trichomycteridae).

    PubMed

    Colihueque, Nelson; Corrales, Olga; Yáñez, Miguel

    2017-01-01

    Trichomycterus areolatus Valenciennes, 1846 is a small endemic catfish inhabiting the Andean river basins of Chile. In this study, the morphological variability of three T. areolatus populations, collected in two river basins from southern Chile, was assessed with multivariate analyses, including principal component analysis (PCA) and discriminant function analysis (DFA). It is hypothesized that populations must segregate morphologically from each other based on the river basin that they were sampled from, since each basin presents relatively particular hydrological characteristics. Significant morphological differences among the three populations were found with PCA (ANOSIM test, r = 0.552, p < 0.0001) and DFA (Wilks's λ = 0.036, p < 0.01). PCA accounted for a total variation of 56.16% by the first two principal components. The first Principal Component (PC1) and PC2 explained 34.72 and 21.44% of the total variation, respectively. The scatter-plot of the first two discriminant functions (DF1 on DF2) also validated the existence of three different populations. In group classification using DFA, 93.3% of the specimens were correctly-classified into their original populations. Of the total of 22 transformed truss measurements, 17 exhibited highly significant ( p < 0.01) differences among populations. The data support the existence of T. areolatus morphological variation across different rivers in southern Chile, likely reflecting the geographic isolation underlying population structure of the species.

  2. Determination of the Characteristics and Classification of Near-Infrared Spectra of Patchouli Oil (Pogostemon Cablin Benth.) from Different Origin

    NASA Astrophysics Data System (ADS)

    Diego, M. C. R.; Purwanto, Y. A.; Sutrisno; Budiastra, I. W.

    2018-05-01

    Research related to the non-destructive method of near-infrared (NIR) spectroscopy in aromatic oil is still in development in Indonesia. The objectives of the study were to determine the characteristics of the near-infrared spectra of patchouli oil and classify it based on its origin. The samples were selected from seven different places in Indonesia (Bogor and Garut from West Java, Aceh, and Jambi from Sumatra and Konawe, Masamba and Kolaka from Sulawesi Island). The spectral data of patchouli oil was obtained by FT-NIR spectrometer at the wavelength of 1000-2500 nm, and after that, the samples were subjected to composition analysis using Gas Chromatography-Mass Spectrometry. The transmittance and absorbance spectra were analyzed and then principal component analysis (PCA) was carried out. Discriminant analysis (DA) of the principal component was developed to classify patchouli oil based on its origin. The result shows that the data of both spectra (transmittance and absorbance spectra) by the PC analysis give a similar result for discriminating the seven types of patchouli oil due to their distribution and behavior. The DA of the three principal component in both data processed spectra could classify patchouli oil accurately. This result exposed that NIR spectroscopy can be successfully used as a correct method to classify patchouli oil based on its origin.

  3. Quad-polarized synthetic aperture radar and multispectral data classification using classification and regression tree and support vector machine-based data fusion system

    NASA Astrophysics Data System (ADS)

    Bigdeli, Behnaz; Pahlavani, Parham

    2017-01-01

    Interpretation of synthetic aperture radar (SAR) data processing is difficult because the geometry and spectral range of SAR are different from optical imagery. Consequently, SAR imaging can be a complementary data to multispectral (MS) optical remote sensing techniques because it does not depend on solar illumination and weather conditions. This study presents a multisensor fusion of SAR and MS data based on the use of classification and regression tree (CART) and support vector machine (SVM) through a decision fusion system. First, different feature extraction strategies were applied on SAR and MS data to produce more spectral and textural information. To overcome the redundancy and correlation between features, an intrinsic dimension estimation method based on noise-whitened Harsanyi, Farrand, and Chang determines the proper dimension of the features. Then, principal component analysis and independent component analysis were utilized on stacked feature space of two data. Afterward, SVM and CART classified each reduced feature space. Finally, a fusion strategy was utilized to fuse the classification results. To show the effectiveness of the proposed methodology, single classification on each data was compared to the obtained results. A coregistered Radarsat-2 and WorldView-2 data set from San Francisco, USA, was available to examine the effectiveness of the proposed method. The results show that combinations of SAR data with optical sensor based on the proposed methodology improve the classification results for most of the classes. The proposed fusion method provided approximately 93.24% and 95.44% for two different areas of the data.

  4. Characterization of Hatay honeys according to their multi-element analysis using ICP-OES combined with chemometrics.

    PubMed

    Yücel, Yasin; Sultanoğlu, Pınar

    2013-09-01

    Chemical characterisation has been carried out on 45 honey samples collected from Hatay region of Turkey. The concentrations of 17 elements were determined by inductively coupled plasma optical emission spectrometry (ICP-OES). Ca, K, Mg and Na were the most abundant elements, with mean contents of 219.38, 446.93, 49.06 and 95.91 mg kg(-1) respectively. The trace element mean contents ranged between 0.03 and 15.07 mg kg(-1). Chemometric methods such as principal component analysis (PCA) and cluster analysis (CA) techniques were applied to classify honey according to mineral content. The first most important principal component (PC) was strongly associated with the value of Al, B, Cd and Co. CA showed eight clusters corresponding to the eight botanical origins of honey. PCA explained 75.69% of the variance with the first six PC variables. Chemometric analysis of the analytical data allowed the accurate classification of the honey samples according to origin. Copyright © 2013 Elsevier Ltd. All rights reserved.

  5. Performance analysis of a Principal Component Analysis ensemble classifier for Emotiv headset P300 spellers.

    PubMed

    Elsawy, Amr S; Eldawlatly, Seif; Taher, Mohamed; Aly, Gamal M

    2014-01-01

    The current trend to use Brain-Computer Interfaces (BCIs) with mobile devices mandates the development of efficient EEG data processing methods. In this paper, we demonstrate the performance of a Principal Component Analysis (PCA) ensemble classifier for P300-based spellers. We recorded EEG data from multiple subjects using the Emotiv neuroheadset in the context of a classical oddball P300 speller paradigm. We compare the performance of the proposed ensemble classifier to the performance of traditional feature extraction and classifier methods. Our results demonstrate the capability of the PCA ensemble classifier to classify P300 data recorded using the Emotiv neuroheadset with an average accuracy of 86.29% on cross-validation data. In addition, offline testing of the recorded data reveals an average classification accuracy of 73.3% that is significantly higher than that achieved using traditional methods. Finally, we demonstrate the effect of the parameters of the P300 speller paradigm on the performance of the method.

  6. Principal components technique analysis for vegetation and land use discrimination. [Brazilian cerrados

    NASA Technical Reports Server (NTRS)

    Parada, N. D. J. (Principal Investigator); Formaggio, A. R.; Dossantos, J. R.; Dias, L. A. V.

    1984-01-01

    Automatic pre-processing technique called Principal Components (PRINCO) in analyzing LANDSAT digitized data, for land use and vegetation cover, on the Brazilian cerrados was evaluated. The chosen pilot area, 223/67 of MSS/LANDSAT 3, was classified on a GE Image-100 System, through a maximum-likehood algorithm (MAXVER). The same procedure was applied to the PRINCO treated image. PRINCO consists of a linear transformation performed on the original bands, in order to eliminate the information redundancy of the LANDSAT channels. After PRINCO only two channels were used thus reducing computer effort. The original channels and the PRINCO channels grey levels for the five identified classes (grassland, "cerrado", burned areas, anthropic areas, and gallery forest) were obtained through the MAXVER algorithm. This algorithm also presented the average performance for both cases. In order to evaluate the results, the Jeffreys-Matusita distance (JM-distance) between classes was computed. The classification matrix, obtained through MAXVER, after a PRINCO pre-processing, showed approximately the same average performance in the classes separability.

  7. SESNPCA: Principal Component Analysis Applied to Stripped-Envelope Core-Collapse Supernovae

    NASA Astrophysics Data System (ADS)

    Williamson, Marc; Bianco, Federica; Modjaz, Maryam

    2018-01-01

    In the new era of time-domain astronomy, it will become increasingly important to have rigorous, data driven models for classifying transients, including supernovae (SNe). We present the first application of principal component analysis (PCA) to stripped-envelope core-collapse supernovae (SESNe). Previous studies of SNe types Ib, IIb, Ic, and broad-line Ic (Ic-BL) focus only on specific spectral features, while our PCA algorithm uses all of the information contained in each spectrum. We use one of the largest compiled datasets of SESNe, containing over 150 SNe, each with spectra taken at multiple phases. Our work focuses on 49 SNe with spectra taken 15 ± 5 days after maximum V-band light where better distinctions can be made between SNe type Ib and Ic spectra. We find that spectra of SNe type IIb and Ic-BL are separable from the other types in PCA space, indicating that PCA is a promising option for developing a purely data driven model for SESNe classification.

  8. Texture classification of vegetation cover in high altitude wetlands zone

    NASA Astrophysics Data System (ADS)

    Wentao, Zou; Bingfang, Wu; Hongbo, Ju; Hua, Liu

    2014-03-01

    The aim of this study was to investigate the utility of datasets composed of texture measures and other features for the classification of vegetation cover, specifically wetlands. QUEST decision tree classifier was applied to a SPOT-5 image sub-scene covering the typical wetlands area in Three River Sources region in Qinghai province, China. The dataset used for the classification comprised of: (1) spectral data and the components of principal component analysis; (2) texture measures derived from pixel basis; (3) DEM and other ancillary data covering the research area. Image textures is an important characteristic of remote sensing images; it can represent spatial variations with spectral brightness in digital numbers. When the spectral information is not enough to separate the different land covers, the texture information can be used to increase the classification accuracy. The texture measures used in this study were calculated from GLCM (Gray level Co-occurrence Matrix); eight frequently used measures were chosen to conduct the classification procedure. The results showed that variance, mean and entropy calculated by GLCM with a 9*9 size window were effective in distinguishing different vegetation types in wetlands zone. The overall accuracy of this method was 84.19% and the Kappa coefficient was 0.8261. The result indicated that the introduction of texture measures can improve the overall accuracy by 12.05% and the overall kappa coefficient by 0.1407 compared with the result using spectral and ancillary data.

  9. Empirical evaluation of grouping of lower urinary tract symptoms: principal component analysis of Tampere Ageing Male Urological Study data.

    PubMed

    Pöyhönen, Antti; Häkkinen, Jukka T; Koskimäki, Juha; Hakama, Matti; Tammela, Teuvo L J; Auvinen, Anssi

    2013-03-01

    WHAT'S KNOWN ON THE SUBJECT? AND WHAT DOES THE STUDY ADD?: The ICS has divided LUTS into three groups: storage, voiding and post-micturition symptoms. The classification is based on anatomical, physiological and urodynamic considerations of a theoretical nature. We used principal component analysis (PCA) to determine the inter-correlations of various LUTS, which is a novel approach to research and can strengthen existing knowledge of the phenomenology of LUTS. After we had completed our analyses, another study was published that used a similar approach and results were very similar to those of the present study. We evaluated the constellation of LUTS using PCA of the data from a population-based study that included >4000 men. In our analysis, three components emerged from the 12 LUTS: voiding, storage and incontinence components. Our results indicated that incontinence may be separate from the other storage symptoms and post-micturition symptoms should perhaps be regarded as voiding symptoms. To determine how lower urinary tract symptoms (LUTS) relate to each other and assess if the classification proposed by the International Continence Society (ICS) is consistent with empirical findings. The information on urinary symptoms for this population-based study was collected using a self-administered postal questionnaire in 2004. The questionnaire was sent to 7470 men, aged 30-80 years, from Pirkanmaa County (Finland), of whom 4384 (58.7%) returned the questionnaire. The Danish Prostatic Symptom Score-1 questionnaire was used to evaluate urinary symptoms. Principal component analysis (PCA) was used to evaluate the inter-correlations among various urinary symptoms. The PCA produced a grouping of 12 LUTS into three categories consisting of voiding, storage and incontinence symptoms. Post-micturition symptoms were related to voiding symptoms, but incontinence symptoms were separate from storage symptoms. In the analyses by age group, similar categorization was found at ages 40, 50, 60 and 80 years, but only two groups of symptoms emerged among men aged 70 years. The prevalence among men aged 30 was too low for meaningful analysis. This population-based study suggests that LUTS can be divided into three subgroups consisting of voiding, storage and incontinence symptoms based on their inter-correlations. Our empirical findings suggest an alternative grouping of LUTS. The potential utility of such an approach requires careful consideration. © 2012 BJU International.

  10. Ecoregions and ecodistricts: Ecological regionalizations for the Netherlands' environmental policy

    NASA Astrophysics Data System (ADS)

    Klijn, Frans; de Waal, Rein W.; Oude Voshaar, Jan H.

    1995-11-01

    For communicating data on the state of the environment to policy makers, various integrative frameworks are used, including regional integration. For this kind of integration we have developed two related ecological regionalizations, ecoregions and ecodistricts, which are two levels in a series of classifications for hierarchically nested ecosystems at different spatial scale levels. We explain the compilation of the maps from existing geographical data, demonstrating the relatively holistic, a priori integrated approach. The resulting maps are submitted to discriminant analysis to test the consistancy of the use of mapping characteristics, using data on individual abiotic ecosystem components from a national database on a 1-km2 grid. This reveals that the spatial patterns of soil, groundwater, and geomorphology correspond with the ecoregion and ecodistrict maps. Differences between the original maps and maps formed by automatically reclassifying 1-km2 cells with these discriminant components are found to be few. These differences are discussed against the background of the principal dilemma between deductive, a priori integrated, and inductive, a posteriori, classification.

  11. Symbolic dynamic filtering and language measure for behavior identification of mobile robots.

    PubMed

    Mallapragada, Goutham; Ray, Asok; Jin, Xin

    2012-06-01

    This paper presents a procedure for behavior identification of mobile robots, which requires limited or no domain knowledge of the underlying process. While the features of robot behavior are extracted by symbolic dynamic filtering of the observed time series, the behavior patterns are classified based on language measure theory. The behavior identification procedure has been experimentally validated on a networked robotic test bed by comparison with commonly used tools, namely, principal component analysis for feature extraction and Bayesian risk analysis for pattern classification.

  12. Contrast improvement of terahertz images of thin histopathologic sections

    PubMed Central

    Formanek, Florian; Brun, Marc-Aurèle; Yasuda, Akio

    2011-01-01

    We present terahertz images of 10 μm thick histopathologic sections obtained in reflection geometry with a time-domain spectrometer, and demonstrate improved contrast for sections measured in paraffin with water. Automated segmentation is applied to the complex refractive index data to generate clustered terahertz images distinguishing cancer from healthy tissues. The degree of classification of pixels is then evaluated using registered visible microscope images. Principal component analysis and propagation simulations are employed to investigate the origin and the gain of image contrast. PMID:21326635

  13. Contrast improvement of terahertz images of thin histopathologic sections.

    PubMed

    Formanek, Florian; Brun, Marc-Aurèle; Yasuda, Akio

    2010-12-03

    We present terahertz images of 10 μm thick histopathologic sections obtained in reflection geometry with a time-domain spectrometer, and demonstrate improved contrast for sections measured in paraffin with water. Automated segmentation is applied to the complex refractive index data to generate clustered terahertz images distinguishing cancer from healthy tissues. The degree of classification of pixels is then evaluated using registered visible microscope images. Principal component analysis and propagation simulations are employed to investigate the origin and the gain of image contrast.

  14. PCA based feature reduction to improve the accuracy of decision tree c4.5 classification

    NASA Astrophysics Data System (ADS)

    Nasution, M. Z. F.; Sitompul, O. S.; Ramli, M.

    2018-03-01

    Splitting attribute is a major process in Decision Tree C4.5 classification. However, this process does not give a significant impact on the establishment of the decision tree in terms of removing irrelevant features. It is a major problem in decision tree classification process called over-fitting resulting from noisy data and irrelevant features. In turns, over-fitting creates misclassification and data imbalance. Many algorithms have been proposed to overcome misclassification and overfitting on classifications Decision Tree C4.5. Feature reduction is one of important issues in classification model which is intended to remove irrelevant data in order to improve accuracy. The feature reduction framework is used to simplify high dimensional data to low dimensional data with non-correlated attributes. In this research, we proposed a framework for selecting relevant and non-correlated feature subsets. We consider principal component analysis (PCA) for feature reduction to perform non-correlated feature selection and Decision Tree C4.5 algorithm for the classification. From the experiments conducted using available data sets from UCI Cervical cancer data set repository with 858 instances and 36 attributes, we evaluated the performance of our framework based on accuracy, specificity and precision. Experimental results show that our proposed framework is robust to enhance classification accuracy with 90.70% accuracy rates.

  15. Characterization of Escherichia coli isolates from different fecal sources by means of classification tree analysis of fatty acid methyl ester (FAME) profiles.

    PubMed

    Seurinck, Sylvie; Deschepper, Ellen; Deboch, Bishaw; Verstraete, Willy; Siciliano, Steven

    2006-03-01

    Microbial source tracking (MST) methods need to be rapid, inexpensive and accurate. Unfortunately, many MST methods provide a wealth of information that is difficult to interpret by the regulators who use this information to make decisions. This paper describes the use of classification tree analysis to interpret the results of a MST method based on fatty acid methyl ester (FAME) profiles of Escherichia coli isolates, and to present results in a format readily interpretable by water quality managers. Raw sewage E. coli isolates and animal E. coli isolates from cow, dog, gull, and horse were isolated and their FAME profiles collected. Correct classification rates determined with leaveone-out cross-validation resulted in an overall low correct classification rate of 61%. A higher overall correct classification rate of 85% was obtained when the animal isolates were pooled together and compared to the raw sewage isolates. Bootstrap aggregation or adaptive resampling and combining of the FAME profile data increased correct classification rates substantially. Other MST methods may be better suited to differentiate between different fecal sources but classification tree analysis has enabled us to distinguish raw sewage from animal E. coli isolates, which previously had not been possible with other multivariate methods such as principal component analysis and cluster analysis.

  16. Combining various types of classifiers and features extracted from magnetic resonance imaging data in schizophrenia recognition.

    PubMed

    Janousova, Eva; Schwarz, Daniel; Kasparek, Tomas

    2015-06-30

    We investigated a combination of three classification algorithms, namely the modified maximum uncertainty linear discriminant analysis (mMLDA), the centroid method, and the average linkage, with three types of features extracted from three-dimensional T1-weighted magnetic resonance (MR) brain images, specifically MR intensities, grey matter densities, and local deformations for distinguishing 49 first episode schizophrenia male patients from 49 healthy male subjects. The feature sets were reduced using intersubject principal component analysis before classification. By combining the classifiers, we were able to obtain slightly improved results when compared with single classifiers. The best classification performance (81.6% accuracy, 75.5% sensitivity, and 87.8% specificity) was significantly better than classification by chance. We also showed that classifiers based on features calculated using more computation-intensive image preprocessing perform better; mMLDA with classification boundary calculated as weighted mean discriminative scores of the groups had improved sensitivity but similar accuracy compared to the original MLDA; reducing a number of eigenvectors during data reduction did not always lead to higher classification accuracy, since noise as well as the signal important for classification were removed. Our findings provide important information for schizophrenia research and may improve accuracy of computer-aided diagnostics of neuropsychiatric diseases. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  17. A Hybrid Sensing Approach for Pure and Adulterated Honey Classification

    PubMed Central

    Subari, Norazian; Saleh, Junita Mohamad; Shakaff, Ali Yeon Md; Zakaria, Ammar

    2012-01-01

    This paper presents a comparison between data from single modality and fusion methods to classify Tualang honey as pure or adulterated using Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) statistical classification approaches. Ten different brands of certified pure Tualang honey were obtained throughout peninsular Malaysia and Sumatera, Indonesia. Various concentrations of two types of sugar solution (beet and cane sugar) were used in this investigation to create honey samples of 20%, 40%, 60% and 80% adulteration concentrations. Honey data extracted from an electronic nose (e-nose) and Fourier Transform Infrared Spectroscopy (FTIR) were gathered, analyzed and compared based on fusion methods. Visual observation of classification plots revealed that the PCA approach able to distinct pure and adulterated honey samples better than the LDA technique. Overall, the validated classification results based on FTIR data (88.0%) gave higher classification accuracy than e-nose data (76.5%) using the LDA technique. Honey classification based on normalized low-level and intermediate-level FTIR and e-nose fusion data scored classification accuracies of 92.2% and 88.7%, respectively using the Stepwise LDA method. The results suggested that pure and adulterated honey samples were better classified using FTIR and e-nose fusion data than single modality data. PMID:23202033

  18. Research on Remote Sensing Image Classification Based on Feature Level Fusion

    NASA Astrophysics Data System (ADS)

    Yuan, L.; Zhu, G.

    2018-04-01

    Remote sensing image classification, as an important direction of remote sensing image processing and application, has been widely studied. However, in the process of existing classification algorithms, there still exists the phenomenon of misclassification and missing points, which leads to the final classification accuracy is not high. In this paper, we selected Sentinel-1A and Landsat8 OLI images as data sources, and propose a classification method based on feature level fusion. Compare three kind of feature level fusion algorithms (i.e., Gram-Schmidt spectral sharpening, Principal Component Analysis transform and Brovey transform), and then select the best fused image for the classification experimental. In the classification process, we choose four kinds of image classification algorithms (i.e. Minimum distance, Mahalanobis distance, Support Vector Machine and ISODATA) to do contrast experiment. We use overall classification precision and Kappa coefficient as the classification accuracy evaluation criteria, and the four classification results of fused image are analysed. The experimental results show that the fusion effect of Gram-Schmidt spectral sharpening is better than other methods. In four kinds of classification algorithms, the fused image has the best applicability to Support Vector Machine classification, the overall classification precision is 94.01 % and the Kappa coefficients is 0.91. The fused image with Sentinel-1A and Landsat8 OLI is not only have more spatial information and spectral texture characteristics, but also enhances the distinguishing features of the images. The proposed method is beneficial to improve the accuracy and stability of remote sensing image classification.

  19. Classification and source determination of medium petroleum distillates by chemometric and artificial neural networks: a self organizing feature approach.

    PubMed

    Mat-Desa, Wan N S; Ismail, Dzulkiflee; NicDaeid, Niamh

    2011-10-15

    Three different medium petroleum distillate (MPD) products (white spirit, paint brush cleaner, and lamp oil) were purchased from commercial stores in Glasgow, Scotland. Samples of 10, 25, 50, 75, 90, and 95% evaporated product were prepared, resulting in 56 samples in total which were analyzed using gas chromatography-mass spectrometry. Data sets from the chromatographic patterns were examined and preprocessed for unsupervised multivariate analyses using principal component analysis (PCA), hierarchical cluster analysis (HCA), and a self organizing feature map (SOFM) artificial neural network. It was revealed that data sets comprised of higher boiling point hydrocarbon compounds provided a good means for the classification of the samples and successfully linked highly weathered samples back to their unevaporated counterpart in every case. The classification abilities of SOFM were further tested and validated for their predictive abilities where one set of weather data in each case was withdrawn from the sample set and used as a test set of the retrained network. This revealed SOFM to be an outstanding mechanism for sample discrimination and linkage over the more conventional PCA and HCA methods often suggested for such data analysis. SOFM also has the advantage of providing additional information through the evaluation of component planes facilitating the investigation of underlying variables that account for the classification. © 2011 American Chemical Society

  20. Wavelet packets for multi- and hyper-spectral imagery

    NASA Astrophysics Data System (ADS)

    Benedetto, J. J.; Czaja, W.; Ehler, M.; Flake, C.; Hirn, M.

    2010-01-01

    State of the art dimension reduction and classification schemes in multi- and hyper-spectral imaging rely primarily on the information contained in the spectral component. To better capture the joint spatial and spectral data distribution we combine the Wavelet Packet Transform with the linear dimension reduction method of Principal Component Analysis. Each spectral band is decomposed by means of the Wavelet Packet Transform and we consider a joint entropy across all the spectral bands as a tool to exploit the spatial information. Dimension reduction is then applied to the Wavelet Packets coefficients. We present examples of this technique for hyper-spectral satellite imaging. We also investigate the role of various shrinkage techniques to model non-linearity in our approach.

  1. Improved classification accuracy by feature extraction using genetic algorithms

    NASA Astrophysics Data System (ADS)

    Patriarche, Julia; Manduca, Armando; Erickson, Bradley J.

    2003-05-01

    A feature extraction algorithm has been developed for the purposes of improving classification accuracy. The algorithm uses a genetic algorithm / hill-climber hybrid to generate a set of linearly recombined features, which may be of reduced dimensionality compared with the original set. The genetic algorithm performs the global exploration, and a hill climber explores local neighborhoods. Hybridizing the genetic algorithm with a hill climber improves both the rate of convergence, and the final overall cost function value; it also reduces the sensitivity of the genetic algorithm to parameter selection. The genetic algorithm includes the operators: crossover, mutation, and deletion / reactivation - the last of these effects dimensionality reduction. The feature extractor is supervised, and is capable of deriving a separate feature space for each tissue (which are reintegrated during classification). A non-anatomical digital phantom was developed as a gold standard for testing purposes. In tests with the phantom, and with images of multiple sclerosis patients, classification with feature extractor derived features yielded lower error rates than using standard pulse sequences, and with features derived using principal components analysis. Using the multiple sclerosis patient data, the algorithm resulted in a mean 31% reduction in classification error of pure tissues.

  2. Geographical classification of apple based on hyperspectral imaging

    NASA Astrophysics Data System (ADS)

    Guo, Zhiming; Huang, Wenqian; Chen, Liping; Zhao, Chunjiang; Peng, Yankun

    2013-05-01

    Attribute of apple according to geographical origin is often recognized and appreciated by the consumers. It is usually an important factor to determine the price of a commercial product. Hyperspectral imaging technology and supervised pattern recognition was attempted to discriminate apple according to geographical origins in this work. Hyperspectral images of 207 Fuji apple samples were collected by hyperspectral camera (400-1000nm). Principal component analysis (PCA) was performed on hyperspectral imaging data to determine main efficient wavelength images, and then characteristic variables were extracted by texture analysis based on gray level co-occurrence matrix (GLCM) from dominant waveband image. All characteristic variables were obtained by fusing the data of images in efficient spectra. Support vector machine (SVM) was used to construct the classification model, and showed excellent performance in classification results. The total classification rate had the high classify accuracy of 92.75% in the training set and 89.86% in the prediction sets, respectively. The overall results demonstrated that the hyperspectral imaging technique coupled with SVM classifier can be efficiently utilized to discriminate Fuji apple according to geographical origins.

  3. Exploring objective climate classification for the Himalayan arc and adjacent regions using gridded data sources

    NASA Astrophysics Data System (ADS)

    Forsythe, N.; Blenkinsop, S.; Fowler, H. J.

    2015-05-01

    A three-step climate classification was applied to a spatial domain covering the Himalayan arc and adjacent plains regions using input data from four global meteorological reanalyses. Input variables were selected based on an understanding of the climatic drivers of regional water resource variability and crop yields. Principal component analysis (PCA) of those variables and k-means clustering on the PCA outputs revealed a reanalysis ensemble consensus for eight macro-climate zones. Spatial statistics of input variables for each zone revealed consistent, distinct climatologies. This climate classification approach has potential for enhancing assessment of climatic influences on water resources and food security as well as for characterising the skill and bias of gridded data sets, both meteorological reanalyses and climate models, for reproducing subregional climatologies. Through their spatial descriptors (area, geographic centroid, elevation mean range), climate classifications also provide metrics, beyond simple changes in individual variables, with which to assess the magnitude of projected climate change. Such sophisticated metrics are of particular interest for regions, including mountainous areas, where natural and anthropogenic systems are expected to be sensitive to incremental climate shifts.

  4. Characteristics of Forests in Western Sayani Mountains, Siberia from SAR Data

    NASA Technical Reports Server (NTRS)

    Ranson, K. Jon; Sun, Guoqing; Kharuk, V. I.; Kovacs, Katalin

    1998-01-01

    This paper investigated the possibility of using spaceborne radar data to map forest types and logging in the mountainous Western Sayani area in Siberia. L and C band HH, HV, and VV polarized images from the Shuttle Imaging Radar-C instrument were used in the study. Techniques to reduce topographic effects in the radar images were investigated. These included radiometric correction using illumination angle inferred from a digital elevation model, and reducing apparent effects of topography through band ratios. Forest classification was performed after terrain correction utilizing typical supervised techniques and principal component analyses. An ancillary data set of local elevations was also used to improve the forest classification. Map accuracy for each technique was estimated for training sites based on Russian forestry maps, satellite imagery and field measurements. The results indicate that it is necessary to correct for topography when attempting to classify forests in mountainous terrain. Radiometric correction based on a DEM (Digital Elevation Model) improved classification results but required reducing the SAR (Synthetic Aperture Radar) resolution to match the DEM. Using ratios of SAR channels that include cross-polarization improved classification and

  5. A bayesian hierarchical model for classification with selection of functional predictors.

    PubMed

    Zhu, Hongxiao; Vannucci, Marina; Cox, Dennis D

    2010-06-01

    In functional data classification, functional observations are often contaminated by various systematic effects, such as random batch effects caused by device artifacts, or fixed effects caused by sample-related factors. These effects may lead to classification bias and thus should not be neglected. Another issue of concern is the selection of functions when predictors consist of multiple functions, some of which may be redundant. The above issues arise in a real data application where we use fluorescence spectroscopy to detect cervical precancer. In this article, we propose a Bayesian hierarchical model that takes into account random batch effects and selects effective functions among multiple functional predictors. Fixed effects or predictors in nonfunctional form are also included in the model. The dimension of the functional data is reduced through orthonormal basis expansion or functional principal components. For posterior sampling, we use a hybrid Metropolis-Hastings/Gibbs sampler, which suffers slow mixing. An evolutionary Monte Carlo algorithm is applied to improve the mixing. Simulation and real data application show that the proposed model provides accurate selection of functional predictors as well as good classification.

  6. Classification and Recognition of Tomb Information in Hyperspectral Image

    NASA Astrophysics Data System (ADS)

    Gu, M.; Lyu, S.; Hou, M.; Ma, S.; Gao, Z.; Bai, S.; Zhou, P.

    2018-04-01

    There are a large number of materials with important historical information in ancient tombs. However, in many cases, these substances could become obscure and indistinguishable by human naked eye or true colour camera. In order to classify and identify materials in ancient tomb effectively, this paper applied hyperspectral imaging technology to archaeological research of ancient tomb in Shanxi province. Firstly, the feature bands including the main information at the bottom of the ancient tomb are selected by the Principal Component Analysis (PCA) transformation to realize the data dimension. Then, the image classification was performed using Support Vector Machine (SVM) based on feature bands. Finally, the material at the bottom of ancient tomb is identified by spectral analysis and spectral matching. The results show that SVM based on feature bands can not only ensure the classification accuracy, but also shorten the data processing time and improve the classification efficiency. In the material identification, it is found that the same matter identified in the visible light is actually two different substances. This research result provides a new reference and research idea for archaeological work.

  7. Prediction of activation patterns preceding hallucinations in patients with schizophrenia using machine learning with structured sparsity.

    PubMed

    de Pierrefeu, Amicie; Fovet, Thomas; Hadj-Selem, Fouad; Löfstedt, Tommy; Ciuciu, Philippe; Lefebvre, Stephanie; Thomas, Pierre; Lopes, Renaud; Jardri, Renaud; Duchesnay, Edouard

    2018-04-01

    Despite significant progress in the field, the detection of fMRI signal changes during hallucinatory events remains difficult and time-consuming. This article first proposes a machine-learning algorithm to automatically identify resting-state fMRI periods that precede hallucinations versus periods that do not. When applied to whole-brain fMRI data, state-of-the-art classification methods, such as support vector machines (SVM), yield dense solutions that are difficult to interpret. We proposed to extend the existing sparse classification methods by taking the spatial structure of brain images into account with structured sparsity using the total variation penalty. Based on this approach, we obtained reliable classifying performances associated with interpretable predictive patterns, composed of two clearly identifiable clusters in speech-related brain regions. The variation in transition-to-hallucination functional patterns not only from one patient to another but also from one occurrence to the next (e.g., also depending on the sensory modalities involved) appeared to be the major difficulty when developing effective classifiers. Consequently, second, this article aimed to characterize the variability within the prehallucination patterns using an extension of principal component analysis with spatial constraints. The principal components (PCs) and the associated basis patterns shed light on the intrinsic structures of the variability present in the dataset. Such results are promising in the scope of innovative fMRI-guided therapy for drug-resistant hallucinations, such as fMRI-based neurofeedback. © 2018 Wiley Periodicals, Inc.

  8. Assessment of computer techniques for processing digital LANDSAT MSS data for lithological discrimination of Serra do Ramalho, State of Bahia

    NASA Technical Reports Server (NTRS)

    Paradella, W. R. (Principal Investigator); Vitorello, I.; Monteiro, M. D.

    1984-01-01

    Enhancement techniques and thematic classifications were applied to the metasediments of Bambui Super Group (Upper Proterozoic) in the Region of Serra do Ramalho, SW of the state of Bahia. Linear contrast stretch, band-ratios with contrast stretch, and color-composites allow lithological discriminations. The effects of human activities and of vegetation cover mask and limit, in several ways, the lithological discrimination with digital MSS data. Principal component images and color composite of linear contrast stretch of these products, show lithological discrimination through tonal gradations. This set of products allows the delineations of several metasedimentary sequences to a level superior to reconnaissance mapping. Supervised (maximum likelihood classifier) and nonsupervised (K-Means classifier) classification of the limestone sequence, host to fluorite mineralization show satisfactory results.

  9. Propellant's differentiation using FTIR-photoacoustic detection for forensic studies of improvised explosive devices.

    PubMed

    Álvarez, Ángela; Yáñez, Jorge; Contreras, David; Saavedra, Renato; Sáez, Pedro; Amarasiriwardena, Dulasiri

    2017-11-01

    The use of propellant for making improvised explosive devices (IED) is an incipient criminal practice. Propellant can be used as initiator in explosive mixtures along with other components such as coal, ammonium nitrate, sulfur, etc. The identification of the propellant's brand used in homemade explosives can provide additional forensic information of this evidence. In this work, four of the most common propellant brands were characterized by Fourier-transform infrared photoacoustic spectroscopy (FTIR-PAS) which is a non-destructive micro-analytical technique. Spectra shows characteristic signals of typical compounds in the propellants, such as nitrocellulose, nitroglycerin, guanidine, diphenylamine, etc. The differentiation of propellant components was achieved by using FTIR-PAS combined with chemometric methods of classification. Principal component analysis (PCA) and soft independent modelling of class analogy (SIMCA) were used to achieve an effective differentiation and classification (100%) of propellant brands. Furthermore, propellant brand differentiation was also assessed using partial least squares discriminant analyses (PLS-DA) by leave one out cross (∼97%) and external (∼100%) validation method. Our results show the ability of FTIR-PAS combined with chemometric analysis to identify and differentiate propellant brands in different explosive formulations of IED. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. Integration of adaptive guided filtering, deep feature learning, and edge-detection techniques for hyperspectral image classification

    NASA Astrophysics Data System (ADS)

    Wan, Xiaoqing; Zhao, Chunhui; Gao, Bing

    2017-11-01

    The integration of an edge-preserving filtering technique in the classification of a hyperspectral image (HSI) has been proven effective in enhancing classification performance. This paper proposes an ensemble strategy for HSI classification using an edge-preserving filter along with a deep learning model and edge detection. First, an adaptive guided filter is applied to the original HSI to reduce the noise in degraded images and to extract powerful spectral-spatial features. Second, the extracted features are fed as input to a stacked sparse autoencoder to adaptively exploit more invariant and deep feature representations; then, a random forest classifier is applied to fine-tune the entire pretrained network and determine the classification output. Third, a Prewitt compass operator is further performed on the HSI to extract the edges of the first principal component after dimension reduction. Moreover, the regional growth rule is applied to the resulting edge logical image to determine the local region for each unlabeled pixel. Finally, the categories of the corresponding neighborhood samples are determined in the original classification map; then, the major voting mechanism is implemented to generate the final output. Extensive experiments proved that the proposed method achieves competitive performance compared with several traditional approaches.

  11. Laser-induced breakdown spectroscopy-based investigation and classification of pharmaceutical tablets using multivariate chemometric analysis

    PubMed Central

    Myakalwar, Ashwin Kumar; Sreedhar, S.; Barman, Ishan; Dingari, Narahara Chari; Rao, S. Venugopal; Kiran, P. Prem; Tewari, Surya P.; Kumar, G. Manoj

    2012-01-01

    We report the effectiveness of laser-induced breakdown spectroscopy (LIBS) in probing the content of pharmaceutical tablets and also investigate its feasibility for routine classification. This method is particularly beneficial in applications where its exquisite chemical specificity and suitability for remote and on site characterization significantly improves the speed and accuracy of quality control and assurance process. Our experiments reveal that in addition to the presence of carbon, hydrogen, nitrogen and oxygen, which can be primarily attributed to the active pharmaceutical ingredients, specific inorganic atoms were also present in all the tablets. Initial attempts at classification by a ratiometric approach using oxygen to nitrogen compositional values yielded an optimal value (at 746.83 nm) with the least relative standard deviation but nevertheless failed to provide an acceptable classification. To overcome this bottleneck in the detection process, two chemometric algorithms, i.e. principal component analysis (PCA) and soft independent modeling of class analogy (SIMCA), were implemented to exploit the multivariate nature of the LIBS data demonstrating that LIBS has the potential to differentiate and discriminate among pharmaceutical tablets. We report excellent prospective classification accuracy using supervised classification via the SIMCA algorithm, demonstrating its potential for future applications in process analytical technology, especially for fast on-line process control monitoring applications in the pharmaceutical industry. PMID:22099648

  12. Online Learning for Classification of Alzheimer Disease based on Cortical Thickness and Hippocampal Shape Analysis.

    PubMed

    Lee, Ga-Young; Kim, Jeonghun; Kim, Ju Han; Kim, Kiwoong; Seong, Joon-Kyung

    2014-01-01

    Mobile healthcare applications are becoming a growing trend. Also, the prevalence of dementia in modern society is showing a steady growing trend. Among degenerative brain diseases that cause dementia, Alzheimer disease (AD) is the most common. The purpose of this study was to identify AD patients using magnetic resonance imaging in the mobile environment. We propose an incremental classification for mobile healthcare systems. Our classification method is based on incremental learning for AD diagnosis and AD prediction using the cortical thickness data and hippocampus shape. We constructed a classifier based on principal component analysis and linear discriminant analysis. We performed initial learning and mobile subject classification. Initial learning is the group learning part in our server. Our smartphone agent implements the mobile classification and shows various results. With use of cortical thickness data analysis alone, the discrimination accuracy was 87.33% (sensitivity 96.49% and specificity 64.33%). When cortical thickness data and hippocampal shape were analyzed together, the achieved accuracy was 87.52% (sensitivity 96.79% and specificity 63.24%). In this paper, we presented a classification method based on online learning for AD diagnosis by employing both cortical thickness data and hippocampal shape analysis data. Our method was implemented on smartphone devices and discriminated AD patients for normal group.

  13. Source identification and apportionment of heavy metals in urban soil profiles.

    PubMed

    Luo, Xiao-San; Xue, Yan; Wang, Yan-Ling; Cang, Long; Xu, Bo; Ding, Jing

    2015-05-01

    Because heavy metals (HMs) occurring naturally in soils accumulate continuously due to human activities, identifying and apportioning their sources becomes a challenging task for pollution prevention in urban environments. Besides the enrichment factors (EFs) and principal component analysis (PCA) for source classification, the receptor model (Absolute Principal Component Scores-Multiple Linear Regression, APCS-MLR) and Pb isotopic mixing model were also developed to quantify the source contribution for typical HMs (Cd, Co, Cr, Cu, Mn, Ni, Pb, Zn) in urban park soils of Xiamen, a representative megacity in southeast China. Furthermore, distribution patterns of their concentrations and sources in 13 soil profiles (top 20 cm) were investigated by different depths (0-5, 5-10, 10-20 cm). Currently the principal anthropogenic source for HMs in urban soil of China is atmospheric deposition from coal combustion rather than vehicle exhaust. Specifically for Pb source by isotopic model ((206)Pb/(207)Pb and (208)Pb/(207)Pb), the average contributions were natural (49%)>coal combustion (45%)≫traffic emissions (6%). Although the urban surface soils are usually more contaminated owing to recent and current human sources, leaching effects and historic vehicle emissions can also make deep soil layer contaminated by HMs. Copyright © 2015 Elsevier Ltd. All rights reserved.

  14. Differences in chewing sounds of dry-crisp snacks by multivariate data analysis

    NASA Astrophysics Data System (ADS)

    De Belie, N.; Sivertsvik, M.; De Baerdemaeker, J.

    2003-09-01

    Chewing sounds of different types of dry-crisp snacks (two types of potato chips, prawn crackers, cornflakes and low calorie snacks from extruded starch) were analysed to assess differences in sound emission patterns. The emitted sounds were recorded by a microphone placed over the ear canal. The first bite and the first subsequent chew were selected from the time signal and a fast Fourier transformation provided the power spectra. Different multivariate analysis techniques were used for classification of the snack groups. This included principal component analysis (PCA) and unfold partial least-squares (PLS) algorithms, as well as multi-way techniques such as three-way PLS, three-way PCA (Tucker3), and parallel factor analysis (PARAFAC) on the first bite and subsequent chew. The models were evaluated by calculating the classification errors and the root mean square error of prediction (RMSEP) for independent validation sets. It appeared that the logarithm of the power spectra obtained from the chewing sounds could be used successfully to distinguish the different snack groups. When different chewers were used, recalibration of the models was necessary. Multi-way models distinguished better between chewing sounds of different snack groups than PCA on bite or chew separately and than unfold PLS. From all three-way models applied, N-PLS with three components showed the best classification capabilities, resulting in classification errors of 14-18%. The major amount of incorrect classifications was due to one type of potato chips that had a very irregular shape, resulting in a wide variation of the emitted sounds.

  15. Image enhancements of Landsat 8 (OLI) and SAR data for preliminary landslide identification and mapping applied to the central region of Kenya

    NASA Astrophysics Data System (ADS)

    Mwaniki, M. W.; Kuria, D. N.; Boitt, M. K.; Ngigi, T. G.

    2017-04-01

    Image enhancements lead to improved performance and increased accuracy of feature extraction, recognition, identification, classification and hence change detection. This increases the utility of remote sensing to suit environmental applications and aid disaster monitoring of geohazards involving large areas. The main aim of this study was to compare the effect of image enhancement applied to synthetic aperture radar (SAR) data and Landsat 8 imagery in landslide identification and mapping. The methodology involved pre-processing Landsat 8 imagery, image co-registration, despeckling of the SAR data, after which Landsat 8 imagery was enhanced by Principal and Independent Component Analysis (PCA and ICA), a spectral index involving bands 7 and 4, and using a False Colour Composite (FCC) with the components bearing the most geologic information. The SAR data were processed using textural and edge filters, and computation of SAR incoherence. The enhanced spatial, textural and edge information from the SAR data was incorporated to the spectral information from Landsat 8 imagery during the knowledge based classification. The methodology was tested in the central highlands of Kenya, characterized by rugged terrain and frequent rainfall induced landslides. The results showed that the SAR data complemented Landsat 8 data which had enriched spectral information afforded by the FCC with enhanced geologic information. The SAR classification depicted landslides along the ridges and lineaments, important information lacking in the Landsat 8 image classification. The success of landslide identification and classification was attributed to the enhanced geologic features by spectral, textural and roughness properties.

  16. Classification and authentication of unknown water samples using machine learning algorithms.

    PubMed

    Kundu, Palash K; Panchariya, P C; Kundu, Madhusree

    2011-07-01

    This paper proposes the development of water sample classification and authentication, in real life which is based on machine learning algorithms. The proposed techniques used experimental measurements from a pulse voltametry method which is based on an electronic tongue (E-tongue) instrumentation system with silver and platinum electrodes. E-tongue include arrays of solid state ion sensors, transducers even of different types, data collectors and data analysis tools, all oriented to the classification of liquid samples and authentication of unknown liquid samples. The time series signal and the corresponding raw data represent the measurement from a multi-sensor system. The E-tongue system, implemented in a laboratory environment for 6 numbers of different ISI (Bureau of Indian standard) certified water samples (Aquafina, Bisleri, Kingfisher, Oasis, Dolphin, and McDowell) was the data source for developing two types of machine learning algorithms like classification and regression. A water data set consisting of 6 numbers of sample classes containing 4402 numbers of features were considered. A PCA (principal component analysis) based classification and authentication tool was developed in this study as the machine learning component of the E-tongue system. A proposed partial least squares (PLS) based classifier, which was dedicated as well; to authenticate a specific category of water sample evolved out as an integral part of the E-tongue instrumentation system. The developed PCA and PLS based E-tongue system emancipated an overall encouraging authentication percentage accuracy with their excellent performances for the aforesaid categories of water samples. Copyright © 2011 ISA. Published by Elsevier Ltd. All rights reserved.

  17. Morphological analysis of Trichomycterus areolatus Valenciennes, 1846 from southern Chilean rivers using a truss-based system (Siluriformes, Trichomycteridae)

    PubMed Central

    Colihueque, Nelson; Corrales, Olga; Yáñez, Miguel

    2017-01-01

    Abstract Trichomycterus areolatus Valenciennes, 1846 is a small endemic catfish inhabiting the Andean river basins of Chile. In this study, the morphological variability of three T. areolatus populations, collected in two river basins from southern Chile, was assessed with multivariate analyses, including principal component analysis (PCA) and discriminant function analysis (DFA). It is hypothesized that populations must segregate morphologically from each other based on the river basin that they were sampled from, since each basin presents relatively particular hydrological characteristics. Significant morphological differences among the three populations were found with PCA (ANOSIM test, r = 0.552, p < 0.0001) and DFA (Wilks’s λ = 0.036, p < 0.01). PCA accounted for a total variation of 56.16% by the first two principal components. The first Principal Component (PC1) and PC2 explained 34.72 and 21.44% of the total variation, respectively. The scatter-plot of the first two discriminant functions (DF1 on DF2) also validated the existence of three different populations. In group classification using DFA, 93.3% of the specimens were correctly-classified into their original populations. Of the total of 22 transformed truss measurements, 17 exhibited highly significant (p < 0.01) differences among populations. The data support the existence of T. areolatus morphological variation across different rivers in southern Chile, likely reflecting the geographic isolation underlying population structure of the species. PMID:29134012

  18. A PCA-Based method for determining craniofacial relationship and sexual dimorphism of facial shapes.

    PubMed

    Shui, Wuyang; Zhou, Mingquan; Maddock, Steve; He, Taiping; Wang, Xingce; Deng, Qingqiong

    2017-11-01

    Previous studies have used principal component analysis (PCA) to investigate the craniofacial relationship, as well as sex determination using facial factors. However, few studies have investigated the extent to which the choice of principal components (PCs) affects the analysis of craniofacial relationship and sexual dimorphism. In this paper, we propose a PCA-based method for visual and quantitative analysis, using 140 samples of 3D heads (70 male and 70 female), produced from computed tomography (CT) images. There are two parts to the method. First, skull and facial landmarks are manually marked to guide the model's registration so that dense corresponding vertices occupy the same relative position in every sample. Statistical shape spaces of the skull and face in dense corresponding vertices are constructed using PCA. Variations in these vertices, captured in every principal component (PC), are visualized to observe shape variability. The correlations of skull- and face-based PC scores are analysed, and linear regression is used to fit the craniofacial relationship. We compute the PC coefficients of a face based on this craniofacial relationship and the PC scores of a skull, and apply the coefficients to estimate a 3D face for the skull. To evaluate the accuracy of the computed craniofacial relationship, the mean and standard deviation of every vertex between the two models are computed, where these models are reconstructed using real PC scores and coefficients. Second, each PC in facial space is analysed for sex determination, for which support vector machines (SVMs) are used. We examined the correlation between PCs and sex, and explored the extent to which the choice of PCs affects the expression of sexual dimorphism. Our results suggest that skull- and face-based PCs can be used to describe the craniofacial relationship and that the accuracy of the method can be improved by using an increased number of face-based PCs. The results show that the accuracy of the sex classification is related to the choice of PCs. The highest sex classification rate is 91.43% using our method. Copyright © 2017 Elsevier Ltd. All rights reserved.

  19. Classification of wines according to their production regions with the contained trace elements using laser-induced breakdown spectroscopy

    NASA Astrophysics Data System (ADS)

    Tian, Ye; Yan, Chunhua; Zhang, Tianlong; Tang, Hongsheng; Li, Hua; Yu, Jialu; Bernard, Jérôme; Chen, Li; Martin, Serge; Delepine-Gilon, Nicole; Bocková, Jana; Veis, Pavel; Chen, Yanping; Yu, Jin

    2017-09-01

    Laser-induced breakdown spectroscopy (LIBS) has been applied to classify French wines according to their production regions. The use of the surface-assisted (or surface-enhanced) sample preparation method enabled a sub-ppm limit of detection (LOD), which led to the detection and identification of at least 22 metal and nonmetal elements in a typical wine sample including majors, minors and traces. An ensemble of 29 bottles of French wines, either red or white wines, from five production regions, Alsace, Bourgogne, Beaujolais, Bordeaux and Languedoc, was analyzed together with a wine from California, considered as an outlier. A non-supervised classification model based on principal component analysis (PCA) was first developed for the classification. The results showed a limited separation power of the model, which however allowed, in a step by step approach, to understand the physical reasons behind each step of sample separation and especially to observe the influence of the matrix effect in the sample classification. A supervised classification model was then developed based on random forest (RF), which is in addition a nonlinear algorithm. The obtained classification results were satisfactory with, when the parameters of the model were optimized, a classification accuracy of 100% for the tested samples. We especially discuss in the paper, the effect of spectrum normalization with an internal reference, the choice of input variables for the classification models and the optimization of parameters for the developed classification models.

  20. The chemotaxonomic classification of Rhodiola plants and its correlation with morphological characteristics and genetic taxonomy.

    PubMed

    Liu, Zhenli; Liu, Yuanyan; Liu, Chunsheng; Song, Zhiqian; Li, Qing; Zha, Qinglin; Lu, Cheng; Wang, Chun; Ning, Zhangchi; Zhang, Yuxin; Tian, Cheng; Lu, Aiping

    2013-07-12

    Rhodiola plants are used as a natural remedy in the western world and as a traditional herbal medicine in China, and are valued for their ability to enhance human resistance to stress or fatigue and to promote longevity. Due to the morphological similarities among different species, the identification of the genus remains somewhat controversial, which may affect their safety and effectiveness in clinical use. In this paper, 47 Rhodiola samples of seven species were collected from thirteen local provinces of China. They were identified by their morphological characteristics and genetic and phytochemical taxonomies. Eight bioactive chemotaxonomic markers from four chemical classes (phenylpropanoids, phenylethanol derivatives, flavonoids and phenolic acids) were determined to evaluate and distinguish the chemotaxonomy of Rhodiola samples using an HPLC-DAD/UV method. Hierarchical cluster analysis (HCA) and principal component analysis (PCA) were applied to compare the two classification methods between genetic and phytochemical taxonomy. The established chemotaxonomic classification could be effectively used for Rhodiola species identification.

  1. Rapid characterization of transgenic and non-transgenic soybean oils by chemometric methods using NIR spectroscopy

    NASA Astrophysics Data System (ADS)

    Luna, Aderval S.; da Silva, Arnaldo P.; Pinho, Jéssica S. A.; Ferré, Joan; Boqué, Ricard

    Near infrared (NIR) spectroscopy and multivariate classification were applied to discriminate soybean oil samples into non-transgenic and transgenic. Principal Component Analysis (PCA) was applied to extract relevant features from the spectral data and to remove the anomalous samples. The best results were obtained when with Support Vectors Machine-Discriminant Analysis (SVM-DA) and Partial Least Squares-Discriminant Analysis (PLS-DA) after mean centering plus multiplicative scatter correction. For SVM-DA the percentage of successful classification was 100% for the training group and 100% and 90% in validation group for non transgenic and transgenic soybean oil samples respectively. For PLS-DA the percentage of successful classification was 95% and 100% in training group for non transgenic and transgenic soybean oil samples respectively and 100% and 80% in validation group for non transgenic and transgenic respectively. The results demonstrate that NIR spectroscopy can provide a rapid, nondestructive and reliable method to distinguish non-transgenic and transgenic soybean oils.

  2. Automatic optical detection and classification of marine animals around MHK converters using machine vision

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brunton, Steven

    Optical systems provide valuable information for evaluating interactions and associations between organisms and MHK energy converters and for capturing potentially rare encounters between marine organisms and MHK device. The deluge of optical data from cabled monitoring packages makes expert review time-consuming and expensive. We propose algorithms and a processing framework to automatically extract events of interest from underwater video. The open-source software framework consists of background subtraction, filtering, feature extraction and hierarchical classification algorithms. This principle classification pipeline was validated on real-world data collected with an experimental underwater monitoring package. An event detection rate of 100% was achieved using robustmore » principal components analysis (RPCA), Fourier feature extraction and a support vector machine (SVM) binary classifier. The detected events were then further classified into more complex classes – algae | invertebrate | vertebrate, one species | multiple species of fish, and interest rank. Greater than 80% accuracy was achieved using a combination of machine learning techniques.« less

  3. A computer analysis of ERTS data of the Lake Gregory area of South Australia with particular emphasis on its role in terrain classification for engineering. M.S. Thesis

    NASA Technical Reports Server (NTRS)

    Lodwick, G. D. (Principal Investigator)

    1976-01-01

    A digital computer and multivariate statistical techniques were used to analyze 4-band multispectral data. A representation of the original data for each of the four bands allows a certain degree of terrain interpretation; however, variations in appearance of sites within and between bands, without additional criteria for deciding which representation should be preferred, create difficulties for classification. Investigation of the video data groups produced by principal components analysis and cluster analysis techniques shows that effective correlations with classifications of terrain produced by conventional methods could be carried out. The analyses also highlighted underlying relationships between the various elements. The approach used allows large areas (185 cm by 185 cm) to be classified into fundamental units within a matter of hours and can be applied to those parts of the Earth where facilities for conventional studies are poor or lacking.

  4. The chemotaxonomic classification of Rhodiola plants and its correlation with morphological characteristics and genetic taxonomy

    PubMed Central

    2013-01-01

    Background Rhodiola plants are used as a natural remedy in the western world and as a traditional herbal medicine in China, and are valued for their ability to enhance human resistance to stress or fatigue and to promote longevity. Due to the morphological similarities among different species, the identification of the genus remains somewhat controversial, which may affect their safety and effectiveness in clinical use. Results In this paper, 47 Rhodiola samples of seven species were collected from thirteen local provinces of China. They were identified by their morphological characteristics and genetic and phytochemical taxonomies. Eight bioactive chemotaxonomic markers from four chemical classes (phenylpropanoids, phenylethanol derivatives, flavonoids and phenolic acids) were determined to evaluate and distinguish the chemotaxonomy of Rhodiola samples using an HPLC-DAD/UV method. Hierarchical cluster analysis (HCA) and principal component analysis (PCA) were applied to compare the two classification methods between genetic and phytochemical taxonomy. Conclusions The established chemotaxonomic classification could be effectively used for Rhodiola species identification. PMID:23844866

  5. Multivariate qualitative analysis of banned additives in food safety using surface enhanced Raman scattering spectroscopy

    NASA Astrophysics Data System (ADS)

    He, Shixuan; Xie, Wanyi; Zhang, Wei; Zhang, Liqun; Wang, Yunxia; Liu, Xiaoling; Liu, Yulong; Du, Chunlei

    2015-02-01

    A novel strategy which combines iteratively cubic spline fitting baseline correction method with discriminant partial least squares qualitative analysis is employed to analyze the surface enhanced Raman scattering (SERS) spectroscopy of banned food additives, such as Sudan I dye and Rhodamine B in food, Malachite green residues in aquaculture fish. Multivariate qualitative analysis methods, using the combination of spectra preprocessing iteratively cubic spline fitting (ICSF) baseline correction with principal component analysis (PCA) and discriminant partial least squares (DPLS) classification respectively, are applied to investigate the effectiveness of SERS spectroscopy for predicting the class assignments of unknown banned food additives. PCA cannot be used to predict the class assignments of unknown samples. However, the DPLS classification can discriminate the class assignment of unknown banned additives using the information of differences in relative intensities. The results demonstrate that SERS spectroscopy combined with ICSF baseline correction method and exploratory analysis methodology DPLS classification can be potentially used for distinguishing the banned food additives in field of food safety.

  6. Chocolate Classification by an Electronic Nose with Pressure Controlled Generated Stimulation

    PubMed Central

    Valdez, Luis F.; Gutiérrez, Juan Manuel

    2016-01-01

    In this work, we will analyze the response of a Metal Oxide Gas Sensor (MOGS) array to a flow controlled stimulus generated in a pressure controlled canister produced by a homemade olfactometer to build an E-nose. The built E-nose is capable of chocolate identification between the 26 analyzed chocolate bar samples and four features recognition (chocolate type, extra ingredient, sweetener and expiration date status). The data analysis tools used were Principal Components Analysis (PCA) and Artificial Neural Networks (ANNs). The chocolate identification E-nose average classification rate was of 81.3% with 0.99 accuracy (Acc), 0.86 precision (Prc), 0.84 sensitivity (Sen) and 0.99 specificity (Spe) for test. The chocolate feature recognition E-nose gives a classification rate of 85.36% with 0.96 Acc, 0.86 Prc, 0.85 Sen and 0.96 Spe. In addition, a preliminary sample aging analysis was made. The results prove the pressure controlled generated stimulus is reliable for this type of studies. PMID:27775628

  7. Progress toward the determination of correct classification rates in fire debris analysis.

    PubMed

    Waddell, Erin E; Song, Emma T; Rinke, Caitlin N; Williams, Mary R; Sigman, Michael E

    2013-07-01

    Principal components analysis (PCA), linear discriminant analysis (LDA), and quadratic discriminant analysis (QDA) were used to develop a multistep classification procedure for determining the presence of ignitable liquid residue in fire debris and assigning any ignitable liquid residue present into the classes defined under the American Society for Testing and Materials (ASTM) E 1618-10 standard method. A multistep classification procedure was tested by cross-validation based on model data sets comprised of the time-averaged mass spectra (also referred to as total ion spectra) of commercial ignitable liquids and pyrolysis products from common building materials and household furnishings (referred to simply as substrates). Fire debris samples from laboratory-scale and field test burns were also used to test the model. The optimal model's true-positive rate was 81.3% for cross-validation samples and 70.9% for fire debris samples. The false-positive rate was 9.9% for cross-validation samples and 8.9% for fire debris samples. © 2013 American Academy of Forensic Sciences.

  8. Integrated Low-Rank-Based Discriminative Feature Learning for Recognition.

    PubMed

    Zhou, Pan; Lin, Zhouchen; Zhang, Chao

    2016-05-01

    Feature learning plays a central role in pattern recognition. In recent years, many representation-based feature learning methods have been proposed and have achieved great success in many applications. However, these methods perform feature learning and subsequent classification in two separate steps, which may not be optimal for recognition tasks. In this paper, we present a supervised low-rank-based approach for learning discriminative features. By integrating latent low-rank representation (LatLRR) with a ridge regression-based classifier, our approach combines feature learning with classification, so that the regulated classification error is minimized. In this way, the extracted features are more discriminative for the recognition tasks. Our approach benefits from a recent discovery on the closed-form solutions to noiseless LatLRR. When there is noise, a robust Principal Component Analysis (PCA)-based denoising step can be added as preprocessing. When the scale of a problem is large, we utilize a fast randomized algorithm to speed up the computation of robust PCA. Extensive experimental results demonstrate the effectiveness and robustness of our method.

  9. Large and Small-Scale Cropland Classification on the Foothills of Mount Kenya Based on SPOT-5 Take-5 Data Time Series

    NASA Astrophysics Data System (ADS)

    Eckert, Sandra

    2016-08-01

    The SPOT-5 Take 5 campaign provided SPOT time series data of an unprecedented spatial and temporal resolution. We analysed 29 scenes acquired between May and September 2015 of a semi-arid region in the foothills of Mount Kenya, with two aims: first, to distinguish rainfed from irrigated cropland and cropland from natural vegetation covers, which show similar reflectance patterns; and second, to identify individual crop types. We tested several input data sets in different combinations: the spectral bands and the normalized difference vegetation index (NDVI) time series, principal components of NDVI time series, and selected NDVI time series statistics. For the classification we used random forests (RF). In the test differentiating rainfed cropland, irrigated cropland, and natural vegetation covers, the best classification accuracies were achieved using spectral bands. For the differentiation of crop types, we analysed the phenology of selected crop types based on NDVI time series. First results are promising.

  10. Chocolate Classification by an Electronic Nose with Pressure Controlled Generated Stimulation.

    PubMed

    Valdez, Luis F; Gutiérrez, Juan Manuel

    2016-10-20

    In this work, we will analyze the response of a Metal Oxide Gas Sensor (MOGS) array to a flow controlled stimulus generated in a pressure controlled canister produced by a homemade olfactometer to build an E-nose. The built E-nose is capable of chocolate identification between the 26 analyzed chocolate bar samples and four features recognition (chocolate type, extra ingredient, sweetener and expiration date status). The data analysis tools used were Principal Components Analysis (PCA) and Artificial Neural Networks (ANNs). The chocolate identification E-nose average classification rate was of 81.3% with 0.99 accuracy (Acc), 0.86 precision (Prc), 0.84 sensitivity (Sen) and 0.99 specificity (Spe) for test. The chocolate feature recognition E-nose gives a classification rate of 85.36% with 0.96 Acc, 0.86 Prc, 0.85 Sen and 0.96 Spe. In addition, a preliminary sample aging analysis was made. The results prove the pressure controlled generated stimulus is reliable for this type of studies.

  11. Feature extraction in MFL signals of machined defects in steel tubes

    NASA Astrophysics Data System (ADS)

    Perazzo, R.; Pignotti, A.; Reich, S.; Stickar, P.

    2001-04-01

    Thirty defects of various shapes were machined on the external and internal wall surfaces of a 177 mm diameter ferromagnetic steel pipe. MFL signals were digitized and recorded at a frequency of 4 Khz. Various magnetizing currents and relative tube-probe velocities of the order of 2m/s were used. The identification of the location of the defect by a principal component/neural network analysis of the signal is shown to be more effective than the standard procedure of classification based on the average signal frequency.

  12. Absorption spectroscopy and multi-angle scattering measurements in the visible spectral range for the geographic classification of Italian exravirgin olive oils

    NASA Astrophysics Data System (ADS)

    Mignani, Anna G.; Ciaccheri, Leonardo; Cimato, Antonio; Sani, Graziano; Smith, Peter R.

    2004-03-01

    Absorption spectroscopy and multi-angle scattering measurements in the visible spectral range are innovately used to analyze samples of extra virgin olive oils coming from selected areas of Tuscany, a famous Italian region for the production of extra virgin olive oil. The measured spectra are processed by means of the Principal Component Analysis method, so as to create a 3D map capable of clustering the Tuscan oils within the wider area of Italian extra virgin olive oils.

  13. Chemotypes of essential oil of unripe galls of Pistacia atlantica Desf. from Algeria.

    PubMed

    Sifi, Ibrahim; Gourine, Nadhir; Gaydou, Emile M; Yousfi, Mohamed

    2015-01-01

    The essential oils (EOs) of unripe galls (from male and female plants) of a total number of 52 samples of Pistacia atlantica collected from different regions in Algeria were analysed by GC/MS and GC. The yields of the extraction of the EO by hydrodistillation vary from low to high values (0.08-1.89% v/w). The results of both methods of principal component analysis and hierarchical ascendant classification revealed the presence of two different chemotypes: α-pinene chemotype and α-pinene/sabinene/terpinen-4-ol chemotype.

  14. Multimodal Neuroimaging: Basic Concepts and Classification of Neuropsychiatric Diseases.

    PubMed

    Tulay, Emine Elif; Metin, Barış; Tarhan, Nevzat; Arıkan, Mehmet Kemal

    2018-06-01

    Neuroimaging techniques are widely used in neuroscience to visualize neural activity, to improve our understanding of brain mechanisms, and to identify biomarkers-especially for psychiatric diseases; however, each neuroimaging technique has several limitations. These limitations led to the development of multimodal neuroimaging (MN), which combines data obtained from multiple neuroimaging techniques, such as electroencephalography, functional magnetic resonance imaging, and yields more detailed information about brain dynamics. There are several types of MN, including visual inspection, data integration, and data fusion. This literature review aimed to provide a brief summary and basic information about MN techniques (data fusion approaches in particular) and classification approaches. Data fusion approaches are generally categorized as asymmetric and symmetric. The present review focused exclusively on studies based on symmetric data fusion methods (data-driven methods), such as independent component analysis and principal component analysis. Machine learning techniques have recently been introduced for use in identifying diseases and biomarkers of disease. The machine learning technique most widely used by neuroscientists is classification-especially support vector machine classification. Several studies differentiated patients with psychiatric diseases and healthy controls with using combined datasets. The common conclusion among these studies is that the prediction of diseases increases when combining data via MN techniques; however, there remain a few challenges associated with MN, such as sample size. Perhaps in the future N-way fusion can be used to combine multiple neuroimaging techniques or nonimaging predictors (eg, cognitive ability) to overcome the limitations of MN.

  15. Discrimination of Rhizoma Gastrodiae (Tianma) using 3D synchronous fluorescence spectroscopy coupled with principal component analysis

    NASA Astrophysics Data System (ADS)

    Fan, Qimeng; Chen, Chaoyin; Huang, Zaiqiang; Zhang, Chunmei; Liang, Pengjuan; Zhao, Shenglan

    2015-02-01

    Rhizoma Gastrodiae (Tianma) of different variants and different geographical origins has vital difference in quality and physiological efficacy. This paper focused on the classification and identification of Tianma of six types (two variants from three different geographical origins) using three dimensional synchronous fluorescence spectroscopy (3D-SFS) coupled with principal component analysis (PCA). 3D-SF spectra of aqueous extracts, which were obtained from Tianma of the six types, were measured by a LS-50B luminescence spectrofluorometer. The experimental results showed that the characteristic fluorescent spectral regions of the 3D-SF spectra were similar, while the intensities of characteristic regions are different significantly. Coupled these differences in peak intensities with PCA, Tianma of six types could be discriminated successfully. In conclusion, 3D-SFS coupled with PCA, which has such advantages as effective, specific, rapid, non-polluting, has an edge for discrimination of the similar Chinese herbal medicine. And the proposed methodology is a useful tool to classify and identify Tianma of different variants and different geographical origins.

  16. Burnt area mapping from ERS-SAR time series using the principal components transformation

    NASA Astrophysics Data System (ADS)

    Gimeno, Meritxell; San-Miguel Ayanz, Jesus; Barbosa, Paulo M.; Schmuck, Guido

    2003-03-01

    Each year thousands of hectares of forest burnt across Southern Europe. To date, remote sensing assessments of this phenomenon have focused on the use of optical satellite imagery. However, the presence of clouds and smoke prevents the acquisition of this type of data in some areas. It is possible to overcome this problem by using synthetic aperture radar (SAR) data. Principal component analysis (PCA) was performed to quantify differences between pre- and post- fire images and to investigate the separability over a European Remote Sensing (ERS) SAR time series. Moreover, the transformation was carried out to determine the best conditions to acquire optimal SAR imagery according to meteorological parameters and the procedures to enhance burnt area discrimination for the identification of fire damage assessment. A comparative neural network classification was performed in order to map and to assess the burnts using a complete ERS time series or just an image before and an image after the fire according to the PCA. The results suggest that ERS is suitable to highlight areas of localized changes associated with forest fire damage in Mediterranean landcover.

  17. Dynamics and spatio-temporal variability of environmental factors in Eastern Australia using functional principal component analysis

    USGS Publications Warehouse

    Szabo, J.K.; Fedriani, E.M.; Segovia-Gonzalez, M. M.; Astheimer, L.B.; Hooper, M.J.

    2010-01-01

    This paper introduces a new technique in ecology to analyze spatial and temporal variability in environmental variables. By using simple statistics, we explore the relations between abiotic and biotic variables that influence animal distributions. However, spatial and temporal variability in rainfall, a key variable in ecological studies, can cause difficulties to any basic model including time evolution. The study was of a landscape scale (three million square kilometers in eastern Australia), mainly over the period of 19982004. We simultaneously considered qualitative spatial (soil and habitat types) and quantitative temporal (rainfall) variables in a Geographical Information System environment. In addition to some techniques commonly used in ecology, we applied a new method, Functional Principal Component Analysis, which proved to be very suitable for this case, as it explained more than 97% of the total variance of the rainfall data, providing us with substitute variables that are easier to manage and are even able to explain rainfall patterns. The main variable came from a habitat classification that showed strong correlations with rainfall values and soil types. ?? 2010 World Scientific Publishing Company.

  18. Identification of milk origin and process-induced changes in milk by stable isotope ratio mass spectrometry.

    PubMed

    Scampicchio, Matteo; Mimmo, Tanja; Capici, Calogero; Huck, Christian; Innocente, Nadia; Drusch, Stephan; Cesco, Stefano

    2012-11-14

    Stable isotope values were used to develop a new analytical approach enabling the simultaneous identification of milk samples either processed with different heating regimens or from different geographical origins. The samples consisted of raw, pasteurized (HTST), and ultrapasteurized (UHT) milk from different Italian origins. The approach consisted of the analysis of the isotope ratio of δ¹³C and δ¹⁵N for the milk samples and their fractions (fat, casein, and whey). The main finding of this work is that as the heat processing affects the composition of the milk fractions, changes in δ¹³C and δ¹⁵N were also observed. These changes were used as markers to develop pattern recognition maps based on principal component analysis and supervised classification models, such as linear discriminant analysis (LDA), multivariate regression (MLR), principal component regression (PCR), and partial least-squares (PLS). The results give proof of the concept that isotope ratio mass spectroscopy can discriminate simultaneously between milk samples according to their geographical origin and type of processing.

  19. Fault Detection of Bearing Systems through EEMD and Optimization Algorithm

    PubMed Central

    Lee, Dong-Han; Ahn, Jong-Hyo; Koh, Bong-Hwan

    2017-01-01

    This study proposes a fault detection and diagnosis method for bearing systems using ensemble empirical mode decomposition (EEMD) based feature extraction, in conjunction with particle swarm optimization (PSO), principal component analysis (PCA), and Isomap. First, a mathematical model is assumed to generate vibration signals from damaged bearing components, such as the inner-race, outer-race, and rolling elements. The process of decomposing vibration signals into intrinsic mode functions (IMFs) and extracting statistical features is introduced to develop a damage-sensitive parameter vector. Finally, PCA and Isomap algorithm are used to classify and visualize this parameter vector, to separate damage characteristics from healthy bearing components. Moreover, the PSO-based optimization algorithm improves the classification performance by selecting proper weightings for the parameter vector, to maximize the visualization effect of separating and grouping of parameter vectors in three-dimensional space. PMID:29143772

  20. Using Structural Equation Modeling To Fit Models Incorporating Principal Components.

    ERIC Educational Resources Information Center

    Dolan, Conor; Bechger, Timo; Molenaar, Peter

    1999-01-01

    Considers models incorporating principal components from the perspectives of structural-equation modeling. These models include the following: (1) the principal-component analysis of patterned matrices; (2) multiple analysis of variance based on principal components; and (3) multigroup principal-components analysis. Discusses fitting these models…

  1. Differentiation of Organically and Conventionally Grown Tomatoes by Chemometric Analysis of Combined Data from Proton Nuclear Magnetic Resonance and Mid-infrared Spectroscopy and Stable Isotope Analysis.

    PubMed

    Hohmann, Monika; Monakhova, Yulia; Erich, Sarah; Christoph, Norbert; Wachter, Helmut; Holzgrabe, Ulrike

    2015-11-04

    Because the basic suitability of proton nuclear magnetic resonance spectroscopy ((1)H NMR) to differentiate organic versus conventional tomatoes was recently proven, the approach to optimize (1)H NMR classification models (comprising overall 205 authentic tomato samples) by including additional data of isotope ratio mass spectrometry (IRMS, δ(13)C, δ(15)N, and δ(18)O) and mid-infrared (MIR) spectroscopy was assessed. Both individual and combined analytical methods ((1)H NMR + MIR, (1)H NMR + IRMS, MIR + IRMS, and (1)H NMR + MIR + IRMS) were examined using principal component analysis (PCA), partial least squares discriminant analysis (PLS-DA), linear discriminant analysis (LDA), and common components and specific weight analysis (ComDim). With regard to classification abilities, fused data of (1)H NMR + MIR + IRMS yielded better validation results (ranging between 95.0 and 100.0%) than individual methods ((1)H NMR, 91.3-100%; MIR, 75.6-91.7%), suggesting that the combined examination of analytical profiles enhances authentication of organically produced tomatoes.

  2. A combined qualitative-quantitative approach for the identification of highly co-creative technology-driven firms

    NASA Astrophysics Data System (ADS)

    Milyakov, Hristo; Tanev, Stoyan; Ruskov, Petko

    2011-03-01

    Value co-creation, is an emerging business and innovation paradigm, however, there is not enough clarity on the distinctive characteristics of value co-creation as compared to more traditional value creation approaches. The present paper summarizes the results from an empirically-derived research study focusing on the development of a systematic procedure for the identification of firms that are active in value co-creation. The study is based on a sample 273 firms that were selected for being representative of the breadth of their value co-creation activities. The results include: i) the identification of the key components of value co-creation based on a research methodology using web search and Principal Component Analysis techniques, and ii) the comparison of two different classification techniques identifying the firms with the highest degree of involvement in value co-creation practices. To the best of our knowledge this is the first study using sophisticated data collection techniques to provide a classification of firms according to the degree of their involvement in value co-creation.

  3. Authentication of fattening diet of Iberian pigs according to their volatile compounds profile from raw subcutaneous fat.

    PubMed

    Narváez-Rivas, M; Pablos, F; Jurado, J M; León-Camacho, M

    2011-02-01

    The composition of volatile components of subcutaneous fat from Iberian pig has been studied. Purge and trap gas chromatography-mass spectrometry has been used. The composition of the volatile fraction of subcutaneous fat has been used for authentication purposes of different types of Iberian pig fat. Three types of this product have been considered, montanera, extensive cebo and intensive cebo. With classification purposes, several pattern recognition techniques have been applied. In order to find out possible tendencies in the sample distribution as well as the discriminant power of the variables, principal component analysis was applied as visualisation technique. Linear discriminant analysis (LDA) and soft independent modelling by class analogy (SIMCA) were used to obtain suitable classification models. LDA and SIMCA allowed the differentiation of three fattening diets by using the contents in 2,2,4,6,6-pentamethyl-heptane, m-xylene, 2,4-dimethyl-heptane, 6-methyl-tridecane, 1-methoxy-2-propanol, isopropyl alcohol, o-xylene, 3-ethyl-2,2-dimethyl-oxirane, 2,6-dimethyl-undecane, 3-methyl-3-pentanol and limonene.

  4. A statistical approach to root system classification

    PubMed Central

    Bodner, Gernot; Leitner, Daniel; Nakhforoosh, Alireza; Sobotik, Monika; Moder, Karl; Kaul, Hans-Peter

    2013-01-01

    Plant root systems have a key role in ecology and agronomy. In spite of fast increase in root studies, still there is no classification that allows distinguishing among distinctive characteristics within the diversity of rooting strategies. Our hypothesis is that a multivariate approach for “plant functional type” identification in ecology can be applied to the classification of root systems. The classification method presented is based on a data-defined statistical procedure without a priori decision on the classifiers. The study demonstrates that principal component based rooting types provide efficient and meaningful multi-trait classifiers. The classification method is exemplified with simulated root architectures and morphological field data. Simulated root architectures showed that morphological attributes with spatial distribution parameters capture most distinctive features within root system diversity. While developmental type (tap vs. shoot-borne systems) is a strong, but coarse classifier, topological traits provide the most detailed differentiation among distinctive groups. Adequacy of commonly available morphologic traits for classification is supported by field data. Rooting types emerging from measured data, mainly distinguished by diameter/weight and density dominated types. Similarity of root systems within distinctive groups was the joint result of phylogenetic relation and environmental as well as human selection pressure. We concluded that the data-define classification is appropriate for integration of knowledge obtained with different root measurement methods and at various scales. Currently root morphology is the most promising basis for classification due to widely used common measurement protocols. To capture details of root diversity efforts in architectural measurement techniques are essential. PMID:23914200

  5. A statistical approach to root system classification.

    PubMed

    Bodner, Gernot; Leitner, Daniel; Nakhforoosh, Alireza; Sobotik, Monika; Moder, Karl; Kaul, Hans-Peter

    2013-01-01

    Plant root systems have a key role in ecology and agronomy. In spite of fast increase in root studies, still there is no classification that allows distinguishing among distinctive characteristics within the diversity of rooting strategies. Our hypothesis is that a multivariate approach for "plant functional type" identification in ecology can be applied to the classification of root systems. The classification method presented is based on a data-defined statistical procedure without a priori decision on the classifiers. The study demonstrates that principal component based rooting types provide efficient and meaningful multi-trait classifiers. The classification method is exemplified with simulated root architectures and morphological field data. Simulated root architectures showed that morphological attributes with spatial distribution parameters capture most distinctive features within root system diversity. While developmental type (tap vs. shoot-borne systems) is a strong, but coarse classifier, topological traits provide the most detailed differentiation among distinctive groups. Adequacy of commonly available morphologic traits for classification is supported by field data. Rooting types emerging from measured data, mainly distinguished by diameter/weight and density dominated types. Similarity of root systems within distinctive groups was the joint result of phylogenetic relation and environmental as well as human selection pressure. We concluded that the data-define classification is appropriate for integration of knowledge obtained with different root measurement methods and at various scales. Currently root morphology is the most promising basis for classification due to widely used common measurement protocols. To capture details of root diversity efforts in architectural measurement techniques are essential.

  6. Chemical modeling of groundwater in the Banat Plain, southwestern Romania, with elevated As content and co-occurring species by combining diagrams and unsupervised multivariate statistical approaches.

    PubMed

    Butaciu, Sinziana; Senila, Marin; Sarbu, Costel; Ponta, Michaela; Tanaselia, Claudiu; Cadar, Oana; Roman, Marius; Radu, Emil; Sima, Mihaela; Frentiu, Tiberiu

    2017-04-01

    The study proposes a combined model based on diagrams (Gibbs, Piper, Stuyfzand Hydrogeochemical Classification System) and unsupervised statistical approaches (Cluster Analysis, Principal Component Analysis, Fuzzy Principal Component Analysis, Fuzzy Hierarchical Cross-Clustering) to describe natural enrichment of inorganic arsenic and co-occurring species in groundwater in the Banat Plain, southwestern Romania. Speciation of inorganic As (arsenite, arsenate), ion concentrations (Na + , K + , Ca 2+ , Mg 2+ , HCO 3 - , Cl - , F - , SO 4 2- , PO 4 3- , NO 3 - ), pH, redox potential, conductivity and total dissolved substances were performed. Classical diagrams provided the hydrochemical characterization, while statistical approaches were helpful to establish (i) the mechanism of naturally occurring of As and F - species and the anthropogenic one for NO 3 - , SO 4 2- , PO 4 3- and K + and (ii) classification of groundwater based on content of arsenic species. The HCO 3 - type of local groundwater and alkaline pH (8.31-8.49) were found to be responsible for the enrichment of arsenic species and occurrence of F - but by different paths. The PO 4 3- -AsO 4 3- ion exchange, water-rock interaction (silicates hydrolysis and desorption from clay) were associated to arsenate enrichment in the oxidizing aquifer. Fuzzy Hierarchical Cross-Clustering was the strongest tool for the rapid simultaneous classification of groundwaters as a function of arsenic content and hydrogeochemical characteristics. The approach indicated the Na + -F - -pH cluster as marker for groundwater with naturally elevated As and highlighted which parameters need to be monitored. A chemical conceptual model illustrating the natural and anthropogenic paths and enrichment of As and co-occurring species in the local groundwater supported by mineralogical analysis of rocks was established. Copyright © 2016 Elsevier Ltd. All rights reserved.

  7. A Hybrid Color Space for Skin Detection Using Genetic Algorithm Heuristic Search and Principal Component Analysis Technique

    PubMed Central

    2015-01-01

    Color is one of the most prominent features of an image and used in many skin and face detection applications. Color space transformation is widely used by researchers to improve face and skin detection performance. Despite the substantial research efforts in this area, choosing a proper color space in terms of skin and face classification performance which can address issues like illumination variations, various camera characteristics and diversity in skin color tones has remained an open issue. This research proposes a new three-dimensional hybrid color space termed SKN by employing the Genetic Algorithm heuristic and Principal Component Analysis to find the optimal representation of human skin color in over seventeen existing color spaces. Genetic Algorithm heuristic is used to find the optimal color component combination setup in terms of skin detection accuracy while the Principal Component Analysis projects the optimal Genetic Algorithm solution to a less complex dimension. Pixel wise skin detection was used to evaluate the performance of the proposed color space. We have employed four classifiers including Random Forest, Naïve Bayes, Support Vector Machine and Multilayer Perceptron in order to generate the human skin color predictive model. The proposed color space was compared to some existing color spaces and shows superior results in terms of pixel-wise skin detection accuracy. Experimental results show that by using Random Forest classifier, the proposed SKN color space obtained an average F-score and True Positive Rate of 0.953 and False Positive Rate of 0.0482 which outperformed the existing color spaces in terms of pixel wise skin detection accuracy. The results also indicate that among the classifiers used in this study, Random Forest is the most suitable classifier for pixel wise skin detection applications. PMID:26267377

  8. The Application of the EIS in Li-ion Batteries Measurement

    NASA Astrophysics Data System (ADS)

    Zhai, N. S.; Li, M. W.; Wang, W. L.; Zhang, D. L.; Xu, D. G.

    2006-10-01

    The measurement and determination of the lithium ion battery's electrochemical impedance spectroscopy (EIS) and the application of EIS to battery classification are researched in this paper. The lithium ion battery gets extensive applications due to its inherent advantages over other batteries. For proper and sustainable performance, it is very necessary to check the uniformity of the lithium ion batteries. In this paper, the equivalent circuit of the lithium ion battery is analyzed; the design of hardware circuit based on DSP and software that calculates the EIS of the lithium ion battery is critically done and evaluated. The parameters of the lithium ion equivalent circuit are determined, the parameter values of li-ion equivalent circuit are achieved by least square method, and the application of Principal Component Analysis (CPA) to the battery classification is analyzed.

  9. Micro-Raman spectroscopy of natural and synthetic indigo samples.

    PubMed

    Vandenabeele, Peter; Moens, Luc

    2003-02-01

    In this work indigo samples from three different sources are studied by using Raman spectroscopy: the synthetic pigment and pigments from the woad (Isatis tinctoria) and the indigo plant (Indigofera tinctoria). 21 samples were obtained from 8 suppliers; for each sample 5 Raman spectra were recorded and used for further chemometrical analysis. Principal components analysis (PCA) was performed as data reduction method before applying hierarchical cluster analysis. Linear discriminant analysis (LDA) was implemented as a non-hierarchical supervised pattern recognition method to build a classification model. In order to avoid broad-shaped interferences from the fluorescence background, the influence of 1st and 2nd derivatives on the classification was studied by using cross-validation. Although chemically identical, it is shown that Raman spectroscopy in combination with suitable chemometric methods has the potential to discriminate between synthetic and natural indigo samples.

  10. Using principal component analysis to capture individual differences within a unified neuropsychological model of chronic post-stroke aphasia: Revealing the unique neural correlates of speech fluency, phonology and semantics.

    PubMed

    Halai, Ajay D; Woollams, Anna M; Lambon Ralph, Matthew A

    2017-01-01

    Individual differences in the performance profiles of neuropsychologically-impaired patients are pervasive yet there is still no resolution on the best way to model and account for the variation in their behavioural impairments and the associated neural correlates. To date, researchers have generally taken one of three different approaches: a single-case study methodology in which each case is considered separately; a case-series design in which all individual patients from a small coherent group are examined and directly compared; or, group studies, in which a sample of cases are investigated as one group with the assumption that they are drawn from a homogenous category and that performance differences are of no interest. In recent research, we have developed a complementary alternative through the use of principal component analysis (PCA) of individual data from large patient cohorts. This data-driven approach not only generates a single unified model for the group as a whole (expressed in terms of the emergent principal components) but is also able to capture the individual differences between patients (in terms of their relative positions along the principal behavioural axes). We demonstrate the use of this approach by considering speech fluency, phonology and semantics in aphasia diagnosis and classification, as well as their unique neural correlates. PCA of the behavioural data from 31 patients with chronic post-stroke aphasia resulted in four statistically-independent behavioural components reflecting phonological, semantic, executive-cognitive and fluency abilities. Even after accounting for lesion volume, entering the four behavioural components simultaneously into a voxel-based correlational methodology (VBCM) analysis revealed that speech fluency (speech quanta) was uniquely correlated with left motor cortex and underlying white matter (including the anterior section of the arcuate fasciculus and the frontal aslant tract), phonological skills with regions in the superior temporal gyrus and pars opercularis, and semantics with the anterior temporal stem. Copyright © 2016 The Author(s). Published by Elsevier Ltd.. All rights reserved.

  11. Classification and quantification analysis of peach kernel from different origins with near-infrared diffuse reflection spectroscopy

    PubMed Central

    Liu, Wei; Wang, Zhen-Zhong; Qing, Jian-Ping; Li, Hong-Juan; Xiao, Wei

    2014-01-01

    Background: Peach kernels which contain kinds of fatty acids play an important role in the regulation of a variety of physiological and biological functions. Objective: To establish an innovative and rapid diffuse reflectance near-infrared spectroscopy (DR-NIR) analysis method along with chemometric techniques for the qualitative and quantitative determination of a peach kernel. Materials and Methods: Peach kernel samples from nine different origins were analyzed with high-performance liquid chromatography (HPLC) as a reference method. DR-NIR is in the spectral range 1100-2300 nm. Principal component analysis (PCA) and partial least squares regression (PLSR) algorithm were applied to obtain prediction models, The Savitzky-Golay derivative and first derivative were adopted for the spectral pre-processing, PCA was applied to classify the varieties of those samples. For the quantitative calibration, the models of linoleic and oleinic acids were established with the PLSR algorithm and the optimal principal component (PC) numbers were selected with leave-one-out (LOO) cross-validation. The established models were evaluated with the root mean square error of deviation (RMSED) and corresponding correlation coefficients (R2). Results: The PCA results of DR-NIR spectra yield clear classification of the two varieties of peach kernel. PLSR had a better predictive ability. The correlation coefficients of the two calibration models were above 0.99, and the RMSED of linoleic and oleinic acids were 1.266% and 1.412%, respectively. Conclusion: The DR-NIR combined with PCA and PLSR algorithm could be used efficiently to identify and quantify peach kernels and also help to solve variety problem. PMID:25422544

  12. Principal component analysis-based unsupervised feature extraction applied to in silico drug discovery for posttraumatic stress disorder-mediated heart disease.

    PubMed

    Taguchi, Y-h; Iwadate, Mitsuo; Umeyama, Hideaki

    2015-04-30

    Feature extraction (FE) is difficult, particularly if there are more features than samples, as small sample numbers often result in biased outcomes or overfitting. Furthermore, multiple sample classes often complicate FE because evaluating performance, which is usual in supervised FE, is generally harder than the two-class problem. Developing sample classification independent unsupervised methods would solve many of these problems. Two principal component analysis (PCA)-based FE, specifically, variational Bayes PCA (VBPCA) was extended to perform unsupervised FE, and together with conventional PCA (CPCA)-based unsupervised FE, were tested as sample classification independent unsupervised FE methods. VBPCA- and CPCA-based unsupervised FE both performed well when applied to simulated data, and a posttraumatic stress disorder (PTSD)-mediated heart disease data set that had multiple categorical class observations in mRNA/microRNA expression of stressed mouse heart. A critical set of PTSD miRNAs/mRNAs were identified that show aberrant expression between treatment and control samples, and significant, negative correlation with one another. Moreover, greater stability and biological feasibility than conventional supervised FE was also demonstrated. Based on the results obtained, in silico drug discovery was performed as translational validation of the methods. Our two proposed unsupervised FE methods (CPCA- and VBPCA-based) worked well on simulated data, and outperformed two conventional supervised FE methods on a real data set. Thus, these two methods have suggested equivalence for FE on categorical multiclass data sets, with potential translational utility for in silico drug discovery.

  13. Multiband tangent space mapping and feature selection for classification of EEG during motor imagery.

    PubMed

    Islam, Md Rabiul; Tanaka, Toshihisa; Molla, Md Khademul Islam

    2018-05-08

    When designing multiclass motor imagery-based brain-computer interface (MI-BCI), a so-called tangent space mapping (TSM) method utilizing the geometric structure of covariance matrices is an effective technique. This paper aims to introduce a method using TSM for finding accurate operational frequency bands related brain activities associated with MI tasks. A multichannel electroencephalogram (EEG) signal is decomposed into multiple subbands, and tangent features are then estimated on each subband. A mutual information analysis-based effective algorithm is implemented to select subbands containing features capable of improving motor imagery classification accuracy. Thus obtained features of selected subbands are combined to get feature space. A principal component analysis-based approach is employed to reduce the features dimension and then the classification is accomplished by a support vector machine (SVM). Offline analysis demonstrates the proposed multiband tangent space mapping with subband selection (MTSMS) approach outperforms state-of-the-art methods. It acheives the highest average classification accuracy for all datasets (BCI competition dataset 2a, IIIa, IIIb, and dataset JK-HH1). The increased classification accuracy of MI tasks with the proposed MTSMS approach can yield effective implementation of BCI. The mutual information-based subband selection method is implemented to tune operation frequency bands to represent actual motor imagery tasks.

  14. Multivariate analysis of the volatile components in tobacco based on infrared-assisted extraction coupled to headspace solid-phase microextraction and gas chromatography-mass spectrometry.

    PubMed

    Yang, Yanqin; Pan, Yuanjiang; Zhou, Guojun; Chu, Guohai; Jiang, Jian; Yuan, Kailong; Xia, Qian; Cheng, Changhe

    2016-11-01

    A novel infrared-assisted extraction coupled to headspace solid-phase microextraction followed by gas chromatography with mass spectrometry method has been developed for the rapid determination of the volatile components in tobacco. The optimal extraction conditions for maximizing the extraction efficiency were as follows: 65 μm polydimethylsiloxane-divinylbenzene fiber, extraction time of 20 min, infrared power of 175 W, and distance between the infrared lamp and the headspace vial of 2 cm. Under the optimum conditions, 50 components were found to exist in all ten tobacco samples from different geographical origins. Compared with conventional water-bath heating and nonheating extraction methods, the extraction efficiency of infrared-assisted extraction was greatly improved. Furthermore, multivariate analysis including principal component analysis, hierarchical cluster analysis, and similarity analysis were performed to evaluate the chemical information of these samples and divided them into three classifications, including rich, moderate, and fresh flavors. The above-mentioned classification results were consistent with the sensory evaluation, which was pivotal and meaningful for tobacco discrimination. As a simple, fast, cost-effective, and highly efficient method, the infrared-assisted extraction coupled to headspace solid-phase microextraction technique is powerful and promising for distinguishing the geographical origins of the tobacco samples coupled to suitable chemometrics. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  15. Identification of functional parameters for the classification of older female fallers and prediction of ‘first-time’ fallers

    PubMed Central

    König, N.; Taylor, W. R.; Armbrecht, G.; Dietzel, R.; Singh, N. B.

    2014-01-01

    Falls remain a challenge for ageing societies. Strong evidence indicates that a previous fall is the strongest single screening indicator for a subsequent fall and the need for assessing fall risk without accounting for fall history is therefore imperative. Testing in three functional domains (using a total 92 measures) were completed in 84 older women (60–85 years of age), including muscular control, standing balance, and mean and variability of gait. Participants were retrospectively classified as fallers (n = 38) or non-fallers (n = 42) and additionally in a prospective manner to identify first-time fallers (FTFs) (n = 6) within a 12-month follow-up period. Principal component analysis revealed that seven components derived from the 92 functional measures are sufficient to depict the spectrum of functional performance. Inclusion of only three components, related to mean and temporal variability of walking, allowed classification of fallers and non-fallers with a sensitivity and specificity of 74% and 76%, respectively. Furthermore, the results indicate that FTFs show a tendency towards the performance of fallers, even before their first fall occurs. This study suggests that temporal variability and mean spatial parameters of gait are the only functional components among the 92 measures tested that differentiate fallers from non-fallers, and could therefore show efficacy in clinical screening programmes for assessing risk of first-time falling. PMID:24898021

  16. Rapid differentiation of Chinese hop varieties (Humulus lupulus) using volatile fingerprinting by HS-SPME-GC-MS combined with multivariate statistical analysis.

    PubMed

    Liu, Zechang; Wang, Liping; Liu, Yumei

    2018-01-18

    Hops impart flavor to beer, with the volatile components characterizing the various hop varieties and qualities. Fingerprinting, especially flavor fingerprinting, is often used to identify 'flavor products' because inconsistencies in the description of flavor may lead to an incorrect definition of beer quality. Compared to flavor fingerprinting, volatile fingerprinting is simpler and easier. We performed volatile fingerprinting using head space-solid phase micro-extraction gas chromatography-mass spectrometry combined with similarity analysis and principal component analysis (PCA) for evaluating and distinguishing between three major Chinese hops. Eighty-four volatiles were identified, which were classified into seven categories. Volatile fingerprinting based on similarity analysis did not yield any obvious result. By contrast, hop varieties and qualities were identified using volatile fingerprinting based on PCA. The potential variables explained the variance in the three hop varieties. In addition, the dendrogram and principal component score plot described the differences and classifications of hops. Volatile fingerprinting plus multivariate statistical analysis can rapidly differentiate between the different varieties and qualities of the three major Chinese hops. Furthermore, this method can be used as a reference in other fields. © 2018 Society of Chemical Industry. © 2018 Society of Chemical Industry.

  17. Use of Cusp Catastrophe for Risk Analysis of Navigational Environment: A Case Study of Three Gorges Reservoir Area

    PubMed Central

    Hao, Guozhu

    2016-01-01

    A water traffic system is a huge, nonlinear, complex system, and its stability is affected by various factors. Water traffic accidents can be considered to be a kind of mutation of a water traffic system caused by the coupling of multiple navigational environment factors. In this study, the catastrophe theory, principal component analysis (PCA), and multivariate statistics are integrated to establish a situation recognition model for a navigational environment with the aim of performing a quantitative analysis of the situation of this environment via the extraction and classification of its key influencing factors; in this model, the natural environment and traffic environment are considered to be two control variables. The Three Gorges Reservoir area of the Yangtze River is considered as an example, and six critical factors, i.e., the visibility, wind, current velocity, route intersection, channel dimension, and traffic flow, are classified into two principal components: the natural environment and traffic environment. These two components are assumed to have the greatest influence on the navigation risk. Then, the cusp catastrophe model is employed to identify the safety situation of the regional navigational environment in the Three Gorges Reservoir area. The simulation results indicate that the situation of the navigational environment of this area is gradually worsening from downstream to upstream. PMID:27391057

  18. Use of Cusp Catastrophe for Risk Analysis of Navigational Environment: A Case Study of Three Gorges Reservoir Area.

    PubMed

    Jiang, Dan; Hao, Guozhu; Huang, Liwen; Zhang, Dan

    2016-01-01

    A water traffic system is a huge, nonlinear, complex system, and its stability is affected by various factors. Water traffic accidents can be considered to be a kind of mutation of a water traffic system caused by the coupling of multiple navigational environment factors. In this study, the catastrophe theory, principal component analysis (PCA), and multivariate statistics are integrated to establish a situation recognition model for a navigational environment with the aim of performing a quantitative analysis of the situation of this environment via the extraction and classification of its key influencing factors; in this model, the natural environment and traffic environment are considered to be two control variables. The Three Gorges Reservoir area of the Yangtze River is considered as an example, and six critical factors, i.e., the visibility, wind, current velocity, route intersection, channel dimension, and traffic flow, are classified into two principal components: the natural environment and traffic environment. These two components are assumed to have the greatest influence on the navigation risk. Then, the cusp catastrophe model is employed to identify the safety situation of the regional navigational environment in the Three Gorges Reservoir area. The simulation results indicate that the situation of the navigational environment of this area is gradually worsening from downstream to upstream.

  19. Fossil Signatures Using Elemental Abundance Distributions and Bayesian Probabilistic Classification

    NASA Technical Reports Server (NTRS)

    Hoover, Richard B.; Storrie-Lombardi, Michael C.

    2004-01-01

    Elemental abundances (C6, N7, O8, Na11, Mg12, Al3, P15, S16, Cl17, K19, Ca20, Ti22, Mn25, Fe26, and Ni28) were obtained for a set of terrestrial fossils and the rock matrix surrounding them. Principal Component Analysis extracted five factors accounting for the 92.5% of the data variance, i.e. information content, of the elemental abundance data. Hierarchical Cluster Analysis provided unsupervised sample classification distinguishing fossil from matrix samples on the basis of either raw abundances or PCA input that agreed strongly with visual classification. A stochastic, non-linear Artificial Neural Network produced a Bayesian probability of correct sample classification. The results provide a quantitative probabilistic methodology for discriminating terrestrial fossils from the surrounding rock matrix using chemical information. To demonstrate the applicability of these techniques to the assessment of meteoritic samples or in situ extraterrestrial exploration, we present preliminary data on samples of the Orgueil meteorite. In both systems an elemental signature produces target classification decisions remarkably consistent with morphological classification by a human expert using only structural (visual) information. We discuss the possibility of implementing a complexity analysis metric capable of automating certain image analysis and pattern recognition abilities of the human eye using low magnification optical microscopy images and discuss the extension of this technique across multiple scales.

  20. Multi-agent Negotiation Mechanisms for Statistical Target Classification in Wireless Multimedia Sensor Networks

    PubMed Central

    Wang, Xue; Bi, Dao-wei; Ding, Liang; Wang, Sheng

    2007-01-01

    The recent availability of low cost and miniaturized hardware has allowed wireless sensor networks (WSNs) to retrieve audio and video data in real world applications, which has fostered the development of wireless multimedia sensor networks (WMSNs). Resource constraints and challenging multimedia data volume make development of efficient algorithms to perform in-network processing of multimedia contents imperative. This paper proposes solving problems in the domain of WMSNs from the perspective of multi-agent systems. The multi-agent framework enables flexible network configuration and efficient collaborative in-network processing. The focus is placed on target classification in WMSNs where audio information is retrieved by microphones. To deal with the uncertainties related to audio information retrieval, the statistical approaches of power spectral density estimates, principal component analysis and Gaussian process classification are employed. A multi-agent negotiation mechanism is specially developed to efficiently utilize limited resources and simultaneously enhance classification accuracy and reliability. The negotiation is composed of two phases, where an auction based approach is first exploited to allocate the classification task among the agents and then individual agent decisions are combined by the committee decision mechanism. Simulation experiments with real world data are conducted and the results show that the proposed statistical approaches and negotiation mechanism not only reduce memory and computation requirements in WMSNs but also significantly enhance classification accuracy and reliability. PMID:28903223

  1. Classification of plum spirit drinks by synchronous fluorescence spectroscopy.

    PubMed

    Sádecká, J; Jakubíková, M; Májek, P; Kleinová, A

    2016-04-01

    Synchronous fluorescence spectroscopy was used in combination with principal component analysis (PCA) and linear discriminant analysis (LDA) for the differentiation of plum spirits according to their geographical origin. A total of 14 Czech, 12 Hungarian and 18 Slovak plum spirit samples were used. The samples were divided in two categories: colorless (22 samples) and colored (22 samples). Synchronous fluorescence spectra (SFS) obtained at a wavelength difference of 60 nm provided the best results. Considering the PCA-LDA applied to the SFS of all samples, Czech, Hungarian and Slovak colorless samples were properly classified in both the calibration and prediction sets. 100% of correct classification was also obtained for Czech and Hungarian colored samples. However, one group of Slovak colored samples was classified as belonging to the Hungarian group in the calibration set. Thus, the total correct classifications obtained were 94% and 100% for the calibration and prediction steps, respectively. The results were compared with those obtained using near-infrared (NIR) spectroscopy. Applying PCA-LDA to NIR spectra (5500-6000 cm(-1)), the total correct classifications were 91% and 92% for the calibration and prediction steps, respectively, which were slightly lower than those obtained using SFS. Copyright © 2015 Elsevier Ltd. All rights reserved.

  2. An Evaluation of Feature Learning Methods for High Resolution Image Classification

    NASA Astrophysics Data System (ADS)

    Tokarczyk, P.; Montoya, J.; Schindler, K.

    2012-07-01

    Automatic image classification is one of the fundamental problems of remote sensing research. The classification problem is even more challenging in high-resolution images of urban areas, where the objects are small and heterogeneous. Two questions arise, namely which features to extract from the raw sensor data to capture the local radiometry and image structure at each pixel or segment, and which classification method to apply to the feature vectors. While classifiers are nowadays well understood, selecting the right features remains a largely empirical process. Here we concentrate on the features. Several methods are evaluated which allow one to learn suitable features from unlabelled image data by analysing the image statistics. In a comparative study, we evaluate unsupervised feature learning with different linear and non-linear learning methods, including principal component analysis (PCA) and deep belief networks (DBN). We also compare these automatically learned features with popular choices of ad-hoc features including raw intensity values, standard combinations like the NDVI, a few PCA channels, and texture filters. The comparison is done in a unified framework using the same images, the target classes, reference data and a Random Forest classifier.

  3. Creating a Taxonomy of Local Boards of Health Based on Local Health Departments’ Perspectives

    PubMed Central

    Shah, Gulzar H.; Sotnikov, Sergey; Leep, Carolyn J.; Ye, Jiali; Van Wave, Timothy W.

    2017-01-01

    Objectives To develop a local board of health (LBoH) classification scheme and empirical definitions to provide a coherent framework for describing variation in the LBoHs. Methods This study is based on data from the 2015 Local Board of Health Survey, conducted among a nationally representative sample of local health department administrators, with 394 responses. The classification development consisted of the following steps: (1) theoretically guided initial domain development, (2) mapping of the survey variables to the proposed domains, (3) data reduction using principal component analysis and group consensus, and (4) scale development and testing for internal consistency. Results The final classification scheme included 60 items across 6 governance function domains and an additional domain—LBoH characteristics and strengths, such as meeting frequency, composition, and diversity of information sources. Application of this classification strongly supports the premise that LBoHs differ in their performance of governance functions and in other characteristics. Conclusions The LBoH taxonomy provides an empirically tested standardized tool for classifying LBoHs from the viewpoint of local health department administrators. Future studies can use this taxonomy to better characterize the impact of LBoHs. PMID:27854524

  4. A Study of Feature Combination for Vehicle Detection Based on Image Processing

    PubMed Central

    2014-01-01

    Video analytics play a critical role in most recent traffic monitoring and driver assistance systems. In this context, the correct detection and classification of surrounding vehicles through image analysis has been the focus of extensive research in the last years. Most of the pieces of work reported for image-based vehicle verification make use of supervised classification approaches and resort to techniques, such as histograms of oriented gradients (HOG), principal component analysis (PCA), and Gabor filters, among others. Unfortunately, existing approaches are lacking in two respects: first, comparison between methods using a common body of work has not been addressed; second, no study of the combination potentiality of popular features for vehicle classification has been reported. In this study the performance of the different techniques is first reviewed and compared using a common public database. Then, the combination capabilities of these techniques are explored and a methodology is presented for the fusion of classifiers built upon them, taking into account also the vehicle pose. The study unveils the limitations of single-feature based classification and makes clear that fusion of classifiers is highly beneficial for vehicle verification. PMID:24672299

  5. Automatic age and gender classification using supervised appearance model

    NASA Astrophysics Data System (ADS)

    Bukar, Ali Maina; Ugail, Hassan; Connah, David

    2016-11-01

    Age and gender classification are two important problems that recently gained popularity in the research community, due to their wide range of applications. Research has shown that both age and gender information are encoded in the face shape and texture, hence the active appearance model (AAM), a statistical model that captures shape and texture variations, has been one of the most widely used feature extraction techniques for the aforementioned problems. However, AAM suffers from some drawbacks, especially when used for classification. This is primarily because principal component analysis (PCA), which is at the core of the model, works in an unsupervised manner, i.e., PCA dimensionality reduction does not take into account how the predictor variables relate to the response (class labels). Rather, it explores only the underlying structure of the predictor variables, thus, it is no surprise if PCA discards valuable parts of the data that represent discriminatory features. Toward this end, we propose a supervised appearance model (sAM) that improves on AAM by replacing PCA with partial least-squares regression. This feature extraction technique is then used for the problems of age and gender classification. Our experiments show that sAM has better predictive power than the conventional AAM.

  6. Feasibility of laser-induced breakdown spectroscopy (LIBS) for classification of sea salts.

    PubMed

    Tan, Man Minh; Cui, Sheng; Yoo, Jonghyun; Han, Song-Hee; Ham, Kyung-Sik; Nam, Sang-Ho; Lee, Yonghoon

    2012-03-01

    We have investigated the feasibility of laser-induced breakdown spectroscopy (LIBS) as a fast, reliable classification tool for sea salts. For 11 kinds of sea salts, potassium (K), magnesium (Mg), calcium (Ca), and aluminum (Al), concentrations were measured by inductively coupled plasma-atomic emission spectroscopy (ICP-AES), and the LIBS spectra were recorded in the narrow wavelength region between 760 and 800 nm where K (I), Mg (I), Ca (II), Al (I), and cyanide (CN) band emissions are observed. The ICP-AES measurements revealed that the K, Mg, Ca, and Al concentrations varied significantly with the provenance of each salt. The relative intensities of the K (I), Mg (I), Ca (II), and Al (I) peaks observed in the LIBS spectra are consistent with the results using ICP-AES. The principal component analysis of the LIBS spectra provided the score plot with quite a high degree of clustering. This indicates that classification of sea salts by chemometric analysis of LIBS spectra is very promising. Classification models were developed by partial least squares discriminant analysis (PLS-DA) and evaluated. In addition, the Al (I) peaks enabled us to discriminate between different production methods of the salts. © 2012 Society for Applied Spectroscopy

  7. Quantifying tolerance indicator values for common stream fish species of the United States

    USGS Publications Warehouse

    Meador, M.R.; Carlisle, D.M.

    2007-01-01

    The classification of fish species tolerance to environmental disturbance is often used as a means to assess ecosystem conditions. Its use, however, may be problematic because the approach to tolerance classification is based on subjective judgment. We analyzed fish and physicochemical data from 773 stream sites collected as part of the U.S. Geological Survey's National Water-Quality Assessment Program to calculate tolerance indicator values for 10 physicochemical variables using weighted averaging. Tolerance indicator values (TIVs) for ammonia, chloride, dissolved oxygen, nitrite plus nitrate, pH, phosphorus, specific conductance, sulfate, suspended sediment, and water temperature were calculated for 105 common fish species of the United States. Tolerance indicator values for specific conductance and sulfate were correlated (rho = 0.87), and thus, fish species may be co-tolerant to these water-quality variables. We integrated TIVs for each species into an overall tolerance classification for comparisons with judgment-based tolerance classifications. Principal components analysis indicated that the distinction between tolerant and intolerant classifications was determined largely by tolerance to suspended sediment, specific conductance, chloride, and total phosphorus. Factors such as water temperature, dissolved oxygen, and pH may not be as important in distinguishing between tolerant and intolerant classifications, but may help to segregate species classified as moderate. Empirically derived tolerance classifications were 58.8% in agreement with judgment-derived tolerance classifications. Canonical discriminant analysis revealed that few TIVs, primarily chloride, could discriminate among judgment-derived tolerance classifications of tolerant, moderate, and intolerant. To our knowledge, this is the first empirically based understanding of fish species tolerance for stream fishes in the United States.

  8. Evaluation of different classification methods for the diagnosis of schizophrenia based on functional near-infrared spectroscopy.

    PubMed

    Li, Zhaohua; Wang, Yuduo; Quan, Wenxiang; Wu, Tongning; Lv, Bin

    2015-02-15

    Based on near-infrared spectroscopy (NIRS), recent converging evidence has been observed that patients with schizophrenia exhibit abnormal functional activities in the prefrontal cortex during a verbal fluency task (VFT). Therefore, some studies have attempted to employ NIRS measurements to differentiate schizophrenia patients from healthy controls with different classification methods. However, no systematic evaluation was conducted to compare their respective classification performances on the same study population. In this study, we evaluated the classification performance of four classification methods (including linear discriminant analysis, k-nearest neighbors, Gaussian process classifier, and support vector machines) on an NIRS-aided schizophrenia diagnosis. We recruited a large sample of 120 schizophrenia patients and 120 healthy controls and measured the hemoglobin response in the prefrontal cortex during the VFT using a multichannel NIRS system. Features for classification were extracted from three types of NIRS data in each channel. We subsequently performed a principal component analysis (PCA) for feature selection prior to comparison of the different classification methods. We achieved a maximum accuracy of 85.83% and an overall mean accuracy of 83.37% using a PCA-based feature selection on oxygenated hemoglobin signals and support vector machine classifier. This is the first comprehensive evaluation of different classification methods for the diagnosis of schizophrenia based on different types of NIRS signals. Our results suggested that, using the appropriate classification method, NIRS has the potential capacity to be an effective objective biomarker for the diagnosis of schizophrenia. Copyright © 2014 Elsevier B.V. All rights reserved.

  9. Study on pattern recognition of Raman spectrum based on fuzzy neural network

    NASA Astrophysics Data System (ADS)

    Zheng, Xiangxiang; Lv, Xiaoyi; Mo, Jiaqing

    2017-10-01

    Hydatid disease is a serious parasitic disease in many regions worldwide, especially in Xinjiang, China. Raman spectrum of the serum of patients with echinococcosis was selected as the research object in this paper. The Raman spectrum of blood samples from healthy people and patients with echinococcosis are measured, of which the spectrum characteristics are analyzed. The fuzzy neural network not only has the ability of fuzzy logic to deal with uncertain information, but also has the ability to store knowledge of neural network, so it is combined with the Raman spectrum on the disease diagnosis problem based on Raman spectrum. Firstly, principal component analysis (PCA) is used to extract the principal components of the Raman spectrum, reducing the network input and accelerating the prediction speed and accuracy of Network based on remaining the original data. Then, the information of the extracted principal component is used as the input of the neural network, the hidden layer of the network is the generation of rules and the inference process, and the output layer of the network is fuzzy classification output. Finally, a part of samples are randomly selected for the use of training network, then the trained network is used for predicting the rest of the samples, and the predicted results are compared with general BP neural network to illustrate the feasibility and advantages of fuzzy neural network. Success in this endeavor would be helpful for the research work of spectroscopic diagnosis of disease and it can be applied in practice in many other spectral analysis technique fields.

  10. Quality Evaluation and Chemical Markers Screening of Salvia miltiorrhiza Bge. (Danshen) Based on HPLC Fingerprints and HPLC-MSn Coupled with Chemometrics.

    PubMed

    Liang, Wenyi; Chen, Wenjing; Wu, Lingfang; Li, Shi; Qi, Qi; Cui, Yaping; Liang, Linjin; Ye, Ting; Zhang, Lanzhen

    2017-03-17

    Danshen, the dried root of Salvia miltiorrhiza Bge., is a widely used commercially available herbal drug, and unstable quality of different samples is a current issue. This study focused on a comprehensive and systematic method combining fingerprints and chemical identification with chemometrics for discrimination and quality assessment of Danshen samples. Twenty-five samples were analyzed by HPLC-PAD and HPLC-MS n . Forty-nine components were identified and characteristic fragmentation regularities were summarized for further interpretation of bioactive components. Chemometric analysis was employed to differentiate samples and clarify the quality differences of Danshen including hierarchical cluster analysis, principal component analysis, and partial least squares discriminant analysis. Consistent results were that the samples were divided into three categories which reflected the difference in quality of Danshen samples. By analyzing the reasons for sample classification, it was revealed that the processing method had a more obvious impact on sample classification than the geographical origin, it induced the different content of bioactive compounds and finally lead to different qualities. Cryptotanshinone, trijuganone B, and 15,16-dihydrotanshinone I were screened out as markers to distinguish samples by different processing methods. The developed strategy could provide a reference for evaluation and discrimination of other traditional herbal medicines.

  11. Supercritical Fluid Chromatography of Drugs: Parallel Factor Analysis for Column Testing in a Wide Range of Operational Conditions

    PubMed Central

    Al-Degs, Yahya; Andri, Bertyl; Thiébaut, Didier; Vial, Jérôme

    2017-01-01

    Retention mechanisms involved in supercritical fluid chromatography (SFC) are influenced by interdependent parameters (temperature, pressure, chemistry of the mobile phase, and nature of the stationary phase), a complexity which makes the selection of a proper stationary phase for a given separation a challenging step. For the first time in SFC studies, Parallel Factor Analysis (PARAFAC) was employed to evaluate the chromatographic behavior of eight different stationary phases in a wide range of chromatographic conditions (temperature, pressure, and gradient elution composition). Design of Experiment was used to optimize experiments involving 14 pharmaceutical compounds present in biological and/or environmental samples and with dissimilar physicochemical properties. The results showed the superiority of PARAFAC for the analysis of the three-way (column × drug × condition) data array over unfolding the multiway array to matrices and performing several classical principal component analyses. Thanks to the PARAFAC components, similarity in columns' function, chromatographic trend of drugs, and correlation between separation conditions could be simply depicted: columns were grouped according to their H-bonding forces, while gradient composition was dominating for condition classification. Also, the number of drugs could be efficiently reduced for columns classification as some of them exhibited a similar behavior, as shown by hierarchical clustering based on PARAFAC components. PMID:28695040

  12. Capturing multidimensionality in stroke aphasia: mapping principal behavioural components to neural structures

    PubMed Central

    Butler, Rebecca A.

    2014-01-01

    Stroke aphasia is a multidimensional disorder in which patient profiles reflect variation along multiple behavioural continua. We present a novel approach to separating the principal aspects of chronic aphasic performance and isolating their neural bases. Principal components analysis was used to extract core factors underlying performance of 31 participants with chronic stroke aphasia on a large, detailed battery of behavioural assessments. The rotated principle components analysis revealed three key factors, which we labelled as phonology, semantic and executive/cognition on the basis of the common elements in the tests that loaded most strongly on each component. The phonology factor explained the most variance, followed by the semantic factor and then the executive-cognition factor. The use of principle components analysis rendered participants’ scores on these three factors orthogonal and therefore ideal for use as simultaneous continuous predictors in a voxel-based correlational methodology analysis of high resolution structural scans. Phonological processing ability was uniquely related to left posterior perisylvian regions including Heschl’s gyrus, posterior middle and superior temporal gyri and superior temporal sulcus, as well as the white matter underlying the posterior superior temporal gyrus. The semantic factor was uniquely related to left anterior middle temporal gyrus and the underlying temporal stem. The executive-cognition factor was not correlated selectively with the structural integrity of any particular region, as might be expected in light of the widely-distributed and multi-functional nature of the regions that support executive functions. The identified phonological and semantic areas align well with those highlighted by other methodologies such as functional neuroimaging and neurostimulation. The use of principle components analysis allowed us to characterize the neural bases of participants’ behavioural performance more robustly and selectively than the use of raw assessment scores or diagnostic classifications because principle components analysis extracts statistically unique, orthogonal behavioural components of interest. As such, in addition to improving our understanding of lesion–symptom mapping in stroke aphasia, the same approach could be used to clarify brain–behaviour relationships in other neurological disorders. PMID:25348632

  13. Classification of circulation type sequences applied to snow avalanches over the eastern Pyrenees (Andorra and Catalonia)

    NASA Astrophysics Data System (ADS)

    Esteban, Pere; Beck, Christoph; Philipp, Andreas

    2010-05-01

    Using data associated with accidents or damages caused by snow avalanches over the eastern Pyrenees (Andorra and Catalonia) several atmospheric circulation type catalogues have been obtained. For this purpose, different circulation type classification methods based on Principal Component Analysis (T-mode and S-mode using the extreme scores) and on optimization procedures (Improved K-means and SANDRA) were applied . Considering the characteristics of the phenomena studied, not only single day circulation patterns were taken into account but also sequences of circulation types of varying length. Thus different classifications with different numbers of types and for different sequence lengths were obtained using the different classification methods. Simple between type variability, within type variability, and outlier detection procedures have been applied for selecting the best result concerning snow avalanches type classifications. Furthermore, days without occurrence of the hazards were also related to the avalanche centroids using pattern-correlations, facilitating the calculation of the anomalies between hazardous and no hazardous days, and also frequencies of occurrence of hazardous events for each circulation type. Finally, the catalogues statistically considered the best results are evaluated using the avalanche forecaster expert knowledge. Consistent explanation of snow avalanches occurrence by means of circulation sequences is obtained, but always considering results from classifications with different sequence length. This work has been developed in the framework of the COST Action 733 (Harmonisation and Applications of Weather Type Classifications for European regions).

  14. Predication of different stages of Alzheimer's disease using neighborhood component analysis and ensemble decision tree.

    PubMed

    Jin, Mingwu; Deng, Weishu

    2018-05-15

    There is a spectrum of the progression from healthy control (HC) to mild cognitive impairment (MCI) without conversion to Alzheimer's disease (AD), to MCI with conversion to AD (cMCI), and to AD. This study aims to predict the different disease stages using brain structural information provided by magnetic resonance imaging (MRI) data. The neighborhood component analysis (NCA) is applied to select most powerful features for prediction. The ensemble decision tree classifier is built to predict which group the subject belongs to. The best features and model parameters are determined by cross validation of the training data. Our results show that 16 out of a total of 429 features were selected by NCA using 240 training subjects, including MMSE score and structural measures in memory-related regions. The boosting tree model with NCA features can achieve prediction accuracy of 56.25% on 160 test subjects. Principal component analysis (PCA) and sequential feature selection (SFS) are used for feature selection, while support vector machine (SVM) is used for classification. The boosting tree model with NCA features outperforms all other combinations of feature selection and classification methods. The results suggest that NCA be a better feature selection strategy than PCA and SFS for the data used in this study. Ensemble tree classifier with boosting is more powerful than SVM to predict the subject group. However, more advanced feature selection and classification methods or additional measures besides structural MRI may be needed to improve the prediction performance. Copyright © 2018 Elsevier B.V. All rights reserved.

  15. Landslides Identification Using Airborne Laser Scanning Data Derived Topographic Terrain Attributes and Support Vector Machine Classification

    NASA Astrophysics Data System (ADS)

    Pawłuszek, Kamila; Borkowski, Andrzej

    2016-06-01

    Since the availability of high-resolution Airborne Laser Scanning (ALS) data, substantial progress in geomorphological research, especially in landslide analysis, has been carried out. First and second order derivatives of Digital Terrain Model (DTM) have become a popular and powerful tool in landslide inventory mapping. Nevertheless, an automatic landslide mapping based on sophisticated classifiers including Support Vector Machine (SVM), Artificial Neural Network or Random Forests is often computationally time consuming. The objective of this research is to deeply explore topographic information provided by ALS data and overcome computational time limitation. For this reason, an extended set of topographic features and the Principal Component Analysis (PCA) were used to reduce redundant information. The proposed novel approach was tested on a susceptible area affected by more than 50 landslides located on Rożnów Lake in Carpathian Mountains, Poland. The initial seven PCA components with 90% of the total variability in the original topographic attributes were used for SVM classification. Comparing results with landslide inventory map, the average user's accuracy (UA), producer's accuracy (PA), and overall accuracy (OA) were calculated for two models according to the classification results. Thereby, for the PCA-feature-reduced model UA, PA, and OA were found to be 72%, 76%, and 72%, respectively. Similarly, UA, PA, and OA in the non-reduced original topographic model, was 74%, 77% and 74%, respectively. Using the initial seven PCA components instead of the twenty original topographic attributes does not significantly change identification accuracy but reduce computational time.

  16. Scoliosis curve type classification using kernel machine from 3D trunk image

    NASA Astrophysics Data System (ADS)

    Adankon, Mathias M.; Dansereau, Jean; Parent, Stefan; Labelle, Hubert; Cheriet, Farida

    2012-03-01

    Adolescent idiopathic scoliosis (AIS) is a deformity of the spine manifested by asymmetry and deformities of the external surface of the trunk. Classification of scoliosis deformities according to curve type is used to plan management of scoliosis patients. Currently, scoliosis curve type is determined based on X-ray exam. However, cumulative exposure to X-rays radiation significantly increases the risk for certain cancer. In this paper, we propose a robust system that can classify the scoliosis curve type from non invasive acquisition of 3D trunk surface of the patients. The 3D image of the trunk is divided into patches and local geometric descriptors characterizing the surface of the back are computed from each patch and forming the features. We perform the reduction of the dimensionality by using Principal Component Analysis and 53 components were retained. In this work a multi-class classifier is built with Least-squares support vector machine (LS-SVM) which is a kernel classifier. For this study, a new kernel was designed in order to achieve a robust classifier in comparison with polynomial and Gaussian kernel. The proposed system was validated using data of 103 patients with different scoliosis curve types diagnosed and classified by an orthopedic surgeon from the X-ray images. The average rate of successful classification was 93.3% with a better rate of prediction for the major thoracic and lumbar/thoracolumbar types.

  17. A Model Comparison for Characterizing Protein Motions from Structure

    NASA Astrophysics Data System (ADS)

    David, Charles; Jacobs, Donald

    2011-10-01

    A comparative study is made using three computational models that characterize native state dynamics starting from known protein structures taken from four distinct SCOP classifications. A geometrical simulation is performed, and the results are compared to the elastic network model and molecular dynamics. The essential dynamics is quantified by a direct analysis of a mode subspace constructed from ANM and a principal component analysis on both the FRODA and MD trajectories using root mean square inner product and principal angles. Relative subspace sizes and overlaps are visualized using the projection of displacement vectors on the model modes. Additionally, a mode subspace is constructed from PCA on an exemplar set of X-ray crystal structures in order to determine similarly with respect to the generated ensembles. Quantitative analysis reveals there is significant overlap across the three model subspaces and the model independent subspace. These results indicate that structure is the key determinant for native state dynamics.

  18. Discrimination of a chestnut-oak forest unit for geologic mapping by means of a principal component enhancement of Landsat multispectral scanner data.

    USGS Publications Warehouse

    Krohn, M.D.; Milton, N.M.; Segal, D.; Enland, A.

    1981-01-01

    A principal component image enhancement has been effective in applying Landsat data to geologic mapping in a heavily forested area of E Virginia. The image enhancement procedure consists of a principal component transformation, a histogram normalization, and the inverse principal componnet transformation. The enhancement preserves the independence of the principal components, yet produces a more readily interpretable image than does a single principal component transformation. -from Authors

  19. Classification of white wine aromas with an electronic nose.

    PubMed

    Lozano, J; Santos, J P; Horrillo, M C

    2005-09-15

    This paper reports the use of a tin dioxide multisensor array based electronic nose for recognition of 29 typical aromas in white wine. Headspace technique has been used to extract aroma of the wine. Multivariate analysis, including principal component analysis (PCA) as well as probabilistic neural networks (PNNs), has been used to identify the main aroma added to the wine. The results showed that in spite of the strong influence of ethanol and other majority compounds of wine, the system could discriminate correctly the aromatic compounds added to the wine with a minimum accuracy of 97.2%.

  20. Use of Raman spectroscopy in the analysis of nickel allergy

    NASA Astrophysics Data System (ADS)

    Alda, Javier; Castillo-Martinez, Claudio; Valdes-Rodriguez, Rodrigo; Hernández-Blanco, Diana; Moncada, Benjamin; González, Francisco J.

    2013-06-01

    Raman spectra of the skin of subjects with nickel allergy are analyzed and compared to the spectra of healthy subjects to detect possible biochemical differences in the structure of the skin that could help diagnose metal allergies in a noninvasive manner. Results show differences between the two groups of Raman spectra. These spectral differences can be classified using principal component analysis. Based on these findings, a novel computational technique to make a fast evaluation and classification of the Raman spectra of the skin is presented and proposed as a noninvasive technique for the detection of nickel allergy.

  1. Diagnostic analysis of liver B ultrasonic texture features based on LM neural network

    NASA Astrophysics Data System (ADS)

    Chi, Qingyun; Hua, Hu; Liu, Menglin; Jiang, Xiuying

    2017-03-01

    In this study, B ultrasound images of 124 benign and malignant patients were randomly selected as the study objects. The B ultrasound images of the liver were treated by enhanced de-noising. By constructing the gray level co-occurrence matrix which reflects the information of each angle, Principal Component Analysis of 22 texture features were extracted and combined with LM neural network for diagnosis and classification. Experimental results show that this method is a rapid and effective diagnostic method for liver imaging, which provides a quantitative basis for clinical diagnosis of liver diseases.

  2. The biometric-based module of smart grid system

    NASA Astrophysics Data System (ADS)

    Engel, E.; Kovalev, I. V.; Ermoshkina, A.

    2015-10-01

    Within Smart Grid concept the flexible biometric-based module base on Principal Component Analysis (PCA) and selective Neural Network is developed. The formation of the selective Neural Network the biometric-based module uses the method which includes three main stages: preliminary processing of the image, face localization and face recognition. Experiments on the Yale face database show that (i) selective Neural Network exhibits promising classification capability for face detection, recognition problems; and (ii) the proposed biometric-based module achieves near real-time face detection, recognition speed and the competitive performance, as compared to some existing subspaces-based methods.

  3. Alcoholism detection in magnetic resonance imaging by Haar wavelet transform and back propagation neural network

    NASA Astrophysics Data System (ADS)

    Yu, Yali; Wang, Mengxia; Lima, Dimas

    2018-04-01

    In order to develop a novel alcoholism detection method, we proposed a magnetic resonance imaging (MRI)-based computer vision approach. We first use contrast equalization to increase the contrast of brain slices. Then, we perform Haar wavelet transform and principal component analysis. Finally, we use back propagation neural network (BPNN) as the classification tool. Our method yields a sensitivity of 81.71±4.51%, a specificity of 81.43±4.52%, and an accuracy of 81.57±2.18%. The Haar wavelet gives better performance than db4 wavelet and sym3 wavelet.

  4. PCANet: A Simple Deep Learning Baseline for Image Classification?

    PubMed

    Chan, Tsung-Han; Jia, Kui; Gao, Shenghua; Lu, Jiwen; Zeng, Zinan; Ma, Yi

    2015-12-01

    In this paper, we propose a very simple deep learning network for image classification that is based on very basic data processing components: 1) cascaded principal component analysis (PCA); 2) binary hashing; and 3) blockwise histograms. In the proposed architecture, the PCA is employed to learn multistage filter banks. This is followed by simple binary hashing and block histograms for indexing and pooling. This architecture is thus called the PCA network (PCANet) and can be extremely easily and efficiently designed and learned. For comparison and to provide a better understanding, we also introduce and study two simple variations of PCANet: 1) RandNet and 2) LDANet. They share the same topology as PCANet, but their cascaded filters are either randomly selected or learned from linear discriminant analysis. We have extensively tested these basic networks on many benchmark visual data sets for different tasks, including Labeled Faces in the Wild (LFW) for face verification; the MultiPIE, Extended Yale B, AR, Facial Recognition Technology (FERET) data sets for face recognition; and MNIST for hand-written digit recognition. Surprisingly, for all tasks, such a seemingly naive PCANet model is on par with the state-of-the-art features either prefixed, highly hand-crafted, or carefully learned [by deep neural networks (DNNs)]. Even more surprisingly, the model sets new records for many classification tasks on the Extended Yale B, AR, and FERET data sets and on MNIST variations. Additional experiments on other public data sets also demonstrate the potential of PCANet to serve as a simple but highly competitive baseline for texture classification and object recognition.

  5. Discriminating semiarid vegetation using airborne imaging spectrometer data - A preliminary assessment

    NASA Technical Reports Server (NTRS)

    Thomas, Randall W.; Ustin, Susan L.

    1987-01-01

    A preliminary assessment was made of Airborne Imaging Spectrometer (AIS) data for discriminating and characterizing vegetation in a semiarid environment. May and October AIS data sets were acquired over a large alluvial fan in eastern California, on which were found Great Basin desert shrub communities. Maximum likelihood classification of a principal components representation of the May AIS data enabled discrimination of subtle spatial detail in images relating to vegetation and soil characteristics. The spatial patterns in the May AIS classification were, however, too detailed for complete interpretation with existing ground data. A similar analysis of the October AIS data yielded poor results. Comparison of AIS results with a similar analysis of May Landsat Thematic Mapper data showed that the May AIS data contained approximately three to four times as much spectrally coherent information. When only two shortwave infrared TM bands were used, results were similar to those from AIS data acquired in October.

  6. Combined Raman and autofluorescence ex vivo diagnostics of skin cancer in near-infrared and visible regions

    NASA Astrophysics Data System (ADS)

    Bratchenko, Ivan A.; Artemyev, Dmitry N.; Myakinin, Oleg O.; Khristoforova, Yulia A.; Moryatov, Alexander A.; Kozlov, Sergey V.; Zakharov, Valery P.

    2017-02-01

    The differentiation of skin melanomas and basal cell carcinomas (BCCs) was demonstrated based on combined analysis of Raman and autofluorescence spectra stimulated by visible and NIR lasers. It was ex vivo tested on 39 melanomas and 40 BCCs. Six spectroscopic criteria utilizing information about alteration of melanin, porphyrins, flavins, lipids, and collagen content in tumor with a comparison to healthy skin were proposed. The measured correlation between the proposed criteria makes it possible to define weakly correlated criteria groups for discriminant analysis and principal components analysis application. It was shown that the accuracy of cancerous tissues classification reaches 97.3% for a combined 6-criteria multimodal algorithm, while the accuracy determined separately for each modality does not exceed 79%. The combined 6-D method is a rapid and reliable tool for malignant skin detection and classification.

  7. Non-negative matrix factorization in texture feature for classification of dementia with MRI data

    NASA Astrophysics Data System (ADS)

    Sarwinda, D.; Bustamam, A.; Ardaneswari, G.

    2017-07-01

    This paper investigates applications of non-negative matrix factorization as feature selection method to select the features from gray level co-occurrence matrix. The proposed approach is used to classify dementia using MRI data. In this study, texture analysis using gray level co-occurrence matrix is done to feature extraction. In the feature extraction process of MRI data, we found seven features from gray level co-occurrence matrix. Non-negative matrix factorization selected three features that influence of all features produced by feature extractions. A Naïve Bayes classifier is adapted to classify dementia, i.e. Alzheimer's disease, Mild Cognitive Impairment (MCI) and normal control. The experimental results show that non-negative factorization as feature selection method able to achieve an accuracy of 96.4% for classification of Alzheimer's and normal control. The proposed method also compared with other features selection methods i.e. Principal Component Analysis (PCA).

  8. Classification of fracture and non-fracture groups by analysis of coherent X-ray scatter

    PubMed Central

    Dicken, A. J.; Evans, J. P. O.; Rogers, K. D.; Stone, N.; Greenwood, C.; Godber, S. X.; Clement, J. G.; Lyburn, I. D.; Martin, R. M.; Zioupos, P.

    2016-01-01

    Osteoporotic fractures present a significant social and economic burden, which is set to rise commensurately with the aging population. Greater understanding of the physicochemical differences between osteoporotic and normal conditions will facilitate the development of diagnostic technologies with increased performance and treatments with increased efficacy. Using coherent X-ray scattering we have evaluated a population of 108 ex vivo human bone samples comprised of non-fracture and fracture groups. Principal component fed linear discriminant analysis was used to develop a classification model to discern each condition resulting in a sensitivity and specificity of 93% and 91%, respectively. Evaluating the coherent X-ray scatter differences from each condition supports the hypothesis that a causal physicochemical change has occurred in the fracture group. This work is a critical step along the path towards developing an in vivo diagnostic tool for fracture risk prediction. PMID:27363947

  9. Classification of smoke tainted wines using mid-infrared spectroscopy and chemometrics.

    PubMed

    Fudge, Anthea L; Wilkinson, Kerry L; Ristic, Renata; Cozzolino, Daniel

    2012-01-11

    In this study, the suitability of mid-infrared (MIR) spectroscopy, combined with principal component analysis (PCA) and linear discriminant analysis (LDA), was evaluated as a rapid analytical technique to identify smoke tainted wines. Control (i.e., unsmoked) and smoke-affected wines (260 in total) from experimental and commercial sources were analyzed by MIR spectroscopy and chemometrics. The concentrations of guaiacol and 4-methylguaiacol were also determined using gas chromatography-mass spectrometry (GC-MS), as markers of smoke taint. LDA models correctly classified 61% of control wines and 70% of smoke-affected wines. Classification rates were found to be influenced by the extent of smoke taint (based on GC-MS and informal sensory assessment), as well as qualitative differences in wine composition due to grape variety and oak maturation. Overall, the potential application of MIR spectroscopy combined with chemometrics as a rapid analytical technique for screening smoke-affected wines was demonstrated.

  10. Improved Classification of Orthosiphon stamineus by Data Fusion of Electronic Nose and Tongue Sensors

    PubMed Central

    Zakaria, Ammar; Shakaff, Ali Yeon Md.; Adom, Abdul Hamid; Ahmad, Mohd Noor; Masnan, Maz Jamilah; Aziz, Abdul Hallis Abdul; Fikri, Nazifah Ahmad; Abdullah, Abu Hassan; Kamarudin, Latifah Munirah

    2010-01-01

    An improved classification of Orthosiphon stamineus using a data fusion technique is presented. Five different commercial sources along with freshly prepared samples were discriminated using an electronic nose (e-nose) and an electronic tongue (e-tongue). Samples from the different commercial brands were evaluated by the e-tongue and then followed by the e-nose. Applying Principal Component Analysis (PCA) separately on the respective e-tongue and e-nose data, only five distinct groups were projected. However, by employing a low level data fusion technique, six distinct groupings were achieved. Hence, this technique can enhance the ability of PCA to analyze the complex samples of Orthosiphon stamineus. Linear Discriminant Analysis (LDA) was then used to further validate and classify the samples. It was found that the LDA performance was also improved when the responses from the e-nose and e-tongue were fused together. PMID:22163381

  11. Improved classification of Orthosiphon stamineus by data fusion of electronic nose and tongue sensors.

    PubMed

    Zakaria, Ammar; Shakaff, Ali Yeon Md; Adom, Abdul Hamid; Ahmad, Mohd Noor; Masnan, Maz Jamilah; Aziz, Abdul Hallis Abdul; Fikri, Nazifah Ahmad; Abdullah, Abu Hassan; Kamarudin, Latifah Munirah

    2010-01-01

    An improved classification of Orthosiphon stamineus using a data fusion technique is presented. Five different commercial sources along with freshly prepared samples were discriminated using an electronic nose (e-nose) and an electronic tongue (e-tongue). Samples from the different commercial brands were evaluated by the e-tongue and then followed by the e-nose. Applying Principal Component Analysis (PCA) separately on the respective e-tongue and e-nose data, only five distinct groups were projected. However, by employing a low level data fusion technique, six distinct groupings were achieved. Hence, this technique can enhance the ability of PCA to analyze the complex samples of Orthosiphon stamineus. Linear Discriminant Analysis (LDA) was then used to further validate and classify the samples. It was found that the LDA performance was also improved when the responses from the e-nose and e-tongue were fused together.

  12. Classification of LC columns based on the QSRR method and selectivity toward moclobemide and its metabolites.

    PubMed

    Plenis, Alina; Olędzka, Ilona; Bączek, Tomasz

    2013-05-05

    This paper focuses on a comparative study of the column classification system based on the quantitative structure-retention relationships (QSRR method) and column performance in real biomedical analysis. The assay was carried out for the LC separation of moclobemide and its metabolites in human plasma, using a set of 24 stationary phases. The QSRR models established for the studied stationary phases were compared with the column test performance results under two chemometric techniques - the principal component analysis (PCA) and the hierarchical clustering analysis (HCA). The study confirmed that the stationary phase classes found closely related by the QSRR approach yielded comparable separation for moclobemide and its metabolites. Therefore, the QSRR method could be considered supportive in the selection of a suitable column for the biomedical analysis offering the selection of similar or dissimilar columns with a relatively higher certainty. Copyright © 2013 Elsevier B.V. All rights reserved.

  13. Comprehensive Chemical Fingerprinting of High-Quality Cocoa at Early Stages of Processing: Effectiveness of Combined Untargeted and Targeted Approaches for Classification and Discrimination.

    PubMed

    Magagna, Federico; Guglielmetti, Alessandro; Liberto, Erica; Reichenbach, Stephen E; Allegrucci, Elena; Gobino, Guido; Bicchi, Carlo; Cordero, Chiara

    2017-08-02

    This study investigates chemical information of volatile fractions of high-quality cocoa (Theobroma cacao L. Malvaceae) from different origins (Mexico, Ecuador, Venezuela, Columbia, Java, Trinidad, and Sao Tomè) produced for fine chocolate. This study explores the evolution of the entire pattern of volatiles in relation to cocoa processing (raw, roasted, steamed, and ground beans). Advanced chemical fingerprinting (e.g., combined untargeted and targeted fingerprinting) with comprehensive two-dimensional gas chromatography coupled with mass spectrometry allows advanced pattern recognition for classification, discrimination, and sensory-quality characterization. The entire data set is analyzed for 595 reliable two-dimensional peak regions, including 130 known analytes and 13 potent odorants. Multivariate analysis with unsupervised exploration (principal component analysis) and simple supervised discrimination methods (Fisher ratios and linear regression trees) reveal informative patterns of similarities and differences and identify characteristic compounds related to sample origin and manufacturing step.

  14. Diagnosis of oral lichen planus from analysis of saliva samples using terahertz time-domain spectroscopy and chemometrics

    NASA Astrophysics Data System (ADS)

    Kistenev, Yury V.; Borisov, Alexey V.; Titarenko, Maria A.; Baydik, Olga D.; Shapovalov, Alexander V.

    2018-04-01

    The ability to diagnose oral lichen planus (OLP) based on saliva analysis using THz time-domain spectroscopy and chemometrics is discussed. The study involved 30 patients (2 male and 28 female) with OLP. This group consisted of two subgroups with the erosive form of OLP (n = 15) and with the reticular and papular forms of OLP (n = 15). The control group consisted of six healthy volunteers (one male and five females) without inflammation in the mucous membrane in the oral cavity and without periodontitis. Principal component analysis was used to reveal informative features in the experimental data. The one-versus-one multiclass classifier using support vector machine binary classifiers was used. The two-stage classification approach using several absorption spectra scans for an individual saliva sample provided 100% accuracy of differential classification between OLP subgroups and control group.

  15. Classification of 'Chemlali' accessions according to the geographical area using chemometric methods of phenolic profiles analysed by HPLC-ESI-TOF-MS.

    PubMed

    Taamalli, Amani; Arráez Román, David; Zarrouk, Mokhtar; Segura-Carretero, Antonio; Fernández-Gutiérrez, Alberto

    2012-05-01

    The present work describes a classification method of Tunisian 'Chemlali' olive oils based on their phenolic composition and geographical area. For this purpose, the data obtained by HPLC-ESI-TOF-MS from 13 samples of extra virgin olive oils, obtained from different production area throughout the country, were used for this study focusing in 23 phenolics compounds detected. The quantitative results showed a significant variability among the analysed oil samples. Factor analysis method using principal component was applied to the data in order to reduce the number of factors which explain the variability of the selected compounds. The data matrix constructed was subjected to a canonical discriminant analysis (CDA) in order to classify the oil samples. These results showed that 100% of cross-validated original group cases were correctly classified, which proves the usefulness of the selected variables. Copyright © 2011 Elsevier Ltd. All rights reserved.

  16. Discrimination of genetically modified sugar beets based on terahertz spectroscopy

    NASA Astrophysics Data System (ADS)

    Chen, Tao; Li, Zhi; Yin, Xianhua; Hu, Fangrong; Hu, Cong

    2016-01-01

    The objective of this paper was to apply terahertz (THz) spectroscopy combined with chemometrics techniques for discrimination of genetically modified (GM) and non-GM sugar beets. In this paper, the THz spectra of 84 sugar beet samples (36 GM sugar beets and 48 non-GM ones) were obtained by using terahertz time-domain spectroscopy (THz-TDS) system in the frequency range from 0.2 to 1.2 THz. Three chemometrics methods, principal component analysis (PCA), discriminant analysis (DA) and discriminant partial least squares (DPLS), were employed to classify sugar beet samples into two groups: genetically modified organisms (GMOs) and non-GMOs. The DPLS method yielded the best classification result, and the percentages of successful classification for GM and non-GM sugar beets were both 100%. Results of the present study demonstrate the usefulness of THz spectroscopy together with chemometrics methods as a powerful tool to distinguish GM and non-GM sugar beets.

  17. Fast discrimination of hydroxypropyl methyl cellulose using portable Raman spectrometer and multivariate methods

    NASA Astrophysics Data System (ADS)

    Song, Biao; Lu, Dan; Peng, Ming; Li, Xia; Zou, Ye; Huang, Meizhen; Lu, Feng

    2017-02-01

    Raman spectroscopy is developed as a fast and non-destructive method for the discrimination and classification of hydroxypropyl methyl cellulose (HPMC) samples. 44 E series and 41 K series of HPMC samples are measured by a self-developed portable Raman spectrometer (Hx-Raman) which is excited by a 785 nm diode laser and the spectrum range is 200-2700 cm-1 with a resolution (FWHM) of 6 cm-1. Multivariate analysis is applied for discrimination of E series from K series. By methods of principal components analysis (PCA) and Fisher discriminant analysis (FDA), a discrimination result with sensitivity of 90.91% and specificity of 95.12% is achieved. The corresponding receiver operating characteristic (ROC) is 0.99, indicting the accuracy of the predictive model. This result demonstrates the prospect of portable Raman spectrometer for rapid, non-destructive classification and discrimination of E series and K series samples of HPMC.

  18. Principal component regression analysis with SPSS.

    PubMed

    Liu, R X; Kuang, J; Gong, Q; Hou, X L

    2003-06-01

    The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.

  19. Application of visible and near-infrared spectroscopy to classification of Miscanthus species

    DOE PAGES

    Jin, Xiaoli; Chen, Xiaoling; Xiao, Liang; ...

    2017-04-03

    Here, the feasibility of visible and near infrared (NIR) spectroscopy as tool to classify Miscanthus samples was explored in this study. Three types of Miscanthus plants, namely, M. sinensis, M. sacchariflorus and M. fIoridulus, were analyzed using a NIR spectrophotometer. Several classification models based on the NIR spectra data were developed using line discriminated analysis (LDA), partial least squares (PLS), least squares support vector machine regression (LSSVR), radial basis function (RBF) and neural network (NN). The principal component analysis (PCA) presented rough classification with overlapping samples, while the models of Line_LSSVR, RBF_LSSVR and RBF_NN presented almost same calibration and validationmore » results. Due to the higher speed of Line_LSSVR than RBF_LSSVR and RBF_NN, we selected the line_LSSVR model as a representative. In our study, the model based on line_LSSVR showed higher accuracy than LDA and PLS models. The total correct classification rates of 87.79 and 96.51% were observed based on LDA and PLS model in the testing set, respectively, while the line_LSSVR showed 99.42% of total correct classification rate. Meanwhile, the lin_LSSVR model in the testing set showed correct classification rate of 100, 100 and 96.77% for M. sinensis, M. sacchariflorus and M. fIoridulus, respectively. The lin_LSSVR model assigned 99.42% of samples to the right groups, except one M. fIoridulus sample. The results demonstrated that NIR spectra combined with a preliminary morphological classification could be an effective and reliable procedure for the classification of Miscanthus species.« less

  20. Application of visible and near-infrared spectroscopy to classification of Miscanthus species

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jin, Xiaoli; Chen, Xiaoling; Xiao, Liang

    Here, the feasibility of visible and near infrared (NIR) spectroscopy as tool to classify Miscanthus samples was explored in this study. Three types of Miscanthus plants, namely, M. sinensis, M. sacchariflorus and M. fIoridulus, were analyzed using a NIR spectrophotometer. Several classification models based on the NIR spectra data were developed using line discriminated analysis (LDA), partial least squares (PLS), least squares support vector machine regression (LSSVR), radial basis function (RBF) and neural network (NN). The principal component analysis (PCA) presented rough classification with overlapping samples, while the models of Line_LSSVR, RBF_LSSVR and RBF_NN presented almost same calibration and validationmore » results. Due to the higher speed of Line_LSSVR than RBF_LSSVR and RBF_NN, we selected the line_LSSVR model as a representative. In our study, the model based on line_LSSVR showed higher accuracy than LDA and PLS models. The total correct classification rates of 87.79 and 96.51% were observed based on LDA and PLS model in the testing set, respectively, while the line_LSSVR showed 99.42% of total correct classification rate. Meanwhile, the lin_LSSVR model in the testing set showed correct classification rate of 100, 100 and 96.77% for M. sinensis, M. sacchariflorus and M. fIoridulus, respectively. The lin_LSSVR model assigned 99.42% of samples to the right groups, except one M. fIoridulus sample. The results demonstrated that NIR spectra combined with a preliminary morphological classification could be an effective and reliable procedure for the classification of Miscanthus species.« less

  1. Application of visible and near-infrared spectroscopy to classification of Miscanthus species.

    PubMed

    Jin, Xiaoli; Chen, Xiaoling; Xiao, Liang; Shi, Chunhai; Chen, Liang; Yu, Bin; Yi, Zili; Yoo, Ji Hye; Heo, Kweon; Yu, Chang Yeon; Yamada, Toshihiko; Sacks, Erik J; Peng, Junhua

    2017-01-01

    The feasibility of visible and near infrared (NIR) spectroscopy as tool to classify Miscanthus samples was explored in this study. Three types of Miscanthus plants, namely, M. sinensis, M. sacchariflorus and M. fIoridulus, were analyzed using a NIR spectrophotometer. Several classification models based on the NIR spectra data were developed using line discriminated analysis (LDA), partial least squares (PLS), least squares support vector machine regression (LSSVR), radial basis function (RBF) and neural network (NN). The principal component analysis (PCA) presented rough classification with overlapping samples, while the models of Line_LSSVR, RBF_LSSVR and RBF_NN presented almost same calibration and validation results. Due to the higher speed of Line_LSSVR than RBF_LSSVR and RBF_NN, we selected the line_LSSVR model as a representative. In our study, the model based on line_LSSVR showed higher accuracy than LDA and PLS models. The total correct classification rates of 87.79 and 96.51% were observed based on LDA and PLS model in the testing set, respectively, while the line_LSSVR showed 99.42% of total correct classification rate. Meanwhile, the lin_LSSVR model in the testing set showed correct classification rate of 100, 100 and 96.77% for M. sinensis, M. sacchariflorus and M. fIoridulus, respectively. The lin_LSSVR model assigned 99.42% of samples to the right groups, except one M. fIoridulus sample. The results demonstrated that NIR spectra combined with a preliminary morphological classification could be an effective and reliable procedure for the classification of Miscanthus species.

  2. Application of visible and near-infrared spectroscopy to classification of Miscanthus species

    PubMed Central

    Shi, Chunhai; Chen, Liang; Yu, Bin; Yi, Zili; Yoo, Ji Hye; Heo, Kweon; Yu, Chang Yeon; Yamada, Toshihiko; Sacks, Erik J.; Peng, Junhua

    2017-01-01

    The feasibility of visible and near infrared (NIR) spectroscopy as tool to classify Miscanthus samples was explored in this study. Three types of Miscanthus plants, namely, M. sinensis, M. sacchariflorus and M. fIoridulus, were analyzed using a NIR spectrophotometer. Several classification models based on the NIR spectra data were developed using line discriminated analysis (LDA), partial least squares (PLS), least squares support vector machine regression (LSSVR), radial basis function (RBF) and neural network (NN). The principal component analysis (PCA) presented rough classification with overlapping samples, while the models of Line_LSSVR, RBF_LSSVR and RBF_NN presented almost same calibration and validation results. Due to the higher speed of Line_LSSVR than RBF_LSSVR and RBF_NN, we selected the line_LSSVR model as a representative. In our study, the model based on line_LSSVR showed higher accuracy than LDA and PLS models. The total correct classification rates of 87.79 and 96.51% were observed based on LDA and PLS model in the testing set, respectively, while the line_LSSVR showed 99.42% of total correct classification rate. Meanwhile, the lin_LSSVR model in the testing set showed correct classification rate of 100, 100 and 96.77% for M. sinensis, M. sacchariflorus and M. fIoridulus, respectively. The lin_LSSVR model assigned 99.42% of samples to the right groups, except one M. fIoridulus sample. The results demonstrated that NIR spectra combined with a preliminary morphological classification could be an effective and reliable procedure for the classification of Miscanthus species. PMID:28369059

  3. Encephalization and quantitative brain composition in bats in relation to their life-habits.

    PubMed

    Pirlot, P; Pottier, J

    1977-12-01

    A quantitative analysis of the brains of 43 bat species is presented. Eleven brain components were studied. The species were arranged according to seven distinct dietary groups and it was found that the relative development of the principal components is related to those groups. The importance of neocorticalization as a reflection of evolution of all the bats in contrast to specialization in some species is stressed. This work gives a clearer view of Chiropteran progressiveness or primitiveness: the insectivorous forms occupy the least advanced, although most specialized, level; the vampires, the carnivorous species and the flying foxes are at the top of the scale. The importance of behaviour and the relative development of the central nervous system in the hierarchial classification of mammals is stressed.

  4. Discrimination of rectal cancer through human serum using surface-enhanced Raman spectroscopy

    NASA Astrophysics Data System (ADS)

    Li, Xiaozhou; Yang, Tianyue; Li, Siqi; Zhang, Su; Jin, Lili

    2015-05-01

    In this paper, surface-enhanced Raman spectroscopy (SERS) was used to detect the changes in blood serum components that accompany rectal cancer. The differences in serum SERS data between rectal cancer patients and healthy controls were examined. Postoperative rectal cancer patients also participated in the comparison to monitor the effects of cancer treatments. The results show that there are significant variations at certain wavenumbers which indicates alteration of corresponding biological substances. Principal component analysis (PCA) and parameters of intensity ratios were used on the original SERS spectra for the extraction of featured variables. These featured variables then underwent linear discriminant analysis (LDA) and classification and regression tree (CART) for the discrimination analysis. Accuracies of 93.5 and 92.4 % were obtained for PCA-LDA and parameter-CART, respectively.

  5. Multivariate classification of edible salts: Simultaneous Laser-Induced Breakdown Spectroscopy and Laser-Ablation Inductively Coupled Plasma Mass Spectrometry Analysis

    NASA Astrophysics Data System (ADS)

    Lee, Yonghoon; Nam, Sang-Ho; Ham, Kyung-Sik; Gonzalez, Jhanis; Oropeza, Dayana; Quarles, Derrick; Yoo, Jonghyun; Russo, Richard E.

    2016-04-01

    Laser-Induced Breakdown Spectroscopy (LIBS) and Laser-Ablation Inductively Coupled Plasma Mass Spectrometry (LA-ICP-MS), both based on laser ablation sampling, can be employed simultaneously to obtain different chemical fingerprints from a sample. We demonstrated that this analysis approach can provide complementary information for improved classification of edible salts. LIBS could detect several of the minor metallic elements along with Na and Cl, while LA-ICP-MS spectra were used to measure non-metallic and trace heavy metal elements. Principal component analysis using LIBS and LA-ICP-MS spectra showed that their major spectral variations classified the sample salts in different ways. Three classification models were developed by using partial least squares-discriminant analysis based on the LIBS, LA-ICP-MS, and their fused data. From the cross-validation performances and confusion matrices of these models, the minor metallic elements (Mg, Ca, and K) detected by LIBS and the non-metallic (I) and trace heavy metal (Ba, W, and Pb) elements detected by LA-ICP-MS provided complementary chemical information to distinguish particular salt samples.

  6. Automatic detection of malaria parasite in blood images using two parameters.

    PubMed

    Kim, Jong-Dae; Nam, Kyeong-Min; Park, Chan-Young; Kim, Yu-Seop; Song, Hye-Jeong

    2015-01-01

    Malaria must be diagnosed quickly and accurately at the initial infection stage and treated early to cure it properly. The malaria diagnosis method using a microscope requires much labor and time of a skilled expert and the diagnosis results vary greatly between individual diagnosticians. Therefore, to be able to measure the malaria parasite infection quickly and accurately, studies have been conducted for automated classification techniques using various parameters. In this study, by measuring classification technique performance according to changes of two parameters, the parameter values were determined that best distinguish normal from plasmodium-infected red blood cells. To reduce the stain deviation of the acquired images, a principal component analysis (PCA) grayscale conversion method was used, and as parameters, we used a malaria infected area and a threshold value used in binarization. The parameter values with the best classification performance were determined by selecting the value (72) corresponding to the lowest error rate on the basis of cell threshold value 128 for the malaria threshold value for detecting plasmodium-infected red blood cells.

  7. [Identification of green tea brand based on hyperspectra imaging technology].

    PubMed

    Zhang, Hai-Liang; Liu, Xiao-Li; Zhu, Feng-Le; He, Yong

    2014-05-01

    Hyperspectral imaging technology was developed to identify different brand famous green tea based on PCA information and image information fusion. First 512 spectral images of six brands of famous green tea in the 380 approximately 1 023 nm wavelength range were collected and principal component analysis (PCA) was performed with the goal of selecting two characteristic bands (545 and 611 nm) that could potentially be used for classification system. Then, 12 gray level co-occurrence matrix (GLCM) features (i. e., mean, covariance, homogeneity, energy, contrast, correlation, entropy, inverse gap, contrast, difference from the second-order and autocorrelation) based on the statistical moment were extracted from each characteristic band image. Finally, integration of the 12 texture features and three PCA spectral characteristics for each green tea sample were extracted as the input of LS-SVM. Experimental results showed that discriminating rate was 100% in the prediction set. The receiver operating characteristic curve (ROC) assessment methods were used to evaluate the LS-SVM classification algorithm. Overall results sufficiently demonstrate that hyperspectral imaging technology can be used to perform classification of green tea.

  8. Classification of M1/M2-polarized human macrophages by label-free hyperspectral reflectance confocal microscopy and multivariate analysis.

    PubMed

    Bertani, Francesca R; Mozetic, Pamela; Fioramonti, Marco; Iuliani, Michele; Ribelli, Giulia; Pantano, Francesco; Santini, Daniele; Tonini, Giuseppe; Trombetta, Marcella; Businaro, Luca; Selci, Stefano; Rainer, Alberto

    2017-08-21

    The possibility of detecting and classifying living cells in a label-free and non-invasive manner holds significant theranostic potential. In this work, Hyperspectral Imaging (HSI) has been successfully applied to the analysis of macrophagic polarization, given its central role in several pathological settings, including the regulation of tumour microenvironment. Human monocyte derived macrophages have been investigated using hyperspectral reflectance confocal microscopy, and hyperspectral datasets have been analysed in terms of M1 vs. M2 polarization by Principal Components Analysis (PCA). Following PCA, Linear Discriminant Analysis has been implemented for semi-automatic classification of macrophagic polarization from HSI data. Our results confirm the possibility to perform single-cell-level in vitro classification of M1 vs. M2 macrophages in a non-invasive and label-free manner with a high accuracy (above 98% for cells deriving from the same donor), supporting the idea of applying the technique to the study of complex interacting cellular systems, such in the case of tumour-immunity in vitro models.

  9. An electronic nose for reliable measurement and correct classification of beverages.

    PubMed

    Mamat, Mazlina; Samad, Salina Abdul; Hannan, Mahammad A

    2011-01-01

    This paper reports the design of an electronic nose (E-nose) prototype for reliable measurement and correct classification of beverages. The prototype was developed and fabricated in the laboratory using commercially available metal oxide gas sensors and a temperature sensor. The repeatability, reproducibility and discriminative ability of the developed E-nose prototype were tested on odors emanating from different beverages such as blackcurrant juice, mango juice and orange juice, respectively. Repeated measurements of three beverages showed very high correlation (r > 0.97) between the same beverages to verify the repeatability. The prototype also produced highly correlated patterns (r > 0.97) in the measurement of beverages using different sensor batches to verify its reproducibility. The E-nose prototype also possessed good discriminative ability whereby it was able to produce different patterns for different beverages, different milk heat treatments (ultra high temperature, pasteurization) and fresh and spoiled milks. The discriminative ability of the E-nose was evaluated using Principal Component Analysis and a Multi Layer Perception Neural Network, with both methods showing good classification results.

  10. Classification of bacteria by simultaneous methylation-solid phase microextraction and gas chromatography/mass spectrometry analysis of fatty acid methyl esters.

    PubMed

    Lu, Yao; Harrington, Peter B

    2010-08-01

    Direct methylation and solid-phase microextraction (SPME) were used as a sample preparation technique for classification of bacteria based on fatty acid methyl ester (FAME) profiles. Methanolic tetramethylammonium hydroxide was applied as a dual-function reagent to saponify and derivatize whole-cell bacterial fatty acids into FAMEs in one step, and SPME was used to extract the bacterial FAMEs from the headspace. Compared with traditional alkaline saponification and sample preparation using liquid-liquid extraction, the method presented in this work avoids using comparatively large amounts of inorganic and organic solvents and greatly decreases the sample preparation time as well. Characteristic gas chromatography/mass spectrometry (GC/MS) of FAME profiles was achieved for six bacterial species. The difference between Gram-positive and Gram-negative bacteria was clearly visualized with the application of principal component analysis of the GC/MS data of bacterial FAMEs. A cross-validation study using ten bootstrap Latin partitions and the fuzzy rule building expert system demonstrated 87 +/- 3% correct classification efficiency.

  11. Comparison of remote sensing image processing techniques to identify tornado damage areas from Landsat TM data

    USGS Publications Warehouse

    Myint, S.W.; Yuan, M.; Cerveny, R.S.; Giri, C.P.

    2008-01-01

    Remote sensing techniques have been shown effective for large-scale damage surveys after a hazardous event in both near real-time or post-event analyses. The paper aims to compare accuracy of common imaging processing techniques to detect tornado damage tracks from Landsat TM data. We employed the direct change detection approach using two sets of images acquired before and after the tornado event to produce a principal component composite images and a set of image difference bands. Techniques in the comparison include supervised classification, unsupervised classification, and objectoriented classification approach with a nearest neighbor classifier. Accuracy assessment is based on Kappa coefficient calculated from error matrices which cross tabulate correctly identified cells on the TM image and commission and omission errors in the result. Overall, the Object-oriented Approach exhibits the highest degree of accuracy in tornado damage detection. PCA and Image Differencing methods show comparable outcomes. While selected PCs can improve detection accuracy 5 to 10%, the Object-oriented Approach performs significantly better with 15-20% higher accuracy than the other two techniques. ?? 2008 by MDPI.

  12. Comparison of Remote Sensing Image Processing Techniques to Identify Tornado Damage Areas from Landsat TM Data

    PubMed Central

    Myint, Soe W.; Yuan, May; Cerveny, Randall S.; Giri, Chandra P.

    2008-01-01

    Remote sensing techniques have been shown effective for large-scale damage surveys after a hazardous event in both near real-time or post-event analyses. The paper aims to compare accuracy of common imaging processing techniques to detect tornado damage tracks from Landsat TM data. We employed the direct change detection approach using two sets of images acquired before and after the tornado event to produce a principal component composite images and a set of image difference bands. Techniques in the comparison include supervised classification, unsupervised classification, and object-oriented classification approach with a nearest neighbor classifier. Accuracy assessment is based on Kappa coefficient calculated from error matrices which cross tabulate correctly identified cells on the TM image and commission and omission errors in the result. Overall, the Object-oriented Approach exhibits the highest degree of accuracy in tornado damage detection. PCA and Image Differencing methods show comparable outcomes. While selected PCs can improve detection accuracy 5 to 10%, the Object-oriented Approach performs significantly better with 15-20% higher accuracy than the other two techniques. PMID:27879757

  13. A Novel Approach for Lie Detection Based on F-Score and Extreme Learning Machine

    PubMed Central

    Gao, Junfeng; Wang, Zhao; Yang, Yong; Zhang, Wenjia; Tao, Chunyi; Guan, Jinan; Rao, Nini

    2013-01-01

    A new machine learning method referred to as F-score_ELM was proposed to classify the lying and truth-telling using the electroencephalogram (EEG) signals from 28 guilty and innocent subjects. Thirty-one features were extracted from the probe responses from these subjects. Then, a recently-developed classifier called extreme learning machine (ELM) was combined with F-score, a simple but effective feature selection method, to jointly optimize the number of the hidden nodes of ELM and the feature subset by a grid-searching training procedure. The method was compared to two classification models combining principal component analysis with back-propagation network and support vector machine classifiers. We thoroughly assessed the performance of these classification models including the training and testing time, sensitivity and specificity from the training and testing sets, as well as network size. The experimental results showed that the number of the hidden nodes can be effectively optimized by the proposed method. Also, F-score_ELM obtained the best classification accuracy and required the shortest training and testing time. PMID:23755136

  14. An Electronic Nose for Reliable Measurement and Correct Classification of Beverages

    PubMed Central

    Mamat, Mazlina; Samad, Salina Abdul; Hannan, Mahammad A.

    2011-01-01

    This paper reports the design of an electronic nose (E-nose) prototype for reliable measurement and correct classification of beverages. The prototype was developed and fabricated in the laboratory using commercially available metal oxide gas sensors and a temperature sensor. The repeatability, reproducibility and discriminative ability of the developed E-nose prototype were tested on odors emanating from different beverages such as blackcurrant juice, mango juice and orange juice, respectively. Repeated measurements of three beverages showed very high correlation (r > 0.97) between the same beverages to verify the repeatability. The prototype also produced highly correlated patterns (r > 0.97) in the measurement of beverages using different sensor batches to verify its reproducibility. The E-nose prototype also possessed good discriminative ability whereby it was able to produce different patterns for different beverages, different milk heat treatments (ultra high temperature, pasteurization) and fresh and spoiled milks. The discriminative ability of the E-nose was evaluated using Principal Component Analysis and a Multi Layer Perception Neural Network, with both methods showing good classification results. PMID:22163964

  15. Hydrometeorological application of an extratropical cyclone classification scheme in the southern United States

    NASA Astrophysics Data System (ADS)

    Senkbeil, J. C.; Brommer, D. M.; Comstock, I. J.; Loyd, T.

    2012-07-01

    Extratropical cyclones (ETCs) in the southern United States are often overlooked when compared with tropical cyclones in the region and ETCs in the northern United States. Although southern ETCs are significant weather events, there is currently not an operational scheme used for identifying and discussing these nameless storms. In this research, we classified 84 ETCs (1970-2009). We manually identified five distinct formation regions and seven unique ETC types using statistical classification. Statistical classification employed the use of principal components analysis and two methods of cluster analysis. Both manual and statistical storm types generally showed positive (negative) relationships with El Niño (La Niña). Manual storm types displayed precipitation swaths consistent with discrete storm tracks which further legitimizes the existence of multiple modes of southern ETCs. Statistical storm types also displayed unique precipitation intensity swaths, but these swaths were less indicative of track location. It is hoped that by classifying southern ETCs into types, that forecasters, hydrologists, and broadcast meteorologists might be able to better anticipate projected amounts of precipitation at their locations.

  16. Multivariate qualitative analysis of banned additives in food safety using surface enhanced Raman scattering spectroscopy.

    PubMed

    He, Shixuan; Xie, Wanyi; Zhang, Wei; Zhang, Liqun; Wang, Yunxia; Liu, Xiaoling; Liu, Yulong; Du, Chunlei

    2015-02-25

    A novel strategy which combines iteratively cubic spline fitting baseline correction method with discriminant partial least squares qualitative analysis is employed to analyze the surface enhanced Raman scattering (SERS) spectroscopy of banned food additives, such as Sudan I dye and Rhodamine B in food, Malachite green residues in aquaculture fish. Multivariate qualitative analysis methods, using the combination of spectra preprocessing iteratively cubic spline fitting (ICSF) baseline correction with principal component analysis (PCA) and discriminant partial least squares (DPLS) classification respectively, are applied to investigate the effectiveness of SERS spectroscopy for predicting the class assignments of unknown banned food additives. PCA cannot be used to predict the class assignments of unknown samples. However, the DPLS classification can discriminate the class assignment of unknown banned additives using the information of differences in relative intensities. The results demonstrate that SERS spectroscopy combined with ICSF baseline correction method and exploratory analysis methodology DPLS classification can be potentially used for distinguishing the banned food additives in field of food safety. Copyright © 2014 Elsevier B.V. All rights reserved.

  17. 3D Texture Analysis in Renal Cell Carcinoma Tissue Image Grading

    PubMed Central

    Cho, Nam-Hoon; Choi, Heung-Kook

    2014-01-01

    One of the most significant processes in cancer cell and tissue image analysis is the efficient extraction of features for grading purposes. This research applied two types of three-dimensional texture analysis methods to the extraction of feature values from renal cell carcinoma tissue images, and then evaluated the validity of the methods statistically through grade classification. First, we used a confocal laser scanning microscope to obtain image slices of four grades of renal cell carcinoma, which were then reconstructed into 3D volumes. Next, we extracted quantitative values using a 3D gray level cooccurrence matrix (GLCM) and a 3D wavelet based on two types of basis functions. To evaluate their validity, we predefined 6 different statistical classifiers and applied these to the extracted feature sets. In the grade classification results, 3D Haar wavelet texture features combined with principal component analysis showed the best discrimination results. Classification using 3D wavelet texture features was significantly better than 3D GLCM, suggesting that the former has potential for use in a computer-based grading system. PMID:25371701

  18. Automatic Cataract Hardness Classification Ex Vivo by Ultrasound Techniques.

    PubMed

    Caixinha, Miguel; Santos, Mário; Santos, Jaime

    2016-04-01

    To demonstrate the feasibility of a new methodology for cataract hardness characterization and automatic classification using ultrasound techniques, different cataract degrees were induced in 210 porcine lenses. A 25-MHz ultrasound transducer was used to obtain acoustical parameters (velocity and attenuation) and backscattering signals. B-Scan and parametric Nakagami images were constructed. Ninety-seven parameters were extracted and subjected to a Principal Component Analysis. Bayes, K-Nearest-Neighbours, Fisher Linear Discriminant and Support Vector Machine (SVM) classifiers were used to automatically classify the different cataract severities. Statistically significant increases with cataract formation were found for velocity, attenuation, mean brightness intensity of the B-Scan images and mean Nakagami m parameter (p < 0.01). The four classifiers showed a good performance for healthy versus cataractous lenses (F-measure ≥ 92.68%), while for initial versus severe cataracts the SVM classifier showed the higher performance (90.62%). The results showed that ultrasound techniques can be used for non-invasive cataract hardness characterization and automatic classification. Copyright © 2016 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.

  19. Feature Extraction of Electronic Nose Signals Using QPSO-Based Multiple KFDA Signal Processing

    PubMed Central

    Wen, Tailai; Huang, Daoyu; Lu, Kun; Deng, Changjian; Zeng, Tanyue; Yu, Song; He, Zhiyi

    2018-01-01

    The aim of this research was to enhance the classification accuracy of an electronic nose (E-nose) in different detecting applications. During the learning process of the E-nose to predict the types of different odors, the prediction accuracy was not quite satisfying because the raw features extracted from sensors’ responses were regarded as the input of a classifier without any feature extraction processing. Therefore, in order to obtain more useful information and improve the E-nose’s classification accuracy, in this paper, a Weighted Kernels Fisher Discriminant Analysis (WKFDA) combined with Quantum-behaved Particle Swarm Optimization (QPSO), i.e., QWKFDA, was presented to reprocess the original feature matrix. In addition, we have also compared the proposed method with quite a few previously existing ones including Principal Component Analysis (PCA), Locality Preserving Projections (LPP), Fisher Discriminant Analysis (FDA) and Kernels Fisher Discriminant Analysis (KFDA). Experimental results proved that QWKFDA is an effective feature extraction method for E-nose in predicting the types of wound infection and inflammable gases, which shared much higher classification accuracy than those of the contrast methods. PMID:29382146

  20. Evaluation of SLAR and simulated thematic mapper MSS data for forest cover mapping using computer-aided analysis techniques

    NASA Technical Reports Server (NTRS)

    Hoffer, R. M.; Dean, M. E.; Knowlton, D. J.; Latty, R. S.

    1982-01-01

    Kershaw County, South Carolina was selected as the study site for analyzing simulated thematic mapper MSS data and dual-polarized X-band synthetic aperture radar (SAR) data. The impact of the improved spatial and spectral characteristics of the LANDSAT D thematic mapper data on computer aided analysis for forest cover type mapping was examined as well as the value of synthetic aperture radar data for differentiating forest and other cover types. The utility of pattern recognition techniques for analyzing SAR data was assessed. Topics covered include: (1) collection and of TMS and reference data; (2) reformatting, geometric and radiometric rectification, and spatial resolution degradation of TMS data; (3) development of training statistics and test data sets; (4) evaluation of different numbers and combinations of wavelength bands on classification performance; (5) comparison among three classification algorithms; and (6) the effectiveness of the principal component transformation in data analysis. The collection, digitization, reformatting, and geometric adjustment of SAR data are also discussed. Image interpretation results and classification results are presented.

  1. Feature Extraction of Electronic Nose Signals Using QPSO-Based Multiple KFDA Signal Processing.

    PubMed

    Wen, Tailai; Yan, Jia; Huang, Daoyu; Lu, Kun; Deng, Changjian; Zeng, Tanyue; Yu, Song; He, Zhiyi

    2018-01-29

    The aim of this research was to enhance the classification accuracy of an electronic nose (E-nose) in different detecting applications. During the learning process of the E-nose to predict the types of different odors, the prediction accuracy was not quite satisfying because the raw features extracted from sensors' responses were regarded as the input of a classifier without any feature extraction processing. Therefore, in order to obtain more useful information and improve the E-nose's classification accuracy, in this paper, a Weighted Kernels Fisher Discriminant Analysis (WKFDA) combined with Quantum-behaved Particle Swarm Optimization (QPSO), i.e., QWKFDA, was presented to reprocess the original feature matrix. In addition, we have also compared the proposed method with quite a few previously existing ones including Principal Component Analysis (PCA), Locality Preserving Projections (LPP), Fisher Discriminant Analysis (FDA) and Kernels Fisher Discriminant Analysis (KFDA). Experimental results proved that QWKFDA is an effective feature extraction method for E-nose in predicting the types of wound infection and inflammable gases, which shared much higher classification accuracy than those of the contrast methods.

  2. Exploration of computational methods for classification of movement intention during human voluntary movement from single trial EEG.

    PubMed

    Bai, Ou; Lin, Peter; Vorbach, Sherry; Li, Jiang; Furlani, Steve; Hallett, Mark

    2007-12-01

    To explore effective combinations of computational methods for the prediction of movement intention preceding the production of self-paced right and left hand movements from single trial scalp electroencephalogram (EEG). Twelve naïve subjects performed self-paced movements consisting of three key strokes with either hand. EEG was recorded from 128 channels. The exploration was performed offline on single trial EEG data. We proposed that a successful computational procedure for classification would consist of spatial filtering, temporal filtering, feature selection, and pattern classification. A systematic investigation was performed with combinations of spatial filtering using principal component analysis (PCA), independent component analysis (ICA), common spatial patterns analysis (CSP), and surface Laplacian derivation (SLD); temporal filtering using power spectral density estimation (PSD) and discrete wavelet transform (DWT); pattern classification using linear Mahalanobis distance classifier (LMD), quadratic Mahalanobis distance classifier (QMD), Bayesian classifier (BSC), multi-layer perceptron neural network (MLP), probabilistic neural network (PNN), and support vector machine (SVM). A robust multivariate feature selection strategy using a genetic algorithm was employed. The combinations of spatial filtering using ICA and SLD, temporal filtering using PSD and DWT, and classification methods using LMD, QMD, BSC and SVM provided higher performance than those of other combinations. Utilizing one of the better combinations of ICA, PSD and SVM, the discrimination accuracy was as high as 75%. Further feature analysis showed that beta band EEG activity of the channels over right sensorimotor cortex was most appropriate for discrimination of right and left hand movement intention. Effective combinations of computational methods provide possible classification of human movement intention from single trial EEG. Such a method could be the basis for a potential brain-computer interface based on human natural movement, which might reduce the requirement of long-term training. Effective combinations of computational methods can classify human movement intention from single trial EEG with reasonable accuracy.

  3. FAST TRACK COMMUNICATION Algebraic classification of the Weyl tensor in higher dimensions based on its 'superenergy' tensor

    NASA Astrophysics Data System (ADS)

    Senovilla, José M. M.

    2010-11-01

    The algebraic classification of the Weyl tensor in the arbitrary dimension n is recovered by means of the principal directions of its 'superenergy' tensor. This point of view can be helpful in order to compute the Weyl aligned null directions explicitly, and permits one to obtain the algebraic type of the Weyl tensor by computing the principal eigenvalue of rank-2 symmetric future tensors. The algebraic types compatible with states of intrinsic gravitational radiation can then be explored. The underlying ideas are general, so that a classification of arbitrary tensors in the general dimension can be achieved.

  4. Applying Ancestry and Sex Computation as a Quality Control Tool in Targeted Next-Generation Sequencing.

    PubMed

    Mathias, Patrick C; Turner, Emily H; Scroggins, Sheena M; Salipante, Stephen J; Hoffman, Noah G; Pritchard, Colin C; Shirts, Brian H

    2016-03-01

    To apply techniques for ancestry and sex computation from next-generation sequencing (NGS) data as an approach to confirm sample identity and detect sample processing errors. We combined a principal component analysis method with k-nearest neighbors classification to compute the ancestry of patients undergoing NGS testing. By combining this calculation with X chromosome copy number data, we determined the sex and ancestry of patients for comparison with self-report. We also modeled the sensitivity of this technique in detecting sample processing errors. We applied this technique to 859 patient samples with reliable self-report data. Our k-nearest neighbors ancestry screen had an accuracy of 98.7% for patients reporting a single ancestry. Visual inspection of principal component plots was consistent with self-report in 99.6% of single-ancestry and mixed-ancestry patients. Our model demonstrates that approximately two-thirds of potential sample swaps could be detected in our patient population using this technique. Patient ancestry can be estimated from NGS data incidentally sequenced in targeted panels, enabling an inexpensive quality control method when coupled with patient self-report. © American Society for Clinical Pathology, 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  5. Reduction of the dimension of neural network models in problems of pattern recognition and forecasting

    NASA Astrophysics Data System (ADS)

    Nasertdinova, A. D.; Bochkarev, V. V.

    2017-11-01

    Deep neural networks with a large number of parameters are a powerful tool for solving problems of pattern recognition, prediction and classification. Nevertheless, overfitting remains a serious problem in the use of such networks. A method of solving the problem of overfitting is proposed in this article. This method is based on reducing the number of independent parameters of a neural network model using the principal component analysis, and can be implemented using existing libraries of neural computing. The algorithm was tested on the problem of recognition of handwritten symbols from the MNIST database, as well as on the task of predicting time series (rows of the average monthly number of sunspots and series of the Lorentz system were used). It is shown that the application of the principal component analysis enables reducing the number of parameters of the neural network model when the results are good. The average error rate for the recognition of handwritten figures from the MNIST database was 1.12% (which is comparable to the results obtained using the "Deep training" methods), while the number of parameters of the neural network can be reduced to 130 times.

  6. Detection of Fungus Infection on Petals of Rapeseed (Brassica napus L.) Using NIR Hyperspectral Imaging

    NASA Astrophysics Data System (ADS)

    Zhao, Yan-Ru; Yu, Ke-Qiang; Li, Xiaoli; He, Yong

    2016-12-01

    Infected petals are often regarded as the source for the spread of fungi Sclerotinia sclerotiorum in all growing process of rapeseed (Brassica napus L.) plants. This research aimed to detect fungal infection of rapeseed petals by applying hyperspectral imaging in the spectral region of 874-1734 nm coupled with chemometrics. Reflectance was extracted from regions of interest (ROIs) in the hyperspectral image of each sample. Firstly, principal component analysis (PCA) was applied to conduct a cluster analysis with the first several principal components (PCs). Then, two methods including X-loadings of PCA and random frog (RF) algorithm were used and compared for optimizing wavebands selection. Least squares-support vector machine (LS-SVM) methodology was employed to establish discriminative models based on the optimal and full wavebands. Finally, area under the receiver operating characteristics curve (AUC) was utilized to evaluate classification performance of these LS-SVM models. It was found that LS-SVM based on the combination of all optimal wavebands had the best performance with AUC of 0.929. These results were promising and demonstrated the potential of applying hyperspectral imaging in fungus infection detection on rapeseed petals.

  7. Raman spectroscopy differentiates between sensitive and resistant multiple myeloma cell lines

    NASA Astrophysics Data System (ADS)

    Franco, Domenico; Trusso, Sebastiano; Fazio, Enza; Allegra, Alessandro; Musolino, Caterina; Speciale, Antonio; Cimino, Francesco; Saija, Antonella; Neri, Fortunato; Nicolò, Marco S.; Guglielmino, Salvatore P. P.

    2017-12-01

    Current methods for identifying neoplastic cells and discerning them from their normal counterparts are often nonspecific and biologically perturbing. Here, we show that single-cell micro-Raman spectroscopy can be used to discriminate between resistant and sensitive multiple myeloma cell lines based on their highly reproducible biomolecular spectral signatures. In order to demonstrate robustness of the proposed approach, we used two different cell lines of multiple myeloma, namely MM.1S and U266B1, and their counterparts MM.1R and U266/BTZ-R subtypes, resistant to dexamethasone and bortezomib, respectively. Then, micro-Raman spectroscopy provides an easily accurate and noninvasive method for cancer detection for both research and clinical environments. Characteristic peaks, mostly due to different DNA/RNA ratio, nucleic acids, lipids and protein concentrations, allow for discerning the sensitive and resistant subtypes. We also explored principal component analysis (PCA) for resistant cell identification and classification. Sensitive and resistant cells form distinct clusters that can be defined using just two principal components. The identification of drug-resistant cells by confocal micro-Raman spectroscopy is thus proposed as a clinical tool to assess the development of resistance to glucocorticoids and proteasome inhibitors in myeloma cells.

  8. Sand/cement ratio evaluation on mortar using neural networks and ultrasonic transmission inspection.

    PubMed

    Molero, M; Segura, I; Izquierdo, M A G; Fuente, J V; Anaya, J J

    2009-02-01

    The quality and degradation state of building materials can be determined by nondestructive testing (NDT). These materials are composed of a cementitious matrix and particles or fragments of aggregates. Sand/cement ratio (s/c) provides the final material quality; however, the sand content can mask the matrix properties in a nondestructive measurement. Therefore, s/c ratio estimation is needed in nondestructive characterization of cementitious materials. In this study, a methodology to classify the sand content in mortar is presented. The methodology is based on ultrasonic transmission inspection, data reduction, and features extraction by principal components analysis (PCA), and neural network classification. This evaluation is carried out with several mortar samples, which were made while taking into account different cement types and s/c ratios. The estimated s/c ratio is determined by ultrasonic spectral attenuation with three different broadband transducers (0.5, 1, and 2 MHz). Statistical PCA to reduce the dimension of the captured traces has been applied. Feed-forward neural networks (NNs) are trained using principal components (PCs) and their outputs are used to display the estimated s/c ratios in false color images, showing the s/c ratio distribution of the mortar samples.

  9. Status of Vegetation Classification in Redwood Ecosystems

    Treesearch

    Thomas M. Mahony; John D. Stuart

    2007-01-01

    Vegetation classifications, based primarily on physiognomic variability and canopy dominants and derived principally from remotely sensed imagery, have been completed for the entire redwood range (Eyre 1980, Fox 1989). However, systematic, quantitative, floristic-based vegetation classifications in old-growth redwood forests have not been completed for large portions...

  10. [Establishment of the Mathematical Model for PMI Estimation Using FTIR Spectroscopy and Data Mining Method].

    PubMed

    Wang, L; Qin, X C; Lin, H C; Deng, K F; Luo, Y W; Sun, Q R; Du, Q X; Wang, Z Y; Tuo, Y; Sun, J H

    2018-02-01

    To analyse the relationship between Fourier transform infrared (FTIR) spectrum of rat's spleen tissue and postmortem interval (PMI) for PMI estimation using FTIR spectroscopy combined with data mining method. Rats were sacrificed by cervical dislocation, and the cadavers were placed at 20 ℃. The FTIR spectrum data of rats' spleen tissues were taken and measured at different time points. After pretreatment, the data was analysed by data mining method. The absorption peak intensity of rat's spleen tissue spectrum changed with the PMI, while the absorption peak position was unchanged. The results of principal component analysis (PCA) showed that the cumulative contribution rate of the first three principal components was 96%. There was an obvious clustering tendency for the spectrum sample at each time point. The methods of partial least squares discriminant analysis (PLS-DA) and support vector machine classification (SVMC) effectively divided the spectrum samples with different PMI into four categories (0-24 h, 48-72 h, 96-120 h and 144-168 h). The determination coefficient ( R ²) of the PMI estimation model established by PLS regression analysis was 0.96, and the root mean square error of calibration (RMSEC) and root mean square error of cross validation (RMSECV) were 9.90 h and 11.39 h respectively. In prediction set, the R ² was 0.97, and the root mean square error of prediction (RMSEP) was 10.49 h. The FTIR spectrum of the rat's spleen tissue can be effectively analyzed qualitatively and quantitatively by the combination of FTIR spectroscopy and data mining method, and the classification and PLS regression models can be established for PMI estimation. Copyright© by the Editorial Department of Journal of Forensic Medicine.

  11. Detection of goat body fat adulteration in pure ghee using ATR-FTIR spectroscopy coupled with chemometric strategy.

    PubMed

    Upadhyay, Neelam; Jaiswal, Pranita; Jha, Shyam Narayan

    2016-10-01

    Ghee forms an important component of the diet of human beings due to its rich flavor and high nutritive value. This high priced fat is prone to adulteration with cheaper fats. ATR-FTIR spectroscopy coupled with chemometrics was applied for determining the presence of goat body fat in ghee (@1, 3, 5, 10, 15 and 20% level in the laboratory made/spiked samples). The spectra of pure (ghee and goat body fat) and spiked samples were taken in the wavenumber range of 4000-500 cm -1 . Separated clusters of pure ghee and spiked samples were obtained on applying principal component analysis at 5% level of significance in the selected wavenumber range (1786-1680, 1490-919 and 1260-1040 cm -1 ). SIMCA was applied for classification of samples and pure ghee showed 100% classification efficiency. The value of R 2 was found to be >0.99 for calibration and validation sets using partial least square method at all the selected wavenumber range which indicate that the model was well developed. The study revealed that the spiked samples of goat body fat could be detected even at 1% level in ghee.

  12. PCA feature extraction for change detection in multidimensional unlabeled data.

    PubMed

    Kuncheva, Ludmila I; Faithfull, William J

    2014-01-01

    When classifiers are deployed in real-world applications, it is assumed that the distribution of the incoming data matches the distribution of the data used to train the classifier. This assumption is often incorrect, which necessitates some form of change detection or adaptive classification. While there has been a lot of work on change detection based on the classification error monitored over the course of the operation of the classifier, finding changes in multidimensional unlabeled data is still a challenge. Here, we propose to apply principal component analysis (PCA) for feature extraction prior to the change detection. Supported by a theoretical example, we argue that the components with the lowest variance should be retained as the extracted features because they are more likely to be affected by a change. We chose a recently proposed semiparametric log-likelihood change detection criterion that is sensitive to changes in both mean and variance of the multidimensional distribution. An experiment with 35 datasets and an illustration with a simple video segmentation demonstrate the advantage of using extracted features compared to raw data. Further analysis shows that feature extraction through PCA is beneficial, specifically for data with multiple balanced classes.

  13. Deep-learning-based classification of FDG-PET data for Alzheimer's disease categories

    NASA Astrophysics Data System (ADS)

    Singh, Shibani; Srivastava, Anant; Mi, Liang; Caselli, Richard J.; Chen, Kewei; Goradia, Dhruman; Reiman, Eric M.; Wang, Yalin

    2017-11-01

    Fluorodeoxyglucose (FDG) positron emission tomography (PET) measures the decline in the regional cerebral metabolic rate for glucose, offering a reliable metabolic biomarker even on presymptomatic Alzheimer's disease (AD) patients. PET scans provide functional information that is unique and unavailable using other types of imaging. However, the computational efficacy of FDG-PET data alone, for the classification of various Alzheimers Diagnostic categories, has not been well studied. This motivates us to correctly discriminate various AD Diagnostic categories using FDG-PET data. Deep learning has improved state-of-the-art classification accuracies in the areas of speech, signal, image, video, text mining and recognition. We propose novel methods that involve probabilistic principal component analysis on max-pooled data and mean-pooled data for dimensionality reduction, and multilayer feed forward neural network which performs binary classification. Our experimental dataset consists of baseline data of subjects including 186 cognitively unimpaired (CU) subjects, 336 mild cognitive impairment (MCI) subjects with 158 Late MCI and 178 Early MCI, and 146 AD patients from Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. We measured F1-measure, precision, recall, negative and positive predictive values with a 10-fold cross validation scheme. Our results indicate that our designed classifiers achieve competitive results while max pooling achieves better classification performance compared to mean-pooled features. Our deep model based research may advance FDG-PET analysis by demonstrating their potential as an effective imaging biomarker of AD.

  14. Voxel-Based Neighborhood for Spatial Shape Pattern Classification of Lidar Point Clouds with Supervised Learning.

    PubMed

    Plaza-Leiva, Victoria; Gomez-Ruiz, Jose Antonio; Mandow, Anthony; García-Cerezo, Alfonso

    2017-03-15

    Improving the effectiveness of spatial shape features classification from 3D lidar data is very relevant because it is largely used as a fundamental step towards higher level scene understanding challenges of autonomous vehicles and terrestrial robots. In this sense, computing neighborhood for points in dense scans becomes a costly process for both training and classification. This paper proposes a new general framework for implementing and comparing different supervised learning classifiers with a simple voxel-based neighborhood computation where points in each non-overlapping voxel in a regular grid are assigned to the same class by considering features within a support region defined by the voxel itself. The contribution provides offline training and online classification procedures as well as five alternative feature vector definitions based on principal component analysis for scatter, tubular and planar shapes. Moreover, the feasibility of this approach is evaluated by implementing a neural network (NN) method previously proposed by the authors as well as three other supervised learning classifiers found in scene processing methods: support vector machines (SVM), Gaussian processes (GP), and Gaussian mixture models (GMM). A comparative performance analysis is presented using real point clouds from both natural and urban environments and two different 3D rangefinders (a tilting Hokuyo UTM-30LX and a Riegl). Classification performance metrics and processing time measurements confirm the benefits of the NN classifier and the feasibility of voxel-based neighborhood.

  15. A probability index for surface zonda wind occurrence at Mendoza city through vertical sounding principal components analysis

    NASA Astrophysics Data System (ADS)

    Otero, Federico; Norte, Federico; Araneo, Diego

    2018-01-01

    The aim of this work is to obtain an index for predicting the probability of occurrence of zonda event at surface level from sounding data at Mendoza city, Argentine. To accomplish this goal, surface zonda wind events were previously found with an objective classification method (OCM) only considering the surface station values. Once obtained the dates and the onset time of each event, the prior closest sounding for each event was taken to realize a principal component analysis (PCA) that is used to identify the leading patterns of the vertical structure of the atmosphere previously to a zonda wind event. These components were used to construct the index model. For the PCA an entry matrix of temperature ( T) and dew point temperature (Td) anomalies for the standard levels between 850 and 300 hPa was build. The analysis yielded six significant components with a 94 % of the variance explained and the leading patterns of favorable weather conditions for the development of the phenomenon were obtained. A zonda/non-zonda indicator c can be estimated by a logistic multiple regressions depending on the PCA component loadings, determining a zonda probability index \\widehat{c} calculable from T and Td profiles and it depends on the climatological features of the region. The index showed 74.7 % efficiency. The same analysis was performed by adding surface values of T and Td from Mendoza Aero station increasing the index efficiency to 87.8 %. The results revealed four significantly correlated PCs with a major improvement in differentiating zonda cases and a reducing of the uncertainty interval.

  16. Optimal Non-Invasive Fault Classification Model for Packaged Ceramic Tile Quality Monitoring Using MMW Imaging

    NASA Astrophysics Data System (ADS)

    Agarwal, Smriti; Singh, Dharmendra

    2016-04-01

    Millimeter wave (MMW) frequency has emerged as an efficient tool for different stand-off imaging applications. In this paper, we have dealt with a novel MMW imaging application, i.e., non-invasive packaged goods quality estimation for industrial quality monitoring applications. An active MMW imaging radar operating at 60 GHz has been ingeniously designed for concealed fault estimation. Ceramic tiles covered with commonly used packaging cardboard were used as concealed targets for undercover fault classification. A comparison of computer vision-based state-of-the-art feature extraction techniques, viz, discrete Fourier transform (DFT), wavelet transform (WT), principal component analysis (PCA), gray level co-occurrence texture (GLCM), and histogram of oriented gradient (HOG) has been done with respect to their efficient and differentiable feature vector generation capability for undercover target fault classification. An extensive number of experiments were performed with different ceramic tile fault configurations, viz., vertical crack, horizontal crack, random crack, diagonal crack along with the non-faulty tiles. Further, an independent algorithm validation was done demonstrating classification accuracy: 80, 86.67, 73.33, and 93.33 % for DFT, WT, PCA, GLCM, and HOG feature-based artificial neural network (ANN) classifier models, respectively. Classification results show good capability for HOG feature extraction technique towards non-destructive quality inspection with appreciably low false alarm as compared to other techniques. Thereby, a robust and optimal image feature-based neural network classification model has been proposed for non-invasive, automatic fault monitoring for a financially and commercially competent industrial growth.

  17. Estimation of the Age and Amount of Brown Rice Plant Hoppers Based on Bionic Electronic Nose Use

    PubMed Central

    Xu, Sai; Zhou, Zhiyan; Lu, Huazhong; Luo, Xiwen; Lan, Yubin; Zhang, Yang; Li, Yanfang

    2014-01-01

    The brown rice plant hopper (BRPH), Nilaparvata lugens (Stal), is one of the most important insect pests affecting rice and causes serious damage to the yield and quality of rice plants in Asia. This study used bionic electronic nose technology to sample BRPH volatiles, which vary in age and amount. Principal component analysis (PCA), linear discrimination analysis (LDA), probabilistic neural network (PNN), BP neural network (BPNN) and loading analysis (Loadings) techniques were used to analyze the sampling data. The results indicate that the PCA and LDA classification ability is poor, but the LDA classification displays superior performance relative to PCA. When a PNN was used to evaluate the BRPH age and amount, the classification rates of the training set were 100% and 96.67%, respectively, and the classification rates of the test set were 90.67% and 64.67%, respectively. When BPNN was used for the evaluation of the BRPH age and amount, the classification accuracies of the training set were 100% and 48.93%, respectively, and the classification accuracies of the test set were 96.67% and 47.33%, respectively. Loadings for BRPH volatiles indicate that the main elements of BRPHs' volatiles are sulfur-containing organics, aromatics, sulfur- and chlorine-containing organics and nitrogen oxides, which provide a reference for sensors chosen when exploited in specialized BRPH identification devices. This research proves the feasibility and broad application prospects of bionic electronic noses for BRPH recognition. PMID:25268913

  18. Typology of person-environment fit constellations: a platform addressing accessibility problems in the built environment for people with functional limitations.

    PubMed

    Slaug, Björn; Schilling, Oliver; Iwarsson, Susanne; Carlsson, Gunilla

    2015-09-02

    Making the built environment accessible for all regardless of functional capacity is an important goal for public health efforts. Considerable impediments to achieving this goal suggest the need for valid measurements of acccessibility and for greater attention to the complexity of person-environment fit issues. To address these needs, this study aimed to provide a methodological platform, useful for further research and instrument development within accessibility research. This was accomplished by the construction of a typology of problematic person-environment fit constellations, utilizing an existing methodology developed to assess and analyze accessibility problems in the built environment. By means of qualitative review and statistical methods we classified the person-environment fit components covered by an existing application which targets housing accessibility: the Housing Enabler (HE) instrument. The International Classification of Functioning, Disability and Health (ICF) was used as a conceptual framework. Qualitative classification principles were based on conceptual similarities and for quantitative analysis of similarities, Principal Component Analysis was carried out. We present a typology of problematic person-environment fit constellations classified along three dimensions: 1) accessibility problem range and severity 2) aspects of functioning 3) environmental context. As a result of the classification of the HE components, 48 typical person-environment fit constellations were recognised. The main contribution of this study is the proposed typology of person-environment fit constellations. The typology provides a methodological platform for the identification and quantification of problematic person-environment fit constellations. Its link to the globally accepted ICF classification system facilitates communication within the scientific and health care practice communities. The typology also highlights how relations between aspects of functioning and physical environmental barriers generate typical accessibility problems, and thereby furnishes a reference point for research oriented to how the built environment may be designed to be supportive for activity, participation and health.

  19. 8 CFR 245.23 - Adjustment of aliens in T nonimmigrant classification.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... 8 Aliens and Nationality 1 2013-01-01 2013-01-01 false Adjustment of aliens in T nonimmigrant classification. 245.23 Section 245.23 Aliens and Nationality DEPARTMENT OF HOMELAND SECURITY IMMIGRATION... aliens in T nonimmigrant classification. (a) Eligibility of principal T-1 applicants. Except as described...

  20. 8 CFR 245.23 - Adjustment of aliens in T nonimmigrant classification.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... 8 Aliens and Nationality 1 2012-01-01 2012-01-01 false Adjustment of aliens in T nonimmigrant classification. 245.23 Section 245.23 Aliens and Nationality DEPARTMENT OF HOMELAND SECURITY IMMIGRATION... aliens in T nonimmigrant classification. (a) Eligibility of principal T-1 applicants. Except as described...

  1. 8 CFR 245.23 - Adjustment of aliens in T nonimmigrant classification.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 8 Aliens and Nationality 1 2011-01-01 2011-01-01 false Adjustment of aliens in T nonimmigrant classification. 245.23 Section 245.23 Aliens and Nationality DEPARTMENT OF HOMELAND SECURITY IMMIGRATION... aliens in T nonimmigrant classification. (a) Eligibility of principal T-1 applicants. Except as described...

  2. 8 CFR 245.23 - Adjustment of aliens in T nonimmigrant classification.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... 8 Aliens and Nationality 1 2014-01-01 2014-01-01 false Adjustment of aliens in T nonimmigrant classification. 245.23 Section 245.23 Aliens and Nationality DEPARTMENT OF HOMELAND SECURITY IMMIGRATION... aliens in T nonimmigrant classification. (a) Eligibility of principal T-1 applicants. Except as described...

  3. 8 CFR 245.23 - Adjustment of aliens in T nonimmigrant classification.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 8 Aliens and Nationality 1 2010-01-01 2010-01-01 false Adjustment of aliens in T nonimmigrant classification. 245.23 Section 245.23 Aliens and Nationality DEPARTMENT OF HOMELAND SECURITY IMMIGRATION... aliens in T nonimmigrant classification. (a) Eligibility of principal T-1 applicants. Except as described...

  4. Health care resource utilization in patients with active epilepsy.

    PubMed

    Kurth, Tobias; Lewis, Barbara E; Walker, Alexander M

    2010-05-01

    To evaluate health care resource utilization (HRU) in active epilepsy. Thomson-Reuters insurance databases included 14 million persons in 2005-2007. We extracted information for individuals with insurance claims suggestive of epilepsy. Using iterative expert classification, we sorted patients by type of epilepsy. For each type we calculated prevalence and HRU. A distance analysis identified closely similar types, and a principal components analysis revealed dimensions of variation in HRU. The prevalence of active epilepsy was 3.4 per 1,000. Most common diagnoses among 46,847 patients were generalized convulsive epilepsy (33.3%) and complex partial seizures (24.8%). Patients averaged 10 physician visits per year, 24 diagnostic tests/procedures per year, >30 drug dispensings per year, and <1 emergency room (ER) visit per year, the minority of each of these being related to epilepsy. Female patients generally had more HRU, and HRU increased with age. Patients were hospitalized most frequently for disorders other than epilepsy. HRU was similar for most epilepsy types, excepting grand mal status, epilepsia partialis continua, and infantile spasms. The first principal components of HRU variation was nonepilepsy HRU, followed by components of epilepsy-related medications, other epilepsy/emergency care, and epilepsy visits/diagnostic procedures. The prevalence of active epilepsy in the United States is substantially less than the prevalence of any history of recurrent seizure. Nonepilepsy-related HRU dominated HRU in epilepsy patients and was the principal source of variation. There is a core set of epilepsy diagnoses, the HRU patterns of which are indistinguishable, whereas patients with grand mal status, epilepsia partialis continua, and infantile spasms all have distinct patterns. To provide more specific insights into the economic impact of the condition, studies of HRU in epilepsy should make a distinction about epilepsy-related and unrelated care.

  5. Principal Component Analysis of Cerebellar Shape on MRI Separates SCA Types 2 and 6 into Two Archetypal Modes of Degeneration

    PubMed Central

    Jung, Brian C.; Choi, Soo I.; Du, Annie X.; Cuzzocreo, Jennifer L.; Geng, Zhuo Z.; Ying, Howard S.; Perlman, Susan L.; Toga, Arthur W.; Prince, Jerry L.

    2014-01-01

    Although “cerebellar ataxia” is often used in reference to a disease process, presumably there are different underlying pathogenetic mechanisms for different subtypes. Indeed, spinocerebellar ataxia (SCA) types 2 and 6 demonstrate complementary phenotypes, thus predicting a different anatomic pattern of degeneration. Here, we show that an unsupervised classification method, based on principal component analysis (PCA) of cerebellar shape characteristics, can be used to separate SCA2 and SCA6 into two classes, which may represent disease-specific archetypes. Patients with SCA2 (n=11) and SCA6 (n=7) were compared against controls (n=15) using PCA to classify cerebellar anatomic shape characteristics. Within the first three principal components, SCA2 and SCA6 differed from controls and from each other. In a secondary analysis, we studied five additional subjects and found that these patients were consistent with the previously defined archetypal clusters of clinical and anatomical characteristics. Secondary analysis of five subjects with related diagnoses showed that disease groups that were clinically and pathophysiologically similar also shared similar anatomic characteristics. Specifically, Archetype #1 consisted of SCA3 (n=1) and SCA2, suggesting that cerebellar syndromes accompanied by atrophy of the pons may be associated with a characteristic pattern of cerebellar neurodegeneration. In comparison, Archetype #2 was comprised of disease groups with pure cerebellar atrophy (episodic ataxia type 2 (n=1), idiopathic late-onset cerebellar ataxias (n=3), and SCA6). This suggests that cerebellar shape analysis could aid in discriminating between different pathologies. Our findings further suggest that magnetic resonance imaging is a promising imaging biomarker that could aid in the diagnosis and therapeutic management in patients with cerebellar syndromes. PMID:22258915

  6. [Study on discrimination of varieties of fire resistive coating for steel structure based on near-infrared spectroscopy].

    PubMed

    Xue, Gang; Song, Wen-qi; Li, Shu-chao

    2015-01-01

    In order to achieve the rapid identification of fire resistive coating for steel structure of different brands in circulating, a new method for the fast discrimination of varieties of fire resistive coating for steel structure by means of near infrared spectroscopy was proposed. The raster scanning near infrared spectroscopy instrument and near infrared diffuse reflectance spectroscopy were applied to collect the spectral curve of different brands of fire resistive coating for steel structure and the spectral data were preprocessed with standard normal variate transformation(standard normal variate transformation, SNV) and Norris second derivative. The principal component analysis (principal component analysis, PCA)was used to near infrared spectra for cluster analysis. The analysis results showed that the cumulate reliabilities of PC1 to PC5 were 99. 791%. The 3-dimentional plot was drawn with the scores of PC1, PC2 and PC3 X 10, which appeared to provide the best clustering of the varieties of fire resistive coating for steel structure. A total of 150 fire resistive coating samples were divided into calibration set and validation set randomly, the calibration set had 125 samples with 25 samples of each variety, and the validation set had 25 samples with 5 samples of each variety. According to the principal component scores of unknown samples, Mahalanobis distance values between each variety and unknown samples were calculated to realize the discrimination of different varieties. The qualitative analysis model for external verification of unknown samples is a 10% recognition ration. The results demonstrated that this identification method can be used as a rapid, accurate method to identify the classification of fire resistive coating for steel structure and provide technical reference for market regulation.

  7. EEG Subspace Analysis and Classification Using Principal Angles for Brain-Computer Interfaces

    NASA Astrophysics Data System (ADS)

    Ashari, Rehab Bahaaddin

    Brain-Computer Interfaces (BCIs) help paralyzed people who have lost some or all of their ability to communicate and control the outside environment from loss of voluntary muscle control. Most BCIs are based on the classification of multichannel electroencephalography (EEG) signals recorded from users as they respond to external stimuli or perform various mental activities. The classification process is fraught with difficulties caused by electrical noise, signal artifacts, and nonstationarity. One approach to reducing the effects of similar difficulties in other domains is the use of principal angles between subspaces, which has been applied mostly to video sequences. This dissertation studies and examines different ideas using principal angles and subspaces concepts. It introduces a novel mathematical approach for comparing sets of EEG signals for use in new BCI technology. The success of the presented results show that principal angles are also a useful approach to the classification of EEG signals that are recorded during a BCI typing application. In this application, the appearance of a subject's desired letter is detected by identifying a P300-wave within a one-second window of EEG following the flash of a letter. Smoothing the signals before using them is the only preprocessing step that was implemented in this study. The smoothing process based on minimizing the second derivative in time is implemented to increase the classification accuracy instead of using the bandpass filter that relies on assumptions on the frequency content of EEG. This study examines four different ways of removing outliers that are based on the principal angles and shows that the outlier removal methods did not help in the presented situations. One of the concepts that this dissertation focused on is the effect of the number of trials on the classification accuracies. The achievement of the good classification results by using a small number of trials starting from two trials only, should make this approach more appropriate for online BCI applications. In order to understand and test how EEG signals are different from one subject to another, different users are tested in this dissertation, some with motor impairments. Furthermore, the concept of transferring information between subjects is examined by training the approach on one subject and testing it on the other subject using the training subject's EEG subspaces to classify the testing subject's trials.

  8. Design of Medians for Principal Arterials

    DOT National Transportation Integrated Search

    2001-08-01

    Public highways and streets have dual but competing roles: to provide property access and to move through traffic. Highway functional classification systems recognize the competition between access and flow, generally specifying that principal arteri...

  9. Integration of spectral, spatial and morphometric data into lithological mapping: A comparison of different Machine Learning Algorithms in the Kurdistan Region, NE Iraq

    NASA Astrophysics Data System (ADS)

    Othman, Arsalan A.; Gloaguen, Richard

    2017-09-01

    Lithological mapping in mountainous regions is often impeded by limited accessibility due to relief. This study aims to evaluate (1) the performance of different supervised classification approaches using remote sensing data and (2) the use of additional information such as geomorphology. We exemplify the methodology in the Bardi-Zard area in NE Iraq, a part of the Zagros Fold - Thrust Belt, known for its chromite deposits. We highlighted the improvement of remote sensing geological classification by integrating geomorphic features and spatial information in the classification scheme. We performed a Maximum Likelihood (ML) classification method besides two Machine Learning Algorithms (MLA): Support Vector Machine (SVM) and Random Forest (RF) to allow the joint use of geomorphic features, Band Ratio (BR), Principal Component Analysis (PCA), spatial information (spatial coordinates) and multispectral data of the Advanced Space-borne Thermal Emission and Reflection radiometer (ASTER) satellite. The RF algorithm showed reliable results and discriminated serpentinite, talus and terrace deposits, red argillites with conglomerates and limestone, limy conglomerates and limestone conglomerates, tuffites interbedded with basic lavas, limestone and Metamorphosed limestone and reddish green shales. The best overall accuracy (∼80%) was achieved by Random Forest (RF) algorithms in the majority of the sixteen tested combination datasets.

  10. Fast, reagentless and reliable screening of "white powders" during the bioterrorism hoaxes.

    PubMed

    Włodarski, Maksymilian; Kaliszewski, Miron; Trafny, Elżbieta Anna; Szpakowska, Małgorzata; Lewandowski, Rafał; Bombalska, Aneta; Kwaśny, Mirosław; Kopczyński, Krzysztof; Mularczyk-Oliwa, Monika

    2015-03-01

    The classification of dry powder samples is an important step in managing the consequences of terrorist incidents. Fluorescence decays of these samples (vegetative bacteria, bacterial endospores, fungi, albumins and several flours) were measured with stroboscopic technique using an EasyLife LS system PTI. Three pulsed nanosecond LED sources, generating 280, 340 and 460nm were employed for samples excitation. The usefulness of a new 460nm light source for fluorescence measurements of dry microbial cells has been demonstrated. The principal component analysis (PCA) and hierarchical cluster analysis (HCA) have been used for classification of dry biological samples. It showed that the single excitation wavelength was not sufficient for differentiation of biological samples of diverse origin. However, merging fluorescence decays from two or three excitation wavelengths allowed classification of these samples. An experimental setup allowing the practical implementation of this method for the real time fluorescence decay measurement was designed. It consisted of the LED emitting nanosecond pulses at 280nm and two fast photomultiplier tubes (PMTs) for signal detection in two fluorescence bands simultaneously. The positive results of the dry powder samples measurements confirmed that the fluorescence decay-based technique could be a useful tool for fast classification of the suspected "white powders" performed by the first responders. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  11. Comparison of nine tractography algorithms for detecting abnormal structural brain networks in Alzheimer’s disease

    PubMed Central

    Zhan, Liang; Zhou, Jiayu; Wang, Yalin; Jin, Yan; Jahanshad, Neda; Prasad, Gautam; Nir, Talia M.; Leonardo, Cassandra D.; Ye, Jieping; Thompson, Paul M.; for the Alzheimer’s Disease Neuroimaging Initiative

    2015-01-01

    Alzheimer’s disease (AD) involves a gradual breakdown of brain connectivity, and network analyses offer a promising new approach to track and understand disease progression. Even so, our ability to detect degenerative changes in brain networks depends on the methods used. Here we compared several tractography and feature extraction methods to see which ones gave best diagnostic classification for 202 people with AD, mild cognitive impairment or normal cognition, scanned with 41-gradient diffusion-weighted magnetic resonance imaging as part of the Alzheimer’s Disease Neuroimaging Initiative (ADNI) project. We computed brain networks based on whole brain tractography with nine different methods – four of them tensor-based deterministic (FACT, RK2, SL, and TL), two orientation distribution function (ODF)-based deterministic (FACT, RK2), two ODF-based probabilistic approaches (Hough and PICo), and one “ball-and-stick” approach (Probtrackx). Brain networks derived from different tractography algorithms did not differ in terms of classification performance on ADNI, but performing principal components analysis on networks helped classification in some cases. Small differences may still be detectable in a truly vast cohort, but these experiments help assess the relative advantages of different tractography algorithms, and different post-processing choices, when used for classification. PMID:25926791

  12. Classification of Partial Discharge Measured under Different Levels of Noise Contamination.

    PubMed

    Jee Keen Raymond, Wong; Illias, Hazlee Azil; Abu Bakar, Ab Halim

    2017-01-01

    Cable joint insulation breakdown may cause a huge loss to power companies. Therefore, it is vital to diagnose the insulation quality to detect early signs of insulation failure. It is well known that there is a correlation between Partial discharge (PD) and the insulation quality. Although many works have been done on PD pattern recognition, it is usually performed in a noise free environment. Also, works on PD pattern recognition in actual cable joint are less likely to be found in literature. Therefore, in this work, classifications of actual cable joint defect types from partial discharge data contaminated by noise were performed. Five cross-linked polyethylene (XLPE) cable joints with artificially created defects were prepared based on the defects commonly encountered on site. Three different types of input feature were extracted from the PD pattern under artificially created noisy environment. These include statistical features, fractal features and principal component analysis (PCA) features. These input features were used to train the classifiers to classify each PD defect types. Classifications were performed using three different artificial intelligence classifiers, which include Artificial Neural Networks (ANN), Adaptive Neuro-Fuzzy Inference System (ANFIS) and Support Vector Machine (SVM). It was found that the classification accuracy decreases with higher noise level but PCA features used in SVM and ANN showed the strongest tolerance against noise contamination.

  13. Chromatographic profiles of Phyllanthus aqueous extracts samples: a proposition of classification using chemometric models.

    PubMed

    Martins, Lucia Regina Rocha; Pereira-Filho, Edenir Rodrigues; Cass, Quezia Bezerra

    2011-04-01

    Taking in consideration the global analysis of complex samples, proposed by the metabolomic approach, the chromatographic fingerprint encompasses an attractive chemical characterization of herbal medicines. Thus, it can be used as a tool in quality control analysis of phytomedicines. The generated multivariate data are better evaluated by chemometric analyses, and they can be modeled by classification methods. "Stone breaker" is a popular Brazilian plant of Phyllanthus genus, used worldwide to treat renal calculus, hepatitis, and many other diseases. In this study, gradient elution at reversed-phase conditions with detection at ultraviolet region were used to obtain chemical profiles (fingerprints) of botanically identified samples of six Phyllanthus species. The obtained chromatograms, at 275 nm, were organized in data matrices, and the time shifts of peaks were adjusted using the Correlation Optimized Warping algorithm. Principal Component Analyses were performed to evaluate similarities among cultivated and uncultivated samples and the discrimination among the species and, after that, the samples were used to compose three classification models using Soft Independent Modeling of Class analogy, K-Nearest Neighbor, and Partial Least Squares for Discriminant Analysis. The ability of classification models were discussed after their successful application for authenticity evaluation of 25 commercial samples of "stone breaker."

  14. An artificial intelligence based improved classification of two-phase flow patterns with feature extracted from acquired images.

    PubMed

    Shanthi, C; Pappa, N

    2017-05-01

    Flow pattern recognition is necessary to select design equations for finding operating details of the process and to perform computational simulations. Visual image processing can be used to automate the interpretation of patterns in two-phase flow. In this paper, an attempt has been made to improve the classification accuracy of the flow pattern of gas/ liquid two- phase flow using fuzzy logic and Support Vector Machine (SVM) with Principal Component Analysis (PCA). The videos of six different types of flow patterns namely, annular flow, bubble flow, churn flow, plug flow, slug flow and stratified flow are recorded for a period and converted to 2D images for processing. The textural and shape features extracted using image processing are applied as inputs to various classification schemes namely fuzzy logic, SVM and SVM with PCA in order to identify the type of flow pattern. The results obtained are compared and it is observed that SVM with features reduced using PCA gives the better classification accuracy and computationally less intensive than other two existing schemes. This study results cover industrial application needs including oil and gas and any other gas-liquid two-phase flows. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.

  15. A hierarchical classification approach for recognition of low-density (LDPE) and high-density polyethylene (HDPE) in mixed plastic waste based on short-wave infrared (SWIR) hyperspectral imaging

    NASA Astrophysics Data System (ADS)

    Bonifazi, Giuseppe; Capobianco, Giuseppe; Serranti, Silvia

    2018-06-01

    The aim of this work was to recognize different polymer flakes from mixed plastic waste through an innovative hierarchical classification strategy based on hyperspectral imaging, with particular reference to low density polyethylene (LDPE) and high-density polyethylene (HDPE). A plastic waste composition assessment, including also LDPE and HDPE identification, may help to define optimal recycling strategies for product quality control. Correct handling of plastic waste is essential for its further "sustainable" recovery, maximizing the sorting performance in particular for plastics with similar characteristics as LDPE and HDPE. Five different plastic waste samples were chosen for the investigation: polypropylene (PP), LDPE, HDPE, polystyrene (PS) and polyvinyl chloride (PVC). A calibration dataset was realized utilizing the corresponding virgin polymers. Hyperspectral imaging in the short-wave infrared range (1000-2500 nm) was thus applied to evaluate the different plastic spectral attributes finalized to perform their recognition/classification. After exploring polymer spectral differences by principal component analysis (PCA), a hierarchical partial least squares discriminant analysis (PLS-DA) model was built allowing the five different polymers to be recognized. The proposed methodology, based on hierarchical classification, is very powerful and fast, allowing to recognize the five different polymers in a single step.

  16. Classification of fresh and frozen-thawed pork muscles using visible and near infrared hyperspectral imaging and textural analysis.

    PubMed

    Pu, Hongbin; Sun, Da-Wen; Ma, Ji; Cheng, Jun-Hu

    2015-01-01

    The potential of visible and near infrared hyperspectral imaging was investigated as a rapid and nondestructive technique for classifying fresh and frozen-thawed meats by integrating critical spectral and image features extracted from hyperspectral images in the region of 400-1000 nm. Six feature wavelengths (400, 446, 477, 516, 592 and 686 nm) were identified using uninformative variable elimination and successive projections algorithm. Image textural features of the principal component images from hyperspectral images were obtained using histogram statistics (HS), gray level co-occurrence matrix (GLCM) and gray level-gradient co-occurrence matrix (GLGCM). By these spectral and textural features, probabilistic neural network (PNN) models for classification of fresh and frozen-thawed pork meats were established. Compared with the models using the optimum wavelengths only, optimum wavelengths with HS image features, and optimum wavelengths with GLCM image features, the model integrating optimum wavelengths with GLGCM gave the highest classification rate of 93.14% and 90.91% for calibration and validation sets, respectively. Results indicated that the classification accuracy can be improved by combining spectral features with textural features and the fusion of critical spectral and textural features had better potential than single spectral extraction in classifying fresh and frozen-thawed pork meat. Copyright © 2014 Elsevier Ltd. All rights reserved.

  17. The Systematic Classification of Gallbladder Stones

    PubMed Central

    Qiao, Tie; Ma, Rui-hong; Luo, Xiao-bing; Yang, Liu-qing; Luo, Zhen-liang; Zheng, Pei-ming

    2013-01-01

    Background To develop a method for systematic classification of gallbladder stones, analyze the clinical characteristics of each type of stone and provide a theoretical basis for the study of the formation mechanism of different types of gallbladder stones. Methodology A total of 807 consecutive patients with gallbladder stones were enrolled and their gallstones were studied. The material composition of gallbladder stones was analyzed using Fourier Transform Infrared spectroscopy and the distribution and microstructure of material components was observed with Scanning Electron Microscopy. The composition and distribution of elements were analyzed by an X-ray energy spectrometer. Gallbladder stones were classified accordingly, and then, gender, age, medical history and BMI of patients with each type of stone were analyzed. Principal Findings Gallbladder stones were classified into 8 types and more than ten subtypes, including cholesterol stones (297), pigment stones (217), calcium carbonate stones (139), phosphate stones (12), calcium stearate stones (9), protein stones (3), cystine stones (1) and mixed stones (129). Mixed stones were those stones with two or more than two kinds of material components and the content of each component was similar. A total of 11 subtypes of mixed stones were found in this study. Patients with cholesterol stones were mainly female between the ages of 30 and 50, with higher BMI and shorter medical history than patients with pigment stones (P<0.05), however, patients with pigment, calcium carbonate, phosphate stones were mainly male between the ages of 40 and 60. Conclusion The systematic classification of gallbladder stones indicates that different types of stones have different characteristics in terms of the microstructure, elemental composition and distribution, providing an important basis for the mechanistic study of gallbladder stones. PMID:24124459

  18. Neuroanatomical basis of paroxysmal sympathetic hyperactivity: A diffusion tensor imaging analysis

    PubMed Central

    Hinson, Holly E.; Puybasset, Louis; Weiss, Nicolas; Perlbarg, Vincent; Benali, Habib; Galanaud, Damien; Lasarev, Mike; Stevens, Robert D.

    2015-01-01

    Primary objective Paroxysmal sympathetic hyperactivity (PSH) is observed in a sub-set of patients with moderate-to-severe traumatic brain injury (TBI). The neuroanatomical basis of PSH is poorly understood. It is hypothesized that PSH is linked to changes in connectivity within the central autonomic network. Research design Retrospective analysis in a sub-set of patients from a multi-centre, prospective cohort study Methods and procedures Adult patients who were <3 weeks after severe TBI were enrolled and screened for PSH using a standard definition. Patients underwent multimodal MRI, which included quantitative diffusion tensor imaging. Main outcomes and results Principal component analysis (PCA) was used to resolve the set of tracts into components. Ability to predict PSH was evaluated via area under the receiver operating characteristic (AUROC) and tree-based classification analyses. Among 102 enrolled patients, 16 met criteria for PSH. The first principle component was significantly associated (p = 0.024, AUROC = 0.867) with PSH status even after controlling for age and admission GCS. In a classification tree analysis, age, GCS and decreased FA in the splenium of the corpus callosum and in the right posterior limb of the internal capsule discriminated PSH vs no PSH with an AUROC of 0.933. Conclusions Disconnection involving the posterior corpus callosum and of the posterior limb of the internal capsule may play a role in the pathogenesis or expression of PSH. PMID:25565392

  19. On the Fallibility of Principal Components in Research

    ERIC Educational Resources Information Center

    Raykov, Tenko; Marcoulides, George A.; Li, Tenglong

    2017-01-01

    The measurement error in principal components extracted from a set of fallible measures is discussed and evaluated. It is shown that as long as one or more measures in a given set of observed variables contains error of measurement, so also does any principal component obtained from the set. The error variance in any principal component is shown…

  20. THE BRIEF PSYCHIATRIC RATING SCALE IN POSITIVE AND NEGATIVE SUBTYPES OF SCHIZOPHRENIA

    PubMed Central

    Kulhara, P.; Mattoo, S.K.; Avasthi, A.; Malhotra, A.

    1987-01-01

    SUMMARY Usefulness of the Brief Psychiatric Rating Scale (BPRS) in distinguishing positive and negative subtypes of schizophrenia is presented. Ninety five schizophrenic patients were assessed on BPRS. Significant differences emerged between positive and negative subtypes of schizophrenia on items like emotional withdrawal, guilt feelings, tension, hallucinatory behaviour, motor retardation, blunted affect and excitement. Discriminant function equation generated by these items had a high rate of prediction of group membership either to positive or negative schizophrenia group. Principal components analysis of BPRS scores yielded factors which favour categorization of patients in positive, negative subtypes. The study provides support for classification of schizophrenia into these subtypes. PMID:21927241

  1. An ECG signals compression method and its validation using NNs.

    PubMed

    Fira, Catalina Monica; Goras, Liviu

    2008-04-01

    This paper presents a new algorithm for electrocardiogram (ECG) signal compression based on local extreme extraction, adaptive hysteretic filtering and Lempel-Ziv-Welch (LZW) coding. The algorithm has been verified using eight of the most frequent normal and pathological types of cardiac beats and an multi-layer perceptron (MLP) neural network trained with original cardiac patterns and tested with reconstructed ones. Aspects regarding the possibility of using the principal component analysis (PCA) to cardiac pattern classification have been investigated as well. A new compression measure called "quality score," which takes into account both the reconstruction errors and the compression ratio, is proposed.

  2. Classification of stevia sweeteners in soft drinks using liquid chromatography and time-of-flight mass spectrometry.

    PubMed

    Kakigi, Y; Suzuki, T; Icho, T; Uyama, A; Mochizuki, N

    2013-01-01

    The aim of this study was to develop a comprehensive analytical method for the characterisation of stevia sweeteners in soft drinks. By using LC and time-of-flight MS, we detected 30 steviol glycosides from nine stevia sweeteners. The mass spectral data of these compounds were applied to the analysis to determine steviol glycosides in nine soft drinks. On the basis of chromatographic data and principal-component analysis, these soft drinks were classified into three groups, and the soft drinks of each group, respectively, contained high-rebaudioside A extract, normal stevia extract or alfa-glucosyltransferase-treated stevia extract.

  3. Discrimination of Medicine Radix Astragali from Different Geographic Origins Using Multiple Spectroscopies Combined with Data Fusion Methods

    NASA Astrophysics Data System (ADS)

    Wang, Hai-Yan; Song, Chao; Sha, Min; Liu, Jun; Li, Li-Ping; Zhang, Zheng-Yong

    2018-05-01

    Raman spectra and ultraviolet-visible absorption spectra of four different geographic origins of Radix Astragali were collected. These data were analyzed using kernel principal component analysis combined with sparse representation classification. The results showed that the recognition rate reached 70.44% using Raman spectra for data input and 90.34% using ultraviolet-visible absorption spectra for data input. A new fusion method based on Raman combined with ultraviolet-visible data was investigated and the recognition rate was increased to 96.43%. The experimental results suggested that the proposed data fusion method effectively improved the utilization rate of the original data.

  4. Using recurrence plot analysis for software execution interpretation and fault detection

    NASA Astrophysics Data System (ADS)

    Mosdorf, M.

    2015-09-01

    This paper shows a method targeted at software execution interpretation and fault detection using recurrence plot analysis. In in the proposed approach recurrence plot analysis is applied to software execution trace that contains executed assembly instructions. Results of this analysis are subject to further processing with PCA (Principal Component Analysis) method that simplifies number coefficients used for software execution classification. This method was used for the analysis of five algorithms: Bubble Sort, Quick Sort, Median Filter, FIR, SHA-1. Results show that some of the collected traces could be easily assigned to particular algorithms (logs from Bubble Sort and FIR algorithms) while others are more difficult to distinguish.

  5. Detection of Poisonous Herbs by Terahertz Time-Domain Spectroscopy

    NASA Astrophysics Data System (ADS)

    Zhang, H.; Li, Z.; Chen, T.; Liu, J.-J.

    2018-03-01

    The aim of this paper is the application of terahertz (THz) spectroscopy combined with chemometrics techniques to distinguish poisonous and non-poisonous herbs which both have a similar appearance. Spectra of one poisonous and two non-poisonous herbs (Gelsemium elegans, Lonicera japonica Thunb, and Ficus Hirta Vahl) were obtained in the range 0.2-1.4 THz by using a THz time-domain spectroscopy system. Principal component analysis (PCA) was used for feature extraction. The prediction accuracy of classification is between 97.78 to 100%. The results demonstrate an efficient and applicative method to distinguish poisonous herbs, and it may be implemented by using THz spectroscopy combined with chemometric algorithms.

  6. Rapid detection of milk adulteration using intact protein flow injection mass spectrometric fingerprints combined with chemometrics.

    PubMed

    Du, Lijuan; Lu, Weiying; Cai, Zhenzhen Julia; Bao, Lei; Hartmann, Christoph; Gao, Boyan; Yu, Liangli Lucy

    2018-02-01

    Flow injection mass spectrometry (FIMS) combined with chemometrics was evaluated for rapidly detecting economically motivated adulteration (EMA) of milk. Twenty-two pure milk and thirty-five counterparts adulterated with soybean, pea, and whey protein isolates at 0.5, 1, 3, 5, and 10% (w/w) levels were analyzed. The principal component analysis (PCA), partial least-squares-discriminant analysis (PLS-DA), and support vector machine (SVM) classification models indicated that the adulterated milks could successfully be classified from the pure milks. FIMS combined with chemometrics might be an effective method to detect possible EMA in milk. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Exploring the complementarity of THz pulse imaging and DCE-MRIs: Toward a unified multi-channel classification and a deep learning framework.

    PubMed

    Yin, X-X; Zhang, Y; Cao, J; Wu, J-L; Hadjiloucas, S

    2016-12-01

    We provide a comprehensive account of recent advances in biomedical image analysis and classification from two complementary imaging modalities: terahertz (THz) pulse imaging and dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI). The work aims to highlight underlining commonalities in both data structures so that a common multi-channel data fusion framework can be developed. Signal pre-processing in both datasets is discussed briefly taking into consideration advances in multi-resolution analysis and model based fractional order calculus system identification. Developments in statistical signal processing using principal component and independent component analysis are also considered. These algorithms have been developed independently by the THz-pulse imaging and DCE-MRI communities, and there is scope to place them in a common multi-channel framework to provide better software standardization at the pre-processing de-noising stage. A comprehensive discussion of feature selection strategies is also provided and the importance of preserving textural information is highlighted. Feature extraction and classification methods taking into consideration recent advances in support vector machine (SVM) and extreme learning machine (ELM) classifiers and their complex extensions are presented. An outlook on Clifford algebra classifiers and deep learning techniques suitable to both types of datasets is also provided. The work points toward the direction of developing a new unified multi-channel signal processing framework for biomedical image analysis that will explore synergies from both sensing modalities for inferring disease proliferation. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  8. In situ FTIR and flash pyrolysis/GC-MS characterization of Protosalvinia (Upper Devonian, Kentucky, USA): Implications for maceral classification

    USGS Publications Warehouse

    Mastalerz, Maria; Hower, J.C.; Carmo, A.

    1998-01-01

    Protosalvinia from Devonian rocks in Kentucky has been analyzed using petrographic and in situ FTIR and flash pyrolysis/GC-MS techniques in order to discuss its origin and placement in organic matter classification. In reflected light, Protosalvinia resembles cutinite in shape, color and reflectance, whereas in fluorescent mode it reveals yellow-green fluorescence, reminiscent of alginite. Alkylbenzenes, alkylnaphthalenes, and n-alkanes are the principal compounds in the pyrolyzates, whereas alkylphenols and n-alk-l-enes are present in minor concentrations. FTIR results show that aliphatic bands (both in stretching and bending modes) are prominent. Protosalvinia also reveals well developed aromatic bands in the out-of-plane region. Such a mixture of aliphatic and aromatic components is not known in documented organic matter types of either marine or terrestrial origin. It is suggested that Protosalvinia might belong to rare marine organisms that yield aromatic pyrolyzates. Based on morphological features and optical properties Protosalvinia should be classified as a maceral of the liptinite group. It does not, however, fit precisely within any of the established categories of the liptinite macerals.Protosalvinia from Devonian rocks in Kentucky has been analyzed using petrographic and in situ FTIR and flash pyrolysis/GC-MS techniques in order to discuss its origin and placement in organic matter classification. In reflected light, Protosalvinia resembles cutinite in shape, color and reflectance, whereas in fluorescent mode it reveals yellow-green fluorescence, reminiscent of alginite. Alkylbenzenes, alkylnaphthalenes, and n-alkanes are the principal compounds in the pyrolyzates, whereas alkylphenols and n-alk-l-enes are present in minor concentrations. FTIR results show that aliphatic bands (both in stretching and bending modes) are prominent. Protosalvinia also reveals well developed aromatic bands in the out-of-plane region. Such a mixture of aliphatic and aromatic components is not known in documented organic matter types of either marine or terrestrial origin. It is suggested that Protosalvinia might belong to rare marine organisms that yield aromatic pyrolyzates. Based on morphological features and optical properties Protosalvinia should be classified as a maceral of the liptinite group. It does not, however, fit precisely within any of the established categories of the liptinite macerals.

  9. Principal component analysis of Mn(salen) catalysts.

    PubMed

    Teixeira, Filipe; Mosquera, Ricardo A; Melo, André; Freire, Cristina; Cordeiro, M Natália D S

    2014-12-14

    The theoretical study of Mn(salen) catalysts has been traditionally performed under the assumption that Mn(acacen') (acacen' = 3,3'-(ethane-1,2-diylbis(azanylylidene))bis(prop-1-en-olate)) is an appropriate surrogate for the larger Mn(salen) complexes. In this work, the geometry and the electronic structure of several Mn(salen) and Mn(acacen') model complexes were studied using Density Functional Theory (DFT) at diverse levels of approximation, with the aim of understanding the effects of truncation, metal oxidation, axial coordination, substitution on the aromatic rings of the salen ligand and chirality of the diimine bridge, as well as the choice of the density functional and basis set. To achieve this goal, geometric and structural data, obtained from these calculations, were subjected to Principal Component Analysis (PCA) and PCA with orthogonal rotation of the components (rPCA). The results show the choice of basis set to be of paramount importance, accounting for up to 30% of the variance in the data, while the differences between salen and acacen' complexes account for about 9% of the variance in the data, and are mostly related to the conformation of the salen/acacen' ligand around the metal centre. Variations in the spin state and oxidation state of the metal centre also account for large fractions of the total variance (up to 10% and 9%, respectively). Other effects, such as the nature of the diimine bridge or the presence of an alkyl substituent in the 3,3 and 5,5 positions of the aldehyde moiety, were found to be less important in terms of explaining the variance within the data set. A matrix of discriminants was compiled using the loadings of the principal and rotated components that best performed in the classification of the entries in the data. The scores obtained from its application to the data set were used as independent variables for devising linear models of different properties, with satisfactory prediction capabilities.

  10. 26 CFR 601.102 - Classification of taxes collected by the Internal Revenue Service.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... Rules § 601.102 Classification of taxes collected by the Internal Revenue Service. (a) Principal... 26 Internal Revenue 20 2010-04-01 2010-04-01 false Classification of taxes collected by the Internal Revenue Service. 601.102 Section 601.102 Internal Revenue INTERNAL REVENUE SERVICE, DEPARTMENT OF...

  11. 29 CFR 779.360 - Classification of liquefied-petroleum-gas sales.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... Establishments Liquefied-Petroleum-Gas and Fuel Oil Dealers § 779.360 Classification of liquefied-petroleum-gas... an essential ingredient or principal raw material, such as sales of liquefied-petroleum-gas for the...

  12. 29 CFR 779.360 - Classification of liquefied-petroleum-gas sales.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... Establishments Liquefied-Petroleum-Gas and Fuel Oil Dealers § 779.360 Classification of liquefied-petroleum-gas... an essential ingredient or principal raw material, such as sales of liquefied-petroleum-gas for the...

  13. 29 CFR 779.360 - Classification of liquefied-petroleum-gas sales.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... Establishments Liquefied-Petroleum-Gas and Fuel Oil Dealers § 779.360 Classification of liquefied-petroleum-gas... an essential ingredient or principal raw material, such as sales of liquefied-petroleum-gas for the...

  14. Investigating the sex-related geometric variation of the human cranium.

    PubMed

    Bertsatos, Andreas; Papageorgopoulou, Christina; Valakos, Efstratios; Chovalopoulou, Maria-Eleni

    2018-01-29

    Accurate sexing methods are of great importance in forensic anthropology since sex assessment is among the principal tasks when examining human skeletal remains. The present study explores a novel approach in assessing the most accurate metric traits of the human cranium for sex estimation based on 80 ectocranial landmarks from 176 modern individuals of known age and sex from the Athens Collection. The purpose of the study is to identify those distance and angle measurements that can be most effectively used in sex assessment. Three-dimensional landmark coordinates were digitized with a Microscribe 3DX and analyzed in GNU Octave. An iterative linear discriminant analysis of all possible combinations of landmarks was performed for each unique set of the 3160 distances and 246,480 angles. Cross-validated correct classification as well as multivariate DFA on top performing variables reported 13 craniometric distances with over 85% classification accuracy, 7 angles over 78%, as well as certain multivariate combinations yielding over 95%. Linear regression of these variables with the centroid size was used to assess their relation to the size of the cranium. In contrast to the use of generalized procrustes analysis (GPA) and principal component analysis (PCA), which constitute the common analytical work flow for such data, our method, although computational intensive, produced easily applicable discriminant functions of high accuracy, while at the same time explored the maximum of cranial variability.

  15. Recovery of a spectrum based on a compressive-sensing algorithm with weighted principal component analysis

    NASA Astrophysics Data System (ADS)

    Dafu, Shen; Leihong, Zhang; Dong, Liang; Bei, Li; Yi, Kang

    2017-07-01

    The purpose of this study is to improve the reconstruction precision and better copy the color of spectral image surfaces. A new spectral reflectance reconstruction algorithm based on an iterative threshold combined with weighted principal component space is presented in this paper, and the principal component with weighted visual features is the sparse basis. Different numbers of color cards are selected as the training samples, a multispectral image is the testing sample, and the color differences in the reconstructions are compared. The channel response value is obtained by a Mega Vision high-accuracy, multi-channel imaging system. The results show that spectral reconstruction based on weighted principal component space is superior in performance to that based on traditional principal component space. Therefore, the color difference obtained using the compressive-sensing algorithm with weighted principal component analysis is less than that obtained using the algorithm with traditional principal component analysis, and better reconstructed color consistency with human eye vision is achieved.

  16. Predicting Survival within the Lung Cancer Histopathological Hierarchy Using a Multi-Scale Genomic Model of Development

    PubMed Central

    Liu, Hongye; Kho, Alvin T; Kohane, Isaac S; Sun, Yao

    2006-01-01

    Background The histopathologic heterogeneity of lung cancer remains a significant confounding factor in its diagnosis and prognosis—spurring numerous recent efforts to find a molecular classification of the disease that has clinical relevance. Methods and Findings Molecular profiles of tumors from 186 patients representing four different lung cancer subtypes (and 17 normal lung tissue samples) were compared with a mouse lung development model using principal component analysis in both temporal and genomic domains. An algorithm for the classification of lung cancers using a multi-scale developmental framework was developed. Kaplan–Meier survival analysis was conducted for lung adenocarcinoma patient subgroups identified via their developmental association. We found multi-scale genomic similarities between four human lung cancer subtypes and the developing mouse lung that are prognostically meaningful. Significant association was observed between the localization of human lung cancer cases along the principal mouse lung development trajectory and the corresponding patient survival rate at three distinct levels of classical histopathologic resolution: among different lung cancer subtypes, among patients within the adenocarcinoma subtype, and within the stage I adenocarcinoma subclass. The earlier the genomic association between a human tumor profile and the mouse lung development sequence, the poorer the patient's prognosis. Furthermore, decomposing this principal lung development trajectory identified a gene set that was significantly enriched for pyrimidine metabolism and cell-adhesion functions specific to lung development and oncogenesis. Conclusions From a multi-scale disease modeling perspective, the molecular dynamics of murine lung development provide an effective framework that is not only data driven but also informed by the biology of development for elucidating the mechanisms of human lung cancer biology and its clinical outcome. PMID:16800721

  17. Reflecting on the structure of soil classification systems: insights from a proposal for integrating subsoil data into soil information systems

    NASA Astrophysics Data System (ADS)

    Dondeyne, Stefaan; Juilleret, Jérôme; Vancampenhout, Karen; Deckers, Jozef; Hissler, Christophe

    2017-04-01

    Classification of soils in both World Reference Base for soil resources (WRB) and Soil Taxonomy hinges on the identification of diagnostic horizons and characteristics. However as these features often occur within the first 100 cm, these classification systems convey little information on subsoil characteristics. An integrated knowledge of the soil, soil-to-substratum and deeper substratum continuum is required when dealing with environmental issues such as vegetation ecology, water quality or the Critical Zone in general. Therefore, we recently proposed a classification system of the subsolum complementing current soil classification systems. By reflecting on the structure of the subsoil classification system which is inspired by WRB, we aim at fostering a discussion on some potential future developments of WRB. For classifying the subsolum we define Regolite, Saprolite, Saprock and Bedrock as four Subsolum Reference Groups each corresponding to different weathering stages of the subsoil. Principal qualifiers can be used to categorize intergrades of these Subsoil Reference Groups while morphologic and lithologic characteristics can be presented with supplementary qualifiers. We argue that adopting a low hierarchical structure - akin to WRB and in contrast to a strong hierarchical structure as in Soil Taxonomy - offers the advantage of having an open classification system avoiding the need for a priori knowledge of all possible combinations which may be encountered in the field. Just as in WRB we also propose to use principal and supplementary qualifiers as a second level of classification. However, in contrast to WRB we propose to reserve the principal qualifiers for intergrades and to regroup the supplementary qualifiers into thematic categories (morphologic or lithologic). Structuring the qualifiers in this manner should facilitate the integration and handling of both soil and subsoil classification units into soil information systems and calls for paying attention to these structural issues in future developments of WRB.

  18. Targeting specific facial variation for different identification tasks.

    PubMed

    Aeria, Gillian; Claes, Peter; Vandermeulen, Dirk; Clement, John Gerald

    2010-09-10

    A conceptual framework that allows faces to be studied and compared objectively with biological validity is presented. The framework is a logical extension of modern morphometrics and statistical shape analysis techniques. Three dimensional (3D) facial scans were collected from 255 healthy young adults. One scan depicted a smiling facial expression and another scan depicted a neutral expression. These facial scans were modelled in a Principal Component Analysis (PCA) space where Euclidean (ED) and Mahalanobis (MD) distances were used to form similarity measures. Within this PCA space, property pathways were calculated that expressed the direction of change in facial expression. Decomposition of distances into property-independent (D1) and dependent components (D2) along these pathways enabled the comparison of two faces in terms of the extent of a smiling expression. The performance of all distances was tested and compared in dual types of experiments: Classification tasks and a Recognition task. In the Classification tasks, individual facial scans were assigned to one or more population groups of smiling or neutral scans. The property-dependent (D2) component of both Euclidean and Mahalanobis distances performed best in the Classification task, by correctly assigning 99.8% of scans to the right population group. The recognition task tested if a scan of an individual depicting a smiling/neutral expression could be positively identified when shown a scan of the same person depicting a neutral/smiling expression. ED1 and MD1 performed best, and correctly identified 97.8% and 94.8% of individual scans respectively as belonging to the same person despite differences in facial expression. It was concluded that decomposed components are superior to straightforward distances in achieving positive identifications and presents a novel method for quantifying facial similarity. Additionally, although the undecomposed Mahalanobis distance often used in practice outperformed that of the Euclidean, it was the opposite result for the decomposed distances. Crown Copyright 2010. Published by Elsevier Ireland Ltd. All rights reserved.

  19. Principal Component and Linkage Analysis of Cardiovascular Risk Traits in the Norfolk Isolate

    PubMed Central

    Cox, Hannah C.; Bellis, Claire; Lea, Rod A.; Quinlan, Sharon; Hughes, Roger; Dyer, Thomas; Charlesworth, Jac; Blangero, John; Griffiths, Lyn R.

    2009-01-01

    Objective(s) An individual's risk of developing cardiovascular disease (CVD) is influenced by genetic factors. This study focussed on mapping genetic loci for CVD-risk traits in a unique population isolate derived from Norfolk Island. Methods This investigation focussed on 377 individuals descended from the population founders. Principal component analysis was used to extract orthogonal components from 11 cardiovascular risk traits. Multipoint variance component methods were used to assess genome-wide linkage using SOLAR to the derived factors. A total of 285 of the 377 related individuals were informative for linkage analysis. Results A total of 4 principal components accounting for 83% of the total variance were derived. Principal component 1 was loaded with body size indicators; principal component 2 with body size, cholesterol and triglyceride levels; principal component 3 with the blood pressures; and principal component 4 with LDL-cholesterol and total cholesterol levels. Suggestive evidence of linkage for principal component 2 (h2 = 0.35) was observed on chromosome 5q35 (LOD = 1.85; p = 0.0008). While peak regions on chromosome 10p11.2 (LOD = 1.27; p = 0.005) and 12q13 (LOD = 1.63; p = 0.003) were observed to segregate with principal components 1 (h2 = 0.33) and 4 (h2 = 0.42), respectively. Conclusion(s): This study investigated a number of CVD risk traits in a unique isolated population. Findings support the clustering of CVD risk traits and provide interesting evidence of a region on chromosome 5q35 segregating with weight, waist circumference, HDL-c and total triglyceride levels. PMID:19339786

  20. An Automated and Intelligent Medical Decision Support System for Brain MRI Scans Classification.

    PubMed

    Siddiqui, Muhammad Faisal; Reza, Ahmed Wasif; Kanesan, Jeevan

    2015-01-01

    A wide interest has been observed in the medical health care applications that interpret neuroimaging scans by machine learning systems. This research proposes an intelligent, automatic, accurate, and robust classification technique to classify the human brain magnetic resonance image (MRI) as normal or abnormal, to cater down the human error during identifying the diseases in brain MRIs. In this study, fast discrete wavelet transform (DWT), principal component analysis (PCA), and least squares support vector machine (LS-SVM) are used as basic components. Firstly, fast DWT is employed to extract the salient features of brain MRI, followed by PCA, which reduces the dimensions of the features. These reduced feature vectors also shrink the memory storage consumption by 99.5%. At last, an advanced classification technique based on LS-SVM is applied to brain MR image classification using reduced features. For improving the efficiency, LS-SVM is used with non-linear radial basis function (RBF) kernel. The proposed algorithm intelligently determines the optimized values of the hyper-parameters of the RBF kernel and also applied k-fold stratified cross validation to enhance the generalization of the system. The method was tested by 340 patients' benchmark datasets of T1-weighted and T2-weighted scans. From the analysis of experimental results and performance comparisons, it is observed that the proposed medical decision support system outperformed all other modern classifiers and achieves 100% accuracy rate (specificity/sensitivity 100%/100%). Furthermore, in terms of computation time, the proposed technique is significantly faster than the recent well-known methods, and it improves the efficiency by 71%, 3%, and 4% on feature extraction stage, feature reduction stage, and classification stage, respectively. These results indicate that the proposed well-trained machine learning system has the potential to make accurate predictions about brain abnormalities from the individual subjects, therefore, it can be used as a significant tool in clinical practice.

  1. Multivariate analyses of crater parameters and the classification of craters

    NASA Technical Reports Server (NTRS)

    Siegal, B. S.; Griffiths, J. C.

    1974-01-01

    Multivariate analyses were performed on certain linear dimensions of six genetic types of craters. A total of 320 craters, consisting of laboratory fluidization craters, craters formed by chemical and nuclear explosives, terrestrial maars and other volcanic craters, and terrestrial meteorite impact craters, authenticated and probable, were analyzed in the first data set in terms of their mean rim crest diameter, mean interior relief, rim height, and mean exterior rim width. The second data set contained an additional 91 terrestrial craters of which 19 were of experimental percussive impact and 28 of volcanic collapse origin, and which was analyzed in terms of mean rim crest diameter, mean interior relief, and rim height. Principal component analyses were performed on the six genetic types of craters. Ninety per cent of the variation in the variables can be accounted for by two components. Ninety-nine per cent of the variation in the craters formed by chemical and nuclear explosives is explained by the first component alone.

  2. Discrimination of gender-, speed-, and shoe-dependent movement patterns in runners using full-body kinematics.

    PubMed

    Maurer, Christian; Federolf, Peter; von Tscharner, Vinzenz; Stirling, Lisa; Nigg, Benno M

    2012-05-01

    Changes in gait kinematics have often been analyzed using pattern recognition methods such as principal component analysis (PCA). It is usually just the first few principal components that are analyzed, because they describe the main variability within a dataset and thus represent the main movement patterns. However, while subtle changes in gait pattern (for instance, due to different footwear) may not change main movement patterns, they may affect movements represented by higher principal components. This study was designed to test two hypotheses: (1) speed and gender differences can be observed in the first principal components, and (2) small interventions such as changing footwear change the gait characteristics of higher principal components. Kinematic changes due to different running conditions (speed - 3.1m/s and 4.9 m/s, gender, and footwear - control shoe and adidas MicroBounce shoe) were investigated by applying PCA and support vector machine (SVM) to a full-body reflective marker setup. Differences in speed changed the basic movement pattern, as was reflected by a change in the time-dependent coefficient derived from the first principal. Gender was differentiated by using the time-dependent coefficient derived from intermediate principal components. (Intermediate principal components are characterized by limb rotations of the thigh and shank.) Different shoe conditions were identified in higher principal components. This study showed that different interventions can be analyzed using a full-body kinematic approach. Within the well-defined vector space spanned by the data of all subjects, higher principal components should also be considered because these components show the differences that result from small interventions such as footwear changes. Crown Copyright © 2012. Published by Elsevier B.V. All rights reserved.

  3. Principal Component Relaxation Mode Analysis of an All-Atom Molecular Dynamics Simulation of Human Lysozyme

    NASA Astrophysics Data System (ADS)

    Nagai, Toshiki; Mitsutake, Ayori; Takano, Hiroshi

    2013-02-01

    A new relaxation mode analysis method, which is referred to as the principal component relaxation mode analysis method, has been proposed to handle a large number of degrees of freedom of protein systems. In this method, principal component analysis is carried out first and then relaxation mode analysis is applied to a small number of principal components with large fluctuations. To reduce the contribution of fast relaxation modes in these principal components efficiently, we have also proposed a relaxation mode analysis method using multiple evolution times. The principal component relaxation mode analysis method using two evolution times has been applied to an all-atom molecular dynamics simulation of human lysozyme in aqueous solution. Slow relaxation modes and corresponding relaxation times have been appropriately estimated, demonstrating that the method is applicable to protein systems.

  4. Overlapped Partitioning for Ensemble Classifiers of P300-Based Brain-Computer Interfaces

    PubMed Central

    Onishi, Akinari; Natsume, Kiyohisa

    2014-01-01

    A P300-based brain-computer interface (BCI) enables a wide range of people to control devices that improve their quality of life. Ensemble classifiers with naive partitioning were recently applied to the P300-based BCI and these classification performances were assessed. However, they were usually trained on a large amount of training data (e.g., 15300). In this study, we evaluated ensemble linear discriminant analysis (LDA) classifiers with a newly proposed overlapped partitioning method using 900 training data. In addition, the classification performances of the ensemble classifier with naive partitioning and a single LDA classifier were compared. One of three conditions for dimension reduction was applied: the stepwise method, principal component analysis (PCA), or none. The results show that an ensemble stepwise LDA (SWLDA) classifier with overlapped partitioning achieved a better performance than the commonly used single SWLDA classifier and an ensemble SWLDA classifier with naive partitioning. This result implies that the performance of the SWLDA is improved by overlapped partitioning and the ensemble classifier with overlapped partitioning requires less training data than that with naive partitioning. This study contributes towards reducing the required amount of training data and achieving better classification performance. PMID:24695550

  5. Overlapped partitioning for ensemble classifiers of P300-based brain-computer interfaces.

    PubMed

    Onishi, Akinari; Natsume, Kiyohisa

    2014-01-01

    A P300-based brain-computer interface (BCI) enables a wide range of people to control devices that improve their quality of life. Ensemble classifiers with naive partitioning were recently applied to the P300-based BCI and these classification performances were assessed. However, they were usually trained on a large amount of training data (e.g., 15300). In this study, we evaluated ensemble linear discriminant analysis (LDA) classifiers with a newly proposed overlapped partitioning method using 900 training data. In addition, the classification performances of the ensemble classifier with naive partitioning and a single LDA classifier were compared. One of three conditions for dimension reduction was applied: the stepwise method, principal component analysis (PCA), or none. The results show that an ensemble stepwise LDA (SWLDA) classifier with overlapped partitioning achieved a better performance than the commonly used single SWLDA classifier and an ensemble SWLDA classifier with naive partitioning. This result implies that the performance of the SWLDA is improved by overlapped partitioning and the ensemble classifier with overlapped partitioning requires less training data than that with naive partitioning. This study contributes towards reducing the required amount of training data and achieving better classification performance.

  6. Assessment of geostatistical features for object-based image classification of contrasted landscape vegetation cover

    NASA Astrophysics Data System (ADS)

    de Oliveira Silveira, Eduarda Martiniano; de Menezes, Michele Duarte; Acerbi Júnior, Fausto Weimar; Castro Nunes Santos Terra, Marcela; de Mello, José Márcio

    2017-07-01

    Accurate mapping and monitoring of savanna and semiarid woodland biomes are needed to support the selection of areas of conservation, to provide sustainable land use, and to improve the understanding of vegetation. The potential of geostatistical features, derived from medium spatial resolution satellite imagery, to characterize contrasted landscape vegetation cover and improve object-based image classification is studied. The study site in Brazil includes cerrado sensu stricto, deciduous forest, and palm swamp vegetation cover. Sentinel 2 and Landsat 8 images were acquired and divided into objects, for each of which a semivariogram was calculated using near-infrared (NIR) and normalized difference vegetation index (NDVI) to extract the set of geostatistical features. The features selected by principal component analysis were used as input data to train a random forest algorithm. Tests were conducted, combining spectral and geostatistical features. Change detection evaluation was performed using a confusion matrix and its accuracies. The semivariogram curves were efficient to characterize spatial heterogeneity, with similar results using NIR and NDVI from Sentinel 2 and Landsat 8. Accuracy was significantly greater when combining geostatistical features with spectral data, suggesting that this method can improve image classification results.

  7. Robust through-the-wall radar image classification using a target-model alignment procedure.

    PubMed

    Smith, Graeme E; Mobasseri, Bijan G

    2012-02-01

    A through-the-wall radar image (TWRI) bears little resemblance to the equivalent optical image, making it difficult to interpret. To maximize the intelligence that may be obtained, it is desirable to automate the classification of targets in the image to support human operators. This paper presents a technique for classifying stationary targets based on the high-range resolution profile (HRRP) extracted from 3-D TWRIs. The dependence of the image on the target location is discussed using a system point spread function (PSF) approach. It is shown that the position dependence will cause a classifier to fail, unless the image to be classified is aligned to a classifier-training location. A target image alignment technique based on deconvolution of the image with the system PSF is proposed. Comparison of the aligned target images with measured images shows the alignment process introducing normalized mean squared error (NMSE) ≤ 9%. The HRRP extracted from aligned target images are classified using a naive Bayesian classifier supported by principal component analysis. The classifier is tested using a real TWRI of canonical targets behind a concrete wall and shown to obtain correct classification rates ≥ 97%. © 2011 IEEE

  8. Classification and quantitation of milk powder by near-infrared spectroscopy and mutual information-based variable selection and partial least squares

    NASA Astrophysics Data System (ADS)

    Chen, Hui; Tan, Chao; Lin, Zan; Wu, Tong

    2018-01-01

    Milk is among the most popular nutrient source worldwide, which is of great interest due to its beneficial medicinal properties. The feasibility of the classification of milk powder samples with respect to their brands and the determination of protein concentration is investigated by NIR spectroscopy along with chemometrics. Two datasets were prepared for experiment. One contains 179 samples of four brands for classification and the other contains 30 samples for quantitative analysis. Principal component analysis (PCA) was used for exploratory analysis. Based on an effective model-independent variable selection method, i.e., minimal-redundancy maximal-relevance (MRMR), only 18 variables were selected to construct a partial least-square discriminant analysis (PLS-DA) model. On the test set, the PLS-DA model based on the selected variable set was compared with the full-spectrum PLS-DA model, both of which achieved 100% accuracy. In quantitative analysis, the partial least-square regression (PLSR) model constructed by the selected subset of 260 variables outperforms significantly the full-spectrum model. It seems that the combination of NIR spectroscopy, MRMR and PLS-DA or PLSR is a powerful tool for classifying different brands of milk and determining the protein content.

  9. Characterization of the diversity in bat biosonar beampatterns with spherical harmonics power spectra.

    PubMed

    Motamedi, Mohammad; Müller, Rolf

    2014-06-01

    The biosonar beampatterns found across different bat species are highly diverse in terms of global and local shape properties such as overall beamwidth or the presence, location, and shape of multiple lobes. It may be hypothesized that some of this variability reflects evolutionary adaptation. To investigate this hypothesis, the present work has searched for patterns in the variability across a set of 283 numerical predictions of emission and reception beampatterns from 88 bat species belonging to four major families (Rhinolophidae, Hipposideridae, Phyllostomidae, Vespertilionidae). This was done using a lossy compression of the beampatterns that utilized real spherical harmonics as basis functions. The resulting vector representations showed differences between the families as well as between emission and reception. These differences existed in the means of the power spectra as well as in their distribution. The distributions were characterized in a low dimensional space found through principal component analysis. The distinctiveness of the beampatterns across the groups was corroborated by pairwise classification experiments that yielded correct classification rates between ~85 and ~98%. Beamwidth was a major factor but not the sole distinguishing feature in these classification experiments. These differences could be seen as an indication of adaptive trends at the beampattern level.

  10. Classification of Parkinsonian syndromes from FDG-PET brain data using decision trees with SSM/PCA features.

    PubMed

    Mudali, D; Teune, L K; Renken, R J; Leenders, K L; Roerdink, J B T M

    2015-01-01

    Medical imaging techniques like fluorodeoxyglucose positron emission tomography (FDG-PET) have been used to aid in the differential diagnosis of neurodegenerative brain diseases. In this study, the objective is to classify FDG-PET brain scans of subjects with Parkinsonian syndromes (Parkinson's disease, multiple system atrophy, and progressive supranuclear palsy) compared to healthy controls. The scaled subprofile model/principal component analysis (SSM/PCA) method was applied to FDG-PET brain image data to obtain covariance patterns and corresponding subject scores. The latter were used as features for supervised classification by the C4.5 decision tree method. Leave-one-out cross validation was applied to determine classifier performance. We carried out a comparison with other types of classifiers. The big advantage of decision tree classification is that the results are easy to understand by humans. A visual representation of decision trees strongly supports the interpretation process, which is very important in the context of medical diagnosis. Further improvements are suggested based on enlarging the number of the training data, enhancing the decision tree method by bagging, and adding additional features based on (f)MRI data.

  11. Mining Feature of Data Fusion in the Classification of Beer Flavor Information Using E-Tongue and E-Nose

    PubMed Central

    Men, Hong; Shi, Yan; Fu, Songlin; Jiao, Yanan; Qiao, Yu; Liu, Jingjing

    2017-01-01

    Multi-sensor data fusion can provide more comprehensive and more accurate analysis results. However, it also brings some redundant information, which is an important issue with respect to finding a feature-mining method for intuitive and efficient analysis. This paper demonstrates a feature-mining method based on variable accumulation to find the best expression form and variables’ behavior affecting beer flavor. First, e-tongue and e-nose were used to gather the taste and olfactory information of beer, respectively. Second, principal component analysis (PCA), genetic algorithm-partial least squares (GA-PLS), and variable importance of projection (VIP) scores were applied to select feature variables of the original fusion set. Finally, the classification models based on support vector machine (SVM), random forests (RF), and extreme learning machine (ELM) were established to evaluate the efficiency of the feature-mining method. The result shows that the feature-mining method based on variable accumulation obtains the main feature affecting beer flavor information, and the best classification performance for the SVM, RF, and ELM models with 96.67%, 94.44%, and 98.33% prediction accuracy, respectively. PMID:28753917

  12. An iterated Laplacian based semi-supervised dimensionality reduction for classification of breast cancer on ultrasound images.

    PubMed

    Liu, Xiao; Shi, Jun; Zhou, Shichong; Lu, Minhua

    2014-01-01

    The dimensionality reduction is an important step in ultrasound image based computer-aided diagnosis (CAD) for breast cancer. A newly proposed l2,1 regularized correntropy algorithm for robust feature selection (CRFS) has achieved good performance for noise corrupted data. Therefore, it has the potential to reduce the dimensions of ultrasound image features. However, in clinical practice, the collection of labeled instances is usually expensive and time costing, while it is relatively easy to acquire the unlabeled or undetermined instances. Therefore, the semi-supervised learning is very suitable for clinical CAD. The iterated Laplacian regularization (Iter-LR) is a new regularization method, which has been proved to outperform the traditional graph Laplacian regularization in semi-supervised classification and ranking. In this study, to augment the classification accuracy of the breast ultrasound CAD based on texture feature, we propose an Iter-LR-based semi-supervised CRFS (Iter-LR-CRFS) algorithm, and then apply it to reduce the feature dimensions of ultrasound images for breast CAD. We compared the Iter-LR-CRFS with LR-CRFS, original supervised CRFS, and principal component analysis. The experimental results indicate that the proposed Iter-LR-CRFS significantly outperforms all other algorithms.

  13. Monitoring and evaluation of the water quality of Budeasa Reservoir-Arges River, Romania.

    PubMed

    Ion, Antoanela; Vladescu, Luminita; Badea, Irinel Adriana; Comanescu, Laura

    2016-09-01

    The purpose of this study was to monitor and record the specific characteristics and properties of the Arges River water in the Budeasa Reservoir (the principal water resources of municipal tap water of the big Romanian city Pitesti and surrounding area) for a period of 5 years (2005-2009). The monitored physical and chemical parameters were turbidity, pH, electrical conductivity, chemical oxygen demand, 5 days biochemical oxygen demand, free dissolved oxygen, nitrite, nitrate, ammonia nitrogen, chloride, total dissolved iron ions, sulfate, manganese, phosphate, total alkalinity, and total hardness. The results were discussed in correlation with the precipitation values during the study. Monthly and annual values of each parameter determined in the period January 2005-December 2009 were used as a basis for the classification of Budeasa Reservoir water, according to the European legislation, as well as for assessing its quality as a drinking water supply. Principal component analysis and Pearson correlation coefficients were used as statistical procedures in order to evaluate the data obtained during this study.

  14. The Research on Dryland Crop Classification Based on the Fusion of SENTINEL-1A SAR and Optical Images

    NASA Astrophysics Data System (ADS)

    Liu, F.; Chen, T.; He, J.; Wen, Q.; Yu, F.; Gu, X.; Wang, Z.

    2018-04-01

    In recent years, the quick upgrading and improvement of SAR sensors provide beneficial complements for the traditional optical remote sensing in the aspects of theory, technology and data. In this paper, Sentinel-1A SAR data and GF-1 optical data were selected for image fusion, and more emphases were put on the dryland crop classification under a complex crop planting structure, regarding corn and cotton as the research objects. Considering the differences among various data fusion methods, the principal component analysis (PCA), Gram-Schmidt (GS), Brovey and wavelet transform (WT) methods were compared with each other, and the GS and Brovey methods were proved to be more applicable in the study area. Then, the classification was conducted based on the object-oriented technique process. And for the GS, Brovey fusion images and GF-1 optical image, the nearest neighbour algorithm was adopted to realize the supervised classification with the same training samples. Based on the sample plots in the study area, the accuracy assessment was conducted subsequently. The values of overall accuracy and kappa coefficient of fusion images were all higher than those of GF-1 optical image, and GS method performed better than Brovey method. In particular, the overall accuracy of GS fusion image was 79.8 %, and the Kappa coefficient was 0.644. Thus, the results showed that GS and Brovey fusion images were superior to optical images for dryland crop classification. This study suggests that the fusion of SAR and optical images is reliable for dryland crop classification under a complex crop planting structure.

  15. A novel chemometric classification for FTIR spectra of mycotoxin-contaminated maize and peanuts at regulatory limits.

    PubMed

    Kos, Gregor; Sieger, Markus; McMullin, David; Zahradnik, Celine; Sulyok, Michael; Öner, Tuba; Mizaikoff, Boris; Krska, Rudolf

    2016-10-01

    The rapid identification of mycotoxins such as deoxynivalenol and aflatoxin B 1 in agricultural commodities is an ongoing concern for food importers and processors. While sophisticated chromatography-based methods are well established for regulatory testing by food safety authorities, few techniques exist to provide a rapid assessment for traders. This study advances the development of a mid-infrared spectroscopic method, recording spectra with little sample preparation. Spectral data were classified using a bootstrap-aggregated (bagged) decision tree method, evaluating the protein and carbohydrate absorption regions of the spectrum. The method was able to classify 79% of 110 maize samples at the European Union regulatory limit for deoxynivalenol of 1750 µg kg -1 and, for the first time, 77% of 92 peanut samples at 8 µg kg -1 of aflatoxin B 1 . A subset model revealed a dependency on variety and type of fungal infection. The employed CRC and SBL maize varieties could be pooled in the model with a reduction of classification accuracy from 90% to 79%. Samples infected with Fusarium verticillioides were removed, leaving samples infected with F. graminearum and F. culmorum in the dataset improving classification accuracy from 73% to 79%. A 500 µg kg -1 classification threshold for deoxynivalenol in maize performed even better with 85% accuracy. This is assumed to be due to a larger number of samples around the threshold increasing representativity. Comparison with established principal component analysis classification, which consistently showed overlapping clusters, confirmed the superior performance of bagged decision tree classification.

  16. Cognition, culture and utility: plant classification by Paraguayan immigrant farmers in Misiones, Argentina.

    PubMed

    Kujawska, Monika; Jiménez-Escobar, N David; Nolan, Justin M; Arias-Mutis, Daniel

    2017-07-25

    This study was conducted in three rural communities of small farmers of Paraguayan origin living in the province of Misiones, Argentina. These Criollos (Mestizos) hail chiefly from departments located in the east of Paraguay, where the climate and flora have similar characteristics as those in Misiones. These ecological features contribute to the continuation and maintenance of knowledge and practices related to the use of plants. Fieldwork was conducted between September 2014 and August 2015. Forty five informants from three rural localities situated along the Parana River participated in an ethno-classification task. For the classification event, photographs of 30 medicinal and edible plants were chosen, specifically those yielding the highest frequency of mention among the members of that community (based on data obtained in the first stage of research in 2014). Variation in local plant classifications was examined and compared using principal component analysis and cluster analysis. We found that people classify plants according to application or use (primarily medicinal, to a lesser extent as edible). Morphology is rarely taken into account, even for very similar and closely-related species such as varieties of palms. In light of our findings, we highlight a dominant functionality model at work in the process of plant cognition and classification among farmers of Paraguayan origin. Salient cultural beliefs and practices associated with rural Paraguayan plant-based medicine are described. Additionally, the manner by which residents' concepts of plants articulate with local folk epistemology is discussed. Culturally constructed use patterns ultimately override morphological variables in rural Paraguayans' ethnobotanical classification.

  17. Influence of nuclei segmentation on breast cancer malignancy classification

    NASA Astrophysics Data System (ADS)

    Jelen, Lukasz; Fevens, Thomas; Krzyzak, Adam

    2009-02-01

    Breast Cancer is one of the most deadly cancers affecting middle-aged women. Accurate diagnosis and prognosis are crucial to reduce the high death rate. Nowadays there are numerous diagnostic tools for breast cancer diagnosis. In this paper we discuss a role of nuclear segmentation from fine needle aspiration biopsy (FNA) slides and its influence on malignancy classification. Classification of malignancy plays a very important role during the diagnosis process of breast cancer. Out of all cancer diagnostic tools, FNA slides provide the most valuable information about the cancer malignancy grade which helps to choose an appropriate treatment. This process involves assessing numerous nuclear features and therefore precise segmentation of nuclei is very important. In this work we compare three powerful segmentation approaches and test their impact on the classification of breast cancer malignancy. The studied approaches involve level set segmentation, fuzzy c-means segmentation and textural segmentation based on co-occurrence matrix. Segmented nuclei were used to extract nuclear features for malignancy classification. For classification purposes four different classifiers were trained and tested with previously extracted features. The compared classifiers are Multilayer Perceptron (MLP), Self-Organizing Maps (SOM), Principal Component-based Neural Network (PCA) and Support Vector Machines (SVM). The presented results show that level set segmentation yields the best results over the three compared approaches and leads to a good feature extraction with a lowest average error rate of 6.51% over four different classifiers. The best performance was recorded for multilayer perceptron with an error rate of 3.07% using fuzzy c-means segmentation.

  18. Functional principal component analysis of glomerular filtration rate curves after kidney transplant.

    PubMed

    Dong, Jianghu J; Wang, Liangliang; Gill, Jagbir; Cao, Jiguo

    2017-01-01

    This article is motivated by some longitudinal clinical data of kidney transplant recipients, where kidney function progression is recorded as the estimated glomerular filtration rates at multiple time points post kidney transplantation. We propose to use the functional principal component analysis method to explore the major source of variations of glomerular filtration rate curves. We find that the estimated functional principal component scores can be used to cluster glomerular filtration rate curves. Ordering functional principal component scores can detect abnormal glomerular filtration rate curves. Finally, functional principal component analysis can effectively estimate missing glomerular filtration rate values and predict future glomerular filtration rate values.

  19. Discriminant analysis of resting-state functional connectivity patterns on the Grassmann manifold

    NASA Astrophysics Data System (ADS)

    Fan, Yong; Liu, Yong; Jiang, Tianzi; Liu, Zhening; Hao, Yihui; Liu, Haihong

    2010-03-01

    The functional networks, extracted from fMRI images using independent component analysis, have been demonstrated informative for distinguishing brain states of cognitive functions and neurological diseases. In this paper, we propose a novel algorithm for discriminant analysis of functional networks encoded by spatial independent components. The functional networks of each individual are used as bases for a linear subspace, referred to as a functional connectivity pattern, which facilitates a comprehensive characterization of temporal signals of fMRI data. The functional connectivity patterns of different individuals are analyzed on the Grassmann manifold by adopting a principal angle based subspace distance. In conjunction with a support vector machine classifier, a forward component selection technique is proposed to select independent components for constructing the most discriminative functional connectivity pattern. The discriminant analysis method has been applied to an fMRI based schizophrenia study with 31 schizophrenia patients and 31 healthy individuals. The experimental results demonstrate that the proposed method not only achieves a promising classification performance for distinguishing schizophrenia patients from healthy controls, but also identifies discriminative functional networks that are informative for schizophrenia diagnosis.

  20. Using statistical text classification to identify health information technology incidents

    PubMed Central

    Chai, Kevin E K; Anthony, Stephen; Coiera, Enrico; Magrabi, Farah

    2013-01-01

    Objective To examine the feasibility of using statistical text classification to automatically identify health information technology (HIT) incidents in the USA Food and Drug Administration (FDA) Manufacturer and User Facility Device Experience (MAUDE) database. Design We used a subset of 570 272 incidents including 1534 HIT incidents reported to MAUDE between 1 January 2008 and 1 July 2010. Text classifiers using regularized logistic regression were evaluated with both ‘balanced’ (50% HIT) and ‘stratified’ (0.297% HIT) datasets for training, validation, and testing. Dataset preparation, feature extraction, feature selection, cross-validation, classification, performance evaluation, and error analysis were performed iteratively to further improve the classifiers. Feature-selection techniques such as removing short words and stop words, stemming, lemmatization, and principal component analysis were examined. Measurements κ statistic, F1 score, precision and recall. Results Classification performance was similar on both the stratified (0.954 F1 score) and balanced (0.995 F1 score) datasets. Stemming was the most effective technique, reducing the feature set size to 79% while maintaining comparable performance. Training with balanced datasets improved recall (0.989) but reduced precision (0.165). Conclusions Statistical text classification appears to be a feasible method for identifying HIT reports within large databases of incidents. Automated identification should enable more HIT problems to be detected, analyzed, and addressed in a timely manner. Semi-supervised learning may be necessary when applying machine learning to big data analysis of patient safety incidents and requires further investigation. PMID:23666777

  1. Voxel-Based Neighborhood for Spatial Shape Pattern Classification of Lidar Point Clouds with Supervised Learning

    PubMed Central

    Plaza-Leiva, Victoria; Gomez-Ruiz, Jose Antonio; Mandow, Anthony; García-Cerezo, Alfonso

    2017-01-01

    Improving the effectiveness of spatial shape features classification from 3D lidar data is very relevant because it is largely used as a fundamental step towards higher level scene understanding challenges of autonomous vehicles and terrestrial robots. In this sense, computing neighborhood for points in dense scans becomes a costly process for both training and classification. This paper proposes a new general framework for implementing and comparing different supervised learning classifiers with a simple voxel-based neighborhood computation where points in each non-overlapping voxel in a regular grid are assigned to the same class by considering features within a support region defined by the voxel itself. The contribution provides offline training and online classification procedures as well as five alternative feature vector definitions based on principal component analysis for scatter, tubular and planar shapes. Moreover, the feasibility of this approach is evaluated by implementing a neural network (NN) method previously proposed by the authors as well as three other supervised learning classifiers found in scene processing methods: support vector machines (SVM), Gaussian processes (GP), and Gaussian mixture models (GMM). A comparative performance analysis is presented using real point clouds from both natural and urban environments and two different 3D rangefinders (a tilting Hokuyo UTM-30LX and a Riegl). Classification performance metrics and processing time measurements confirm the benefits of the NN classifier and the feasibility of voxel-based neighborhood. PMID:28294963

  2. Rapid fingerprinting and classification of extra virgin olive oil by microjet sampling and extractive electrospray ionization mass spectrometry.

    PubMed

    Law, Wai Siang; Chen, Huan Wen; Balabin, Roman; Berchtold, Christian; Meier, Lukas; Zenobi, Renato

    2010-04-01

    Microjet sampling in combination with extractive electrospray ionization (EESI) mass spectrometry (MS) was applied to the rapid characterization and classification of extra virgin olive oil (EVOO) without any sample pretreatment. When modifying the composition of the primary ESI spray solvent, mass spectra of an identical EVOO sample showed differences. This demonstrates the capability of this technique to extract molecules with varying polarities, hence generating rich molecular information of the EVOO. Moreover, with the aid of microjet sampling, compounds of different volatilities (e.g.E-2-hexenal, trans-trans-2,4-heptadienal, tyrosol and caffeic acid) could be sampled simultaneously. EVOO data was also compared with that of other edible oils. Principal Component Analysis (PCA) was performed to discriminate EVOO and EVOO adulterated with edible oils. Microjet sampling EESI-MS was found to be a simple, rapid (less than 2 min analysis time per sample) and powerful method to obtain MS fingerprints of EVOO without requiring any complicated sample pretreatment steps.

  3. A time-frequency classifier for human gait recognition

    NASA Astrophysics Data System (ADS)

    Mobasseri, Bijan G.; Amin, Moeness G.

    2009-05-01

    Radar has established itself as an effective all-weather, day or night sensor. Radar signals can penetrate walls and provide information on moving targets. Recently, radar has been used as an effective biometric sensor for classification of gait. The return from a coherent radar system contains a frequency offset in the carrier frequency, known as the Doppler Effect. The movements of arms and legs give rise to micro Doppler which can be clearly detailed in the time-frequency domain using traditional or modern time-frequency signal representation. In this paper we propose a gait classifier based on subspace learning using principal components analysis(PCA). The training set consists of feature vectors defined as either time or frequency snapshots taken from the spectrogram of radar backscatter. We show that gait signature is captured effectively in feature vectors. Feature vectors are then used in training a minimum distance classifier based on Mahalanobis distance metric. Results show that gait classification with high accuracy and short observation window is achievable using the proposed classifier.

  4. An Initial Analysis of LANDSAT-4 Thematic Mapper Data for the Discrimination of Agricultural, Forested Wetlands, and Urban Land Cover. [Poinsett County, Arkansas; and Reelfoot Lake and Union City, Tennessee

    NASA Technical Reports Server (NTRS)

    Quattrochi, D. A.

    1985-01-01

    The capabilities of TM data for discriminating land covers within three particular cultural and ecological realms was assessed. The agricultural investigation in Poinsett County, Arkansas illustrates that TM data can successfully be used to discriminate a variety of crop cover types within the study area. The single-date TM classification produced results that were significantly better than those developed from multitemporal MSS data. For the Reelfoot Lake area of Tennessee TM data, processed using unsupervised signature development techniques, produced a detailed classification of forested wetlands with excellent accuracy. Even in a small city of approximately 15,000 people (Union City, Tennessee). TM data can successfully be used to spectrally distinguish specific urban classes. Furthermore, the principal components analysis evaluation of the data shows that through photointerpretation, it is possible to distinguish individual buildings and roof responses with the TM.

  5. Analysis of lard in meatball broth using Fourier transform infrared spectroscopy and chemometrics.

    PubMed

    Kurniawati, Endah; Rohman, Abdul; Triyana, Kuwat

    2014-01-01

    Meatball is one of the favorite foods in Indonesia. For the economic reason (due to the price difference), the substitution of beef meat with pork can occur. In this study, FTIR spectroscopy in combination with chemometrics of partial least square (PLS) and principal component analysis (PCA) was used for analysis of pork fat (lard) in meatball broth. Lard in meatball broth was quantitatively determined at wavenumber region of 1018-1284 cm(-1). The coefficient of determination (R(2)) and root mean square error of calibration (RMSEC) values obtained were 0.9975 and 1.34% (v/v), respectively. Furthermore, the classification of lard and beef fat in meatball broth as well as in commercial samples was performed at wavenumber region of 1200-1000 cm(-1). The results showed that FTIR spectroscopy coupled with chemometrics can be used for quantitative analysis and classification of lard in meatball broth for Halal verification studies. The developed method is simple in operation, rapid and not involving extensive sample preparation. © 2013.

  6. Attenuated Total Reflection Mid-Infrared (ATR-MIR) Spectroscopy and Chemometrics for the Identification and Classification of Commercial Tannins.

    PubMed

    Ricci, Arianna; Parpinello, Giuseppina P; Olejar, Kenneth J; Kilmartin, Paul A; Versari, Andrea

    2015-11-01

    Attenuated total reflection Fourier transform infrared (FT-IR) spectroscopy was used to characterize 40 commercial tannins, including condensed and hydrolyzable chemical classes, provided as powder extracts from suppliers. Spectral data were processed to detect typical molecular vibrations of tannins bearing different chemical groups and of varying botanical origin (univariate qualitative analysis). The mid-infrared region between 4000 and 520 cm(-1) was analyzed, with a particular emphasis on the vibrational modes in the fingerprint region (1800-520 cm(-1)), which provide detailed information about skeletal structures and specific substituents. The region 1800-1500 cm(-1) contained signals due to hydrolyzable structures, while bands due to condensed tannins appeared at 1300-900 cm(-1) and exhibited specific hydroxylation patterns useful to elucidate the structure of the flavonoid monomeric units. The spectra were investigated further using principal component analysis for discriminative purposes, to enhance the ability of infrared spectroscopy in the classification and quality control of commercial dried extracts and to enhance their industrial exploitation.

  7. Emerging approach for analytical characterization and geographical classification of Moroccan and French honeys by means of a voltammetric electronic tongue.

    PubMed

    El Alami El Hassani, Nadia; Tahri, Khalid; Llobet, Eduard; Bouchikhi, Benachir; Errachid, Abdelhamid; Zine, Nadia; El Bari, Nezha

    2018-03-15

    Moroccan and French honeys from different geographical areas were classified and characterized by applying a voltammetric electronic tongue (VE-tongue) coupled to analytical methods. The studied parameters include color intensity, free lactonic and total acidity, proteins, phenols, hydroxymethylfurfural content (HMF), sucrose, reducing and total sugars. The geographical classification of different honeys was developed through three-pattern recognition techniques: principal component analysis (PCA), support vector machines (SVMs) and hierarchical cluster analysis (HCA). Honey characterization was achieved by partial least squares modeling (PLS). All the PLS models developed were able to accurately estimate the correct values of the parameters analyzed using as input the voltammetric experimental data (i.e. r>0.9). This confirms the potential ability of the VE-tongue for performing a rapid characterization of honeys via PLS in which an uncomplicated, cost-effective sample preparation process that does not require the use of additional chemicals is implemented. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. Study of archaeological coins of different dynasties using libs coupled with multivariate analysis

    NASA Astrophysics Data System (ADS)

    Awasthi, Shikha; Kumar, Rohit; Rai, G. K.; Rai, A. K.

    2016-04-01

    Laser Induced Breakdown Spectroscopy (LIBS) is an atomic emission spectroscopic technique having unique capability of an in-situ monitoring tool for detection and quantification of elements present in different artifacts. Archaeological coins collected form G.R. Sharma Memorial Museum; University of Allahabad, India has been analyzed using LIBS technique. These coins were obtained from excavation of Kausambi, Uttar Pradesh, India. LIBS system assembled in the laboratory (laser Nd:YAG 532 nm, 4 ns pulse width FWHM with Ocean Optics LIBS 2000+ spectrometer) is employed for spectral acquisition. The spectral lines of Ag, Cu, Ca, Sn, Si, Fe and Mg are identified in the LIBS spectra of different coins. LIBS along with Multivariate Analysis play an effective role for classification and contribution of spectral lines in different coins. The discrimination between five coins with Archaeological interest has been carried out using Principal Component Analysis (PCA). The results show the potential relevancy of the methodology used in the elemental identification and classification of artifacts with high accuracy and robustness.

  9. Classification of Ilex species based on metabolomic fingerprinting using nuclear magnetic resonance and multivariate data analysis.

    PubMed

    Choi, Young Hae; Sertic, Sarah; Kim, Hye Kyong; Wilson, Erica G; Michopoulos, Filippos; Lefeber, Alfons W M; Erkelens, Cornelis; Prat Kricun, Sergio D; Verpoorte, Robert

    2005-02-23

    The metabolomic analysis of 11 Ilex species, I. argentina, I. brasiliensis, I. brevicuspis, I. dumosavar. dumosa, I. dumosa var. guaranina, I. integerrima, I. microdonta, I. paraguariensis var. paraguariensis, I. pseudobuxus, I. taubertiana, and I. theezans, was carried out by NMR spectroscopy and multivariate data analysis. The analysis using principal component analysis and classification of the (1)H NMR spectra showed a clear discrimination of those samples based on the metabolites present in the organic and aqueous fractions. The major metabolites that contribute to the discrimination are arbutin, caffeine, phenylpropanoids, and theobromine. Among those metabolites, arbutin, which has not been reported yet as a constituent of Ilex species, was found to be a biomarker for I. argentina,I. brasiliensis, I. brevicuspis, I. integerrima, I. microdonta, I. pseudobuxus, I. taubertiana, and I. theezans. This reliable method based on the determination of a large number of metabolites makes the chemotaxonomical analysis of Ilex species possible.

  10. Detection of sugar adulterants in apple juice using fourier transform infrared spectroscopy and chemometrics.

    PubMed

    Kelly, J F Daniel; Downey, Gerard

    2005-05-04

    Fourier transform infrared spectroscopy and attenuated total reflection sampling have been used to detect adulteration of single strength apple juice samples. The sample set comprised 224 authentic apple juices and 480 adulterated samples. Adulterants used included partially inverted cane syrup (PICS), beet sucrose (BS), high fructose corn syrup (HFCS), and a synthetic solution of fructose, glucose, and sucrose (FGS). Adulteration was carried out on individual apple juice samples at levels of 10, 20, 30, and 40% w/w. Spectral data were compressed by principal component analysis and analyzed using k-nearest neighbors and partial least squares regression techniques. Prediction results for the best classification models achieved an overall (authentic plus adulterated) correct classification rate of 96.5, 93.9, 92.2, and 82.4% for PICS, BS, HFCS, and FGS adulterants, respectively. This method shows promise as a rapid screening technique for the detection of a broad range of potential adulterants in apple juice.

  11. What’s Wrong with the Murals at the Mogao Grottoes: A Near-Infrared Hyperspectral Imaging Method

    PubMed Central

    Sun, Meijun; Zhang, Dong; Wang, Zheng; Ren, Jinchang; Chai, Bolong; Sun, Jizhou

    2015-01-01

    Although a significant amount of work has been performed to preserve the ancient murals in the Mogao Grottoes by Dunhuang Cultural Research, non-contact methods need to be developed to effectively evaluate the degree of flaking of the murals. In this study, we propose to evaluate the flaking by automatically analyzing hyperspectral images that were scanned at the site. Murals with various degrees of flaking were scanned in the 126th cave using a near-infrared (NIR) hyperspectral camera with a spectral range of approximately 900 to 1700 nm. The regions of interest (ROIs) of the murals were manually labeled and grouped into four levels: normal, slight, moderate, and severe. The average spectral data from each ROI and its group label were used to train our classification model. To predict the degree of flaking, we adopted four algorithms: deep belief networks (DBNs), partial least squares regression (PLSR), principal component analysis with a support vector machine (PCA + SVM) and principal component analysis with an artificial neural network (PCA + ANN). The experimental results show the effectiveness of our method. In particular, better results are obtained using DBNs when the training data contain a significant amount of striping noise. PMID:26394926

  12. The Bolivian "Altiplano" and "Valle" sheep are two different peripatric breeds.

    PubMed

    Parés-Casanova, Pere M; Pérezgrovas Garza, Raúl

    2014-06-01

    Forty-nine sheep belonged to the Andean Altiplano region ("Altiplano") and 30 in the lowland regions of Bolivia ("Valle"), aged 1 to 4 years, were wool sampled to determine the extent of difference between these local breeds. Fibre length and the percentage of each type of fibre (long-thick, short-thin and kemp), yield and fibre diameter were measured. There was a highly significant difference between the two sheep populations that were not clearly separated in the first two principal component of a principal components analysis (PC); the first PC explained 67.1 % and the second PC explained 26.6 % of the total variation. The variables that contributed most to the separation of the sheep populations were the percentage of long-thick and short-thin fibres in the first PC and yield in the second PC. A discriminant analysis, which was used to classify individuals with respect to their breeding, achieved an accurate classification rate of 84.2 %. Thus, the Altiplano and Valle sheep must be viewed as two closely peripatric breeds rather than different "ecotypes", as more than 80 % could be correctly assigned to one of the breeds; however, the differences are based on composition of long-thick and short-thin fibres and yield after alcohol scouring.

  13. Development of a multimetric index for integrated assessment of salt marsh ecosystem condition

    USGS Publications Warehouse

    Nagel, Jessica L.; Neckles, Hilary A.; Guntenspergen, Glenn R.; Rocks, Erika N.; Schoolmaster, Donald; Grace, James B.; Skidds, Dennis; Stevens, Sara

    2018-01-01

    Tools for assessing and communicating salt marsh condition are essential to guide decisions aimed at maintaining or restoring ecosystem integrity and services. Multimetric indices (MMIs) are increasingly used to provide integrated assessments of ecosystem condition. We employed a theory-based approach that considers the multivariate relationship of metrics with human disturbance to construct a salt marsh MMI for five National Parks in the northeastern USA. We quantified the degree of human disturbance for each marsh using the first principal component score from a principal components analysis of physical, chemical, and land use stressors. We then applied a metric selection algorithm to different combinations of about 45 vegetation and nekton metrics (e.g., species abundance, species richness, and ecological and functional classifications) derived from multi-year monitoring data. While MMIs derived from nekton or vegetation metrics alone were strongly correlated with human disturbance (r values from −0.80 to −0.93), an MMI derived from both vegetation and nekton metrics yielded an exceptionally strong correlation with disturbance (r = −0.96). Individual MMIs included from one to five metrics. The metric-assembly algorithm yielded parsimonious MMIs that exhibit the greatest possible correlations with disturbance in a way that is objective, efficient, and reproducible.

  14. In-TFT-array-process micro defect inspection using nonlinear principal component analysis.

    PubMed

    Liu, Yi-Hung; Wang, Chi-Kai; Ting, Yung; Lin, Wei-Zhi; Kang, Zhi-Hao; Chen, Ching-Shun; Hwang, Jih-Shang

    2009-11-20

    Defect inspection plays a critical role in thin film transistor liquid crystal display (TFT-LCD) manufacture, and has received much attention in the field of automatic optical inspection (AOI). Previously, most focus was put on the problems of macro-scale Mura-defect detection in cell process, but it has recently been found that the defects which substantially influence the yield rate of LCD panels are actually those in the TFT array process, which is the first process in TFT-LCD manufacturing. Defect inspection in TFT array process is therefore considered a difficult task. This paper presents a novel inspection scheme based on kernel principal component analysis (KPCA) algorithm, which is a nonlinear version of the well-known PCA algorithm. The inspection scheme can not only detect the defects from the images captured from the surface of LCD panels, but also recognize the types of the detected defects automatically. Results, based on real images provided by a LCD manufacturer in Taiwan, indicate that the KPCA-based defect inspection scheme is able to achieve a defect detection rate of over 99% and a high defect classification rate of over 96% when the imbalanced support vector machine (ISVM) with 2-norm soft margin is employed as the classifier. More importantly, the inspection time is less than 1 s per input image.

  15. Blind deconvolution with principal components analysis for wide-field and small-aperture telescopes

    NASA Astrophysics Data System (ADS)

    Jia, Peng; Sun, Rongyu; Wang, Weinan; Cai, Dongmei; Liu, Huigen

    2017-09-01

    Telescopes with a wide field of view (greater than 1°) and small apertures (less than 2 m) are workhorses for observations such as sky surveys and fast-moving object detection, and play an important role in time-domain astronomy. However, images captured by these telescopes are contaminated by optical system aberrations, atmospheric turbulence, tracking errors and wind shear. To increase the quality of images and maximize their scientific output, we propose a new blind deconvolution algorithm based on statistical properties of the point spread functions (PSFs) of these telescopes. In this new algorithm, we first construct the PSF feature space through principal component analysis, and then classify PSFs from a different position and time using a self-organizing map. According to the classification results, we divide images of the same PSF types and select these PSFs to construct a prior PSF. The prior PSF is then used to restore these images. To investigate the improvement that this algorithm provides for data reduction, we process images of space debris captured by our small-aperture wide-field telescopes. Comparing the reduced results of the original images and the images processed with the standard Richardson-Lucy method, our method shows a promising improvement in astrometry accuracy.

  16. Identification of fungal phytopathogens using Fourier transform infrared-attenuated total reflection spectroscopy and advanced statistical methods

    NASA Astrophysics Data System (ADS)

    Salman, Ahmad; Lapidot, Itshak; Pomerantz, Ami; Tsror, Leah; Shufan, Elad; Moreh, Raymond; Mordechai, Shaul; Huleihel, Mahmoud

    2012-01-01

    The early diagnosis of phytopathogens is of a great importance; it could save large economical losses due to crops damaged by fungal diseases, and prevent unnecessary soil fumigation or the use of fungicides and bactericides and thus prevent considerable environmental pollution. In this study, 18 isolates of three different fungi genera were investigated; six isolates of Colletotrichum coccodes, six isolates of Verticillium dahliae and six isolates of Fusarium oxysporum. Our main goal was to differentiate these fungi samples on the level of isolates, based on their infrared absorption spectra obtained using the Fourier transform infrared-attenuated total reflection (FTIR-ATR) sampling technique. Advanced statistical and mathematical methods: principal component analysis (PCA), linear discriminant analysis (LDA), and k-means were applied to the spectra after manipulation. Our results showed significant spectral differences between the various fungi genera examined. The use of k-means enabled classification between the genera with a 94.5% accuracy, whereas the use of PCA [3 principal components (PCs)] and LDA has achieved a 99.7% success rate. However, on the level of isolates, the best differentiation results were obtained using PCA (9 PCs) and LDA for the lower wavenumber region (800-1775 cm-1), with identification success rates of 87%, 85.5%, and 94.5% for Colletotrichum, Fusarium, and Verticillium strains, respectively.

  17. The Relation between Factor Score Estimates, Image Scores, and Principal Component Scores

    ERIC Educational Resources Information Center

    Velicer, Wayne F.

    1976-01-01

    Investigates the relation between factor score estimates, principal component scores, and image scores. The three methods compared are maximum likelihood factor analysis, principal component analysis, and a variant of rescaled image analysis. (RC)

  18. The Butterflies of Principal Components: A Case of Ultrafine-Grained Polyphase Units

    NASA Astrophysics Data System (ADS)

    Rietmeijer, F. J. M.

    1996-03-01

    Dusts in the accretion regions of chondritic interplanetary dust particles [IDPs] consisted of three principal components: carbonaceous units [CUs], carbon-bearing chondritic units [GUs] and carbon-free silicate units [PUs]. Among others, differences among chondritic IDP morphologies and variable bulk C/Si ratios reflect variable mixtures of principal components. The spherical shapes of the initially amorphous principal components remain visible in many chondritic porous IDPs but fusion was documented for CUs, GUs and PUs. The PUs occur as coarse- and ultrafine-grained units that include so called GEMS. Spherical principal components preserved in an IDP as recognisable textural units have unique proporties with important implications for their petrological evolution from pre-accretion processing to protoplanet alteration and dynamic pyrometamorphism. Throughout their lifetime the units behaved as closed-systems without chemical exchange with other units. This behaviour is reflected in their mineralogies while the bulk compositions of principal components define the environments wherein they were formed.

  19. Identification of different bacterial species in biofilms using confocal Raman microscopy

    NASA Astrophysics Data System (ADS)

    Beier, Brooke D.; Quivey, Robert G.; Berger, Andrew J.

    2010-11-01

    Confocal Raman microspectroscopy is used to discriminate between different species of bacteria grown in biofilms. Tests are performed using two bacterial species, Streptococcus sanguinis and Streptococcus mutans, which are major components of oral plaque and of particular interest due to their association with healthy and cariogenic plaque, respectively. Dehydrated biofilms of these species are studied as a simplified model of dental plaque. A prediction model based on principal component analysis and logistic regression is calibrated using pure biofilms of each species and validated on pure biofilms grown months later, achieving 96% accuracy in prospective classification. When biofilms of the two species are partially mixed together, Raman-based identifications are achieved within ~2 μm of the boundaries between species with 97% accuracy. This combination of spatial resolution and predication accuracy should be suitable for forming images of species distributions within intact two-species biofilms.

  20. A study of fuzzy logic ensemble system performance on face recognition problem

    NASA Astrophysics Data System (ADS)

    Polyakova, A.; Lipinskiy, L.

    2017-02-01

    Some problems are difficult to solve by using a single intelligent information technology (IIT). The ensemble of the various data mining (DM) techniques is a set of models which are able to solve the problem by itself, but the combination of which allows increasing the efficiency of the system as a whole. Using the IIT ensembles can improve the reliability and efficiency of the final decision, since it emphasizes on the diversity of its components. The new method of the intellectual informational technology ensemble design is considered in this paper. It is based on the fuzzy logic and is designed to solve the classification and regression problems. The ensemble consists of several data mining algorithms: artificial neural network, support vector machine and decision trees. These algorithms and their ensemble have been tested by solving the face recognition problems. Principal components analysis (PCA) is used for feature selection.

  1. Principles for ecological classification

    Treesearch

    Dennis H. Grossman; Patrick Bourgeron; Wolf-Dieter N. Busch; David T. Cleland; William Platts; G. Ray; C. Robins; Gary Roloff

    1999-01-01

    The principal purpose of any classification is to relate common properties among different entities to facilitate understanding of evolutionary and adaptive processes. In the context of this volume, it is to facilitate ecosystem stewardship, i.e., to help support ecosystem conservation and management objectives.

  2. Rapid discrimination of sea buckthorn berries from different H. rhamnoides subspecies by multi-step IR spectroscopy coupled with multivariate data analysis

    NASA Astrophysics Data System (ADS)

    Liu, Yue; Zhang, Ying; Zhang, Jing; Fan, Gang; Tu, Ya; Sun, Suqin; Shen, Xudong; Li, Qingzhu; Zhang, Yi

    2018-03-01

    As an important ethnic medicine, sea buckthorn was widely used to prevent and treat various diseases due to its nutritional and medicinal properties. According to the Chinese Pharmacopoeia, sea buckthorn was originated from H. rhamnoides, which includes five subspecies distributed in China. Confusion and misidentification usually occurred due to their similar morphology, especially in dried and powdered forms. Additionally, these five subspecies have vital differences in quality and physiological efficacy. This paper focused on the quick classification and identification method of sea buckthorn berry powders from five H. rhamnoides subspecies using multi-step IR spectroscopy coupled with multivariate data analysis. The holistic chemical compositions revealed by the FT-IR spectra demonstrated that flavonoids, fatty acids and sugars were the main chemical components. Further, the differences in FT-IR spectra regarding their peaks, positions and intensities were used to identify H. rhamnoides subspecies samples. The discrimination was achieved using principal component analysis (PCA) and partial least square-discriminant analysis (PLS-DA). The results showed that the combination of multi-step IR spectroscopy and chemometric analysis offered a simple, fast and reliable method for the classification and identification of the sea buckthorn berry powders from different H. rhamnoides subspecies.

  3. Space weathering trends on carbonaceous asteroids: A possible explanation for Bennu's blue slope?

    NASA Astrophysics Data System (ADS)

    Lantz, C.; Binzel, R. P.; DeMeo, F. E.

    2018-03-01

    We compare primitive near-Earth asteroid spectral properties to the irradiated carbonaceous chondrite samples of Lantz et al. (2017) in order to assess how space weathering processes might influence taxonomic classification. Using the same eigenvectors from the asteroid taxonomy by DeMeo et al. (2009), we calculate the principal components for fresh and irradiated meteorites and find that change in spectral slope (blueing or reddening) causes a corresponding shift in the two first principal components along the same line that the C- and X-complexes track. Using a sample of B-, C-, X-, and D-type NEOs with visible and near-infrared spectral data, we further investigated the correlation between prinicipal components and the spectral curvature for the primitive asteroids. We find that space weathering effects are not just slope and albedo, but also include spectral curvature. We show how, through space weathering, surfaces having an original "C-type" reflectance can thus turn into a redder P-type or a bluer B-type, and that space weathering can also decrease (and disguise) the D-type population. Finally we take a look at the case of OSIRIS-REx target (101955) Bennu and propose an explanation for the blue and possibly red spectra that were previously observed on different locations of its surface: parts of Bennu's surface could have become blue due to space weathering, while fresher areas are redder. No clear prediction can be made on Hayabusa-2 target (162173) Ryugu.

  4. A data fusion-based drought index

    NASA Astrophysics Data System (ADS)

    Azmi, Mohammad; Rüdiger, Christoph; Walker, Jeffrey P.

    2016-03-01

    Drought and water stress monitoring plays an important role in the management of water resources, especially during periods of extreme climate conditions. Here, a data fusion-based drought index (DFDI) has been developed and analyzed for three different locations of varying land use and climate regimes in Australia. The proposed index comprehensively considers all types of drought through a selection of indices and proxies associated with each drought type. In deriving the proposed index, weekly data from three different data sources (OzFlux Network, Asia-Pacific Water Monitor, and MODIS-Terra satellite) were employed to first derive commonly used individual standardized drought indices (SDIs), which were then grouped using an advanced clustering method. Next, three different multivariate methods (principal component analysis, factor analysis, and independent component analysis) were utilized to aggregate the SDIs located within each group. For the two clusters in which the grouped SDIs best reflected the water availability and vegetation conditions, the variables were aggregated based on an averaging between the standardized first principal components of the different multivariate methods. Then, considering those two aggregated indices as well as the classifications of months (dry/wet months and active/non-active months), the proposed DFDI was developed. Finally, the symbolic regression method was used to derive mathematical equations for the proposed DFDI. The results presented here show that the proposed index has revealed new aspects in water stress monitoring which previous indices were not able to, by simultaneously considering both hydrometeorological and ecological concepts to define the real water stress of the study areas.

  5. Regional prioritisation of flood risk in mountainous areas

    NASA Astrophysics Data System (ADS)

    Rogelis, M. C.; Werner, M.; Obregón, N.; Wright, G.

    2015-07-01

    A regional analysis of flood risk was carried out in the mountainous area surrounding the city of Bogotá (Colombia). Vulnerability at regional level was assessed on the basis of a principal component analysis carried out with variables recognised in literature to contribute to vulnerability; using watersheds as the unit of analysis. The area exposed was obtained from a simplified flood analysis at regional level to provide a mask where vulnerability variables were extracted. The vulnerability indicator obtained from the principal component analysis was combined with an existing susceptibility indicator, thus providing an index that allows the watersheds to be prioritised in support of flood risk management at regional level. Results show that the components of vulnerability can be expressed in terms of four constituent indicators; socio-economic fragility, which is composed of demography and lack of well-being; lack of resilience, which is composed of education, preparedness and response capacity, rescue capacity, social cohesion and participation; and physical exposure is composed of exposed infrastructure and exposed population. A sensitivity analysis shows that the classification of vulnerability is robust for watersheds with low and high values of the vulnerability indicator, while some watersheds with intermediate values of the indicator are sensitive to shifting between medium and high vulnerability. The complex interaction between vulnerability and hazard is evidenced in the case study. Environmental degradation in vulnerable watersheds shows the influence that vulnerability exerts on hazard and vice versa, thus establishing a cycle that builds up risk conditions.

  6. Acoustic mapping and classification of benthic habitat using unsupervised learning in artificial reef water

    NASA Astrophysics Data System (ADS)

    Li, Dong; Tang, Cheng; Xia, Chunlei; Zhang, Hua

    2017-02-01

    Artificial reefs (ARs) are effective means to maintain fishery resources and to restore ecological environment in coastal waters. ARs have been widely constructed along the Chinese coast. However, understanding of benthic habitats in the vicinity of ARs is limited, hindering effective fisheries and aquacultural management. Multibeam echosounder (MBES) is an advanced acoustic instrument capable of efficiently generating large-scale maps of benthic environments at fine resolutions. The objective of this study is to develop a technical approach to characterize, classify, and map shallow coastal areas with ARs using an MBES. An automated classification method is designed and tested to process bathymetric and backscatter data from MBES and transform the variables into simple, easily visualized maps. To reduce the redundancy in acoustic variables, a principal component analysis (PCA) is used to condense the highly collinear dataset. An acoustic benthic map of bottom sediments is classified using an iterative self-organizing data analysis technique (ISODATA). The approach is tested with MBES surveys in a 1.15 km2 fish farm with a high density of ARs off the Yantai coast in northern China. Using this method, 3 basic benthic habitats (sandy bottom, muddy sediments, and ARs) are distinguished. The results of the classification are validated using sediment samples and underwater surveys. Our study shows that the use of MBES is an effective method for acoustic mapping and classification of ARs.

  7. Physicochemical properties of honey from Marche, Central Italy: classification of unifloral and multifloral honeys by multivariate analysis.

    PubMed

    Truzzi, Cristina; Illuminati, Silvia; Annibaldia, Anna; Finale, Carolina; Rossetti, Monica; Scarponi, Giuseppe

    2014-11-01

    The purpose of this study was the physicochemical characterization and classification of Italian honey from Marche Region with a chemometric approach. A total of 135 honeys of different botanical origins [acacia (Robinia pseudoacacia L.), chestnut (Castanea sativa), coriander (Coriandrum sativum L.), lime (Tilia spp.), sunflower (Helianthus annuus L.), Metcalfa honeydew and multifloral honey] were considered. The average results of electrical conductivity (0.14-1.45 mS cm(-1)), pH (3.89-5.42), free acidity (10.9-39.0 meq(NaOH) kg(-1)), lactones (2.4-4.5 meq(NaOH) kg(-1)), total acidity (14.5-40.9 meq(NaOH) kg(-1)), proline (229-665 mg kg(-1)) and 5-(hydroxy-methyl)-2-furaldehyde (0.6-3.9 mg kg(-1)) content show wide variability among the analysed honey types, with statistically significant differences between the different honey types. Pattern recognition methods such as principal component analysis and discriminant analysis were performed in order to find a relationship between variables and types of honey and to classify honey on the basis of its physicochemical properties. The variables of electrical conductivity, acidity (free, lactones), pH and proline content exhibited higher discriminant power and provided enough information for the classification and distinction of unifloral honey types, but not for the classification of multifloral honey (100% and 85% of samples correctly classified, respectively).

  8. A hierarchical classification approach for recognition of low-density (LDPE) and high-density polyethylene (HDPE) in mixed plastic waste based on short-wave infrared (SWIR) hyperspectral imaging.

    PubMed

    Bonifazi, Giuseppe; Capobianco, Giuseppe; Serranti, Silvia

    2018-06-05

    The aim of this work was to recognize different polymer flakes from mixed plastic waste through an innovative hierarchical classification strategy based on hyperspectral imaging, with particular reference to low density polyethylene (LDPE) and high-density polyethylene (HDPE). A plastic waste composition assessment, including also LDPE and HDPE identification, may help to define optimal recycling strategies for product quality control. Correct handling of plastic waste is essential for its further "sustainable" recovery, maximizing the sorting performance in particular for plastics with similar characteristics as LDPE and HDPE. Five different plastic waste samples were chosen for the investigation: polypropylene (PP), LDPE, HDPE, polystyrene (PS) and polyvinyl chloride (PVC). A calibration dataset was realized utilizing the corresponding virgin polymers. Hyperspectral imaging in the short-wave infrared range (1000-2500nm) was thus applied to evaluate the different plastic spectral attributes finalized to perform their recognition/classification. After exploring polymer spectral differences by principal component analysis (PCA), a hierarchical partial least squares discriminant analysis (PLS-DA) model was built allowing the five different polymers to be recognized. The proposed methodology, based on hierarchical classification, is very powerful and fast, allowing to recognize the five different polymers in a single step. Copyright © 2018 Elsevier B.V. All rights reserved.

  9. Evaluation of linear discriminant analysis for automated Raman histological mapping of esophageal high-grade dysplasia

    NASA Astrophysics Data System (ADS)

    Hutchings, Joanne; Kendall, Catherine; Shepherd, Neil; Barr, Hugh; Stone, Nicholas

    2010-11-01

    Rapid Raman mapping has the potential to be used for automated histopathology diagnosis, providing an adjunct technique to histology diagnosis. The aim of this work is to evaluate the feasibility of automated and objective pathology classification of Raman maps using linear discriminant analysis. Raman maps of esophageal tissue sections are acquired. Principal component (PC)-fed linear discriminant analysis (LDA) is carried out using subsets of the Raman map data (6483 spectra). An overall (validated) training classification model performance of 97.7% (sensitivity 95.0 to 100% and specificity 98.6 to 100%) is obtained. The remainder of the map spectra (131,672 spectra) are projected onto the classification model resulting in Raman images, demonstrating good correlation with contiguous hematoxylin and eosin (HE) sections. Initial results suggest that LDA has the potential to automate pathology diagnosis of esophageal Raman images, but since the classification of test spectra is forced into existing training groups, further work is required to optimize the training model. A small pixel size is advantageous for developing the training datasets using mapping data, despite lengthy mapping times, due to additional morphological information gained, and could facilitate differentiation of further tissue groups, such as the basal cells/lamina propria, in the future, but larger pixels sizes (and faster mapping) may be more feasible for clinical application.

  10. Fast clustering algorithm for large ECG data sets based on CS theory in combination with PCA and K-NN methods.

    PubMed

    Balouchestani, Mohammadreza; Krishnan, Sridhar

    2014-01-01

    Long-term recording of Electrocardiogram (ECG) signals plays an important role in health care systems for diagnostic and treatment purposes of heart diseases. Clustering and classification of collecting data are essential parts for detecting concealed information of P-QRS-T waves in the long-term ECG recording. Currently used algorithms do have their share of drawbacks: 1) clustering and classification cannot be done in real time; 2) they suffer from huge energy consumption and load of sampling. These drawbacks motivated us in developing novel optimized clustering algorithm which could easily scan large ECG datasets for establishing low power long-term ECG recording. In this paper, we present an advanced K-means clustering algorithm based on Compressed Sensing (CS) theory as a random sampling procedure. Then, two dimensionality reduction methods: Principal Component Analysis (PCA) and Linear Correlation Coefficient (LCC) followed by sorting the data using the K-Nearest Neighbours (K-NN) and Probabilistic Neural Network (PNN) classifiers are applied to the proposed algorithm. We show our algorithm based on PCA features in combination with K-NN classifier shows better performance than other methods. The proposed algorithm outperforms existing algorithms by increasing 11% classification accuracy. In addition, the proposed algorithm illustrates classification accuracy for K-NN and PNN classifiers, and a Receiver Operating Characteristics (ROC) area of 99.98%, 99.83%, and 99.75% respectively.

  11. Classification of Partial Discharge Measured under Different Levels of Noise Contamination

    PubMed Central

    2017-01-01

    Cable joint insulation breakdown may cause a huge loss to power companies. Therefore, it is vital to diagnose the insulation quality to detect early signs of insulation failure. It is well known that there is a correlation between Partial discharge (PD) and the insulation quality. Although many works have been done on PD pattern recognition, it is usually performed in a noise free environment. Also, works on PD pattern recognition in actual cable joint are less likely to be found in literature. Therefore, in this work, classifications of actual cable joint defect types from partial discharge data contaminated by noise were performed. Five cross-linked polyethylene (XLPE) cable joints with artificially created defects were prepared based on the defects commonly encountered on site. Three different types of input feature were extracted from the PD pattern under artificially created noisy environment. These include statistical features, fractal features and principal component analysis (PCA) features. These input features were used to train the classifiers to classify each PD defect types. Classifications were performed using three different artificial intelligence classifiers, which include Artificial Neural Networks (ANN), Adaptive Neuro-Fuzzy Inference System (ANFIS) and Support Vector Machine (SVM). It was found that the classification accuracy decreases with higher noise level but PCA features used in SVM and ANN showed the strongest tolerance against noise contamination. PMID:28085953

  12. A hybrid LIBS-Raman system combined with chemometrics: an efficient tool for plastic identification and sorting.

    PubMed

    Shameem, K M Muhammed; Choudhari, Khoobaram S; Bankapur, Aseefhali; Kulkarni, Suresh D; Unnikrishnan, V K; George, Sajan D; Kartha, V B; Santhosh, C

    2017-05-01

    Classification of plastics is of great importance in the recycling industry as the littering of plastic wastes increases day by day as a result of its extensive use. In this paper, we demonstrate the efficacy of a combined laser-induced breakdown spectroscopy (LIBS)-Raman system for the rapid identification and classification of post-consumer plastics. The atomic information and molecular information of polyethylene terephthalate, polyethylene, polypropylene, and polystyrene were studied using plasma emission spectra and scattered signal obtained in the LIBS and Raman technique, respectively. The collected spectral features of the samples were analyzed using statistical tools (principal component analysis, Mahalanobis distance) to categorize the plastics. The analyses of the data clearly show that elemental information and molecular information obtained from these techniques are efficient for classification of plastics. In addition, the molecular information collected via Raman spectroscopy exhibits clearly distinct features for the transparent plastics (100% discrimination), whereas the LIBS technique shows better spectral feature differences for the colored samples. The study shows that the information obtained from these complementary techniques allows the complete classification of the plastic samples, irrespective of the color or additives. This work further throws some light on the fact that the potential limitations of any of these techniques for sample identification can be overcome by the complementarity of these two techniques. Graphical Abstract ᅟ.

  13. A Novel Anti-classification Approach for Knowledge Protection.

    PubMed

    Lin, Chen-Yi; Chen, Tung-Shou; Tsai, Hui-Fang; Lee, Wei-Bin; Hsu, Tien-Yu; Kao, Yuan-Hung

    2015-10-01

    Classification is the problem of identifying a set of categories where new data belong, on the basis of a set of training data whose category membership is known. Its application is wide-spread, such as the medical science domain. The issue of the classification knowledge protection has been paid attention increasingly in recent years because of the popularity of cloud environments. In the paper, we propose a Shaking Sorted-Sampling (triple-S) algorithm for protecting the classification knowledge of a dataset. The triple-S algorithm sorts the data of an original dataset according to the projection results of the principal components analysis so that the features of the adjacent data are similar. Then, we generate noise data with incorrect classes and add those data to the original dataset. In addition, we develop an effective positioning strategy, determining the added positions of noise data in the original dataset, to ensure the restoration of the original dataset after removing those noise data. The experimental results show that the disturbance effect of the triple-S algorithm on the CLC, MySVM, and LibSVM classifiers increases when the noise data ratio increases. In addition, compared with existing methods, the disturbance effect of the triple-S algorithm is more significant on MySVM and LibSVM when a certain amount of the noise data added to the original dataset is reached.

  14. Identification and classification of failure modes in laminated composites by using a multivariate statistical analysis of wavelet coefficients

    NASA Astrophysics Data System (ADS)

    Baccar, D.; Söffker, D.

    2017-11-01

    Acoustic Emission (AE) is a suitable method to monitor the health of composite structures in real-time. However, AE-based failure mode identification and classification are still complex to apply due to the fact that AE waves are generally released simultaneously from all AE-emitting damage sources. Hence, the use of advanced signal processing techniques in combination with pattern recognition approaches is required. In this paper, AE signals generated from laminated carbon fiber reinforced polymer (CFRP) subjected to indentation test are examined and analyzed. A new pattern recognition approach involving a number of processing steps able to be implemented in real-time is developed. Unlike common classification approaches, here only CWT coefficients are extracted as relevant features. Firstly, Continuous Wavelet Transform (CWT) is applied to the AE signals. Furthermore, dimensionality reduction process using Principal Component Analysis (PCA) is carried out on the coefficient matrices. The PCA-based feature distribution is analyzed using Kernel Density Estimation (KDE) allowing the determination of a specific pattern for each fault-specific AE signal. Moreover, waveform and frequency content of AE signals are in depth examined and compared with fundamental assumptions reported in this field. A correlation between the identified patterns and failure modes is achieved. The introduced method improves the damage classification and can be used as a non-destructive evaluation tool.

  15. The influence of iliotibial band syndrome history on running biomechanics examined via principal components analysis.

    PubMed

    Foch, Eric; Milner, Clare E

    2014-01-03

    Iliotibial band syndrome (ITBS) is a common knee overuse injury among female runners. Atypical discrete trunk and lower extremity biomechanics during running may be associated with the etiology of ITBS. Examining discrete data points limits the interpretation of a waveform to a single value. Characterizing entire kinematic and kinetic waveforms may provide additional insight into biomechanical factors associated with ITBS. Therefore, the purpose of this cross-sectional investigation was to determine whether female runners with previous ITBS exhibited differences in kinematics and kinetics compared to controls using a principal components analysis (PCA) approach. Forty participants comprised two groups: previous ITBS and controls. Principal component scores were retained for the first three principal components and were analyzed using independent t-tests. The retained principal components accounted for 93-99% of the total variance within each waveform. Runners with previous ITBS exhibited low principal component one scores for frontal plane hip angle. Principal component one accounted for the overall magnitude in hip adduction which indicated that runners with previous ITBS assumed less hip adduction throughout stance. No differences in the remaining retained principal component scores for the waveforms were detected among groups. A smaller hip adduction angle throughout the stance phase of running may be a compensatory strategy to limit iliotibial band strain. This running strategy may have persisted after ITBS symptoms subsided. © 2013 Published by Elsevier Ltd.

  16. Detection of Abnormal Events via Optical Flow Feature Analysis

    PubMed Central

    Wang, Tian; Snoussi, Hichem

    2015-01-01

    In this paper, a novel algorithm is proposed to detect abnormal events in video streams. The algorithm is based on the histogram of the optical flow orientation descriptor and the classification method. The details of the histogram of the optical flow orientation descriptor are illustrated for describing movement information of the global video frame or foreground frame. By combining one-class support vector machine and kernel principal component analysis methods, the abnormal events in the current frame can be detected after a learning period characterizing normal behaviors. The difference abnormal detection results are analyzed and explained. The proposed detection method is tested on benchmark datasets, then the experimental results show the effectiveness of the algorithm. PMID:25811227

  17. The Raman spectrum character of skin tumor induced by UVB

    NASA Astrophysics Data System (ADS)

    Wu, Shulian; Hu, Liangjun; Wang, Yunxia; Li, Yongzeng

    2016-03-01

    In our study, the skin canceration processes induced by UVB were analyzed from the perspective of tissue spectrum. A home-made Raman spectral system with a millimeter order excitation laser spot size combined with a multivariate statistical analysis for monitoring the skin changed irradiated by UVB was studied and the discrimination were evaluated. Raman scattering signals of the SCC and normal skin were acquired. Spectral differences in Raman spectra were revealed. Linear discriminant analysis (LDA) based on principal component analysis (PCA) were employed to generate diagnostic algorithms for the classification of skin SCC and normal. The results indicated that Raman spectroscopy combined with PCA-LDA demonstrated good potential for improving the diagnosis of skin cancers.

  18. Metabolic profiling using HPLC allows classification of drugs according to their mechanisms of action in HL-1 cardiomyocytes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Strigun, Alexander; Wahrheit, Judith; Beckers, Simone

    Along with hepatotoxicity, cardiotoxic side effects remain one of the major reasons for drug withdrawals and boxed warnings. Prediction methods for cardiotoxicity are insufficient. High content screening comprising of not only electrophysiological characterization but also cellular molecular alterations are expected to improve the cardiotoxicity prediction potential. Metabolomic approaches recently have become an important focus of research in pharmacological testing and prediction. In this study, the culture medium supernatants from HL-1 cardiomyocytes after exposure to drugs from different classes (analgesics, antimetabolites, anthracyclines, antihistamines, channel blockers) were analyzed to determine specific metabolic footprints in response to the tested drugs. Since most drugsmore » influence energy metabolism in cardiac cells, the metabolite 'sub-profile' consisting of glucose, lactate, pyruvate and amino acids was considered. These metabolites were quantified using HPLC in samples after exposure of cells to test compounds of the respective drug groups. The studied drug concentrations were selected from concentration response curves for each drug. The metabolite profiles were randomly split into training/validation and test set; and then analysed using multivariate statistics (principal component analysis and discriminant analysis). Discriminant analysis resulted in clustering of drugs according to their modes of action. After cross validation and cross model validation, the underlying training data were able to predict 50%-80% of conditions to the correct classification group. We show that HPLC based characterisation of known cell culture medium components is sufficient to predict a drug's potential classification according to its mode of action.« less

  19. Deep-Learning Convolutional Neural Networks Accurately Classify Genetic Mutations in Gliomas.

    PubMed

    Chang, P; Grinband, J; Weinberg, B D; Bardis, M; Khy, M; Cadena, G; Su, M-Y; Cha, S; Filippi, C G; Bota, D; Baldi, P; Poisson, L M; Jain, R; Chow, D

    2018-05-10

    The World Health Organization has recently placed new emphasis on the integration of genetic information for gliomas. While tissue sampling remains the criterion standard, noninvasive imaging techniques may provide complimentary insight into clinically relevant genetic mutations. Our aim was to train a convolutional neural network to independently predict underlying molecular genetic mutation status in gliomas with high accuracy and identify the most predictive imaging features for each mutation. MR imaging data and molecular information were retrospectively obtained from The Cancer Imaging Archives for 259 patients with either low- or high-grade gliomas. A convolutional neural network was trained to classify isocitrate dehydrogenase 1 ( IDH1 ) mutation status, 1p/19q codeletion, and O6-methylguanine-DNA methyltransferase ( MGMT ) promotor methylation status. Principal component analysis of the final convolutional neural network layer was used to extract the key imaging features critical for successful classification. Classification had high accuracy: IDH1 mutation status, 94%; 1p/19q codeletion, 92%; and MGMT promotor methylation status, 83%. Each genetic category was also associated with distinctive imaging features such as definition of tumor margins, T1 and FLAIR suppression, extent of edema, extent of necrosis, and textural features. Our results indicate that for The Cancer Imaging Archives dataset, machine-learning approaches allow classification of individual genetic mutations of both low- and high-grade gliomas. We show that relevant MR imaging features acquired from an added dimensionality-reduction technique demonstrate that neural networks are capable of learning key imaging components without prior feature selection or human-directed training. © 2018 by American Journal of Neuroradiology.

  20. Quantitative determination and classification of energy drinks using near-infrared spectroscopy.

    PubMed

    Rácz, Anita; Héberger, Károly; Fodor, Marietta

    2016-09-01

    Almost a hundred commercially available energy drink samples from Hungary, Slovakia, and Greece were collected for the quantitative determination of their caffeine and sugar content with FT-NIR spectroscopy and high-performance liquid chromatography (HPLC). Calibration models were built with partial least-squares regression (PLSR). An HPLC-UV method was used to measure the reference values for caffeine content, while sugar contents were measured with the Schoorl method. Both the nominal sugar content (as indicated on the cans) and the measured sugar concentration were used as references. Although the Schoorl method has larger error and bias, appropriate models could be developed using both references. The validation of the models was based on sevenfold cross-validation and external validation. FT-NIR analysis is a good candidate to replace the HPLC-UV method, because it is much cheaper than any chromatographic method, while it is also more time-efficient. The combination of FT-NIR with multidimensional chemometric techniques like PLSR can be a good option for the detection of low caffeine concentrations in energy drinks. Moreover, three types of energy drinks that contain (i) taurine, (ii) arginine, and (iii) none of these two components were classified correctly using principal component analysis and linear discriminant analysis. Such classifications are important for the detection of adulterated samples and for quality control, as well. In this case, more than a hundred samples were used for the evaluation. The classification was validated with cross-validation and several randomization tests (X-scrambling). Graphical Abstract The way of energy drinks from cans to appropriate chemometric models.

  1. Novel algorithm for simultaneous component detection and pseudo-molecular ion characterization in liquid chromatography-mass spectrometry.

    PubMed

    Zhang, Yufeng; Wang, Xiaoan; Wo, Siukwan; Ho, Hingman; Han, Quanbin; Fan, Xiaohui; Zuo, Zhong

    2015-01-01

    Resolving components and determining their pseudo-molecular ions (PMIs) are crucial steps in identifying complex herbal mixtures by liquid chromatography-mass spectrometry. To tackle such labor-intensive steps, we present here a novel algorithm for simultaneous detection of components and their PMIs. Our method consists of three steps: (1) obtaining a simplified dataset containing only mono-isotopic masses by removal of background noise and isotopic cluster ions based on the isotopic distribution model derived from all the reported natural compounds in dictionary of natural products; (2) stepwise resolving and removing all features of the highest abundant component from current simplified dataset and calculating PMI of each component according to an adduct-ion model, in which all non-fragment ions in a mass spectrum are considered as PMI plus one or several neutral species; (3) visual classification of detected components by principal component analysis (PCA) to exclude possible non-natural compounds (such as pharmaceutical excipients). This algorithm has been successfully applied to a standard mixture and three herbal extract/preparations. It indicated that our algorithm could detect components' features as a whole and report their PMI with an accuracy of more than 98%. Furthermore, components originated from excipients/contaminants could be easily separated from those natural components in the bi-plots of PCA. Copyright © 2014 Elsevier B.V. All rights reserved.

  2. A new approach for computing a flood vulnerability index using cluster analysis

    NASA Astrophysics Data System (ADS)

    Fernandez, Paulo; Mourato, Sandra; Moreira, Madalena; Pereira, Luísa

    2016-08-01

    A Flood Vulnerability Index (FloodVI) was developed using Principal Component Analysis (PCA) and a new aggregation method based on Cluster Analysis (CA). PCA simplifies a large number of variables into a few uncorrelated factors representing the social, economic, physical and environmental dimensions of vulnerability. CA groups areas that have the same characteristics in terms of vulnerability into vulnerability classes. The grouping of the areas determines their classification contrary to other aggregation methods in which the areas' classification determines their grouping. While other aggregation methods distribute the areas into classes, in an artificial manner, by imposing a certain probability for an area to belong to a certain class, as determined by the assumption that the aggregation measure used is normally distributed, CA does not constrain the distribution of the areas by the classes. FloodVI was designed at the neighbourhood level and was applied to the Portuguese municipality of Vila Nova de Gaia where several flood events have taken place in the recent past. The FloodVI sensitivity was assessed using three different aggregation methods: the sum of component scores, the first component score and the weighted sum of component scores. The results highlight the sensitivity of the FloodVI to different aggregation methods. Both sum of component scores and weighted sum of component scores have shown similar results. The first component score aggregation method classifies almost all areas as having medium vulnerability and finally the results obtained using the CA show a distinct differentiation of the vulnerability where hot spots can be clearly identified. The information provided by records of previous flood events corroborate the results obtained with CA, because the inundated areas with greater damages are those that are identified as high and very high vulnerability areas by CA. This supports the fact that CA provides a reliable FloodVI.

  3. Comparing drug classification systems.

    PubMed

    Mahoney, Anne; Evans, Jonathan

    2008-11-06

    An essential quality of drug classification systems is the ability to assign medications to a structured hierarchy for categories such as mechanism of action, physiological effects, and therapeutic indications. No single classification system can meet all of these needs; however, there should be consistency among those that group by the same underlying principals. We discovered discrepancies in how drugs with multiple therapeutic indications are classified among four widely used schemas.

  4. Using the Landsat Archive to Monitor Gully Erosion Development, in SE Nigeria, as a Response to Land-use Classification and Environmental Variability.

    NASA Astrophysics Data System (ADS)

    Brolly, M.; Iro, S.

    2016-12-01

    This study presents novel low budget methodologies for mapping and monitoring gully erosion development in South-East Nigeria. The unabated way gullies develop, and the lack of control measures in the SE Nigeria study area, motivates this work. The Landsat archive is used to determine change in land-use/cover classification over a 30-year period (1986-2015) in a region measuring 70km x 70km. Multi-resolution segmentation is enabled through Object Based Image Analysis (OBIA) and Pixel based classification techniques (supervised/unsupervised) using an initial dataset including 40 ground validated gully sites within the region. Detected increases in gully area are positively correlated with land clearance, manifested by associated vegetation reduction and anthropogenic encroachment with r values reported of -0.94 (p<0.05) and -0.97 (p<0.05) for the Pixel and OBIA classification approaches respectively. Within the study region 14 specific gullies are further vectorised and quantified in terms of extent and rates of change. Local and regional results are then examined in regard to land-use and environmental variables, such as meteorology, soil and rock geology, and topographical/landscape parameters. Of the 14 specific sites, the maximum reported erosion rates are 232010m2 per year for the largest gully (4123765m2) and -501m2 per year for the smallest (2749m2), representing year on year % increases of 9% and -0.15% respectively. These erosion rates were exhibited in 1988 and 2007. Analysis of topography across the region at 30m resolution reveals 90% of the 40 observed gullies develop on concave slopes with high values of 4 plan curvatures and greater than 15° inclines with highest erosion rates exhibited on ferralsols soil type. Principal Component Analysis reveals inter-variable similarities, via component 1, between Slope (58%), Elevation (50%) and Gully Area (62%), while, Vegetation loss (14%), Soil structure (8%) and Rate of gully change (3%) are better defined by the second component, showing their similarities.

  5. LiDAR point classification based on sparse representation

    NASA Astrophysics Data System (ADS)

    Li, Nan; Pfeifer, Norbert; Liu, Chun

    2017-04-01

    In order to combine the initial spatial structure and features of LiDAR data for accurate classification. The LiDAR data is represented as a 4-order tensor. Sparse representation for classification(SRC) method is used for LiDAR tensor classification. It turns out SRC need only a few of training samples from each class, meanwhile can achieve good classification result. Multiple features are extracted from raw LiDAR points to generate a high-dimensional vector at each point. Then the LiDAR tensor is built by the spatial distribution and feature vectors of the point neighborhood. The entries of LiDAR tensor are accessed via four indexes. Each index is called mode: three spatial modes in direction X ,Y ,Z and one feature mode. Sparse representation for classification(SRC) method is proposed in this paper. The sparsity algorithm is to find the best represent the test sample by sparse linear combination of training samples from a dictionary. To explore the sparsity of LiDAR tensor, the tucker decomposition is used. It decomposes a tensor into a core tensor multiplied by a matrix along each mode. Those matrices could be considered as the principal components in each mode. The entries of core tensor show the level of interaction between the different components. Therefore, the LiDAR tensor can be approximately represented by a sparse tensor multiplied by a matrix selected from a dictionary along each mode. The matrices decomposed from training samples are arranged as initial elements in the dictionary. By dictionary learning, a reconstructive and discriminative structure dictionary along each mode is built. The overall structure dictionary composes of class-specified sub-dictionaries. Then the sparse core tensor is calculated by tensor OMP(Orthogonal Matching Pursuit) method based on dictionaries along each mode. It is expected that original tensor should be well recovered by sub-dictionary associated with relevant class, while entries in the sparse tensor associated with other classed should be nearly zero. Therefore, SRC use the reconstruction error associated with each class to do data classification. A section of airborne LiDAR points of Vienna city is used and classified into 6classes: ground, roofs, vegetation, covered ground, walls and other points. Only 6 training samples from each class are taken. For the final classification result, ground and covered ground are merged into one same class(ground). The classification accuracy for ground is 94.60%, roof is 95.47%, vegetation is 85.55%, wall is 76.17%, other object is 20.39%.

  6. Class D Management Implementation Approach of the First Orbital Mission of the Earth Venture Series

    NASA Technical Reports Server (NTRS)

    Wells, James E.; Scherrer, John; Law, Richard; Bonniksen, Chris

    2013-01-01

    A key element of the National Research Council's Earth Science and Applications Decadal Survey called for the creation of the Venture Class line of low-cost research and application missions within NASA (National Aeronautics and Space Administration). One key component of the architecture chosen by NASA within the Earth Venture line is a series of self-contained stand-alone spaceflight science missions called "EV-Mission". The first mission chosen for this competitively selected, cost and schedule capped, Principal Investigator-led opportunity is the CYclone Global Navigation Satellite System (CYGNSS). As specified in the defining Announcement of Opportunity, the Principal Investigator is held responsible for successfully achieving the science objectives of the selected mission and the management approach that he/she chooses to obtain those results has a significant amount of freedom as long as it meets the intent of key NASA guidance like NPR 7120.5 and 7123. CYGNSS is classified under NPR 7120.5E guidance as a Category 3 (low priority, low cost) mission and carries a Class D risk classification (low priority, high risk) per NPR 8705.4. As defined in the NPR guidance, Class D risk classification allows for a relatively broad range of implementation strategies. The management approach that will be utilized on CYGNSS is a streamlined implementation that starts with a higher risk tolerance posture at NASA and that philosophy flows all the way down to the individual part level.

  7. Classification of EEG Signals Based on Pattern Recognition Approach.

    PubMed

    Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed

    2017-01-01

    Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a "pattern recognition" approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90-7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11-89.63% and 91.60-81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy.

  8. Classification of EEG Signals Based on Pattern Recognition Approach

    PubMed Central

    Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed

    2017-01-01

    Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a “pattern recognition” approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90–7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11–89.63% and 91.60–81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy. PMID:29209190

  9. Markerless gating for lung cancer radiotherapy based on machine learning techniques

    NASA Astrophysics Data System (ADS)

    Lin, Tong; Li, Ruijiang; Tang, Xiaoli; Dy, Jennifer G.; Jiang, Steve B.

    2009-03-01

    In lung cancer radiotherapy, radiation to a mobile target can be delivered by respiratory gating, for which we need to know whether the target is inside or outside a predefined gating window at any time point during the treatment. This can be achieved by tracking one or more fiducial markers implanted inside or near the target, either fluoroscopically or electromagnetically. However, the clinical implementation of marker tracking is limited for lung cancer radiotherapy mainly due to the risk of pneumothorax. Therefore, gating without implanted fiducial markers is a promising clinical direction. We have developed several template-matching methods for fluoroscopic marker-less gating. Recently, we have modeled the gating problem as a binary pattern classification problem, in which principal component analysis (PCA) and support vector machine (SVM) are combined to perform the classification task. Following the same framework, we investigated different combinations of dimensionality reduction techniques (PCA and four nonlinear manifold learning methods) and two machine learning classification methods (artificial neural networks—ANN and SVM). Performance was evaluated on ten fluoroscopic image sequences of nine lung cancer patients. We found that among all combinations of dimensionality reduction techniques and classification methods, PCA combined with either ANN or SVM achieved a better performance than the other nonlinear manifold learning methods. ANN when combined with PCA achieves a better performance than SVM in terms of classification accuracy and recall rate, although the target coverage is similar for the two classification methods. Furthermore, the running time for both ANN and SVM with PCA is within tolerance for real-time applications. Overall, ANN combined with PCA is a better candidate than other combinations we investigated in this work for real-time gated radiotherapy.

  10. An Improved Cloud Classification Algorithm for China’s FY-2C Multi-Channel Images Using Artificial Neural Network

    PubMed Central

    Liu, Yu; Xia, Jun; Shi, Chun-Xiang; Hong, Yang

    2009-01-01

    The crowning objective of this research was to identify a better cloud classification method to upgrade the current window-based clustering algorithm used operationally for China’s first operational geostationary meteorological satellite FengYun-2C (FY-2C) data. First, the capabilities of six widely-used Artificial Neural Network (ANN) methods are analyzed, together with the comparison of two other methods: Principal Component Analysis (PCA) and a Support Vector Machine (SVM), using 2864 cloud samples manually collected by meteorologists in June, July, and August in 2007 from three FY-2C channel (IR1, 10.3–11.3 μm; IR2, 11.5–12.5 μm and WV 6.3–7.6 μm) imagery. The result shows that: (1) ANN approaches, in general, outperformed the PCA and the SVM given sufficient training samples and (2) among the six ANN networks, higher cloud classification accuracy was obtained with the Self-Organizing Map (SOM) and Probabilistic Neural Network (PNN). Second, to compare the ANN methods to the present FY-2C operational algorithm, this study implemented SOM, one of the best ANN network identified from this study, as an automated cloud classification system for the FY-2C multi-channel data. It shows that SOM method has improved the results greatly not only in pixel-level accuracy but also in cloud patch-level classification by more accurately identifying cloud types such as cumulonimbus, cirrus and clouds in high latitude. Findings of this study suggest that the ANN-based classifiers, in particular the SOM, can be potentially used as an improved Automated Cloud Classification Algorithm to upgrade the current window-based clustering method for the FY-2C operational products. PMID:22346714

  11. An Improved Cloud Classification Algorithm for China's FY-2C Multi-Channel Images Using Artificial Neural Network.

    PubMed

    Liu, Yu; Xia, Jun; Shi, Chun-Xiang; Hong, Yang

    2009-01-01

    The crowning objective of this research was to identify a better cloud classification method to upgrade the current window-based clustering algorithm used operationally for China's first operational geostationary meteorological satellite FengYun-2C (FY-2C) data. First, the capabilities of six widely-used Artificial Neural Network (ANN) methods are analyzed, together with the comparison of two other methods: Principal Component Analysis (PCA) and a Support Vector Machine (SVM), using 2864 cloud samples manually collected by meteorologists in June, July, and August in 2007 from three FY-2C channel (IR1, 10.3-11.3 μm; IR2, 11.5-12.5 μm and WV 6.3-7.6 μm) imagery. The result shows that: (1) ANN approaches, in general, outperformed the PCA and the SVM given sufficient training samples and (2) among the six ANN networks, higher cloud classification accuracy was obtained with the Self-Organizing Map (SOM) and Probabilistic Neural Network (PNN). Second, to compare the ANN methods to the present FY-2C operational algorithm, this study implemented SOM, one of the best ANN network identified from this study, as an automated cloud classification system for the FY-2C multi-channel data. It shows that SOM method has improved the results greatly not only in pixel-level accuracy but also in cloud patch-level classification by more accurately identifying cloud types such as cumulonimbus, cirrus and clouds in high latitude. Findings of this study suggest that the ANN-based classifiers, in particular the SOM, can be potentially used as an improved Automated Cloud Classification Algorithm to upgrade the current window-based clustering method for the FY-2C operational products.

  12. Parallel exploitation of a spatial-spectral classification approach for hyperspectral images on RVC-CAL

    NASA Astrophysics Data System (ADS)

    Lazcano, R.; Madroñal, D.; Fabelo, H.; Ortega, S.; Salvador, R.; Callicó, G. M.; Juárez, E.; Sanz, C.

    2017-10-01

    Hyperspectral Imaging (HI) assembles high resolution spectral information from hundreds of narrow bands across the electromagnetic spectrum, thus generating 3D data cubes in which each pixel gathers the spectral information of the reflectance of every spatial pixel. As a result, each image is composed of large volumes of data, which turns its processing into a challenge, as performance requirements have been continuously tightened. For instance, new HI applications demand real-time responses. Hence, parallel processing becomes a necessity to achieve this requirement, so the intrinsic parallelism of the algorithms must be exploited. In this paper, a spatial-spectral classification approach has been implemented using a dataflow language known as RVCCAL. This language represents a system as a set of functional units, and its main advantage is that it simplifies the parallelization process by mapping the different blocks over different processing units. The spatial-spectral classification approach aims at refining the classification results previously obtained by using a K-Nearest Neighbors (KNN) filtering process, in which both the pixel spectral value and the spatial coordinates are considered. To do so, KNN needs two inputs: a one-band representation of the hyperspectral image and the classification results provided by a pixel-wise classifier. Thus, spatial-spectral classification algorithm is divided into three different stages: a Principal Component Analysis (PCA) algorithm for computing the one-band representation of the image, a Support Vector Machine (SVM) classifier, and the KNN-based filtering algorithm. The parallelization of these algorithms shows promising results in terms of computational time, as the mapping of them over different cores presents a speedup of 2.69x when using 3 cores. Consequently, experimental results demonstrate that real-time processing of hyperspectral images is achievable.

  13. Application of PCA and SIMCA statistical analysis of FT-IR spectra for the classification and identification of different slag types with environmental origin.

    PubMed

    Stumpe, B; Engel, T; Steinweg, B; Marschner, B

    2012-04-03

    In the past, different slag materials were often used for landscaping and construction purposes or simply dumped. Nowadays German environmental laws strictly control the use of slags, but there is still a remaining part of 35% which is uncontrolled dumped in landfills. Since some slags have high heavy metal contents and different slag types have typical chemical and physical properties that will influence the risk potential and other characteristics of the deposits, an identification of the slag types is needed. We developed a FT-IR-based statistical method to identify different slags classes. Slags samples were collected at different sites throughout various cities within the industrial Ruhr area. Then, spectra of 35 samples from four different slags classes, ladle furnace (LF), blast furnace (BF), oxygen furnace steel (OF), and zinc furnace slags (ZF), were determined in the mid-infrared region (4000-400 cm(-1)). The spectra data sets were subject to statistical classification methods for the separation of separate spectral data of different slag classes. Principal component analysis (PCA) models for each slag class were developed and further used for soft independent modeling of class analogy (SIMCA). Precise classification of slag samples into four different slag classes were achieved using two different SIMCA models stepwise. At first, SIMCA 1 was used for classification of ZF as well as OF slags over the total spectral range. If no correct classification was found, then the spectrum was analyzed with SIMCA 2 at reduced wavenumbers for the classification of LF as well as BF spectra. As a result, we provide a time- and cost-efficient method based on FT-IR spectroscopy for processing and identifying large numbers of environmental slag samples.

  14. Classification of hydrological parameter sensitivity and evaluation of parameter transferability across 431 US MOPEX basins

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ren, Huiying; Hou, Zhangshuan; Huang, Maoyi

    The Community Land Model (CLM) represents physical, chemical, and biological processes of the terrestrial ecosystems that interact with climate across a range of spatial and temporal scales. As CLM includes numerous sub-models and associated parameters, the high-dimensional parameter space presents a formidable challenge for quantifying uncertainty and improving Earth system predictions needed to assess environmental changes and risks. This study aims to evaluate the potential of transferring hydrologic model parameters in CLM through sensitivity analyses and classification across watersheds from the Model Parameter Estimation Experiment (MOPEX) in the United States. The sensitivity of CLM-simulated water and energy fluxes to hydrologicalmore » parameters across 431 MOPEX basins are first examined using an efficient stochastic sampling-based sensitivity analysis approach. Linear, interaction, and high-order nonlinear impacts are all identified via statistical tests and stepwise backward removal parameter screening. The basins are then classified accordingly to their parameter sensitivity patterns (internal attributes), as well as their hydrologic indices/attributes (external hydrologic factors) separately, using a Principal component analyses (PCA) and expectation-maximization (EM) –based clustering approach. Similarities and differences among the parameter sensitivity-based classification system (S-Class), the hydrologic indices-based classification (H-Class), and the Koppen climate classification systems (K-Class) are discussed. Within each S-class with similar parameter sensitivity characteristics, similar inversion modeling setups can be used for parameter calibration, and the parameters and their contribution or significance to water and energy cycling may also be more transferrable. This classification study provides guidance on identifiable parameters, and on parameterization and inverse model design for CLM but the methodology is applicable to other models. Inverting parameters at representative sites belonging to the same class can significantly reduce parameter calibration efforts.« less

  15. Nonlinear Principal Components Analysis: Introduction and Application

    ERIC Educational Resources Information Center

    Linting, Marielle; Meulman, Jacqueline J.; Groenen, Patrick J. F.; van der Koojj, Anita J.

    2007-01-01

    The authors provide a didactic treatment of nonlinear (categorical) principal components analysis (PCA). This method is the nonlinear equivalent of standard PCA and reduces the observed variables to a number of uncorrelated principal components. The most important advantages of nonlinear over linear PCA are that it incorporates nominal and ordinal…

  16. Selective principal component regression analysis of fluorescence hyperspectral image to assess aflatoxin contamination in corn

    USDA-ARS?s Scientific Manuscript database

    Selective principal component regression analysis (SPCR) uses a subset of the original image bands for principal component transformation and regression. For optimal band selection before the transformation, this paper used genetic algorithms (GA). In this case, the GA process used the regression co...

  17. Similarities between principal components of protein dynamics and random diffusion

    NASA Astrophysics Data System (ADS)

    Hess, Berk

    2000-12-01

    Principal component analysis, also called essential dynamics, is a powerful tool for finding global, correlated motions in atomic simulations of macromolecules. It has become an established technique for analyzing molecular dynamics simulations of proteins. The first few principal components of simulations of large proteins often resemble cosines. We derive the principal components for high-dimensional random diffusion, which are almost perfect cosines. This resemblance between protein simulations and noise implies that for many proteins the time scales of current simulations are too short to obtain convergence of collective motions.

  18. Directly Reconstructing Principal Components of Heterogeneous Particles from Cryo-EM Images

    PubMed Central

    Tagare, Hemant D.; Kucukelbir, Alp; Sigworth, Fred J.; Wang, Hongwei; Rao, Murali

    2015-01-01

    Structural heterogeneity of particles can be investigated by their three-dimensional principal components. This paper addresses the question of whether, and with what algorithm, the three-dimensional principal components can be directly recovered from cryo-EM images. The first part of the paper extends the Fourier slice theorem to covariance functions showing that the three-dimensional covariance, and hence the principal components, of a heterogeneous particle can indeed be recovered from two-dimensional cryo-EM images. The second part of the paper proposes a practical algorithm for reconstructing the principal components directly from cryo-EM images without the intermediate step of calculating covariances. This algorithm is based on maximizing the (posterior) likelihood using the Expectation-Maximization algorithm. The last part of the paper applies this algorithm to simulated data and to two real cryo-EM data sets: a data set of the 70S ribosome with and without Elongation Factor-G (EF-G), and a data set of the inluenza virus RNA dependent RNA Polymerase (RdRP). The first principal component of the 70S ribosome data set reveals the expected conformational changes of the ribosome as the EF-G binds and unbinds. The first principal component of the RdRP data set reveals a conformational change in the two dimers of the RdRP. PMID:26049077

  19. Identification of spilled oils by NIR spectroscopy technology based on KPCA and LSSVM

    NASA Astrophysics Data System (ADS)

    Tan, Ailing; Bi, Weihong

    2011-08-01

    Oil spills on the sea surface are seen relatively often with the development of the petroleum exploitation and transportation of the sea. Oil spills are great threat to the marine environment and the ecosystem, thus the oil pollution in the ocean becomes an urgent topic in the environmental protection. To develop the oil spill accident treatment program and track the source of the spilled oils, a novel qualitative identification method combined Kernel Principal Component Analysis (KPCA) and Least Square Support Vector Machine (LSSVM) was proposed. The proposed method adapt Fourier transform NIR spectrophotometer to collect the NIR spectral data of simulated gasoline, diesel fuel and kerosene oil spills samples and do some pretreatments to the original spectrum. We use the KPCA algorithm which is an extension of Principal Component Analysis (PCA) using techniques of kernel methods to extract nonlinear features of the preprocessed spectrum. Support Vector Machines (SVM) is a powerful methodology for solving spectral classification tasks in chemometrics. LSSVM are reformulations to the standard SVMs which lead to solving a system of linear equations. So a LSSVM multiclass classification model was designed which using Error Correcting Output Code (ECOC) method borrowing the idea of error correcting codes used for correcting bit errors in transmission channels. The most common and reliable approach to parameter selection is to decide on parameter ranges, and to then do a grid search over the parameter space to find the optimal model parameters. To test the proposed method, 375 spilled oil samples of unknown type were selected to study. The optimal model has the best identification capabilities with the accuracy of 97.8%. Experimental results show that the proposed KPCA plus LSSVM qualitative analysis method of near infrared spectroscopy has good recognition result, which could work as a new method for rapid identification of spilled oils.

  20. Generative statistical modeling of left atrial appendage appearance to substantiate clinical paradigms for stroke risk stratification

    NASA Astrophysics Data System (ADS)

    Sanatkhani, Soroosh; Menon, Prahlad G.

    2018-03-01

    Left atrial appendage (LAA) is the source of 91% of the thrombi in patients with atrial arrhythmias ( 2.3 million US adults), turning this region into a potential threat for stroke. LAA geometries have been clinically categorized into four appearance groups viz. Cauliflower, Cactus, Chicken-Wing and WindSock, based on visual appearance in 3D volume visualizations of contrast-enhanced computed tomography (CT) imaging, and have further been correlated with stroke risk by considering clinical mortality statistics. However, such classification from visual appearance is limited by human subjectivity and is not sophisticated enough to address all the characteristics of the geometries. Quantification of LAA geometry metrics can reveal a more repeatable and reliable estimate on the characteristics of the LAA which correspond with stasis risk, and in-turn cardioembolic risk. We present an approach to quantify the appearance of the LAA in patients in atrial fibrillation (AF) using a weighted set of baseline eigen-modes of LAA appearance variation, as a means to objectify classification of patient-specific LAAs into the four accepted clinical appearance groups. Clinical images of 16 patients (4 per LAA appearance category) with atrial fibrillation (AF) were identified and visualized as volume images. All the volume images were rigidly reoriented in order to be spatially co-registered, normalized in terms of intensity, resampled and finally reshaped appropriately to carry out principal component analysis (PCA), in order to parametrize the LAA region's appearance based on principal components (PCs/eigen mode) of greyscale appearance, generating 16 eigen-modes of appearance variation. Our pilot studies show that the most dominant LAA appearance (i.e. reconstructable using the fewest eigen-modes) resembles the Chicken-Wing class, which is known to have the lowest stroke risk per clinical mortality statistics. Our findings indicate the possibility that LAA geometries with high risk of stroke are higher-order statistical variants of underlying lower risk shapes.

  1. Prediction, time variance, and classification of hydraulic response to recharge in two karst aquifers

    USGS Publications Warehouse

    Long, Andrew J.; Mahler, Barbara J.

    2013-01-01

    Many karst aquifers are rapidly filled and depleted and therefore are likely to be susceptible to changes in short-term climate variability. Here we explore methods that could be applied to model site-specific hydraulic responses, with the intent of simulating these responses to different climate scenarios from high-resolution climate models. We compare hydraulic responses (spring flow, groundwater level, stream base flow, and cave drip) at several sites in two karst aquifers: the Edwards aquifer (Texas, USA) and the Madison aquifer (South Dakota, USA). A lumped-parameter model simulates nonlinear soil moisture changes for estimation of recharge, and a time-variant convolution model simulates the aquifer response to this recharge. Model fit to data is 2.4% better for calibration periods than for validation periods according to the Nash–Sutcliffe coefficient of efficiency, which ranges from 0.53 to 0.94 for validation periods. We use metrics that describe the shapes of the impulse-response functions (IRFs) obtained from convolution modeling to make comparisons in the distribution of response times among sites and between aquifers. Time-variant IRFs were applied to 62% of the sites. Principal component analysis (PCA) of metrics describing the shapes of the IRFs indicates three principal components that together account for 84% of the variability in IRF shape: the first is related to IRF skewness and temporal spread and accounts for 51% of the variability; the second and third largely are related to time-variant properties and together account for 33% of the variability. Sites with IRFs that dominantly comprise exponential curves are separated geographically from those dominantly comprising lognormal curves in both aquifers as a result of spatial heterogeneity. The use of multiple IRF metrics in PCA is a novel method to characterize, compare, and classify the way in which different sites and aquifers respond to recharge. As convolution models are developed for additional aquifers, they could contribute to an IRF database and a general classification system for karst aquifers.

  2. Identification of Reliable Components in Multivariate Curve Resolution-Alternating Least Squares (MCR-ALS): a Data-Driven Approach across Metabolic Processes.

    PubMed

    Motegi, Hiromi; Tsuboi, Yuuri; Saga, Ayako; Kagami, Tomoko; Inoue, Maki; Toki, Hideaki; Minowa, Osamu; Noda, Tetsuo; Kikuchi, Jun

    2015-11-04

    There is an increasing need to use multivariate statistical methods for understanding biological functions, identifying the mechanisms of diseases, and exploring biomarkers. In addition to classical analyses such as hierarchical cluster analysis, principal component analysis, and partial least squares discriminant analysis, various multivariate strategies, including independent component analysis, non-negative matrix factorization, and multivariate curve resolution, have recently been proposed. However, determining the number of components is problematic. Despite the proposal of several different methods, no satisfactory approach has yet been reported. To resolve this problem, we implemented a new idea: classifying a component as "reliable" or "unreliable" based on the reproducibility of its appearance, regardless of the number of components in the calculation. Using the clustering method for classification, we applied this idea to multivariate curve resolution-alternating least squares (MCR-ALS). Comparisons between conventional and modified methods applied to proton nuclear magnetic resonance ((1)H-NMR) spectral datasets derived from known standard mixtures and biological mixtures (urine and feces of mice) revealed that more plausible results are obtained by the modified method. In particular, clusters containing little information were detected with reliability. This strategy, named "cluster-aided MCR-ALS," will facilitate the attainment of more reliable results in the metabolomics datasets.

  3. [Research on Rapid Discrimination of Edible Oil by ATR Infrared Spectroscopy].

    PubMed

    Ma, Xiao; Yuan, Hong-fu; Song, Chun-feng; Hu, Ai-qin; Li, Xiao-yu; Zhao, Zhong; Li, Xiu-qin; Guo Zhen; Zhu, Zhi-qiang

    2015-07-01

    A rapid discrimination method of edible oils, KL-BP model, was proposed by attenuated total reflectance infrared spectroscopy. The model extracts the characteristic of classification from source data by KL and reduces data dimension at the same time. Then the neural network model is constructed by the new data which as the input of the model. 84 edible oil samples which include sesame oil, corn oil, canola oil, blend oil, sunflower oil, peanut oil, olive oil, soybean oil and tea seed oil, were collected and their infrared spectra determined using an ATR FT-IR spectrometer. In order to compare the method performance, principal component analysis (PCA) direct-classification model, KL direct-classification model, PLS-DA model, PCA-BP model and KL-BP model are constructed in this paper. The results show that the recognition rates of PCA, PCA-BP, KL, PLS-DA and KL-BP are 59.1%, 68.2%, 77.3%, 77.3% and 90.9% for discriminating the 9 kinds of edible oils, respectively. KL extracts the eigenvector which make the distance between different class and distance of every class ratio is the largest. So the method can get much more classify information than PCA. BP neural network can effectively enhance the classification ability and accuracy. Taking full of the advantages of KL in extracting more category information in dimension reducing and the features of BP neural network in self-learning, adaptive, nonlinear, the KL-BP method has the best classification ability and recognition accuracy and great importance for rapidly recognizing edible oil in practice.

  4. Acoustic signature recognition technique for Human-Object Interactions (HOI) in persistent surveillance systems

    NASA Astrophysics Data System (ADS)

    Alkilani, Amjad; Shirkhodaie, Amir

    2013-05-01

    Handling, manipulation, and placement of objects, hereon called Human-Object Interaction (HOI), in the environment generate sounds. Such sounds are readily identifiable by the human hearing. However, in the presence of background environment noises, recognition of minute HOI sounds is challenging, though vital for improvement of multi-modality sensor data fusion in Persistent Surveillance Systems (PSS). Identification of HOI sound signatures can be used as precursors to detection of pertinent threats that otherwise other sensor modalities may miss to detect. In this paper, we present a robust method for detection and classification of HOI events via clustering of extracted features from training of HOI acoustic sound waves. In this approach, salient sound events are preliminary identified and segmented from background via a sound energy tracking method. Upon this segmentation, frequency spectral pattern of each sound event is modeled and its features are extracted to form a feature vector for training. To reduce dimensionality of training feature space, a Principal Component Analysis (PCA) technique is employed to expedite fast classification of test feature vectors, a kd-tree and Random Forest classifiers are trained for rapid classification of training sound waves. Each classifiers employs different similarity distance matching technique for classification. Performance evaluations of classifiers are compared for classification of a batch of training HOI acoustic signatures. Furthermore, to facilitate semantic annotation of acoustic sound events, a scheme based on Transducer Mockup Language (TML) is proposed. The results demonstrate the proposed approach is both reliable and effective, and can be extended to future PSS applications.

  5. Comparative study of SVM methods combined with voxel selection for object category classification on fMRI data.

    PubMed

    Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li

    2011-02-16

    Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice.

  6. Metabolite profiling in retinoblastoma identifies novel clinicopathological subgroups

    PubMed Central

    Kohe, Sarah; Brundler, Marie-Anne; Jenkinson, Helen; Parulekar, Manoj; Wilson, Martin; Peet, Andrew C; McConville, Carmel M

    2015-01-01

    Background: Tumour classification, based on histopathology or molecular pathology, is of value to predict tumour behaviour and to select appropriate treatment. In retinoblastoma, pathology information is not available at diagnosis and only exists for enucleated tumours. Alternative methods of tumour classification, using noninvasive techniques such as magnetic resonance spectroscopy, are urgently required to guide treatment decisions at the time of diagnosis. Methods: High-resolution magic-angle spinning magnetic resonance spectroscopy (HR-MAS MRS) was undertaken on enucleated retinoblastomas. Principal component analysis and cluster analysis of the HR-MAS MRS data was used to identify tumour subgroups. Individual metabolite concentrations were determined and were correlated with histopathological risk factors for each group. Results: Multivariate analysis identified three metabolic subgroups of retinoblastoma, with the most discriminatory metabolites being taurine, hypotaurine, total-choline and creatine. Metabolite concentrations correlated with specific histopathological features: taurine was correlated with differentiation, total-choline and phosphocholine with retrolaminar optic nerve invasion, and total lipids with necrosis. Conclusions: We have demonstrated that a metabolite-based classification of retinoblastoma can be obtained using ex vivo magnetic resonance spectroscopy, and that the subgroups identified correlate with histopathological features. This result justifies future studies to validate the clinical relevance of these subgroups and highlights the potential of in vivo MRS as a noninvasive diagnostic tool for retinoblastoma patient stratification. PMID:26348444

  7. Bamboo Classification Using WorldView-2 Imagery of Giant Panda Habitat in a Large Shaded Area in Wolong, Sichuan Province, China.

    PubMed

    Tang, Yunwei; Jing, Linhai; Li, Hui; Liu, Qingjie; Yan, Qi; Li, Xiuxia

    2016-11-22

    This study explores the ability of WorldView-2 (WV-2) imagery for bamboo mapping in a mountainous region in Sichuan Province, China. A large area of this place is covered by shadows in the image, and only a few sampled points derived were useful. In order to identify bamboos based on sparse training data, the sample size was expanded according to the reflectance of multispectral bands selected using the principal component analysis (PCA). Then, class separability based on the training data was calculated using a feature space optimization method to select the features for classification. Four regular object-based classification methods were applied based on both sets of training data. The results show that the k -nearest neighbor ( k -NN) method produced the greatest accuracy. A geostatistically-weighted k -NN classifier, accounting for the spatial correlation between classes, was then applied to further increase the accuracy. It achieved 82.65% and 93.10% of the producer's and user's accuracies respectively for the bamboo class. The canopy densities were estimated to explain the result. This study demonstrates that the WV-2 image can be used to identify small patches of understory bamboos given limited known samples, and the resulting bamboo distribution facilitates the assessments of the habitats of giant pandas.

  8. A Novel Hybrid Dimension Reduction Technique for Undersized High Dimensional Gene Expression Data Sets Using Information Complexity Criterion for Cancer Classification

    PubMed Central

    Pamukçu, Esra; Bozdogan, Hamparsum; Çalık, Sinan

    2015-01-01

    Gene expression data typically are large, complex, and highly noisy. Their dimension is high with several thousand genes (i.e., features) but with only a limited number of observations (i.e., samples). Although the classical principal component analysis (PCA) method is widely used as a first standard step in dimension reduction and in supervised and unsupervised classification, it suffers from several shortcomings in the case of data sets involving undersized samples, since the sample covariance matrix degenerates and becomes singular. In this paper we address these limitations within the context of probabilistic PCA (PPCA) by introducing and developing a new and novel approach using maximum entropy covariance matrix and its hybridized smoothed covariance estimators. To reduce the dimensionality of the data and to choose the number of probabilistic PCs (PPCs) to be retained, we further introduce and develop celebrated Akaike's information criterion (AIC), consistent Akaike's information criterion (CAIC), and the information theoretic measure of complexity (ICOMP) criterion of Bozdogan. Six publicly available undersized benchmark data sets were analyzed to show the utility, flexibility, and versatility of our approach with hybridized smoothed covariance matrix estimators, which do not degenerate to perform the PPCA to reduce the dimension and to carry out supervised classification of cancer groups in high dimensions. PMID:25838836

  9. Oral cancer screening: serum Raman spectroscopic approach

    NASA Astrophysics Data System (ADS)

    Sahu, Aditi K.; Dhoot, Suyash; Singh, Amandeep; Sawant, Sharada S.; Nandakumar, Nikhila; Talathi-Desai, Sneha; Garud, Mandavi; Pagare, Sandeep; Srivastava, Sanjeeva; Nair, Sudhir; Chaturvedi, Pankaj; Murali Krishna, C.

    2015-11-01

    Serum Raman spectroscopy (RS) has previously shown potential in oral cancer diagnosis and recurrence prediction. To evaluate the potential of serum RS in oral cancer screening, premalignant and cancer-specific detection was explored in the present study using 328 subjects belonging to healthy controls, premalignant, disease controls, and oral cancer groups. Spectra were acquired using a Raman microprobe. Spectral findings suggest changes in amino acids, lipids, protein, DNA, and β-carotene across the groups. A patient-wise approach was employed for data analysis using principal component linear discriminant analysis. In the first step, the classification among premalignant, disease control (nonoral cancer), oral cancer, and normal samples was evaluated in binary classification models. Thereafter, two screening-friendly classification approaches were explored to further evaluate the clinical utility of serum RS: a single four-group model and normal versus abnormal followed by determining the type of abnormality model. Results demonstrate the feasibility of premalignant and specific cancer detection. The normal versus abnormal model yields better sensitivity and specificity rates of 64 and 80% these rates are comparable to standard screening approaches. Prospectively, as the current screening procedure of visual inspection is useful mainly for high-risk populations, serum RS may serve as a useful adjunct for early and specific detection of oral precancers and cancer.

  10. Centered Kernel Alignment Enhancing Neural Network Pretraining for MRI-Based Dementia Diagnosis

    PubMed Central

    Cárdenas-Peña, David; Collazos-Huertas, Diego; Castellanos-Dominguez, German

    2016-01-01

    Dementia is a growing problem that affects elderly people worldwide. More accurate evaluation of dementia diagnosis can help during the medical examination. Several methods for computer-aided dementia diagnosis have been proposed using resonance imaging scans to discriminate between patients with Alzheimer's disease (AD) or mild cognitive impairment (MCI) and healthy controls (NC). Nonetheless, the computer-aided diagnosis is especially challenging because of the heterogeneous and intermediate nature of MCI. We address the automated dementia diagnosis by introducing a novel supervised pretraining approach that takes advantage of the artificial neural network (ANN) for complex classification tasks. The proposal initializes an ANN based on linear projections to achieve more discriminating spaces. Such projections are estimated by maximizing the centered kernel alignment criterion that assesses the affinity between the resonance imaging data kernel matrix and the label target matrix. As a result, the performed linear embedding allows accounting for features that contribute the most to the MCI class discrimination. We compare the supervised pretraining approach to two unsupervised initialization methods (autoencoders and Principal Component Analysis) and against the best four performing classification methods of the 2014 CADDementia challenge. As a result, our proposal outperforms all the baselines (7% of classification accuracy and area under the receiver-operating-characteristic curve) at the time it reduces the class biasing. PMID:27148392

  11. Application of multispectral imaging to determine quality attributes and ripeness stage in strawberry fruit.

    PubMed

    Liu, Changhong; Liu, Wei; Lu, Xuzhong; Ma, Fei; Chen, Wei; Yang, Jianbo; Zheng, Lei

    2014-01-01

    Multispectral imaging with 19 wavelengths in the range of 405-970 nm has been evaluated for nondestructive determination of firmness, total soluble solids (TSS) content and ripeness stage in strawberry fruit. Several analysis approaches, including partial least squares (PLS), support vector machine (SVM) and back propagation neural network (BPNN), were applied to develop theoretical models for predicting the firmness and TSS of intact strawberry fruit. Compared with PLS and SVM, BPNN considerably improved the performance of multispectral imaging for predicting firmness and total soluble solids content with the correlation coefficient (r) of 0.94 and 0.83, SEP of 0.375 and 0.573, and bias of 0.035 and 0.056, respectively. Subsequently, the ability of multispectral imaging technology to classify fruit based on ripeness stage was tested using SVM and principal component analysis-back propagation neural network (PCA-BPNN) models. The higher classification accuracy of 100% was achieved using SVM model. Moreover, the results of all these models demonstrated that the VIS parts of the spectra were the main contributor to the determination of firmness, TSS content estimation and classification of ripeness stage in strawberry fruit. These results suggest that multispectral imaging, together with suitable analysis model, is a promising technology for rapid estimation of quality attributes and classification of ripeness stage in strawberry fruit.

  12. An arrhythmia classification algorithm using a dedicated wavelet adapted to different subjects.

    PubMed

    Kim, Jinkwon; Min, Se Dong; Lee, Myoungho

    2011-06-27

    Numerous studies have been conducted regarding a heartbeat classification algorithm over the past several decades. However, many algorithms have also been studied to acquire robust performance, as biosignals have a large amount of variation among individuals. Various methods have been proposed to reduce the differences coming from personal characteristics, but these expand the differences caused by arrhythmia. In this paper, an arrhythmia classification algorithm using a dedicated wavelet adapted to individual subjects is proposed. We reduced the performance variation using dedicated wavelets, as in the ECG morphologies of the subjects. The proposed algorithm utilizes morphological filtering and a continuous wavelet transform with a dedicated wavelet. A principal component analysis and linear discriminant analysis were utilized to compress the morphological data transformed by the dedicated wavelets. An extreme learning machine was used as a classifier in the proposed algorithm. A performance evaluation was conducted with the MIT-BIH arrhythmia database. The results showed a high sensitivity of 97.51%, specificity of 85.07%, accuracy of 97.94%, and a positive predictive value of 97.26%. The proposed algorithm achieves better accuracy than other state-of-the-art algorithms with no intrasubject between the training and evaluation datasets. And it significantly reduces the amount of intervention needed by physicians.

  13. An arrhythmia classification algorithm using a dedicated wavelet adapted to different subjects

    PubMed Central

    2011-01-01

    Background Numerous studies have been conducted regarding a heartbeat classification algorithm over the past several decades. However, many algorithms have also been studied to acquire robust performance, as biosignals have a large amount of variation among individuals. Various methods have been proposed to reduce the differences coming from personal characteristics, but these expand the differences caused by arrhythmia. Methods In this paper, an arrhythmia classification algorithm using a dedicated wavelet adapted to individual subjects is proposed. We reduced the performance variation using dedicated wavelets, as in the ECG morphologies of the subjects. The proposed algorithm utilizes morphological filtering and a continuous wavelet transform with a dedicated wavelet. A principal component analysis and linear discriminant analysis were utilized to compress the morphological data transformed by the dedicated wavelets. An extreme learning machine was used as a classifier in the proposed algorithm. Results A performance evaluation was conducted with the MIT-BIH arrhythmia database. The results showed a high sensitivity of 97.51%, specificity of 85.07%, accuracy of 97.94%, and a positive predictive value of 97.26%. Conclusions The proposed algorithm achieves better accuracy than other state-of-the-art algorithms with no intrasubject between the training and evaluation datasets. And it significantly reduces the amount of intervention needed by physicians. PMID:21707989

  14. Spectral Data Reduction via Wavelet Decomposition

    NASA Technical Reports Server (NTRS)

    Kaewpijit, S.; LeMoigne, J.; El-Ghazawi, T.; Rood, Richard (Technical Monitor)

    2002-01-01

    The greatest advantage gained from hyperspectral imagery is that narrow spectral features can be used to give more information about materials than was previously possible with broad-band multispectral imagery. For many applications, the new larger data volumes from such hyperspectral sensors, however, present a challenge for traditional processing techniques. For example, the actual identification of each ground surface pixel by its corresponding reflecting spectral signature is still one of the most difficult challenges in the exploitation of this advanced technology, because of the immense volume of data collected. Therefore, conventional classification methods require a preprocessing step of dimension reduction to conquer the so-called "curse of dimensionality." Spectral data reduction using wavelet decomposition could be useful, as it does not only reduce the data volume, but also preserves the distinctions between spectral signatures. This characteristic is related to the intrinsic property of wavelet transforms that preserves high- and low-frequency features during the signal decomposition, therefore preserving peaks and valleys found in typical spectra. When comparing to the most widespread dimension reduction technique, the Principal Component Analysis (PCA), and looking at the same level of compression rate, we show that Wavelet Reduction yields better classification accuracy, for hyperspectral data processed with a conventional supervised classification such as a maximum likelihood method.

  15. Morphometric comparison by the ISAS® CASA-DNAf system of two techniques for the evaluation of DNA fragmentation in human spermatozoa

    PubMed Central

    Sadeghi, Sara; García-Molina, Almudena; Celma, Ferran; Valverde, Anthony; Fereidounfar, Sogol; Soler, Carles

    2016-01-01

    DNA fragmentation has been shown to be one of the causes of male infertility, particularly related to repeated abortions, and different methods have been developed to analyze it. In the present study, two commercial kits based on the SCD technique (Halosperm® and SDFA) were evaluated by the use of the DNA fragmentation module of the ISAS® v1 CASA system. Seven semen samples from volunteers were analyzed. To compare the results between techniques, the Kruskal–Wallis test was used. Data were used for calculation of Principal Components (two PCs were obtained), and subsequent subpopulations were identified using the Halo, Halo/Core Ratio, and PC data. Results from both kits were significantly different (P < 0.001). In each case, four subpopulations were obtained, independently of the classification method used. The distribution of subpopulations differed depending on the kit used. From the PC data, a discriminant analysis matrix was obtained and a good a posteriori classification was obtained (97.1% for Halosperm and 96.6% for SDFA). The present results are the first approach on morphometric evaluation of DNA fragmentation from the SCD technique. This approach could be used for the future definition of a classification matrix surpassing the current subjective evaluation of this important sperm factor. PMID:27678463

  16. Morphometric comparison by the ISAS® CASA-DNAf system of two techniques for the evaluation of DNA fragmentation in human spermatozoa.

    PubMed

    Sadeghi, Sara; García-Molina, Almudena; Celma, Ferran; Valverde, Anthony; Fereidounfar, Sogol; Soler, Carles

    2016-01-01

    DNA fragmentation has been shown to be one of the causes of male infertility, particularly related to repeated abortions, and different methods have been developed to analyze it. In the present study, two commercial kits based on the SCD technique (Halosperm ® and SDFA) were evaluated by the use of the DNA fragmentation module of the ISAS ® v1 CASA system. Seven semen samples from volunteers were analyzed. To compare the results between techniques, the Kruskal-Wallis test was used. Data were used for calculation of Principal Components (two PCs were obtained), and subsequent subpopulations were identified using the Halo, Halo/Core Ratio, and PC data. Results from both kits were significantly different (P < 0.001). In each case, four subpopulations were obtained, independently of the classification method used. The distribution of subpopulations differed depending on the kit used. From the PC data, a discriminant analysis matrix was obtained and a good a posteriori classification was obtained (97.1% for Halosperm and 96.6% for SDFA). The present results are the first approach on morphometric evaluation of DNA fragmentation from the SCD technique. This approach could be used for the future definition of a classification matrix surpassing the current subjective evaluation of this important sperm factor.

  17. Development of algorithms for detecting citrus canker based on hyperspectral reflectance imaging.

    PubMed

    Li, Jiangbo; Rao, Xiuqin; Ying, Yibin

    2012-01-15

    Automated discrimination of fruits with canker from other fruit with normal surface and different type of peel defects has become a helpful task to enhance the competitiveness and profitability of the citrus industry. Over the last several years, hyperspectral imaging technology has received increasing attention in the agricultural products inspection field. This paper studied the feasibility of classification of citrus canker from other peel conditions including normal surface and nine peel defects by hyperspectal imaging. A combination algorithm based on principal component analysis and the two-band ratio (Q(687/630)) method was proposed. Since fewer wavelengths were desired in order to develop a rapid multispectral imaging system, the canker classification performance of the two-band ratio (Q(687/630)) method alone was also evaluated. The proposed combination approach and two-band ratio method alone resulted in overall classification accuracy for training set samples and test set samples of 99.5%, 84.5% and 98.2%, 82.9%, respectively. The proposed combination approach was more efficient for classifying canker against various conditions under reflectance hyperspectral imagery. However, the two-band ratio (Q(687/630)) method alone also demonstrated effectiveness in discriminating citrus canker from normal fruit and other peel diseases except for copper burn and anthracnose. Copyright © 2011 Society of Chemical Industry.

  18. Non-targeted 1H NMR fingerprinting and multivariate statistical analyses for the characterisation of the geographical origin of Italian sweet cherries.

    PubMed

    Longobardi, F; Ventrella, A; Bianco, A; Catucci, L; Cafagna, I; Gallo, V; Mastrorilli, P; Agostiano, A

    2013-12-01

    In this study, non-targeted (1)H NMR fingerprinting was used in combination with multivariate statistical techniques for the classification of Italian sweet cherries based on their different geographical origins (Emilia Romagna and Puglia). As classification techniques, Soft Independent Modelling of Class Analogy (SIMCA), Partial Least Squares Discriminant Analysis (PLS-DA), and Linear Discriminant Analysis (LDA) were carried out and the results were compared. For LDA, before performing a refined selection of the number/combination of variables, two different strategies for a preliminary reduction of the variable number were tested. The best average recognition and CV prediction abilities (both 100.0%) were obtained for all the LDA models, although PLS-DA also showed remarkable performances (94.6%). All the statistical models were validated by observing the prediction abilities with respect to an external set of cherry samples. The best result (94.9%) was obtained with LDA by performing a best subset selection procedure on a set of 30 principal components previously selected by a stepwise decorrelation. The metabolites that mostly contributed to the classification performances of such LDA model, were found to be malate, glucose, fructose, glutamine and succinate. Copyright © 2013 Elsevier Ltd. All rights reserved.

  19. 42 CFR 412.60 - DRG classification and weighting factors.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 42 Public Health 2 2013-10-01 2013-10-01 false DRG classification and weighting factors. 412.60... discharge is based, as appropriate, on the patient's age, sex, principal diagnosis (that is, the diagnosis...), secondary diagnoses, procedures performed, and discharge status. (2) Each discharge is assigned to only one...

  20. 42 CFR 412.60 - DRG classification and weighting factors.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 42 Public Health 2 2012-10-01 2012-10-01 false DRG classification and weighting factors. 412.60... discharge is based, as appropriate, on the patient's age, sex, principal diagnosis (that is, the diagnosis...), secondary diagnoses, procedures performed, and discharge status. (2) Each discharge is assigned to only one...

Top