Sample records for component analysis pca-based

  1. Principal component and spatial correlation analysis of spectroscopic-imaging data in scanning probe microscopy.

    PubMed

    Jesse, Stephen; Kalinin, Sergei V

    2009-02-25

    An approach for the analysis of multi-dimensional, spectroscopic-imaging data based on principal component analysis (PCA) is explored. PCA selects and ranks relevant response components based on variance within the data. It is shown that for examples with small relative variations between spectra, the first few PCA components closely coincide with results obtained using model fitting, and this is achieved at rates approximately four orders of magnitude faster. For cases with strong response variations, PCA allows an effective approach to rapidly process, de-noise, and compress data. The prospects for PCA combined with correlation function analysis of component maps as a universal tool for data analysis and representation in microscopy are discussed.

  2. Priority of VHS Development Based in Potential Area using Principal Component Analysis

    NASA Astrophysics Data System (ADS)

    Meirawan, D.; Ana, A.; Saripudin, S.

    2018-02-01

    The current condition of VHS is still inadequate in quality, quantity and relevance. The purpose of this research is to analyse the development of VHS based on the development of regional potential by using principal component analysis (PCA) in Bandung, Indonesia. This study used descriptive qualitative data analysis using the principle of secondary data reduction component. The method used is Principal Component Analysis (PCA) analysis with Minitab Statistics Software tool. The results of this study indicate the value of the lowest requirement is a priority of the construction of development VHS with a program of majors in accordance with the development of regional potential. Based on the PCA score found that the main priority in the development of VHS in Bandung is in Saguling, which has the lowest PCA value of 416.92 in area 1, Cihampelas with the lowest PCA value in region 2 and Padalarang with the lowest PCA value.

  3. Analysis of the principal component algorithm in phase-shifting interferometry.

    PubMed

    Vargas, J; Quiroga, J Antonio; Belenguer, T

    2011-06-15

    We recently presented a new asynchronous demodulation method for phase-sampling interferometry. The method is based in the principal component analysis (PCA) technique. In the former work, the PCA method was derived heuristically. In this work, we present an in-depth analysis of the PCA demodulation method.

  4. A stable systemic risk ranking in China's banking sector: Based on principal component analysis

    NASA Astrophysics Data System (ADS)

    Fang, Libing; Xiao, Binqing; Yu, Honghai; You, Qixing

    2018-02-01

    In this paper, we compare five popular systemic risk rankings, and apply principal component analysis (PCA) model to provide a stable systemic risk ranking for the Chinese banking sector. Our empirical results indicate that five methods suggest vastly different systemic risk rankings for the same bank, while the combined systemic risk measure based on PCA provides a reliable ranking. Furthermore, according to factor loadings of the first component, PCA combined ranking is mainly based on fundamentals instead of market price data. We clearly find that price-based rankings are not as practical a method as fundamentals-based ones. This PCA combined ranking directly shows systemic risk contributions of each bank for banking supervision purpose and reminds banks to prevent and cope with the financial crisis in advance.

  5. A stock market forecasting model combining two-directional two-dimensional principal component analysis and radial basis function neural network.

    PubMed

    Guo, Zhiqiang; Wang, Huaiqing; Yang, Jie; Miller, David J

    2015-01-01

    In this paper, we propose and implement a hybrid model combining two-directional two-dimensional principal component analysis ((2D)2PCA) and a Radial Basis Function Neural Network (RBFNN) to forecast stock market behavior. First, 36 stock market technical variables are selected as the input features, and a sliding window is used to obtain the input data of the model. Next, (2D)2PCA is utilized to reduce the dimension of the data and extract its intrinsic features. Finally, an RBFNN accepts the data processed by (2D)2PCA to forecast the next day's stock price or movement. The proposed model is used on the Shanghai stock market index, and the experiments show that the model achieves a good level of fitness. The proposed model is then compared with one that uses the traditional dimension reduction method principal component analysis (PCA) and independent component analysis (ICA). The empirical results show that the proposed model outperforms the PCA-based model, as well as alternative models based on ICA and on the multilayer perceptron.

  6. A Stock Market Forecasting Model Combining Two-Directional Two-Dimensional Principal Component Analysis and Radial Basis Function Neural Network

    PubMed Central

    Guo, Zhiqiang; Wang, Huaiqing; Yang, Jie; Miller, David J.

    2015-01-01

    In this paper, we propose and implement a hybrid model combining two-directional two-dimensional principal component analysis ((2D)2PCA) and a Radial Basis Function Neural Network (RBFNN) to forecast stock market behavior. First, 36 stock market technical variables are selected as the input features, and a sliding window is used to obtain the input data of the model. Next, (2D)2PCA is utilized to reduce the dimension of the data and extract its intrinsic features. Finally, an RBFNN accepts the data processed by (2D)2PCA to forecast the next day's stock price or movement. The proposed model is used on the Shanghai stock market index, and the experiments show that the model achieves a good level of fitness. The proposed model is then compared with one that uses the traditional dimension reduction method principal component analysis (PCA) and independent component analysis (ICA). The empirical results show that the proposed model outperforms the PCA-based model, as well as alternative models based on ICA and on the multilayer perceptron. PMID:25849483

  7. PCA-LBG-based algorithms for VQ codebook generation

    NASA Astrophysics Data System (ADS)

    Tsai, Jinn-Tsong; Yang, Po-Yuan

    2015-04-01

    Vector quantisation (VQ) codebooks are generated by combining principal component analysis (PCA) algorithms with Linde-Buzo-Gray (LBG) algorithms. All training vectors are grouped according to the projected values of the principal components. The PCA-LBG-based algorithms include (1) PCA-LBG-Median, which selects the median vector of each group, (2) PCA-LBG-Centroid, which adopts the centroid vector of each group, and (3) PCA-LBG-Random, which randomly selects a vector of each group. The LBG algorithm finds a codebook based on the better vectors sent to an initial codebook by the PCA. The PCA performs an orthogonal transformation to convert a set of potentially correlated variables into a set of variables that are not linearly correlated. Because the orthogonal transformation efficiently distinguishes test image vectors, the proposed PCA-LBG-based algorithm is expected to outperform conventional algorithms in designing VQ codebooks. The experimental results confirm that the proposed PCA-LBG-based algorithms indeed obtain better results compared to existing methods reported in the literature.

  8. Experimental Researches on the Durability Indicators and the Physiological Comfort of Fabrics using the Principal Component Analysis (PCA) Method

    NASA Astrophysics Data System (ADS)

    Hristian, L.; Ostafe, M. M.; Manea, L. R.; Apostol, L. L.

    2017-06-01

    The work pursued the distribution of combed wool fabrics destined to manufacturing of external articles of clothing in terms of the values of durability and physiological comfort indices, using the mathematical model of Principal Component Analysis (PCA). Principal Components Analysis (PCA) applied in this study is a descriptive method of the multivariate analysis/multi-dimensional data, and aims to reduce, under control, the number of variables (columns) of the matrix data as much as possible to two or three. Therefore, based on the information about each group/assortment of fabrics, it is desired that, instead of nine inter-correlated variables, to have only two or three new variables called components. The PCA target is to extract the smallest number of components which recover the most of the total information contained in the initial data.

  9. Application of principal component analysis to distinguish patients with schizophrenia from healthy controls based on fractional anisotropy measurements.

    PubMed

    Caprihan, A; Pearlson, G D; Calhoun, V D

    2008-08-15

    Principal component analysis (PCA) is often used to reduce the dimension of data before applying more sophisticated data analysis methods such as non-linear classification algorithms or independent component analysis. This practice is based on selecting components corresponding to the largest eigenvalues. If the ultimate goal is separation of data in two groups, then these set of components need not have the most discriminatory power. We measured the distance between two such populations using Mahalanobis distance and chose the eigenvectors to maximize it, a modified PCA method, which we call the discriminant PCA (DPCA). DPCA was applied to diffusion tensor-based fractional anisotropy images to distinguish age-matched schizophrenia subjects from healthy controls. The performance of the proposed method was evaluated by the one-leave-out method. We show that for this fractional anisotropy data set, the classification error with 60 components was close to the minimum error and that the Mahalanobis distance was twice as large with DPCA, than with PCA. Finally, by masking the discriminant function with the white matter tracts of the Johns Hopkins University atlas, we identified left superior longitudinal fasciculus as the tract which gave the least classification error. In addition, with six optimally chosen tracts the classification error was zero.

  10. Identification and classification of upper limb motions using PCA.

    PubMed

    Veer, Karan; Vig, Renu

    2018-03-28

    This paper describes the utility of principal component analysis (PCA) in classifying upper limb signals. PCA is a powerful tool for analyzing data of high dimension. Here, two different input strategies were explored. The first method uses upper arm dual-position-based myoelectric signal acquisition and the other solely uses PCA for classifying surface electromyogram (SEMG) signals. SEMG data from the biceps and the triceps brachii muscles and four independent muscle activities of the upper arm were measured in seven subjects (total dataset=56). The datasets used for the analysis are rotated by class-specific principal component matrices to decorrelate the measured data prior to feature extraction.

  11. Developing and Evaluating Creativity Gamification Rehabilitation System: The Application of PCA-ANFIS Based Emotions Model

    ERIC Educational Resources Information Center

    Su, Chung-Ho; Cheng, Ching-Hsue

    2016-01-01

    This study aims to explore the factors in a patient's rehabilitation achievement after a total knee replacement (TKR) patient exercises, using a PCA-ANFIS emotion model-based game rehabilitation system, which combines virtual reality (VR) and motion capture technology. The researchers combine a principal component analysis (PCA) and an adaptive…

  12. 2L-PCA: a two-level principal component analyzer for quantitative drug design and its applications.

    PubMed

    Du, Qi-Shi; Wang, Shu-Qing; Xie, Neng-Zhong; Wang, Qing-Yan; Huang, Ri-Bo; Chou, Kuo-Chen

    2017-09-19

    A two-level principal component predictor (2L-PCA) was proposed based on the principal component analysis (PCA) approach. It can be used to quantitatively analyze various compounds and peptides about their functions or potentials to become useful drugs. One level is for dealing with the physicochemical properties of drug molecules, while the other level is for dealing with their structural fragments. The predictor has the self-learning and feedback features to automatically improve its accuracy. It is anticipated that 2L-PCA will become a very useful tool for timely providing various useful clues during the process of drug development.

  13. A new statistical PCA-ICA algorithm for location of R-peaks in ECG.

    PubMed

    Chawla, M P S; Verma, H K; Kumar, Vinod

    2008-09-16

    The success of ICA to separate the independent components from the mixture depends on the properties of the electrocardiogram (ECG) recordings. This paper discusses some of the conditions of independent component analysis (ICA) that could affect the reliability of the separation and evaluation of issues related to the properties of the signals and number of sources. Principal component analysis (PCA) scatter plots are plotted to indicate the diagnostic features in the presence and absence of base-line wander in interpreting the ECG signals. In this analysis, a newly developed statistical algorithm by authors, based on the use of combined PCA-ICA for two correlated channels of 12-channel ECG data is proposed. ICA technique has been successfully implemented in identifying and removal of noise and artifacts from ECG signals. Cleaned ECG signals are obtained using statistical measures like kurtosis and variance of variance after ICA processing. This analysis also paper deals with the detection of QRS complexes in electrocardiograms using combined PCA-ICA algorithm. The efficacy of the combined PCA-ICA algorithm lies in the fact that the location of the R-peaks is bounded from above and below by the location of the cross-over points, hence none of the peaks are ignored or missed.

  14. Stationary Wavelet-based Two-directional Two-dimensional Principal Component Analysis for EMG Signal Classification

    NASA Astrophysics Data System (ADS)

    Ji, Yi; Sun, Shanlin; Xie, Hong-Bo

    2017-06-01

    Discrete wavelet transform (WT) followed by principal component analysis (PCA) has been a powerful approach for the analysis of biomedical signals. Wavelet coefficients at various scales and channels were usually transformed into a one-dimensional array, causing issues such as the curse of dimensionality dilemma and small sample size problem. In addition, lack of time-shift invariance of WT coefficients can be modeled as noise and degrades the classifier performance. In this study, we present a stationary wavelet-based two-directional two-dimensional principal component analysis (SW2D2PCA) method for the efficient and effective extraction of essential feature information from signals. Time-invariant multi-scale matrices are constructed in the first step. The two-directional two-dimensional principal component analysis then operates on the multi-scale matrices to reduce the dimension, rather than vectors in conventional PCA. Results are presented from an experiment to classify eight hand motions using 4-channel electromyographic (EMG) signals recorded in healthy subjects and amputees, which illustrates the efficiency and effectiveness of the proposed method for biomedical signal analysis.

  15. PCA based clustering for brain tumor segmentation of T1w MRI images.

    PubMed

    Kaya, Irem Ersöz; Pehlivanlı, Ayça Çakmak; Sekizkardeş, Emine Gezmez; Ibrikci, Turgay

    2017-03-01

    Medical images are huge collections of information that are difficult to store and process consuming extensive computing time. Therefore, the reduction techniques are commonly used as a data pre-processing step to make the image data less complex so that a high-dimensional data can be identified by an appropriate low-dimensional representation. PCA is one of the most popular multivariate methods for data reduction. This paper is focused on T1-weighted MRI images clustering for brain tumor segmentation with dimension reduction by different common Principle Component Analysis (PCA) algorithms. Our primary aim is to present a comparison between different variations of PCA algorithms on MRIs for two cluster methods. Five most common PCA algorithms; namely the conventional PCA, Probabilistic Principal Component Analysis (PPCA), Expectation Maximization Based Principal Component Analysis (EM-PCA), Generalize Hebbian Algorithm (GHA), and Adaptive Principal Component Extraction (APEX) were applied to reduce dimensionality in advance of two clustering algorithms, K-Means and Fuzzy C-Means. In the study, the T1-weighted MRI images of the human brain with brain tumor were used for clustering. In addition to the original size of 512 lines and 512 pixels per line, three more different sizes, 256 × 256, 128 × 128 and 64 × 64, were included in the study to examine their effect on the methods. The obtained results were compared in terms of both the reconstruction errors and the Euclidean distance errors among the clustered images containing the same number of principle components. According to the findings, the PPCA obtained the best results among all others. Furthermore, the EM-PCA and the PPCA assisted K-Means algorithm to accomplish the best clustering performance in the majority as well as achieving significant results with both clustering algorithms for all size of T1w MRI images. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  16. Contact- and distance-based principal component analysis of protein dynamics.

    PubMed

    Ernst, Matthias; Sittel, Florian; Stock, Gerhard

    2015-12-28

    To interpret molecular dynamics simulations of complex systems, systematic dimensionality reduction methods such as principal component analysis (PCA) represent a well-established and popular approach. Apart from Cartesian coordinates, internal coordinates, e.g., backbone dihedral angles or various kinds of distances, may be used as input data in a PCA. Adopting two well-known model problems, folding of villin headpiece and the functional dynamics of BPTI, a systematic study of PCA using distance-based measures is presented which employs distances between Cα-atoms as well as distances between inter-residue contacts including side chains. While this approach seems prohibitive for larger systems due to the quadratic scaling of the number of distances with the size of the molecule, it is shown that it is sufficient (and sometimes even better) to include only relatively few selected distances in the analysis. The quality of the PCA is assessed by considering the resolution of the resulting free energy landscape (to identify metastable conformational states and barriers) and the decay behavior of the corresponding autocorrelation functions (to test the time scale separation of the PCA). By comparing results obtained with distance-based, dihedral angle, and Cartesian coordinates, the study shows that the choice of input variables may drastically influence the outcome of a PCA.

  17. Contact- and distance-based principal component analysis of protein dynamics

    NASA Astrophysics Data System (ADS)

    Ernst, Matthias; Sittel, Florian; Stock, Gerhard

    2015-12-01

    To interpret molecular dynamics simulations of complex systems, systematic dimensionality reduction methods such as principal component analysis (PCA) represent a well-established and popular approach. Apart from Cartesian coordinates, internal coordinates, e.g., backbone dihedral angles or various kinds of distances, may be used as input data in a PCA. Adopting two well-known model problems, folding of villin headpiece and the functional dynamics of BPTI, a systematic study of PCA using distance-based measures is presented which employs distances between Cα-atoms as well as distances between inter-residue contacts including side chains. While this approach seems prohibitive for larger systems due to the quadratic scaling of the number of distances with the size of the molecule, it is shown that it is sufficient (and sometimes even better) to include only relatively few selected distances in the analysis. The quality of the PCA is assessed by considering the resolution of the resulting free energy landscape (to identify metastable conformational states and barriers) and the decay behavior of the corresponding autocorrelation functions (to test the time scale separation of the PCA). By comparing results obtained with distance-based, dihedral angle, and Cartesian coordinates, the study shows that the choice of input variables may drastically influence the outcome of a PCA.

  18. An Intelligent Architecture Based on Field Programmable Gate Arrays Designed to Detect Moving Objects by Using Principal Component Analysis

    PubMed Central

    Bravo, Ignacio; Mazo, Manuel; Lázaro, José L.; Gardel, Alfredo; Jiménez, Pedro; Pizarro, Daniel

    2010-01-01

    This paper presents a complete implementation of the Principal Component Analysis (PCA) algorithm in Field Programmable Gate Array (FPGA) devices applied to high rate background segmentation of images. The classical sequential execution of different parts of the PCA algorithm has been parallelized. This parallelization has led to the specific development and implementation in hardware of the different stages of PCA, such as computation of the correlation matrix, matrix diagonalization using the Jacobi method and subspace projections of images. On the application side, the paper presents a motion detection algorithm, also entirely implemented on the FPGA, and based on the developed PCA core. This consists of dynamically thresholding the differences between the input image and the one obtained by expressing the input image using the PCA linear subspace previously obtained as a background model. The proposal achieves a high ratio of processed images (up to 120 frames per second) and high quality segmentation results, with a completely embedded and reliable hardware architecture based on commercial CMOS sensors and FPGA devices. PMID:22163406

  19. An intelligent architecture based on Field Programmable Gate Arrays designed to detect moving objects by using Principal Component Analysis.

    PubMed

    Bravo, Ignacio; Mazo, Manuel; Lázaro, José L; Gardel, Alfredo; Jiménez, Pedro; Pizarro, Daniel

    2010-01-01

    This paper presents a complete implementation of the Principal Component Analysis (PCA) algorithm in Field Programmable Gate Array (FPGA) devices applied to high rate background segmentation of images. The classical sequential execution of different parts of the PCA algorithm has been parallelized. This parallelization has led to the specific development and implementation in hardware of the different stages of PCA, such as computation of the correlation matrix, matrix diagonalization using the Jacobi method and subspace projections of images. On the application side, the paper presents a motion detection algorithm, also entirely implemented on the FPGA, and based on the developed PCA core. This consists of dynamically thresholding the differences between the input image and the one obtained by expressing the input image using the PCA linear subspace previously obtained as a background model. The proposal achieves a high ratio of processed images (up to 120 frames per second) and high quality segmentation results, with a completely embedded and reliable hardware architecture based on commercial CMOS sensors and FPGA devices.

  20. A measure for objects clustering in principal component analysis biplot: A case study in inter-city buses maintenance cost data

    NASA Astrophysics Data System (ADS)

    Ginanjar, Irlandia; Pasaribu, Udjianna S.; Indratno, Sapto W.

    2017-03-01

    This article presents the application of the principal component analysis (PCA) biplot for the needs of data mining. This article aims to simplify and objectify the methods for objects clustering in PCA biplot. The novelty of this paper is to get a measure that can be used to objectify the objects clustering in PCA biplot. Orthonormal eigenvectors, which are the coefficients of a principal component model representing an association between principal components and initial variables. The existence of the association is a valid ground to objects clustering based on principal axes value, thus if m principal axes used in the PCA, then the objects can be classified into 2m clusters. The inter-city buses are clustered based on maintenance costs data by using two principal axes PCA biplot. The buses are clustered into four groups. The first group is the buses with high maintenance costs, especially for lube, and brake canvass. The second group is the buses with high maintenance costs, especially for tire, and filter. The third group is the buses with low maintenance costs, especially for lube, and brake canvass. The fourth group is buses with low maintenance costs, especially for tire, and filter.

  1. From measurements to metrics: PCA-based indicators of cyber anomaly

    NASA Astrophysics Data System (ADS)

    Ahmed, Farid; Johnson, Tommy; Tsui, Sonia

    2012-06-01

    We present a framework of the application of Principal Component Analysis (PCA) to automatically obtain meaningful metrics from intrusion detection measurements. In particular, we report the progress made in applying PCA to analyze the behavioral measurements of malware and provide some preliminary results in selecting dominant attributes from an arbitrary number of malware attributes. The results will be useful in formulating an optimal detection threshold in the principal component space, which can both validate and augment existing malware classifiers.

  2. A diffusion-matched principal component analysis (DM-PCA) based two-channel denoising procedure for high-resolution diffusion-weighted MRI

    PubMed Central

    Chang, Hing-Chiu; Bilgin, Ali; Bernstein, Adam; Trouard, Theodore P.

    2018-01-01

    Over the past several years, significant efforts have been made to improve the spatial resolution of diffusion-weighted imaging (DWI), aiming at better detecting subtle lesions and more reliably resolving white-matter fiber tracts. A major concern with high-resolution DWI is the limited signal-to-noise ratio (SNR), which may significantly offset the advantages of high spatial resolution. Although the SNR of DWI data can be improved by denoising in post-processing, existing denoising procedures may potentially reduce the anatomic resolvability of high-resolution imaging data. Additionally, non-Gaussian noise induced signal bias in low-SNR DWI data may not always be corrected with existing denoising approaches. Here we report an improved denoising procedure, termed diffusion-matched principal component analysis (DM-PCA), which comprises 1) identifying a group of (not necessarily neighboring) voxels that demonstrate very similar magnitude signal variation patterns along the diffusion dimension, 2) correcting low-frequency phase variations in complex-valued DWI data, 3) performing PCA along the diffusion dimension for real- and imaginary-components (in two separate channels) of phase-corrected DWI voxels with matched diffusion properties, 4) suppressing the noisy PCA components in real- and imaginary-components, separately, of phase-corrected DWI data, and 5) combining real- and imaginary-components of denoised DWI data. Our data show that the new two-channel (i.e., for real- and imaginary-components) DM-PCA denoising procedure performs reliably without noticeably compromising anatomic resolvability. Non-Gaussian noise induced signal bias could also be reduced with the new denoising method. The DM-PCA based denoising procedure should prove highly valuable for high-resolution DWI studies in research and clinical uses. PMID:29694400

  3. Application of principal component analysis for improvement of X-ray fluorescence images obtained by polycapillary-based micro-XRF technique

    NASA Astrophysics Data System (ADS)

    Aida, S.; Matsuno, T.; Hasegawa, T.; Tsuji, K.

    2017-07-01

    Micro X-ray fluorescence (micro-XRF) analysis is repeated as a means of producing elemental maps. In some cases, however, the XRF images of trace elements that are obtained are not clear due to high background intensity. To solve this problem, we applied principal component analysis (PCA) to XRF spectra. We focused on improving the quality of XRF images by applying PCA. XRF images of the dried residue of standard solution on the glass substrate were taken. The XRF intensities for the dried residue were analyzed before and after PCA. Standard deviations of XRF intensities in the PCA-filtered images were improved, leading to clear contrast of the images. This improvement of the XRF images was effective in cases where the XRF intensity was weak.

  4. Research on distributed heterogeneous data PCA algorithm based on cloud platform

    NASA Astrophysics Data System (ADS)

    Zhang, Jin; Huang, Gang

    2018-05-01

    Principal component analysis (PCA) of heterogeneous data sets can solve the problem that centralized data scalability is limited. In order to reduce the generation of intermediate data and error components of distributed heterogeneous data sets, a principal component analysis algorithm based on heterogeneous data sets under cloud platform is proposed. The algorithm performs eigenvalue processing by using Householder tridiagonalization and QR factorization to calculate the error component of the heterogeneous database associated with the public key to obtain the intermediate data set and the lost information. Experiments on distributed DBM heterogeneous datasets show that the model method has the feasibility and reliability in terms of execution time and accuracy.

  5. Fast principal component analysis for stacking seismic data

    NASA Astrophysics Data System (ADS)

    Wu, Juan; Bai, Min

    2018-04-01

    Stacking seismic data plays an indispensable role in many steps of the seismic data processing and imaging workflow. Optimal stacking of seismic data can help mitigate seismic noise and enhance the principal components to a great extent. Traditional average-based seismic stacking methods cannot obtain optimal performance when the ambient noise is extremely strong. We propose a principal component analysis (PCA) algorithm for stacking seismic data without being sensitive to noise level. Considering the computational bottleneck of the classic PCA algorithm in processing massive seismic data, we propose an efficient PCA algorithm to make the proposed method readily applicable for industrial applications. Two numerically designed examples and one real seismic data are used to demonstrate the performance of the presented method.

  6. Nonlinear Principal Components Analysis: Introduction and Application

    ERIC Educational Resources Information Center

    Linting, Marielle; Meulman, Jacqueline J.; Groenen, Patrick J. F.; van der Koojj, Anita J.

    2007-01-01

    The authors provide a didactic treatment of nonlinear (categorical) principal components analysis (PCA). This method is the nonlinear equivalent of standard PCA and reduces the observed variables to a number of uncorrelated principal components. The most important advantages of nonlinear over linear PCA are that it incorporates nominal and ordinal…

  7. IMPROVED SEARCH OF PRINCIPAL COMPONENT ANALYSIS DATABASES FOR SPECTRO-POLARIMETRIC INVERSION

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Casini, R.; Lites, B. W.; Ramos, A. Asensio

    2013-08-20

    We describe a simple technique for the acceleration of spectro-polarimetric inversions based on principal component analysis (PCA) of Stokes profiles. This technique involves the indexing of the database models based on the sign of the projections (PCA coefficients) of the first few relevant orders of principal components of the four Stokes parameters. In this way, each model in the database can be attributed a distinctive binary number of 2{sup 4n} bits, where n is the number of PCA orders used for the indexing. Each of these binary numbers (indices) identifies a group of ''compatible'' models for the inversion of amore » given set of observed Stokes profiles sharing the same index. The complete set of the binary numbers so constructed evidently determines a partition of the database. The search of the database for the PCA inversion of spectro-polarimetric data can profit greatly from this indexing. In practical cases it becomes possible to approach the ideal acceleration factor of 2{sup 4n} as compared to the systematic search of a non-indexed database for a traditional PCA inversion. This indexing method relies on the existence of a physical meaning in the sign of the PCA coefficients of a model. For this reason, the presence of model ambiguities and of spectro-polarimetric noise in the observations limits in practice the number n of relevant PCA orders that can be used for the indexing.« less

  8. Image restoration for three-dimensional fluorescence microscopy using an orthonormal basis for efficient representation of depth-variant point-spread functions

    PubMed Central

    Patwary, Nurmohammed; Preza, Chrysanthe

    2015-01-01

    A depth-variant (DV) image restoration algorithm for wide field fluorescence microscopy, using an orthonormal basis decomposition of DV point-spread functions (PSFs), is investigated in this study. The efficient PSF representation is based on a previously developed principal component analysis (PCA), which is computationally intensive. We present an approach developed to reduce the number of DV PSFs required for the PCA computation, thereby making the PCA-based approach computationally tractable for thick samples. Restoration results from both synthetic and experimental images show consistency and that the proposed algorithm addresses efficiently depth-induced aberration using a small number of principal components. Comparison of the PCA-based algorithm with a previously-developed strata-based DV restoration algorithm demonstrates that the proposed method improves performance by 50% in terms of accuracy and simultaneously reduces the processing time by 64% using comparable computational resources. PMID:26504634

  9. Exploring patterns enriched in a dataset with contrastive principal component analysis.

    PubMed

    Abid, Abubakar; Zhang, Martin J; Bagaria, Vivek K; Zou, James

    2018-05-30

    Visualization and exploration of high-dimensional data is a ubiquitous challenge across disciplines. Widely used techniques such as principal component analysis (PCA) aim to identify dominant trends in one dataset. However, in many settings we have datasets collected under different conditions, e.g., a treatment and a control experiment, and we are interested in visualizing and exploring patterns that are specific to one dataset. This paper proposes a method, contrastive principal component analysis (cPCA), which identifies low-dimensional structures that are enriched in a dataset relative to comparison data. In a wide variety of experiments, we demonstrate that cPCA with a background dataset enables us to visualize dataset-specific patterns missed by PCA and other standard methods. We further provide a geometric interpretation of cPCA and strong mathematical guarantees. An implementation of cPCA is publicly available, and can be used for exploratory data analysis in many applications where PCA is currently used.

  10. Short-term PV/T module temperature prediction based on PCA-RBF neural network

    NASA Astrophysics Data System (ADS)

    Li, Jiyong; Zhao, Zhendong; Li, Yisheng; Xiao, Jing; Tang, Yunfeng

    2018-02-01

    Aiming at the non-linearity and large inertia of temperature control in PV/T system, short-term temperature prediction of PV/T module is proposed, to make the PV/T system controller run forward according to the short-term forecasting situation to optimize control effect. Based on the analysis of the correlation between PV/T module temperature and meteorological factors, and the temperature of adjacent time series, the principal component analysis (PCA) method is used to pre-process the original input sample data. Combined with the RBF neural network theory, the simulation results show that the PCA method makes the prediction accuracy of the network model higher and the generalization performance stronger than that of the RBF neural network without the main component extraction.

  11. Cluster and principal component analysis based on SSR markers of Amomum tsao-ko in Jinping County of Yunnan Province

    NASA Astrophysics Data System (ADS)

    Ma, Mengli; Lei, En; Meng, Hengling; Wang, Tiantao; Xie, Linyan; Shen, Dong; Xianwang, Zhou; Lu, Bingyue

    2017-08-01

    Amomum tsao-ko is a commercial plant that used for various purposes in medicinal and food industries. For the present investigation, 44 germplasm samples were collected from Jinping County of Yunnan Province. Clusters analysis and 2-dimensional principal component analysis (PCA) was used to represent the genetic relations among Amomum tsao-ko by using simple sequence repeat (SSR) markers. Clustering analysis clearly distinguished the samples groups. Two major clusters were formed; first (Cluster I) consisted of 34 individuals, the second (Cluster II) consisted of 10 individuals, Cluster I as the main group contained multiple sub-clusters. PCA also showed 2 groups: PCA Group 1 included 29 individuals, PCA Group 2 included 12 individuals, consistent with the results of cluster analysis. The purpose of the present investigation was to provide information on genetic relationship of Amomum tsao-ko germplasm resources in main producing areas, also provide a theoretical basis for the protection and utilization of Amomum tsao-ko resources.

  12. Applying robust variant of Principal Component Analysis as a damage detector in the presence of outliers

    NASA Astrophysics Data System (ADS)

    Gharibnezhad, Fahit; Mujica, Luis E.; Rodellar, José

    2015-01-01

    Using Principal Component Analysis (PCA) for Structural Health Monitoring (SHM) has received considerable attention over the past few years. PCA has been used not only as a direct method to identify, classify and localize damages but also as a significant primary step for other methods. Despite several positive specifications that PCA conveys, it is very sensitive to outliers. Outliers are anomalous observations that can affect the variance and the covariance as vital parts of PCA method. Therefore, the results based on PCA in the presence of outliers are not fully satisfactory. As a main contribution, this work suggests the use of robust variant of PCA not sensitive to outliers, as an effective way to deal with this problem in SHM field. In addition, the robust PCA is compared with the classical PCA in the sense of detecting probable damages. The comparison between the results shows that robust PCA can distinguish the damages much better than using classical one, and even in many cases allows the detection where classic PCA is not able to discern between damaged and non-damaged structures. Moreover, different types of robust PCA are compared with each other as well as with classical counterpart in the term of damage detection. All the results are obtained through experiments with an aircraft turbine blade using piezoelectric transducers as sensors and actuators and adding simulated damages.

  13. Q-mode versus R-mode principal component analysis for linear discriminant analysis (LDA)

    NASA Astrophysics Data System (ADS)

    Lee, Loong Chuen; Liong, Choong-Yeun; Jemain, Abdul Aziz

    2017-05-01

    Many literature apply Principal Component Analysis (PCA) as either preliminary visualization or variable con-struction methods or both. Focus of PCA can be on the samples (R-mode PCA) or variables (Q-mode PCA). Traditionally, R-mode PCA has been the usual approach to reduce high-dimensionality data before the application of Linear Discriminant Analysis (LDA), to solve classification problems. Output from PCA composed of two new matrices known as loadings and scores matrices. Each matrix can then be used to produce a plot, i.e. loadings plot aids identification of important variables whereas scores plot presents spatial distribution of samples on new axes that are also known as Principal Components (PCs). Fundamentally, the scores matrix always be the input variables for building classification model. A recent paper uses Q-mode PCA but the focus of analysis was not on the variables but instead on the samples. As a result, the authors have exchanged the use of both loadings and scores plots in which clustering of samples was studied using loadings plot whereas scores plot has been used to identify important manifest variables. Therefore, the aim of this study is to statistically validate the proposed practice. Evaluation is based on performance of external error obtained from LDA models according to number of PCs. On top of that, bootstrapping was also conducted to evaluate the external error of each of the LDA models. Results show that LDA models produced by PCs from R-mode PCA give logical performance and the matched external error are also unbiased whereas the ones produced with Q-mode PCA show the opposites. With that, we concluded that PCs produced from Q-mode is not statistically stable and thus should not be applied to problems of classifying samples, but variables. We hope this paper will provide some insights on the disputable issues.

  14. Principal component analysis of indocyanine green fluorescence dynamics for diagnosis of vascular diseases

    NASA Astrophysics Data System (ADS)

    Seo, Jihye; An, Yuri; Lee, Jungsul; Choi, Chulhee

    2015-03-01

    Indocyanine green (ICG), a near-infrared fluorophore, has been used in visualization of vascular structure and non-invasive diagnosis of vascular disease. Although many imaging techniques have been developed, there are still limitations in diagnosis of vascular diseases. We have recently developed a minimally invasive diagnostics system based on ICG fluorescence imaging for sensitive detection of vascular insufficiency. In this study, we used principal component analysis (PCA) to examine ICG spatiotemporal profile and to obtain pathophysiological information from ICG dynamics. Here we demonstrated that principal components of ICG dynamics in both feet showed significant differences between normal control and diabetic patients with vascula complications. We extracted the PCA time courses of the first three components and found distinct pattern in diabetic patient. We propose that PCA of ICG dynamics reveal better classification performance compared to fluorescence intensity analysis. We anticipate that specific feature of spatiotemporal ICG dynamics can be useful in diagnosis of various vascular diseases.

  15. How Many Separable Sources? Model Selection In Independent Components Analysis

    PubMed Central

    Woods, Roger P.; Hansen, Lars Kai; Strother, Stephen

    2015-01-01

    Unlike mixtures consisting solely of non-Gaussian sources, mixtures including two or more Gaussian components cannot be separated using standard independent components analysis methods that are based on higher order statistics and independent observations. The mixed Independent Components Analysis/Principal Components Analysis (mixed ICA/PCA) model described here accommodates one or more Gaussian components in the independent components analysis model and uses principal components analysis to characterize contributions from this inseparable Gaussian subspace. Information theory can then be used to select from among potential model categories with differing numbers of Gaussian components. Based on simulation studies, the assumptions and approximations underlying the Akaike Information Criterion do not hold in this setting, even with a very large number of observations. Cross-validation is a suitable, though computationally intensive alternative for model selection. Application of the algorithm is illustrated using Fisher's iris data set and Howells' craniometric data set. Mixed ICA/PCA is of potential interest in any field of scientific investigation where the authenticity of blindly separated non-Gaussian sources might otherwise be questionable. Failure of the Akaike Information Criterion in model selection also has relevance in traditional independent components analysis where all sources are assumed non-Gaussian. PMID:25811988

  16. An Exploratory Study on Using Principal-Component Analysis and Confirmatory Factor Analysis to Identify Bolt-On Dimensions: The EQ-5D Case Study.

    PubMed

    Finch, Aureliano Paolo; Brazier, John Edward; Mukuria, Clara; Bjorner, Jakob Bue

    2017-12-01

    Generic preference-based measures such as the EuroQol five-dimensional questionnaire (EQ-5D) are used in economic evaluation, but may not be appropriate for all conditions. When this happens, a possible solution is adding bolt-ons to expand their descriptive systems. Using review-based methods, studies published to date claimed the relevance of bolt-ons in the presence of poor psychometric results. This approach does not identify the specific dimensions missing from the Generic preference-based measure core descriptive system, and is inappropriate for identifying dimensions that might improve the measure generically. This study explores the use of principal-component analysis (PCA) and confirmatory factor analysis (CFA) for bolt-on identification in the EQ-5D. Data were drawn from the international Multi-Instrument Comparison study, which is an online survey on health and well-being measures in five countries. Analysis was based on a pool of 92 items from nine instruments. Initial content analysis provided a theoretical framework for PCA results interpretation and CFA model development. PCA was used to investigate the underlining dimensional structure and whether EQ-5D items were represented in the identified constructs. CFA was used to confirm the structure. CFA was cross-validated in random halves of the sample. PCA suggested a nine-component solution, which was confirmed by CFA. This included psychological symptoms, physical functioning, and pain, which were covered by the EQ-5D, and satisfaction, speech/cognition,relationships, hearing, vision, and energy/sleep which were not. These latter factors may represent relevant candidate bolt-ons. PCA and CFA appear useful methods for identifying potential bolt-ons dimensions for an instrument such as the EQ-5D. Copyright © 2017 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.

  17. Using principal component analysis and annual seasonal trend analysis to assess karst rocky desertification in southwestern China.

    PubMed

    Zhang, Zhiming; Ouyang, Zhiyun; Xiao, Yi; Xiao, Yang; Xu, Weihua

    2017-06-01

    Increasing exploitation of karst resources is causing severe environmental degradation because of the fragility and vulnerability of karst areas. By integrating principal component analysis (PCA) with annual seasonal trend analysis (ASTA), this study assessed karst rocky desertification (KRD) within a spatial context. We first produced fractional vegetation cover (FVC) data from a moderate-resolution imaging spectroradiometer normalized difference vegetation index using a dimidiate pixel model. Then, we generated three main components of the annual FVC data using PCA. Subsequently, we generated the slope image of the annual seasonal trends of FVC using median trend analysis. Finally, we combined the three PCA components and annual seasonal trends of FVC with the incidence of KRD for each type of carbonate rock to classify KRD into one of four categories based on K-means cluster analysis: high, moderate, low, and none. The results of accuracy assessments indicated that this combination approach produced greater accuracy and more reasonable KRD mapping than the average FVC based on the vegetation coverage standard. The KRD map for 2010 indicated that the total area of KRD was 78.76 × 10 3  km 2 , which constitutes about 4.06% of the eight southwest provinces of China. The largest KRD areas were found in Yunnan province. The combined PCA and ASTA approach was demonstrated to be an easily implemented, robust, and flexible method for the mapping and assessment of KRD, which can be used to enhance regional KRD management schemes or to address assessment of other environmental issues.

  18. Decision tree and PCA-based fault diagnosis of rotating machinery

    NASA Astrophysics Data System (ADS)

    Sun, Weixiang; Chen, Jin; Li, Jiaqing

    2007-04-01

    After analysing the flaws of conventional fault diagnosis methods, data mining technology is introduced to fault diagnosis field, and a new method based on C4.5 decision tree and principal component analysis (PCA) is proposed. In this method, PCA is used to reduce features after data collection, preprocessing and feature extraction. Then, C4.5 is trained by using the samples to generate a decision tree model with diagnosis knowledge. At last the tree model is used to make diagnosis analysis. To validate the method proposed, six kinds of running states (normal or without any defect, unbalance, rotor radial rub, oil whirl, shaft crack and a simultaneous state of unbalance and radial rub), are simulated on Bently Rotor Kit RK4 to test C4.5 and PCA-based method and back-propagation neural network (BPNN). The result shows that C4.5 and PCA-based diagnosis method has higher accuracy and needs less training time than BPNN.

  19. [Determination of the Plant Origin of Licorice Oil Extract, a Natural Food Additive, by Principal Component Analysis Based on Chemical Components].

    PubMed

    Tada, Atsuko; Ishizuki, Kyoko; Sugimoto, Naoki; Yoshimatsu, Kayo; Kawahara, Nobuo; Suematsu, Takako; Arifuku, Kazunori; Fukai, Toshio; Tamura, Yukiyoshi; Ohtsuki, Takashi; Tahara, Maiko; Yamazaki, Takeshi; Akiyama, Hiroshi

    2015-01-01

    "Licorice oil extract" (LOE) (antioxidant agent) is described in the notice of Japanese food additive regulations as a material obtained from the roots and/or rhizomes of Glycyrrhiza uralensis, G. inflata or G. glabra. In this study, we aimed to identify the original Glycyrrhiza species of eight food additive products using LC/MS. Glabridin, a characteristic compound in G. glabra, was specifically detected in seven products, and licochalcone A, a characteristic compound in G. inflata, was detected in one product. In addition, Principal Component Analysis (PCA) (a kind of multivariate analysis) using the data of LC/MS or (1)H-NMR analysis was performed. The data of thirty-one samples, including LOE products used as food additives, ethanol extracts of various Glycyrrhiza species and commercially available Glycyrrhiza species-derived products were assessed. Based on the PCA results, the majority of LOE products was confirmed to be derived from G. glabra. This study suggests that PCA using (1)H-NMR analysis data is a simple and useful method to identify the plant species of origin of natural food additive products.

  20. Comparison of common components analysis with principal components analysis and independent components analysis: Application to SPME-GC-MS volatolomic signatures.

    PubMed

    Bouhlel, Jihéne; Jouan-Rimbaud Bouveresse, Delphine; Abouelkaram, Said; Baéza, Elisabeth; Jondreville, Catherine; Travel, Angélique; Ratel, Jérémy; Engel, Erwan; Rutledge, Douglas N

    2018-02-01

    The aim of this work is to compare a novel exploratory chemometrics method, Common Components Analysis (CCA), with Principal Components Analysis (PCA) and Independent Components Analysis (ICA). CCA consists in adapting the multi-block statistical method known as Common Components and Specific Weights Analysis (CCSWA or ComDim) by applying it to a single data matrix, with one variable per block. As an application, the three methods were applied to SPME-GC-MS volatolomic signatures of livers in an attempt to reveal volatile organic compounds (VOCs) markers of chicken exposure to different types of micropollutants. An application of CCA to the initial SPME-GC-MS data revealed a drift in the sample Scores along CC2, as a function of injection order, probably resulting from time-related evolution in the instrument. This drift was eliminated by orthogonalization of the data set with respect to CC2, and the resulting data are used as the orthogonalized data input into each of the three methods. Since the first step in CCA is to norm-scale all the variables, preliminary data scaling has no effect on the results, so that CCA was applied only to orthogonalized SPME-GC-MS data, while, PCA and ICA were applied to the "orthogonalized", "orthogonalized and Pareto-scaled", and "orthogonalized and autoscaled" data. The comparison showed that PCA results were highly dependent on the scaling of variables, contrary to ICA where the data scaling did not have a strong influence. Nevertheless, for both PCA and ICA the clearest separations of exposed groups were obtained after autoscaling of variables. The main part of this work was to compare the CCA results using the orthogonalized data with those obtained with PCA and ICA applied to orthogonalized and autoscaled variables. The clearest separations of exposed chicken groups were obtained by CCA. CCA Loadings also clearly identified the variables contributing most to the Common Components giving separations. The PCA Loadings did not highlight the most influencing variables for each separation, whereas the ICA Loadings highlighted the same variables as did CCA. This study shows the potential of CCA for the extraction of pertinent information from a data matrix, using a procedure based on an original optimisation criterion, to produce results that are complementary, and in some cases may be superior, to those of PCA and ICA. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. Stability of Nonlinear Principal Components Analysis: An Empirical Study Using the Balanced Bootstrap

    ERIC Educational Resources Information Center

    Linting, Marielle; Meulman, Jacqueline J.; Groenen, Patrick J. F.; van der Kooij, Anita J.

    2007-01-01

    Principal components analysis (PCA) is used to explore the structure of data sets containing linearly related numeric variables. Alternatively, nonlinear PCA can handle possibly nonlinearly related numeric as well as nonnumeric variables. For linear PCA, the stability of its solution can be established under the assumption of multivariate…

  2. The fractal characteristic of facial anthropometric data for developing PCA fit test panels for youth born in central China.

    PubMed

    Yang, Lei; Wei, Ran; Shen, Henggen

    2017-01-01

    New principal component analysis (PCA) respirator fit test panels had been developed for current American and Chinese civilian workers based on anthropometric surveys. The PCA panels used the first two principal components (PCs) obtained from a set of 10 facial dimensions. Although the PCA panels for American and Chinese subjects adopted the bivairate framework with two PCs, the number of the PCs retained in the PCA analysis was different between Chinese subjects and Americans. For the Chinese youth group, the third PC should be retained in the PCA analysis for developing new fit test panels. In this article, an additional number label (ANL) is used to explain the third PC in PCA analysis when the first two PCs are used to construct the PCA half-facepiece respirator fit test panel for Chinese group. The three-dimensional box-counting method is proposed to estimate the ANLs by calculating fractal dimensions of the facial anthropometric data of the Chinese youth. The linear regression coefficients of scale-free range R 2 are all over 0.960, which demonstrates that the facial anthropometric data of the Chinese youth has fractal characteristic. The youth subjects born in Henan province has an ANL of 2.002, which is lower than the composite facial anthropometric data of Chinese subjects born in many provinces. Hence, Henan youth subjects have the self-similar facial anthropometric characteristic and should use the particular ANL (2.002) as the important tool along with using the PCA panel. The ANL method proposed in this article not only provides a new methodology in quantifying the characteristics of facial anthropometric dimensions for any ethnic/racial group, but also extends the scope of PCA panel studies to higher dimensions.

  3. TARGETED PRINCIPLE COMPONENT ANALYSIS: A NEW MOTION ARTIFACT CORRECTION APPROACH FOR NEAR-INFRARED SPECTROSCOPY

    PubMed Central

    YÜCEL, MERYEM A.; SELB, JULIETTE; COOPER, ROBERT J.; BOAS, DAVID A.

    2014-01-01

    As near-infrared spectroscopy (NIRS) broadens its application area to different age and disease groups, motion artifacts in the NIRS signal due to subject movement is becoming an important challenge. Motion artifacts generally produce signal fluctuations that are larger than physiological NIRS signals, thus it is crucial to correct for them before obtaining an estimate of stimulus evoked hemodynamic responses. There are various methods for correction such as principle component analysis (PCA), wavelet-based filtering and spline interpolation. Here, we introduce a new approach to motion artifact correction, targeted principle component analysis (tPCA), which incorporates a PCA filter only on the segments of data identified as motion artifacts. It is expected that this will overcome the issues of filtering desired signals that plagues standard PCA filtering of entire data sets. We compared the new approach with the most effective motion artifact correction algorithms on a set of data acquired simultaneously with a collodion-fixed probe (low motion artifact content) and a standard Velcro probe (high motion artifact content). Our results show that tPCA gives statistically better results in recovering hemodynamic response function (HRF) as compared to wavelet-based filtering and spline interpolation for the Velcro probe. It results in a significant reduction in mean-squared error (MSE) and significant enhancement in Pearson’s correlation coefficient to the true HRF. The collodion-fixed fiber probe with no motion correction performed better than the Velcro probe corrected for motion artifacts in terms of MSE and Pearson’s correlation coefficient. Thus, if the experimental study permits, the use of a collodion-fixed fiber probe may be desirable. If the use of a collodion-fixed probe is not feasible, then we suggest the use of tPCA in the processing of motion artifact contaminated data. PMID:25360181

  4. Independent components analysis to increase efficiency of discriminant analysis methods (FDA and LDA): Application to NMR fingerprinting of wine.

    PubMed

    Monakhova, Yulia B; Godelmann, Rolf; Kuballa, Thomas; Mushtakova, Svetlana P; Rutledge, Douglas N

    2015-08-15

    Discriminant analysis (DA) methods, such as linear discriminant analysis (LDA) or factorial discriminant analysis (FDA), are well-known chemometric approaches for solving classification problems in chemistry. In most applications, principle components analysis (PCA) is used as the first step to generate orthogonal eigenvectors and the corresponding sample scores are utilized to generate discriminant features for the discrimination. Independent components analysis (ICA) based on the minimization of mutual information can be used as an alternative to PCA as a preprocessing tool for LDA and FDA classification. To illustrate the performance of this ICA/DA methodology, four representative nuclear magnetic resonance (NMR) data sets of wine samples were used. The classification was performed regarding grape variety, year of vintage and geographical origin. The average increase for ICA/DA in comparison with PCA/DA in the percentage of correct classification varied between 6±1% and 8±2%. The maximum increase in classification efficiency of 11±2% was observed for discrimination of the year of vintage (ICA/FDA) and geographical origin (ICA/LDA). The procedure to determine the number of extracted features (PCs, ICs) for the optimum DA models was discussed. The use of independent components (ICs) instead of principle components (PCs) resulted in improved classification performance of DA methods. The ICA/LDA method is preferable to ICA/FDA for recognition tasks based on NMR spectroscopic measurements. Copyright © 2015 Elsevier B.V. All rights reserved.

  5. Free energy landscape of a biomolecule in dihedral principal component space: sampling convergence and correspondence between structures and minima.

    PubMed

    Maisuradze, Gia G; Leitner, David M

    2007-05-15

    Dihedral principal component analysis (dPCA) has recently been developed and shown to display complex features of the free energy landscape of a biomolecule that may be absent in the free energy landscape plotted in principal component space due to mixing of internal and overall rotational motion that can occur in principal component analysis (PCA) [Mu et al., Proteins: Struct Funct Bioinfo 2005;58:45-52]. Another difficulty in the implementation of PCA is sampling convergence, which we address here for both dPCA and PCA using a tetrapeptide as an example. We find that for both methods the sampling convergence can be reached over a similar time. Minima in the free energy landscape in the space of the two largest dihedral principal components often correspond to unique structures, though we also find some distinct minima to correspond to the same structure. 2007 Wiley-Liss, Inc.

  6. Subject order-independent group ICA (SOI-GICA) for functional MRI data analysis.

    PubMed

    Zhang, Han; Zuo, Xi-Nian; Ma, Shuang-Ye; Zang, Yu-Feng; Milham, Michael P; Zhu, Chao-Zhe

    2010-07-15

    Independent component analysis (ICA) is a data-driven approach to study functional magnetic resonance imaging (fMRI) data. Particularly, for group analysis on multiple subjects, temporally concatenation group ICA (TC-GICA) is intensively used. However, due to the usually limited computational capability, data reduction with principal component analysis (PCA: a standard preprocessing step of ICA decomposition) is difficult to achieve for a large dataset. To overcome this, TC-GICA employs multiple-stage PCA data reduction. Such multiple-stage PCA data reduction, however, leads to variable outputs due to different subject concatenation orders. Consequently, the ICA algorithm uses the variable multiple-stage PCA outputs and generates variable decompositions. In this study, a rigorous theoretical analysis was conducted to prove the existence of such variability. Simulated and real fMRI experiments were used to demonstrate the subject-order-induced variability of TC-GICA results using multiple PCA data reductions. To solve this problem, we propose a new subject order-independent group ICA (SOI-GICA). Both simulated and real fMRI data experiments demonstrated the high robustness and accuracy of the SOI-GICA results compared to those of traditional TC-GICA. Accordingly, we recommend SOI-GICA for group ICA-based fMRI studies, especially those with large data sets. Copyright 2010 Elsevier Inc. All rights reserved.

  7. Esophageal cancer detection based on tissue surface-enhanced Raman spectroscopy and multivariate analysis

    NASA Astrophysics Data System (ADS)

    Feng, Shangyuan; Lin, Juqiang; Huang, Zufang; Chen, Guannan; Chen, Weisheng; Wang, Yue; Chen, Rong; Zeng, Haishan

    2013-01-01

    The capability of using silver nanoparticle based near-infrared surface enhanced Raman scattering (SERS) spectroscopy combined with principal component analysis (PCA) and linear discriminate analysis (LDA) to differentiate esophageal cancer tissue from normal tissue was presented. Significant differences in Raman intensities of prominent SERS bands were observed between normal and cancer tissues. PCA-LDA multivariate analysis of the measured tissue SERS spectra achieved diagnostic sensitivity of 90.9% and specificity of 97.8%. This exploratory study demonstrated great potential for developing label-free tissue SERS analysis into a clinical tool for esophageal cancer detection.

  8. Comparison of multi-subject ICA methods for analysis of fMRI data

    PubMed Central

    Erhardt, Erik Barry; Rachakonda, Srinivas; Bedrick, Edward; Allen, Elena; Adali, Tülay; Calhoun, Vince D.

    2010-01-01

    Spatial independent component analysis (ICA) applied to functional magnetic resonance imaging (fMRI) data identifies functionally connected networks by estimating spatially independent patterns from their linearly mixed fMRI signals. Several multi-subject ICA approaches estimating subject-specific time courses (TCs) and spatial maps (SMs) have been developed, however there has not yet been a full comparison of the implications of their use. Here, we provide extensive comparisons of four multi-subject ICA approaches in combination with data reduction methods for simulated and fMRI task data. For multi-subject ICA, the data first undergo reduction at the subject and group levels using principal component analysis (PCA). Comparisons of subject-specific, spatial concatenation, and group data mean subject-level reduction strategies using PCA and probabilistic PCA (PPCA) show that computationally intensive PPCA is equivalent to PCA, and that subject-specific and group data mean subject-level PCA are preferred because of well-estimated TCs and SMs. Second, aggregate independent components are estimated using either noise free ICA or probabilistic ICA (PICA). Third, subject-specific SMs and TCs are estimated using back-reconstruction. We compare several direct group ICA (GICA) back-reconstruction approaches (GICA1-GICA3) and an indirect back-reconstruction approach, spatio-temporal regression (STR, or dual regression). Results show the earlier group ICA (GICA1) approximates STR, however STR has contradictory assumptions and may show mixed-component artifacts in estimated SMs. Our evidence-based recommendation is to use GICA3, introduced here, with subject-specific PCA and noise-free ICA, providing the most robust and accurate estimated SMs and TCs in addition to offering an intuitive interpretation. PMID:21162045

  9. Method of Real-Time Principal-Component Analysis

    NASA Technical Reports Server (NTRS)

    Duong, Tuan; Duong, Vu

    2005-01-01

    Dominant-element-based gradient descent and dynamic initial learning rate (DOGEDYN) is a method of sequential principal-component analysis (PCA) that is well suited for such applications as data compression and extraction of features from sets of data. In comparison with a prior method of gradient-descent-based sequential PCA, this method offers a greater rate of learning convergence. Like the prior method, DOGEDYN can be implemented in software. However, the main advantage of DOGEDYN over the prior method lies in the facts that it requires less computation and can be implemented in simpler hardware. It should be possible to implement DOGEDYN in compact, low-power, very-large-scale integrated (VLSI) circuitry that could process data in real time.

  10. [Vis-NIR spectroscopic pattern recognition combined with SG smoothing applied to breed screening of transgenic sugarcane].

    PubMed

    Liu, Gui-Song; Guo, Hao-Song; Pan, Tao; Wang, Ji-Hua; Cao, Gan

    2014-10-01

    Based on Savitzky-Golay (SG) smoothing screening, principal component analysis (PCA) combined with separately supervised linear discriminant analysis (LDA) and unsupervised hierarchical clustering analysis (HCA) were used for non-destructive visible and near-infrared (Vis-NIR) detection for breed screening of transgenic sugarcane. A random and stability-dependent framework of calibration, prediction, and validation was proposed. A total of 456 samples of sugarcane leaves planting in the elongating stage were collected from the field, which was composed of 306 transgenic (positive) samples containing Bt and Bar gene and 150 non-transgenic (negative) samples. A total of 156 samples (negative 50 and positive 106) were randomly selected as the validation set; the remaining samples (negative 100 and positive 200, a total of 300 samples) were used as the modeling set, and then the modeling set was subdivided into calibration (negative 50 and positive 100, a total of 150 samples) and prediction sets (negative 50 and positive 100, a total of 150 samples) for 50 times. The number of SG smoothing points was ex- panded, while some modes of higher derivative were removed because of small absolute value, and a total of 264 smoothing modes were used for screening. The pairwise combinations of first three principal components were used, and then the optimal combination of principal components was selected according to the model effect. Based on all divisions of calibration and prediction sets and all SG smoothing modes, the SG-PCA-LDA and SG-PCA-HCA models were established, the model parameters were optimized based on the average prediction effect for all divisions to produce modeling stability. Finally, the model validation was performed by validation set. With SG smoothing, the modeling accuracy and stability of PCA-LDA, PCA-HCA were signif- icantly improved. For the optimal SG-PCA-LDA model, the recognition rate of positive and negative validation samples were 94.3%, 96.0%; and were 92.5%, 98.0% for the optimal SG-PCA-LDA model, respectively. Vis-NIR spectro- scopic pattern recognition combined with SG smoothing could be used for accurate recognition of transgenic sugarcane leaves, and provided a convenient screening method for transgenic sugarcane breeding.

  11. Perturbational formulation of principal component analysis in molecular dynamics simulation.

    PubMed

    Koyama, Yohei M; Kobayashi, Tetsuya J; Tomoda, Shuji; Ueda, Hiroki R

    2008-10-01

    Conformational fluctuations of a molecule are important to its function since such intrinsic fluctuations enable the molecule to respond to the external environmental perturbations. For extracting large conformational fluctuations, which predict the primary conformational change by the perturbation, principal component analysis (PCA) has been used in molecular dynamics simulations. However, several versions of PCA, such as Cartesian coordinate PCA and dihedral angle PCA (dPCA), are limited to use with molecules with a single dominant state or proteins where the dihedral angle represents an important internal coordinate. Other PCAs with general applicability, such as the PCA using pairwise atomic distances, do not represent the physical meaning clearly. Therefore, a formulation that provides general applicability and clearly represents the physical meaning is yet to be developed. For developing such a formulation, we consider the conformational distribution change by the perturbation with arbitrary linearly independent perturbation functions. Within the second order approximation of the Kullback-Leibler divergence by the perturbation, the PCA can be naturally interpreted as a method for (1) decomposing a given perturbation into perturbations that independently contribute to the conformational distribution change or (2) successively finding the perturbation that induces the largest conformational distribution change. In this perturbational formulation of PCA, (i) the eigenvalue measures the Kullback-Leibler divergence from the unperturbed to perturbed distributions, (ii) the eigenvector identifies the combination of the perturbation functions, and (iii) the principal component determines the probability change induced by the perturbation. Based on this formulation, we propose a PCA using potential energy terms, and we designate it as potential energy PCA (PEPCA). The PEPCA provides both general applicability and clear physical meaning. For demonstrating its power, we apply the PEPCA to an alanine dipeptide molecule in vacuum as a minimal model of a nonsingle dominant conformational biomolecule. The first and second principal components clearly characterize two stable states and the transition state between them. Positive and negative components with larger absolute values of the first and second eigenvectors identify the electrostatic interactions, which stabilize or destabilize each stable state and the transition state. Our result therefore indicates that PCA can be applied, by carefully selecting the perturbation functions, not only to identify the molecular conformational fluctuation but also to predict the conformational distribution change by the perturbation beyond the limitation of the previous methods.

  12. Perturbational formulation of principal component analysis in molecular dynamics simulation

    NASA Astrophysics Data System (ADS)

    Koyama, Yohei M.; Kobayashi, Tetsuya J.; Tomoda, Shuji; Ueda, Hiroki R.

    2008-10-01

    Conformational fluctuations of a molecule are important to its function since such intrinsic fluctuations enable the molecule to respond to the external environmental perturbations. For extracting large conformational fluctuations, which predict the primary conformational change by the perturbation, principal component analysis (PCA) has been used in molecular dynamics simulations. However, several versions of PCA, such as Cartesian coordinate PCA and dihedral angle PCA (dPCA), are limited to use with molecules with a single dominant state or proteins where the dihedral angle represents an important internal coordinate. Other PCAs with general applicability, such as the PCA using pairwise atomic distances, do not represent the physical meaning clearly. Therefore, a formulation that provides general applicability and clearly represents the physical meaning is yet to be developed. For developing such a formulation, we consider the conformational distribution change by the perturbation with arbitrary linearly independent perturbation functions. Within the second order approximation of the Kullback-Leibler divergence by the perturbation, the PCA can be naturally interpreted as a method for (1) decomposing a given perturbation into perturbations that independently contribute to the conformational distribution change or (2) successively finding the perturbation that induces the largest conformational distribution change. In this perturbational formulation of PCA, (i) the eigenvalue measures the Kullback-Leibler divergence from the unperturbed to perturbed distributions, (ii) the eigenvector identifies the combination of the perturbation functions, and (iii) the principal component determines the probability change induced by the perturbation. Based on this formulation, we propose a PCA using potential energy terms, and we designate it as potential energy PCA (PEPCA). The PEPCA provides both general applicability and clear physical meaning. For demonstrating its power, we apply the PEPCA to an alanine dipeptide molecule in vacuum as a minimal model of a nonsingle dominant conformational biomolecule. The first and second principal components clearly characterize two stable states and the transition state between them. Positive and negative components with larger absolute values of the first and second eigenvectors identify the electrostatic interactions, which stabilize or destabilize each stable state and the transition state. Our result therefore indicates that PCA can be applied, by carefully selecting the perturbation functions, not only to identify the molecular conformational fluctuation but also to predict the conformational distribution change by the perturbation beyond the limitation of the previous methods.

  13. Comparison of water extraction methods in Tibet based on GF-1 data

    NASA Astrophysics Data System (ADS)

    Jia, Lingjun; Shang, Kun; Liu, Jing; Sun, Zhongqing

    2018-03-01

    In this study, we compared four different water extraction methods with GF-1 data according to different water types in Tibet, including Support Vector Machine (SVM), Principal Component Analysis (PCA), Decision Tree Classifier based on False Normalized Difference Water Index (FNDWI-DTC), and PCA-SVM. The results show that all of the four methods can extract large area water body, but only SVM and PCA-SVM can obtain satisfying extraction results for small size water body. The methods were evaluated by both overall accuracy (OAA) and Kappa coefficient (KC). The OAA of PCA-SVM, SVM, FNDWI-DTC, PCA are 96.68%, 94.23%, 93.99%, 93.01%, and the KCs are 0.9308, 0.8995, 0.8962, 0.8842, respectively, in consistent with visual inspection. In summary, SVM is better for narrow rivers extraction and PCA-SVM is suitable for water extraction of various types. As for dark blue lakes, the methods using PCA can extract more quickly and accurately.

  14. Facilitating in vivo tumor localization by principal component analysis based on dynamic fluorescence molecular imaging

    NASA Astrophysics Data System (ADS)

    Gao, Yang; Chen, Maomao; Wu, Junyu; Zhou, Yuan; Cai, Chuangjian; Wang, Daliang; Luo, Jianwen

    2017-09-01

    Fluorescence molecular imaging has been used to target tumors in mice with xenograft tumors. However, tumor imaging is largely distorted by the aggregation of fluorescent probes in the liver. A principal component analysis (PCA)-based strategy was applied on the in vivo dynamic fluorescence imaging results of three mice with xenograft tumors to facilitate tumor imaging, with the help of a tumor-specific fluorescent probe. Tumor-relevant features were extracted from the original images by PCA and represented by the principal component (PC) maps. The second principal component (PC2) map represented the tumor-related features, and the first principal component (PC1) map retained the original pharmacokinetic profiles, especially of the liver. The distribution patterns of the PC2 map of the tumor-bearing mice were in good agreement with the actual tumor location. The tumor-to-liver ratio and contrast-to-noise ratio were significantly higher on the PC2 map than on the original images, thus distinguishing the tumor from its nearby fluorescence noise of liver. The results suggest that the PC2 map could serve as a bioimaging marker to facilitate in vivo tumor localization, and dynamic fluorescence molecular imaging with PCA could be a valuable tool for future studies of in vivo tumor metabolism and progression.

  15. Comparative Analysis of a Principal Component Analysis-Based and an Artificial Neural Network-Based Method for Baseline Removal.

    PubMed

    Carvajal, Roberto C; Arias, Luis E; Garces, Hugo O; Sbarbaro, Daniel G

    2016-04-01

    This work presents a non-parametric method based on a principal component analysis (PCA) and a parametric one based on artificial neural networks (ANN) to remove continuous baseline features from spectra. The non-parametric method estimates the baseline based on a set of sampled basis vectors obtained from PCA applied over a previously composed continuous spectra learning matrix. The parametric method, however, uses an ANN to filter out the baseline. Previous studies have demonstrated that this method is one of the most effective for baseline removal. The evaluation of both methods was carried out by using a synthetic database designed for benchmarking baseline removal algorithms, containing 100 synthetic composed spectra at different signal-to-baseline ratio (SBR), signal-to-noise ratio (SNR), and baseline slopes. In addition to deomonstrating the utility of the proposed methods and to compare them in a real application, a spectral data set measured from a flame radiation process was used. Several performance metrics such as correlation coefficient, chi-square value, and goodness-of-fit coefficient were calculated to quantify and compare both algorithms. Results demonstrate that the PCA-based method outperforms the one based on ANN both in terms of performance and simplicity. © The Author(s) 2016.

  16. Temporal Processing of Dynamic Positron Emission Tomography via Principal Component Analysis in the Sinogram Domain

    NASA Astrophysics Data System (ADS)

    Chen, Zhe; Parker, B. J.; Feng, D. D.; Fulton, R.

    2004-10-01

    In this paper, we compare various temporal analysis schemes applied to dynamic PET for improved quantification, image quality and temporal compression purposes. We compare an optimal sampling schedule (OSS) design, principal component analysis (PCA) applied in the image domain, and principal component analysis applied in the sinogram domain; for region-of-interest quantification, sinogram-domain PCA is combined with the Huesman algorithm to quantify from the sinograms directly without requiring reconstruction of all PCA channels. Using a simulated phantom FDG brain study and three clinical studies, we evaluate the fidelity of the compressed data for estimation of local cerebral metabolic rate of glucose by a four-compartment model. Our results show that using a noise-normalized PCA in the sinogram domain gives similar compression ratio and quantitative accuracy to OSS, but with substantially better precision. These results indicate that sinogram-domain PCA for dynamic PET can be a useful preprocessing stage for PET compression and quantification applications.

  17. Multi-Centrality Graph Spectral Decompositions and Their Application to Cyber Intrusion Detection

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Pin-Yu; Choudhury, Sutanay; Hero, Alfred

    Many modern datasets can be represented as graphs and hence spectral decompositions such as graph principal component analysis (PCA) can be useful. Distinct from previous graph decomposition approaches based on subspace projection of a single topological feature, e.g., the centered graph adjacency matrix (graph Laplacian), we propose spectral decomposition approaches to graph PCA and graph dictionary learning that integrate multiple features, including graph walk statistics, centrality measures and graph distances to reference nodes. In this paper we propose a new PCA method for single graph analysis, called multi-centrality graph PCA (MC-GPCA), and a new dictionary learning method for ensembles ofmore » graphs, called multi-centrality graph dictionary learning (MC-GDL), both based on spectral decomposition of multi-centrality matrices. As an application to cyber intrusion detection, MC-GPCA can be an effective indicator of anomalous connectivity pattern and MC-GDL can provide discriminative basis for attack classification.« less

  18. Mapping brain activity in gradient-echo functional MRI using principal component analysis

    NASA Astrophysics Data System (ADS)

    Khosla, Deepak; Singh, Manbir; Don, Manuel

    1997-05-01

    The detection of sites of brain activation in functional MRI has been a topic of immense research interest and many technique shave been proposed to this end. Recently, principal component analysis (PCA) has been applied to extract the activated regions and their time course of activation. This method is based on the assumption that the activation is orthogonal to other signal variations such as brain motion, physiological oscillations and other uncorrelated noises. A distinct advantage of this method is that it does not require any knowledge of the time course of the true stimulus paradigm. This technique is well suited to EPI image sequences where the sampling rate is high enough to capture the effects of physiological oscillations. In this work, we propose and apply tow methods that are based on PCA to conventional gradient-echo images and investigate their usefulness as tools to extract reliable information on brain activation. The first method is a conventional technique where a single image sequence with alternating on and off stages is subject to a principal component analysis. The second method is a PCA-based approach called the common spatial factor analysis technique (CSF). As the name suggests, this method relies on common spatial factors between the above fMRI image sequence and a background fMRI. We have applied these methods to identify active brain ares during visual stimulation and motor tasks. The results from these methods are compared to those obtained by using the standard cross-correlation technique. We found good agreement in the areas identified as active across all three techniques. The results suggest that PCA and CSF methods have good potential in detecting the true stimulus correlated changes in the presence of other interfering signals.

  19. Incorporating biological information in sparse principal component analysis with application to genomic data.

    PubMed

    Li, Ziyi; Safo, Sandra E; Long, Qi

    2017-07-11

    Sparse principal component analysis (PCA) is a popular tool for dimensionality reduction, pattern recognition, and visualization of high dimensional data. It has been recognized that complex biological mechanisms occur through concerted relationships of multiple genes working in networks that are often represented by graphs. Recent work has shown that incorporating such biological information improves feature selection and prediction performance in regression analysis, but there has been limited work on extending this approach to PCA. In this article, we propose two new sparse PCA methods called Fused and Grouped sparse PCA that enable incorporation of prior biological information in variable selection. Our simulation studies suggest that, compared to existing sparse PCA methods, the proposed methods achieve higher sensitivity and specificity when the graph structure is correctly specified, and are fairly robust to misspecified graph structures. Application to a glioblastoma gene expression dataset identified pathways that are suggested in the literature to be related with glioblastoma. The proposed sparse PCA methods Fused and Grouped sparse PCA can effectively incorporate prior biological information in variable selection, leading to improved feature selection and more interpretable principal component loadings and potentially providing insights on molecular underpinnings of complex diseases.

  20. Identification and apportionment of hazardous elements in the sediments in the Yangtze River estuary.

    PubMed

    Wang, Jiawei; Liu, Ruimin; Wang, Haotian; Yu, Wenwen; Xu, Fei; Shen, Zhenyao

    2015-12-01

    In this study, positive matrix factorization (PMF) and principal components analysis (PCA) were combined to identify and apportion pollution-based sources of hazardous elements in the surface sediments in the Yangtze River estuary (YRE). Source identification analysis indicated that PC1, including Al, Fe, Mn, Cr, Ni, As, Cu, and Zn, can be defined as a sewage component; PC2, including Pb and Sb, can be considered as an atmospheric deposition component; and PC3, containing Cd and Hg, can be considered as an agricultural nonpoint component. To better identify the sources and quantitatively apportion the concentrations to their sources, eight sources were identified with PMF: agricultural/industrial sewage mixed (18.6 %), mining wastewater (15.9 %), agricultural fertilizer (14.5 %), atmospheric deposition (12.8 %), agricultural nonpoint (10.6 %), industrial wastewater (9.8 %), marine activity (9.0 %), and nickel plating industry (8.8 %). Overall, the hazardous element content seems to be more connected to anthropogenic activity instead of natural sources. The PCA results laid the foundation for the PMF analysis by providing a general classification of sources. PMF resolves more factors with a higher explained variance than PCA; PMF provided both the internal analysis and the quantitative analysis. The combination of the two methods can provide more reasonable and reliable results.

  1. Discrimination of healthy and osteoarthritic articular cartilage by Fourier transform infrared imaging and Fisher’s discriminant analysis

    PubMed Central

    Mao, Zhi-Hua; Yin, Jian-Hua; Zhang, Xue-Xi; Wang, Xiao; Xia, Yang

    2016-01-01

    Fourier transform infrared spectroscopic imaging (FTIRI) technique can be used to obtain the quantitative information of content and spatial distribution of principal components in cartilage by combining with chemometrics methods. In this study, FTIRI combining with principal component analysis (PCA) and Fisher’s discriminant analysis (FDA) was applied to identify the healthy and osteoarthritic (OA) articular cartilage samples. Ten 10-μm thick sections of canine cartilages were imaged at 6.25μm/pixel in FTIRI. The infrared spectra extracted from the FTIR images were imported into SPSS software for PCA and FDA. Based on the PCA result of 2 principal components, the healthy and OA cartilage samples were effectively discriminated by the FDA with high accuracy of 94% for the initial samples (training set) and cross validation, as well as 86.67% for the prediction group. The study showed that cartilage degeneration became gradually weak with the increase of the depth. FTIRI combined with chemometrics may become an effective method for distinguishing healthy and OA cartilages in future. PMID:26977354

  2. [Identification of varieties of cashmere by Vis/NIR spectroscopy technology based on PCA-SVM].

    PubMed

    Wu, Gui-Fang; He, Yong

    2009-06-01

    One mixed algorithm was presented to discriminate cashmere varieties with principal component analysis (PCA) and support vector machine (SVM). Cashmere fiber has such characteristics as threadlike, softness, glossiness and high tensile strength. The quality characters and economic value of each breed of cashmere are very different. In order to safeguard the consumer's rights and guarantee the quality of cashmere product, quickly, efficiently and correctly identifying cashmere has significant meaning to the production and transaction of cashmere material. The present research adopts Vis/NIRS spectroscopy diffuse techniques to collect the spectral data of cashmere. The near infrared fingerprint of cashmere was acquired by principal component analysis (PCA), and support vector machine (SVM) methods were used to further identify the cashmere material. The result of PCA indicated that the score map made by the scores of PC1, PC2 and PC3 was used, and 10 principal components (PCs) were selected as the input of support vector machine (SVM) based on the reliabilities of PCs of 99.99%. One hundred cashmere samples were used for calibration and the remaining 75 cashmere samples were used for validation. A one-against-all multi-class SVM model was built, the capabilities of SVM with different kernel function were comparatively analyzed, and the result showed that SVM possessing with the Gaussian kernel function has the best identification capabilities with the accuracy of 100%. This research indicated that the data mining method of PCA-SVM has a good identification effect, and can work as a new method for rapid identification of cashmere material varieties.

  3. Medical diagnosis of atherosclerosis from Carotid Artery Doppler Signals using principal component analysis (PCA), k-NN based weighting pre-processing and Artificial Immune Recognition System (AIRS).

    PubMed

    Latifoğlu, Fatma; Polat, Kemal; Kara, Sadik; Güneş, Salih

    2008-02-01

    In this study, we proposed a new medical diagnosis system based on principal component analysis (PCA), k-NN based weighting pre-processing, and Artificial Immune Recognition System (AIRS) for diagnosis of atherosclerosis from Carotid Artery Doppler Signals. The suggested system consists of four stages. First, in the feature extraction stage, we have obtained the features related with atherosclerosis disease using Fast Fourier Transformation (FFT) modeling and by calculating of maximum frequency envelope of sonograms. Second, in the dimensionality reduction stage, the 61 features of atherosclerosis disease have been reduced to 4 features using PCA. Third, in the pre-processing stage, we have weighted these 4 features using different values of k in a new weighting scheme based on k-NN based weighting pre-processing. Finally, in the classification stage, AIRS classifier has been used to classify subjects as healthy or having atherosclerosis. Hundred percent of classification accuracy has been obtained by the proposed system using 10-fold cross validation. This success shows that the proposed system is a robust and effective system in diagnosis of atherosclerosis disease.

  4. SU-F-J-138: An Extension of PCA-Based Respiratory Deformation Modeling Via Multi-Linear Decomposition

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Iliopoulos, AS; Sun, X; Pitsianis, N

    Purpose: To address and lift the limited degree of freedom (DoF) of globally bilinear motion components such as those based on principal components analysis (PCA), for encoding and modeling volumetric deformation motion. Methods: We provide a systematic approach to obtaining a multi-linear decomposition (MLD) and associated motion model from deformation vector field (DVF) data. We had previously introduced MLD for capturing multi-way relationships between DVF variables, without being restricted by the bilinear component format of PCA-based models. PCA-based modeling is commonly used for encoding patient-specific deformation as per planning 4D-CT images, and aiding on-board motion estimation during radiotherapy. However, themore » bilinear space-time decomposition inherently limits the DoF of such models by the small number of respiratory phases. While this limit is not reached in model studies using analytical or digital phantoms with low-rank motion, it compromises modeling power in the presence of relative motion, asymmetries and hysteresis, etc, which are often observed in patient data. Specifically, a low-DoF model will spuriously couple incoherent motion components, compromising its adaptability to on-board deformation changes. By the multi-linear format of extracted motion components, MLD-based models can encode higher-DoF deformation structure. Results: We conduct mathematical and experimental comparisons between PCA- and MLD-based models. A set of temporally-sampled analytical trajectories provides a synthetic, high-rank DVF; trajectories correspond to respiratory and cardiac motion factors, including different relative frequencies and spatial variations. Additionally, a digital XCAT phantom is used to simulate a lung lesion deforming incoherently with respect to the body, which adheres to a simple respiratory trend. In both cases, coupling of incoherent motion components due to a low model DoF is clearly demonstrated. Conclusion: Multi-linear decomposition can enable decoupling of distinct motion factors in high-rank DVF measurements. This may improve motion model expressiveness and adaptability to on-board deformation, aiding model-based image reconstruction for target verification. NIH Grant No. R01-184173.« less

  5. [Identification of varieties of textile fibers by using Vis/NIR infrared spectroscopy technique].

    PubMed

    Wu, Gui-Fang; He, Yong

    2010-02-01

    The aim of the present paper was to provide new insight into Vis/NIR spectroscopic analysis of textile fibers. In order to achieve rapid identification of the varieties of fibers, the authors selected 5 kinds of fibers of cotton, flax, wool, silk and tencel to do a study with Vis/NIR spectroscopy. Firstly, the spectra of each kind of fiber were scanned by spectrometer, and principal component analysis (PCA) method was used to analyze the characteristics of the pattern of Vis/NIR spectra. Principal component scores scatter plot (PC1 x PC2 x PC3) of fiber indicated the classification effect of five varieties of fibers. The former 6 principal components (PCs) were selected according to the quantity and size of PCs. The PCA classification model was optimized by using the least-squares support vector machines (LS-SVM) method. The authors used the 6 PCs extracted by PCA as the inputs of LS-SVM, and PCA-LS-SVM model was built to achieve varieties validation as well as mathematical model building and optimization analysis. Two hundred samples (40 samples for each variety of fibers) of five varieties of fibers were used for calibration of PCA-LS-SVM model, and the other 50 samples (10 samples for each variety of fibers) were used for validation. The result of validation showed that Vis/NIR spectroscopy technique based on PCA-LS-SVM had a powerful classification capability. It provides a new method for identifying varieties of fibers rapidly and real time, so it has important significance for protecting the rights of consumers, ensuring the quality of textiles, and implementing rationalization production and transaction of textile materials and its production.

  6. Empirical evaluation of grouping of lower urinary tract symptoms: principal component analysis of Tampere Ageing Male Urological Study data.

    PubMed

    Pöyhönen, Antti; Häkkinen, Jukka T; Koskimäki, Juha; Hakama, Matti; Tammela, Teuvo L J; Auvinen, Anssi

    2013-03-01

    WHAT'S KNOWN ON THE SUBJECT? AND WHAT DOES THE STUDY ADD?: The ICS has divided LUTS into three groups: storage, voiding and post-micturition symptoms. The classification is based on anatomical, physiological and urodynamic considerations of a theoretical nature. We used principal component analysis (PCA) to determine the inter-correlations of various LUTS, which is a novel approach to research and can strengthen existing knowledge of the phenomenology of LUTS. After we had completed our analyses, another study was published that used a similar approach and results were very similar to those of the present study. We evaluated the constellation of LUTS using PCA of the data from a population-based study that included >4000 men. In our analysis, three components emerged from the 12 LUTS: voiding, storage and incontinence components. Our results indicated that incontinence may be separate from the other storage symptoms and post-micturition symptoms should perhaps be regarded as voiding symptoms. To determine how lower urinary tract symptoms (LUTS) relate to each other and assess if the classification proposed by the International Continence Society (ICS) is consistent with empirical findings. The information on urinary symptoms for this population-based study was collected using a self-administered postal questionnaire in 2004. The questionnaire was sent to 7470 men, aged 30-80 years, from Pirkanmaa County (Finland), of whom 4384 (58.7%) returned the questionnaire. The Danish Prostatic Symptom Score-1 questionnaire was used to evaluate urinary symptoms. Principal component analysis (PCA) was used to evaluate the inter-correlations among various urinary symptoms. The PCA produced a grouping of 12 LUTS into three categories consisting of voiding, storage and incontinence symptoms. Post-micturition symptoms were related to voiding symptoms, but incontinence symptoms were separate from storage symptoms. In the analyses by age group, similar categorization was found at ages 40, 50, 60 and 80 years, but only two groups of symptoms emerged among men aged 70 years. The prevalence among men aged 30 was too low for meaningful analysis. This population-based study suggests that LUTS can be divided into three subgroups consisting of voiding, storage and incontinence symptoms based on their inter-correlations. Our empirical findings suggest an alternative grouping of LUTS. The potential utility of such an approach requires careful consideration. © 2012 BJU International.

  7. Descriptive Characteristics of Surface Water Quality in Hong Kong by a Self-Organising Map

    PubMed Central

    An, Yan; Zou, Zhihong; Li, Ranran

    2016-01-01

    In this study, principal component analysis (PCA) and a self-organising map (SOM) were used to analyse a complex dataset obtained from the river water monitoring stations in the Tolo Harbor and Channel Water Control Zone (Hong Kong), covering the period of 2009–2011. PCA was initially applied to identify the principal components (PCs) among the nonlinear and complex surface water quality parameters. SOM followed PCA, and was implemented to analyze the complex relationships and behaviors of the parameters. The results reveal that PCA reduced the multidimensional parameters to four significant PCs which are combinations of the original ones. The positive and inverse relationships of the parameters were shown explicitly by pattern analysis in the component planes. It was found that PCA and SOM are efficient tools to capture and analyze the behavior of multivariable, complex, and nonlinear related surface water quality data. PMID:26761018

  8. Descriptive Characteristics of Surface Water Quality in Hong Kong by a Self-Organising Map.

    PubMed

    An, Yan; Zou, Zhihong; Li, Ranran

    2016-01-08

    In this study, principal component analysis (PCA) and a self-organising map (SOM) were used to analyse a complex dataset obtained from the river water monitoring stations in the Tolo Harbor and Channel Water Control Zone (Hong Kong), covering the period of 2009-2011. PCA was initially applied to identify the principal components (PCs) among the nonlinear and complex surface water quality parameters. SOM followed PCA, and was implemented to analyze the complex relationships and behaviors of the parameters. The results reveal that PCA reduced the multidimensional parameters to four significant PCs which are combinations of the original ones. The positive and inverse relationships of the parameters were shown explicitly by pattern analysis in the component planes. It was found that PCA and SOM are efficient tools to capture and analyze the behavior of multivariable, complex, and nonlinear related surface water quality data.

  9. Exploring functional data analysis and wavelet principal component analysis on ecstasy (MDMA) wastewater data.

    PubMed

    Salvatore, Stefania; Bramness, Jørgen G; Røislien, Jo

    2016-07-12

    Wastewater-based epidemiology (WBE) is a novel approach in drug use epidemiology which aims to monitor the extent of use of various drugs in a community. In this study, we investigate functional principal component analysis (FPCA) as a tool for analysing WBE data and compare it to traditional principal component analysis (PCA) and to wavelet principal component analysis (WPCA) which is more flexible temporally. We analysed temporal wastewater data from 42 European cities collected daily over one week in March 2013. The main temporal features of ecstasy (MDMA) were extracted using FPCA using both Fourier and B-spline basis functions with three different smoothing parameters, along with PCA and WPCA with different mother wavelets and shrinkage rules. The stability of FPCA was explored through bootstrapping and analysis of sensitivity to missing data. The first three principal components (PCs), functional principal components (FPCs) and wavelet principal components (WPCs) explained 87.5-99.6 % of the temporal variation between cities, depending on the choice of basis and smoothing. The extracted temporal features from PCA, FPCA and WPCA were consistent. FPCA using Fourier basis and common-optimal smoothing was the most stable and least sensitive to missing data. FPCA is a flexible and analytically tractable method for analysing temporal changes in wastewater data, and is robust to missing data. WPCA did not reveal any rapid temporal changes in the data not captured by FPCA. Overall the results suggest FPCA with Fourier basis functions and common-optimal smoothing parameter as the most accurate approach when analysing WBE data.

  10. Principal components analysis in clinical studies.

    PubMed

    Zhang, Zhongheng; Castelló, Adela

    2017-09-01

    In multivariate analysis, independent variables are usually correlated to each other which can introduce multicollinearity in the regression models. One approach to solve this problem is to apply principal components analysis (PCA) over these variables. This method uses orthogonal transformation to represent sets of potentially correlated variables with principal components (PC) that are linearly uncorrelated. PCs are ordered so that the first PC has the largest possible variance and only some components are selected to represent the correlated variables. As a result, the dimension of the variable space is reduced. This tutorial illustrates how to perform PCA in R environment, the example is a simulated dataset in which two PCs are responsible for the majority of the variance in the data. Furthermore, the visualization of PCA is highlighted.

  11. Investigation of domain walls in PPLN by confocal raman microscopy and PCA analysis

    NASA Astrophysics Data System (ADS)

    Shur, Vladimir Ya.; Zelenovskiy, Pavel; Bourson, Patrice

    2017-07-01

    Confocal Raman microscopy (CRM) is a powerful tool for investigation of ferroelectric domains. Mechanical stresses and electric fields existed in the vicinity of neutral and charged domain walls modify frequency, intensity and width of spectral lines [1], thus allowing to visualize micro- and nanodomain structures both at the surface and in the bulk of the crystal [2,3]. Stresses and fields are naturally coupled in ferroelectrics due to inverse piezoelectric effect and hardly can be separated in Raman spectra. PCA is a powerful statistical method for analysis of large data matrix providing a set of orthogonal variables, called principal components (PCs). PCA is widely used for classification of experimental data, for example, in crystallization experiments, for detection of small amounts of components in solid mixtures etc. [4,5]. In Raman spectroscopy PCA was applied for analysis of phase transitions and provided critical pressure with good accuracy [6]. In the present work we for the first time applied Principal Component Analysis (PCA) method for analysis of Raman spectra measured in periodically poled lithium niobate (PPLN). We found that principal components demonstrate different sensitivity to mechanical stresses and electric fields in the vicinity of the domain walls. This allowed us to separately visualize spatial distribution of fields and electric fields at the surface and in the bulk of PPLN.

  12. Liquid chromatography tandem mass spectrometry determination of chemical markers and principal component analysis of Vitex agnus-castus L. fruits (Verbenaceae) and derived food supplements.

    PubMed

    Mari, Angela; Montoro, Paola; Pizza, Cosimo; Piacente, Sonia

    2012-11-01

    A validated analytical method for the quantitative determination of seven chemical markers occurring in a hydroalcoholic extract of Vitex agnus-castus fruits by liquid chromatography electrospray triple quadrupole tandem mass spectrometry (LC/ESI/(QqQ)MSMS) is reported. To carry out a comparative study, five commercial food supplements corresponding to hydroalcoholic extracts of V. agnus-castus fruits were analysed under the same chromatographic conditions of the crude extract. Principal component analysis (PCA), based only on the variation of the amount of the seven chemical markers, was applied in order to find similarities between the hydroalcoholic extract and the food supplements. A second PCA analysis was carried out considering the whole spectroscopic data deriving from liquid chromatography electrospray linear ion trap mass spectrometry (LC/ESI/(LIT)MS) analysis. High similarity between the two PCA was observed, showing the possibility to select one of these two approaches for future applications in the field of comparative analysis of food supplements and quality control procedures. Copyright © 2012 Elsevier B.V. All rights reserved.

  13. A feasibility study on age-related factors of wrist pulse using principal component analysis.

    PubMed

    Jang-Han Bae; Young Ju Jeon; Sanghun Lee; Jaeuk U Kim

    2016-08-01

    Various analysis methods for examining wrist pulse characteristics are needed for accurate pulse diagnosis. In this feasibility study, principal component analysis (PCA) was performed to observe age-related factors of wrist pulse from various analysis parameters. Forty subjects in the age group of 20s and 40s were participated, and their wrist pulse signal and respiration signal were acquired with the pulse tonometric device. After pre-processing of the signals, twenty analysis parameters which have been regarded as values reflecting pulse characteristics were calculated and PCA was performed. As a results, we could reduce complex parameters to lower dimension and age-related factors of wrist pulse were observed by combining-new analysis parameter derived from PCA. These results demonstrate that PCA can be useful tool for analyzing wrist pulse signal.

  14. Rotation of EOFs by the Independent Component Analysis: Towards A Solution of the Mixing Problem in the Decomposition of Geophysical Time Series

    NASA Technical Reports Server (NTRS)

    Aires, Filipe; Rossow, William B.; Chedin, Alain; Hansen, James E. (Technical Monitor)

    2001-01-01

    The Independent Component Analysis is a recently developed technique for component extraction. This new method requires the statistical independence of the extracted components, a stronger constraint that uses higher-order statistics, instead of the classical decorrelation, a weaker constraint that uses only second-order statistics. This technique has been used recently for the analysis of geophysical time series with the goal of investigating the causes of variability in observed data (i.e. exploratory approach). We demonstrate with a data simulation experiment that, if initialized with a Principal Component Analysis, the Independent Component Analysis performs a rotation of the classical PCA (or EOF) solution. This rotation uses no localization criterion like other Rotation Techniques (RT), only the global generalization of decorrelation by statistical independence is used. This rotation of the PCA solution seems to be able to solve the tendency of PCA to mix several physical phenomena, even when the signal is just their linear sum.

  15. Coarse-to-fine markerless gait analysis based on PCA and Gauss-Laguerre decomposition

    NASA Astrophysics Data System (ADS)

    Goffredo, Michela; Schmid, Maurizio; Conforto, Silvia; Carli, Marco; Neri, Alessandro; D'Alessio, Tommaso

    2005-04-01

    Human movement analysis is generally performed through the utilization of marker-based systems, which allow reconstructing, with high levels of accuracy, the trajectories of markers allocated on specific points of the human body. Marker based systems, however, show some drawbacks that can be overcome by the use of video systems applying markerless techniques. In this paper, a specifically designed computer vision technique for the detection and tracking of relevant body points is presented. It is based on the Gauss-Laguerre Decomposition, and a Principal Component Analysis Technique (PCA) is used to circumscribe the region of interest. Results obtained on both synthetic and experimental tests provide significant reduction of the computational costs, with no significant reduction of the tracking accuracy.

  16. Revealing the ultrafast outflow in IRAS 13224-3809 through spectral variability

    NASA Astrophysics Data System (ADS)

    Parker, M. L.; Alston, W. N.; Buisson, D. J. K.; Fabian, A. C.; Jiang, J.; Kara, E.; Lohfink, A.; Pinto, C.; Reynolds, C. S.

    2017-08-01

    We present an analysis of the long-term X-ray variability of the extreme narrow-line Seyfert 1 galaxy IRAS 13224-3809 using principal component analysis (PCA) and fractional excess variability (Fvar) spectra to identify model-independent spectral components. We identify a series of variability peaks in both the first PCA component and Fvar spectrum which correspond to the strongest predicted absorption lines from the ultrafast outflow (UFO) discovered by Parker et al. (2017). We also find higher order PCA components, which correspond to variability of the soft excess and reflection features. The subtle differences between RMS and PCA results argue that the observed flux-dependence of the absorption is due to increased ionization of the gas, rather than changes in column density or covering fraction. This result demonstrates that we can detect outflows from variability alone and that variability studies of UFOs are an extremely promising avenue for future research.

  17. The Influence Function of Principal Component Analysis by Self-Organizing Rule.

    PubMed

    Higuchi; Eguchi

    1998-07-28

    This article is concerned with a neural network approach to principal component analysis (PCA). An algorithm for PCA by the self-organizing rule has been proposed and its robustness observed through the simulation study by Xu and Yuille (1995). In this article, the robustness of the algorithm against outliers is investigated by using the theory of influence function. The influence function of the principal component vector is given in an explicit form. Through this expression, the method is shown to be robust against any directions orthogonal to the principal component vector. In addition, a statistic generated by the self-organizing rule is proposed to assess the influence of data in PCA.

  18. Common factor analysis versus principal component analysis: choice for symptom cluster research.

    PubMed

    Kim, Hee-Ju

    2008-03-01

    The purpose of this paper is to examine differences between two factor analytical methods and their relevance for symptom cluster research: common factor analysis (CFA) versus principal component analysis (PCA). Literature was critically reviewed to elucidate the differences between CFA and PCA. A secondary analysis (N = 84) was utilized to show the actual result differences from the two methods. CFA analyzes only the reliable common variance of data, while PCA analyzes all the variance of data. An underlying hypothetical process or construct is involved in CFA but not in PCA. PCA tends to increase factor loadings especially in a study with a small number of variables and/or low estimated communality. Thus, PCA is not appropriate for examining the structure of data. If the study purpose is to explain correlations among variables and to examine the structure of the data (this is usual for most cases in symptom cluster research), CFA provides a more accurate result. If the purpose of a study is to summarize data with a smaller number of variables, PCA is the choice. PCA can also be used as an initial step in CFA because it provides information regarding the maximum number and nature of factors. In using factor analysis for symptom cluster research, several issues need to be considered, including subjectivity of solution, sample size, symptom selection, and level of measure.

  19. Identifying natural and anthropogenic sources of metals in urban and rural soils using GIS-based data, PCA, and spatial interpolation

    PubMed Central

    Davis, Harley T.; Aelion, C. Marjorie; McDermott, Suzanne; Lawson, Andrew B.

    2009-01-01

    Determining sources of neurotoxic metals in rural and urban soils is important for mitigating human exposure. Surface soil from four areas with significant clusters of mental retardation and developmental delay (MR/DD) in children, and one control site were analyzed for nine metals and characterized by soil type, climate, ecological region, land use and industrial facilities using readily-available GIS-based data. Kriging, principal component analysis (PCA) and cluster analysis (CA) were used to identify commonalities of metal distribution. Three MR/DD areas (one rural and two urban) had similar soil types and significantly higher soil metal concentrations. PCA and CA results suggested that Ba, Be and Mn were consistently from natural sources; Pb and Hg from anthropogenic sources; and As, Cr, Cu, and Ni from both sources. Arsenic had low commonality estimates, was highly associated with a third PCA factor, and had a complex distribution, complicating mitigation strategies to minimize concentrations and exposures. PMID:19361902

  20. Evaluation of skin melanoma in spectral range 450-950 nm using principal component analysis

    NASA Astrophysics Data System (ADS)

    Jakovels, D.; Lihacova, I.; Kuzmina, I.; Spigulis, J.

    2013-06-01

    Diagnostic potential of principal component analysis (PCA) of multi-spectral imaging data in the wavelength range 450- 950 nm for distant skin melanoma recognition is discussed. Processing of the measured clinical data by means of PCA resulted in clear separation between malignant melanomas and pigmented nevi.

  1. Principle Component Analysis with Incomplete Data: A simulation of R pcaMethods package in Constructing an Environmental Quality Index with Missing Data

    EPA Science Inventory

    Missing data is a common problem in the application of statistical techniques. In principal component analysis (PCA), a technique for dimensionality reduction, incomplete data points are either discarded or imputed using interpolation methods. Such approaches are less valid when ...

  2. Extracting factors for interest rate scenarios

    NASA Astrophysics Data System (ADS)

    Molgedey, L.; Galic, E.

    2001-04-01

    Factor based interest rate models are widely used for risk managing purposes, for option pricing and for identifying and capturing yield curve anomalies. The movements of a term structure of interest rates are commonly assumed to be driven by a small number of orthogonal factors such as SHIFT, TWIST and BUTTERFLY (BOW). These factors are usually obtained by a Principal Component Analysis (PCA) of historical bond prices (interest rates). Although PCA diagonalizes the covariance matrix of either the interest rates or the interest rate changes, it does not use both covariance matrices simultaneously. Furthermore higher linear and nonlinear correlations are neglected. These correlations as well as the mean reverting properties of the interest rates become crucial, if one is interested in a longer time horizon (infrequent hedging or trading). We will show that Independent Component Analysis (ICA) is a more appropriate tool than PCA, since ICA uses the covariance matrix of the interest rates as well as the covariance matrix of the interest rate changes simultaneously. Additionally higher linear and nonlinear correlations may be easily incorporated. The resulting factors are uncorrelated for various time delays, approximately independent but nonorthogonal. This is in contrast to the factors obtained from the PCA, which are orthogonal and uncorrelated for identical times only. Although factors from the ICA are nonorthogonal, it is sufficient to consider only a few factors in order to explain most of the variation in the original data. Finally we will present examples that ICA based hedges outperforms PCA based hedges specifically if the portfolio is sensitive to structural changes of the yield curve.

  3. Application of principal component analysis (PCA) as a sensory assessment tool for fermented food products.

    PubMed

    Ghosh, Debasree; Chattopadhyay, Parimal

    2012-06-01

    The objective of the work was to use the method of quantitative descriptive analysis (QDA) to describe the sensory attributes of the fermented food products prepared with the incorporation of lactic cultures. Panellists were selected and trained to evaluate various attributes specially color and appearance, body texture, flavor, overall acceptability and acidity of the fermented food products like cow milk curd and soymilk curd, idli, sauerkraut and probiotic ice cream. Principal component analysis (PCA) identified the six significant principal components that accounted for more than 90% of the variance in the sensory attribute data. Overall product quality was modelled as a function of principal components using multiple least squares regression (R (2) = 0.8). The result from PCA was statistically analyzed by analysis of variance (ANOVA). These findings demonstrate the utility of quantitative descriptive analysis for identifying and measuring the fermented food product attributes that are important for consumer acceptability.

  4. An Efficient Data Compression Model Based on Spatial Clustering and Principal Component Analysis in Wireless Sensor Networks.

    PubMed

    Yin, Yihang; Liu, Fengzheng; Zhou, Xiang; Li, Quanzhong

    2015-08-07

    Wireless sensor networks (WSNs) have been widely used to monitor the environment, and sensors in WSNs are usually power constrained. Because inner-node communication consumes most of the power, efficient data compression schemes are needed to reduce the data transmission to prolong the lifetime of WSNs. In this paper, we propose an efficient data compression model to aggregate data, which is based on spatial clustering and principal component analysis (PCA). First, sensors with a strong temporal-spatial correlation are grouped into one cluster for further processing with a novel similarity measure metric. Next, sensor data in one cluster are aggregated in the cluster head sensor node, and an efficient adaptive strategy is proposed for the selection of the cluster head to conserve energy. Finally, the proposed model applies principal component analysis with an error bound guarantee to compress the data and retain the definite variance at the same time. Computer simulations show that the proposed model can greatly reduce communication and obtain a lower mean square error than other PCA-based algorithms.

  5. Ripening-dependent metabolic changes in the volatiles of pineapple (Ananas comosus (L.) Merr.) fruit: II. Multivariate statistical profiling of pineapple aroma compounds based on comprehensive two-dimensional gas chromatography-mass spectrometry.

    PubMed

    Steingass, Christof Björn; Jutzi, Manfred; Müller, Jenny; Carle, Reinhold; Schmarr, Hans-Georg

    2015-03-01

    Ripening-dependent changes of pineapple volatiles were studied in a nontargeted profiling analysis. Volatiles were isolated via headspace solid phase microextraction and analyzed by comprehensive 2D gas chromatography and mass spectrometry (HS-SPME-GC×GC-qMS). Profile patterns presented in the contour plots were evaluated applying image processing techniques and subsequent multivariate statistical data analysis. Statistical methods comprised unsupervised hierarchical cluster analysis (HCA) and principal component analysis (PCA) to classify the samples. Supervised partial least squares discriminant analysis (PLS-DA) and partial least squares (PLS) regression were applied to discriminate different ripening stages and describe the development of volatiles during postharvest storage, respectively. Hereby, substantial chemical markers allowing for class separation were revealed. The workflow permitted the rapid distinction between premature green-ripe pineapples and postharvest-ripened sea-freighted fruits. Volatile profiles of fully ripe air-freighted pineapples were similar to those of green-ripe fruits postharvest ripened for 6 days after simulated sea freight export, after PCA with only two principal components. However, PCA considering also the third principal component allowed differentiation between air-freighted fruits and the four progressing postharvest maturity stages of sea-freighted pineapples.

  6. Application of Hyperspectral Imaging and Chemometric Calibrations for Variety Discrimination of Maize Seeds

    PubMed Central

    Zhang, Xiaolei; Liu, Fei; He, Yong; Li, Xiaoli

    2012-01-01

    Hyperspectral imaging in the visible and near infrared (VIS-NIR) region was used to develop a novel method for discriminating different varieties of commodity maize seeds. Firstly, hyperspectral images of 330 samples of six varieties of maize seeds were acquired using a hyperspectral imaging system in the 380–1,030 nm wavelength range. Secondly, principal component analysis (PCA) and kernel principal component analysis (KPCA) were used to explore the internal structure of the spectral data. Thirdly, three optimal wavelengths (523, 579 and 863 nm) were selected by implementing PCA directly on each image. Then four textural variables including contrast, homogeneity, energy and correlation were extracted from gray level co-occurrence matrix (GLCM) of each monochromatic image based on the optimal wavelengths. Finally, several models for maize seeds identification were established by least squares-support vector machine (LS-SVM) and back propagation neural network (BPNN) using four different combinations of principal components (PCs), kernel principal components (KPCs) and textural features as input variables, respectively. The recognition accuracy achieved in the PCA-GLCM-LS-SVM model (98.89%) was the most satisfactory one. We conclude that hyperspectral imaging combined with texture analysis can be implemented for fast classification of different varieties of maize seeds. PMID:23235456

  7. An analytics of electricity consumption characteristics based on principal component analysis

    NASA Astrophysics Data System (ADS)

    Feng, Junshu

    2018-02-01

    Abstract . More detailed analysis of the electricity consumption characteristics can make demand side management (DSM) much more targeted. In this paper, an analytics of electricity consumption characteristics based on principal component analysis (PCA) is given, which the PCA method can be used in to extract the main typical characteristics of electricity consumers. Then, electricity consumption characteristics matrix is designed, which can make a comparison of different typical electricity consumption characteristics between different types of consumers, such as industrial consumers, commercial consumers and residents. In our case study, the electricity consumption has been mainly divided into four characteristics: extreme peak using, peak using, peak-shifting using and others. Moreover, it has been found that industrial consumers shift their peak load often, meanwhile commercial and residential consumers have more peak-time consumption. The conclusions can provide decision support of DSM for the government and power providers.

  8. Hyperspectral Image Denoising Using a Nonlocal Spectral Spatial Principal Component Analysis

    NASA Astrophysics Data System (ADS)

    Li, D.; Xu, L.; Peng, J.; Ma, J.

    2018-04-01

    Hyperspectral images (HSIs) denoising is a critical research area in image processing duo to its importance in improving the quality of HSIs, which has a negative impact on object detection and classification and so on. In this paper, we develop a noise reduction method based on principal component analysis (PCA) for hyperspectral imagery, which is dependent on the assumption that the noise can be removed by selecting the leading principal components. The main contribution of paper is to introduce the spectral spatial structure and nonlocal similarity of the HSIs into the PCA denoising model. PCA with spectral spatial structure can exploit spectral correlation and spatial correlation of HSI by using 3D blocks instead of 2D patches. Nonlocal similarity means the similarity between the referenced pixel and other pixels in nonlocal area, where Mahalanobis distance algorithm is used to estimate the spatial spectral similarity by calculating the distance in 3D blocks. The proposed method is tested on both simulated and real hyperspectral images, the results demonstrate that the proposed method is superior to several other popular methods in HSI denoising.

  9. Kernel Principal Component Analysis for dimensionality reduction in fMRI-based diagnosis of ADHD.

    PubMed

    Sidhu, Gagan S; Asgarian, Nasimeh; Greiner, Russell; Brown, Matthew R G

    2012-01-01

    This study explored various feature extraction methods for use in automated diagnosis of Attention-Deficit Hyperactivity Disorder (ADHD) from functional Magnetic Resonance Image (fMRI) data. Each participant's data consisted of a resting state fMRI scan as well as phenotypic data (age, gender, handedness, IQ, and site of scanning) from the ADHD-200 dataset. We used machine learning techniques to produce support vector machine (SVM) classifiers that attempted to differentiate between (1) all ADHD patients vs. healthy controls and (2) ADHD combined (ADHD-c) type vs. ADHD inattentive (ADHD-i) type vs. controls. In different tests, we used only the phenotypic data, only the imaging data, or else both the phenotypic and imaging data. For feature extraction on fMRI data, we tested the Fast Fourier Transform (FFT), different variants of Principal Component Analysis (PCA), and combinations of FFT and PCA. PCA variants included PCA over time (PCA-t), PCA over space and time (PCA-st), and kernelized PCA (kPCA-st). Baseline chance accuracy was 64.2% produced by guessing healthy control (the majority class) for all participants. Using only phenotypic data produced 72.9% accuracy on two class diagnosis and 66.8% on three class diagnosis. Diagnosis using only imaging data did not perform as well as phenotypic-only approaches. Using both phenotypic and imaging data with combined FFT and kPCA-st feature extraction yielded accuracies of 76.0% on two class diagnosis and 68.6% on three class diagnosis-better than phenotypic-only approaches. Our results demonstrate the potential of using FFT and kPCA-st with resting-state fMRI data as well as phenotypic data for automated diagnosis of ADHD. These results are encouraging given known challenges of learning ADHD diagnostic classifiers using the ADHD-200 dataset (see Brown et al., 2012).

  10. Gabor-based kernel PCA with fractional power polynomial models for face recognition.

    PubMed

    Liu, Chengjun

    2004-05-01

    This paper presents a novel Gabor-based kernel Principal Component Analysis (PCA) method by integrating the Gabor wavelet representation of face images and the kernel PCA method for face recognition. Gabor wavelets first derive desirable facial features characterized by spatial frequency, spatial locality, and orientation selectivity to cope with the variations due to illumination and facial expression changes. The kernel PCA method is then extended to include fractional power polynomial models for enhanced face recognition performance. A fractional power polynomial, however, does not necessarily define a kernel function, as it might not define a positive semidefinite Gram matrix. Note that the sigmoid kernels, one of the three classes of widely used kernel functions (polynomial kernels, Gaussian kernels, and sigmoid kernels), do not actually define a positive semidefinite Gram matrix either. Nevertheless, the sigmoid kernels have been successfully used in practice, such as in building support vector machines. In order to derive real kernel PCA features, we apply only those kernel PCA eigenvectors that are associated with positive eigenvalues. The feasibility of the Gabor-based kernel PCA method with fractional power polynomial models has been successfully tested on both frontal and pose-angled face recognition, using two data sets from the FERET database and the CMU PIE database, respectively. The FERET data set contains 600 frontal face images of 200 subjects, while the PIE data set consists of 680 images across five poses (left and right profiles, left and right half profiles, and frontal view) with two different facial expressions (neutral and smiling) of 68 subjects. The effectiveness of the Gabor-based kernel PCA method with fractional power polynomial models is shown in terms of both absolute performance indices and comparative performance against the PCA method, the kernel PCA method with polynomial kernels, the kernel PCA method with fractional power polynomial models, the Gabor wavelet-based PCA method, and the Gabor wavelet-based kernel PCA method with polynomial kernels.

  11. Total Electron Content forecast model over Australia

    NASA Astrophysics Data System (ADS)

    Bouya, Zahra; Terkildsen, Michael; Francis, Matthew

    Ionospheric perturbations can cause serious propagation errors in modern radio systems such as Global Navigation Satellite Systems (GNSS). Forecasting ionospheric parameters is helpful to estimate potential degradation of the performance of these systems. Our purpose is to establish an Australian Regional Total Electron Content (TEC) forecast model at IPS. In this work we present an approach based on the combined use of the Principal Component Analysis (PCA) and Artificial Neural Network (ANN) to predict future TEC values. PCA is used to reduce the dimensionality of the original TEC data by mapping it into its eigen-space. In this process the top- 5 eigenvectors are chosen to reflect the directions of the maximum variability. An ANN approach was then used for the multicomponent prediction. We outline the design of the ANN model with its parameters. A number of activation functions along with different spectral ranges and different numbers of Principal Components (PCs) were tested to find the PCA-ANN models reaching the best results. Keywords: GNSS, Space Weather, Regional, Forecast, PCA, ANN.

  12. Morphological analysis of Trichomycterus areolatus Valenciennes, 1846 from southern Chilean rivers using a truss-based system (Siluriformes, Trichomycteridae).

    PubMed

    Colihueque, Nelson; Corrales, Olga; Yáñez, Miguel

    2017-01-01

    Trichomycterus areolatus Valenciennes, 1846 is a small endemic catfish inhabiting the Andean river basins of Chile. In this study, the morphological variability of three T. areolatus populations, collected in two river basins from southern Chile, was assessed with multivariate analyses, including principal component analysis (PCA) and discriminant function analysis (DFA). It is hypothesized that populations must segregate morphologically from each other based on the river basin that they were sampled from, since each basin presents relatively particular hydrological characteristics. Significant morphological differences among the three populations were found with PCA (ANOSIM test, r = 0.552, p < 0.0001) and DFA (Wilks's λ = 0.036, p < 0.01). PCA accounted for a total variation of 56.16% by the first two principal components. The first Principal Component (PC1) and PC2 explained 34.72 and 21.44% of the total variation, respectively. The scatter-plot of the first two discriminant functions (DF1 on DF2) also validated the existence of three different populations. In group classification using DFA, 93.3% of the specimens were correctly-classified into their original populations. Of the total of 22 transformed truss measurements, 17 exhibited highly significant ( p < 0.01) differences among populations. The data support the existence of T. areolatus morphological variation across different rivers in southern Chile, likely reflecting the geographic isolation underlying population structure of the species.

  13. Performance evaluation of PCA-based spike sorting algorithms.

    PubMed

    Adamos, Dimitrios A; Kosmidis, Efstratios K; Theophilidis, George

    2008-09-01

    Deciphering the electrical activity of individual neurons from multi-unit noisy recordings is critical for understanding complex neural systems. A widely used spike sorting algorithm is being evaluated for single-electrode nerve trunk recordings. The algorithm is based on principal component analysis (PCA) for spike feature extraction. In the neuroscience literature it is generally assumed that the use of the first two or most commonly three principal components is sufficient. We estimate the optimum PCA-based feature space by evaluating the algorithm's performance on simulated series of action potentials. A number of modifications are made to the open source nev2lkit software to enable systematic investigation of the parameter space. We introduce a new metric to define clustering error considering over-clustering more favorable than under-clustering as proposed by experimentalists for our data. Both the program patch and the metric are available online. Correlated and white Gaussian noise processes are superimposed to account for biological and artificial jitter in the recordings. We report that the employment of more than three principal components is in general beneficial for all noise cases considered. Finally, we apply our results to experimental data and verify that the sorting process with four principal components is in agreement with a panel of electrophysiology experts.

  14. Study of support vector machine and serum surface-enhanced Raman spectroscopy for noninvasive esophageal cancer detection

    NASA Astrophysics Data System (ADS)

    Li, Shao-Xin; Zeng, Qiu-Yao; Li, Lin-Fang; Zhang, Yan-Jiao; Wan, Ming-Ming; Liu, Zhi-Ming; Xiong, Hong-Lian; Guo, Zhou-Yi; Liu, Song-Hao

    2013-02-01

    The ability of combining serum surface-enhanced Raman spectroscopy (SERS) with support vector machine (SVM) for improving classification esophageal cancer patients from normal volunteers is investigated. Two groups of serum SERS spectra based on silver nanoparticles (AgNPs) are obtained: one group from patients with pathologically confirmed esophageal cancer (n=30) and the other group from healthy volunteers (n=31). Principal components analysis (PCA), conventional SVM (C-SVM) and conventional SVM combination with PCA (PCA-SVM) methods are implemented to classify the same spectral dataset. Results show that a diagnostic accuracy of 77.0% is acquired for PCA technique, while diagnostic accuracies of 83.6% and 85.2% are obtained for C-SVM and PCA-SVM methods based on radial basis functions (RBF) models. The results prove that RBF SVM models are superior to PCA algorithm in classification serum SERS spectra. The study demonstrates that serum SERS in combination with SVM technique has great potential to provide an effective and accurate diagnostic schema for noninvasive detection of esophageal cancer.

  15. Analysis of antique bronze coins by Laser Induced Breakdown Spectroscopy and multivariate analysis

    NASA Astrophysics Data System (ADS)

    Bachler, M. Orlić; Bišćan, M.; Kregar, Z.; Jelovica Badovinac, I.; Dobrinić, J.; Milošević, S.

    2016-09-01

    This work presents a feasibility study of applying the Principal Component Analysis (PCA) to data obtained by Laser-Induced Breakdown Spectroscopy (LIBS) with the aim of determining correlation between different samples. The samples were antique bronze coins coated in silver (follis) dated in the Roman Empire period and were made during different rulers in different mints. While raw LIBS data revealed that in the period from the year 286 to 383 CE content of silver was constantly decreasing, the PCA showed that the samples can be somewhat grouped together based on their place of origin, which could be a useful hint when analysing unknown samples. It was also found that PCA can help in discriminating spectra corresponding to ablation from the surface and from the bulk. Furthermore, Partial Least Squares method (PLS) was used to obtain, based on a set of samples with known composition, an estimation of relative copper concentration in studied ancient coins. This analysis showed that copper concentration in surface layers ranged from 83% to 90%.

  16. Identification of the isomers using principal component analysis (PCA) method

    NASA Astrophysics Data System (ADS)

    Kepceoǧlu, Abdullah; Gündoǧdu, Yasemin; Ledingham, Kenneth William David; Kilic, Hamdi Sukur

    2016-03-01

    In this work, we have carried out a detailed statistical analysis for experimental data of mass spectra from xylene isomers. Principle Component Analysis (PCA) was used to identify the isomers which cannot be distinguished using conventional statistical methods for interpretation of their mass spectra. Experiments have been carried out using a linear TOF-MS coupled to a femtosecond laser system as an energy source for the ionisation processes. We have performed experiments and collected data which has been analysed and interpreted using PCA as a multivariate analysis of these spectra. This demonstrates the strength of the method to get an insight for distinguishing the isomers which cannot be identified using conventional mass analysis obtained through dissociative ionisation processes on these molecules. The PCA results dependending on the laser pulse energy and the background pressure in the spectrometers have been presented in this work.

  17. Sample-space-based feature extraction and class preserving projection for gene expression data.

    PubMed

    Wang, Wenjun

    2013-01-01

    In order to overcome the problems of high computational complexity and serious matrix singularity for feature extraction using Principal Component Analysis (PCA) and Fisher's Linear Discrinimant Analysis (LDA) in high-dimensional data, sample-space-based feature extraction is presented, which transforms the computation procedure of feature extraction from gene space to sample space by representing the optimal transformation vector with the weighted sum of samples. The technique is used in the implementation of PCA, LDA, Class Preserving Projection (CPP) which is a new method for discriminant feature extraction proposed, and the experimental results on gene expression data demonstrate the effectiveness of the method.

  18. Use of Geochemistry Data Collected by the Mars Exploration Rover Spirit in Gusev Crater to Teach Geomorphic Zonation through Principal Components Analysis

    ERIC Educational Resources Information Center

    Rodrigue, Christine M.

    2011-01-01

    This paper presents a laboratory exercise used to teach principal components analysis (PCA) as a means of surface zonation. The lab was built around abundance data for 16 oxides and elements collected by the Mars Exploration Rover Spirit in Gusev Crater between Sol 14 and Sol 470. Students used PCA to reduce 15 of these into 3 components, which,…

  19. Complexity of free energy landscapes of peptides revealed by nonlinear principal component analysis.

    PubMed

    Nguyen, Phuong H

    2006-12-01

    Employing the recently developed hierarchical nonlinear principal component analysis (NLPCA) method of Saegusa et al. (Neurocomputing 2004;61:57-70 and IEICE Trans Inf Syst 2005;E88-D:2242-2248), the complexities of the free energy landscapes of several peptides, including triglycine, hexaalanine, and the C-terminal beta-hairpin of protein G, were studied. First, the performance of this NLPCA method was compared with the standard linear principal component analysis (PCA). In particular, we compared two methods according to (1) the ability of the dimensionality reduction and (2) the efficient representation of peptide conformations in low-dimensional spaces spanned by the first few principal components. The study revealed that NLPCA reduces the dimensionality of the considered systems much better, than did PCA. For example, in order to get the similar error, which is due to representation of the original data of beta-hairpin in low dimensional space, one needs 4 and 21 principal components of NLPCA and PCA, respectively. Second, by representing the free energy landscapes of the considered systems as a function of the first two principal components obtained from PCA, we obtained the relatively well-structured free energy landscapes. In contrast, the free energy landscapes of NLPCA are much more complicated, exhibiting many states which are hidden in the PCA maps, especially in the unfolded regions. Furthermore, the study also showed that many states in the PCA maps are mixed up by several peptide conformations, while those of the NLPCA maps are more pure. This finding suggests that the NLPCA should be used to capture the essential features of the systems. (c) 2006 Wiley-Liss, Inc.

  20. Application of Principal Component Analysis to NIR Spectra of Phyllosilicates: A Technique for Identifying Phyllosilicates on Mars

    NASA Technical Reports Server (NTRS)

    Rampe, E. B.; Lanza, N. L.

    2012-01-01

    Orbital near-infrared (NIR) reflectance spectra of the martian surface from the OMEGA and CRISM instruments have identified a variety of phyllosilicates in Noachian terrains. The types of phyllosilicates present on Mars have important implications for the aqueous environments in which they formed, and, thus, for recognizing locales that may have been habitable. Current identifications of phyllosilicates from martian NIR data are based on the positions of spectral absorptions relative to laboratory data of well-characterized samples and from spectral ratios; however, some phyllosilicates can be difficult to distinguish from one another with these methods (i.e. illite vs. muscovite). Here we employ a multivariate statistical technique, principal component analysis (PCA), to differentiate between spectrally similar phyllosilicate minerals. PCA is commonly used in a variety of industries (pharmaceutical, agricultural, viticultural) to discriminate between samples. Previous work using PCA to analyze raw NIR reflectance data from mineral mixtures has shown that this is a viable technique for identifying mineral types, abundances, and particle sizes. Here, we evaluate PCA of second-derivative NIR reflectance data as a method for classifying phyllosilicates and test whether this method can be used to identify phyllosilicates on Mars.

  1. Characterizing Variability of Modular Brain Connectivity with Constrained Principal Component Analysis

    PubMed Central

    Hirayama, Jun-ichiro; Hyvärinen, Aapo; Kiviniemi, Vesa; Kawanabe, Motoaki; Yamashita, Okito

    2016-01-01

    Characterizing the variability of resting-state functional brain connectivity across subjects and/or over time has recently attracted much attention. Principal component analysis (PCA) serves as a fundamental statistical technique for such analyses. However, performing PCA on high-dimensional connectivity matrices yields complicated “eigenconnectivity” patterns, for which systematic interpretation is a challenging issue. Here, we overcome this issue with a novel constrained PCA method for connectivity matrices by extending the idea of the previously proposed orthogonal connectivity factorization method. Our new method, modular connectivity factorization (MCF), explicitly introduces the modularity of brain networks as a parametric constraint on eigenconnectivity matrices. In particular, MCF analyzes the variability in both intra- and inter-module connectivities, simultaneously finding network modules in a principled, data-driven manner. The parametric constraint provides a compact module-based visualization scheme with which the result can be intuitively interpreted. We develop an optimization algorithm to solve the constrained PCA problem and validate our method in simulation studies and with a resting-state functional connectivity MRI dataset of 986 subjects. The results show that the proposed MCF method successfully reveals the underlying modular eigenconnectivity patterns in more general situations and is a promising alternative to existing methods. PMID:28002474

  2. Spatial assessment of water quality using chemometrics in the Pearl River Estuary, China

    NASA Astrophysics Data System (ADS)

    Wu, Meilin; Wang, Youshao; Dong, Junde; Sun, Fulin; Wang, Yutu; Hong, Yiguo

    2017-03-01

    A cruise was commissioned in the summer of 2009 to evaluate water quality in the Pearl River Estuary (PRE). Chemometrics such as Principal Component Analysis (PCA), Cluster analysis (CA) and Self-Organizing Map (SOM) were employed to identify anthropogenic and natural influences on estuary water quality. The scores of stations in the surface layer in the first principal component (PC1) were related to NH4-N, PO4-P, NO2-N, NO3-N, TP, and Chlorophyll a while salinity, turbidity, and SiO3-Si in the second principal component (PC2). Similarly, the scores of stations in the bottom layers in PC1 were related to PO4-P, NO2-N, NO3-N, and TP, while salinity, Chlorophyll a, NH4-N, and SiO3-Si in PC2. Results of the PCA identified the spatial distribution of the surface and bottom water quality, namely the Guangzhou urban reach, Middle reach, and Lower reach of the estuary. Both cluster analysis and PCA produced the similar results. Self-organizing map delineated the Guangzhou urban reach of the Pearl River that was mainly influenced by human activities. The middle and lower reaches of the PRE were mainly influenced by the waters in the South China Sea. The information extracted by PCA, CA, and SOM would be very useful to regional agencies in developing a strategy to carry out scientific plans for resource use based on marine system functions.

  3. A Study of Wind Turbine Comprehensive Operational Assessment Model Based on EM-PCA Algorithm

    NASA Astrophysics Data System (ADS)

    Zhou, Minqiang; Xu, Bin; Zhan, Yangyan; Ren, Danyuan; Liu, Dexing

    2018-01-01

    To assess wind turbine performance accurately and provide theoretical basis for wind farm management, a hybrid assessment model based on Entropy Method and Principle Component Analysis (EM-PCA) was established, which took most factors of operational performance into consideration and reach to a comprehensive result. To verify the model, six wind turbines were chosen as the research objects, the ranking obtained by the method proposed in the paper were 4#>6#>1#>5#>2#>3#, which are completely in conformity with the theoretical ranking, which indicates that the reliability and effectiveness of the EM-PCA method are high. The method could give guidance for processing unit state comparison among different units and launching wind farm operational assessment.

  4. Power line identification of millimeter wave radar based on PCA-GS-SVM

    NASA Astrophysics Data System (ADS)

    Fang, Fang; Zhang, Guifeng; Cheng, Yansheng

    2017-12-01

    Aiming at the problem that the existing detection method can not effectively solve the security of UAV's ultra low altitude flight caused by power line, a power line recognition method based on grid search (GS) and the principal component analysis and support vector machine (PCA-SVM) is proposed. Firstly, the candidate line of Hough transform is reduced by PCA, and the main feature of candidate line is extracted. Then, upport vector machine (SVM is) optimized by grid search method (GS). Finally, using support vector machine classifier optimized parameters to classify the candidate line. MATLAB simulation results show that this method can effectively identify the power line and noise, and has high recognition accuracy and algorithm efficiency.

  5. EMPCA and Cluster Analysis of Quasar Spectra: Construction and Application to Simulated Spectra

    NASA Astrophysics Data System (ADS)

    Marrs, Adam; Leighly, Karen; Wagner, Cassidy; Macinnis, Francis

    2017-01-01

    Quasars have complex spectra with emission lines influenced by many factors. Therefore, to fully describe the spectrum requires specification of a large number of parameters, such as line equivalent width, blueshift, and ratios. Principal Component Analysis (PCA) aims to construct eigenvectors-or principal components-from the data with the goal of finding a few key parameters that can be used to predict the rest of the spectrum fairly well. Analysis of simulated quasar spectra was used to verify and justify our modified application of PCA.We used a variant of PCA called Weighted Expectation Maximization PCA (EMPCA; Bailey 2012) along with k-means cluster analysis to analyze simulated quasar spectra. Our approach combines both analytical methods to address two known problems with classical PCA. EMPCA uses weights to account for uncertainty and missing points in the spectra. K-means groups similar spectra together to address the nonlinearity of quasar spectra, specifically variance in blueshifts and widths of the emission lines.In producing and analyzing simulations, we first tested the effects of varying equivalent widths and blueshifts on the derived principal components, and explored the differences between standard PCA and EMPCA. We also tested the effects of varying signal-to-noise ratio. Next we used the results of fits to composite quasar spectra (see accompanying poster by Wagner et al.) to construct a set of realistic simulated spectra, and subjected those spectra to the EMPCA /k-means analysis. We concluded that our approach was validated when we found that the mean spectra from our k-means clusters derived from PCA projection coefficients reproduced the trends observed in the composite spectra.Furthermore, our method needed only two eigenvectors to identify both sets of correlations used to construct the simulations, as well as indicating the linear and nonlinear segments. Comparing this to regular PCA, which can require a dozen or more components, or to direct spectral analysis that may need measurement of 20 fit parameters, shows why the dual application of these two techniques is such a powerful tool.

  6. Fault Detection of Bearing Systems through EEMD and Optimization Algorithm

    PubMed Central

    Lee, Dong-Han; Ahn, Jong-Hyo; Koh, Bong-Hwan

    2017-01-01

    This study proposes a fault detection and diagnosis method for bearing systems using ensemble empirical mode decomposition (EEMD) based feature extraction, in conjunction with particle swarm optimization (PSO), principal component analysis (PCA), and Isomap. First, a mathematical model is assumed to generate vibration signals from damaged bearing components, such as the inner-race, outer-race, and rolling elements. The process of decomposing vibration signals into intrinsic mode functions (IMFs) and extracting statistical features is introduced to develop a damage-sensitive parameter vector. Finally, PCA and Isomap algorithm are used to classify and visualize this parameter vector, to separate damage characteristics from healthy bearing components. Moreover, the PSO-based optimization algorithm improves the classification performance by selecting proper weightings for the parameter vector, to maximize the visualization effect of separating and grouping of parameter vectors in three-dimensional space. PMID:29143772

  7. Time-oriented hierarchical method for computation of principal components using subspace learning algorithm.

    PubMed

    Jankovic, Marko; Ogawa, Hidemitsu

    2004-10-01

    Principal Component Analysis (PCA) and Principal Subspace Analysis (PSA) are classic techniques in statistical data analysis, feature extraction and data compression. Given a set of multivariate measurements, PCA and PSA provide a smaller set of "basis vectors" with less redundancy, and a subspace spanned by them, respectively. Artificial neurons and neural networks have been shown to perform PSA and PCA when gradient ascent (descent) learning rules are used, which is related to the constrained maximization (minimization) of statistical objective functions. Due to their low complexity, such algorithms and their implementation in neural networks are potentially useful in cases of tracking slow changes of correlations in the input data or in updating eigenvectors with new samples. In this paper we propose PCA learning algorithm that is fully homogeneous with respect to neurons. The algorithm is obtained by modification of one of the most famous PSA learning algorithms--Subspace Learning Algorithm (SLA). Modification of the algorithm is based on Time-Oriented Hierarchical Method (TOHM). The method uses two distinct time scales. On a faster time scale PSA algorithm is responsible for the "behavior" of all output neurons. On a slower scale, output neurons will compete for fulfillment of their "own interests". On this scale, basis vectors in the principal subspace are rotated toward the principal eigenvectors. At the end of the paper it will be briefly analyzed how (or why) time-oriented hierarchical method can be used for transformation of any of the existing neural network PSA method, into PCA method.

  8. Receptor modeling for source apportionment of polycyclic aromatic hydrocarbons in urban atmosphere.

    PubMed

    Singh, Kunwar P; Malik, Amrita; Kumar, Ranjan; Saxena, Puneet; Sinha, Sarita

    2008-01-01

    This study reports source apportionment of polycyclic aromatic hydrocarbons (PAHs) in particulate depositions on vegetation foliages near highway in the urban environment of Lucknow city (India) using the principal components analysis/absolute principal components scores (PCA/APCS) receptor modeling approach. The multivariate method enables identification of major PAHs sources along with their quantitative contributions with respect to individual PAH. The PCA identified three major sources of PAHs viz. combustion, vehicular emissions, and diesel based activities. The PCA/APCS receptor modeling approach revealed that the combustion sources (natural gas, wood, coal/coke, biomass) contributed 19-97% of various PAHs, vehicular emissions 0-70%, diesel based sources 0-81% and other miscellaneous sources 0-20% of different PAHs. The contributions of major pyrolytic and petrogenic sources to the total PAHs were 56 and 42%, respectively. Further, the combustion related sources contribute major fraction of the carcinogenic PAHs in the study area. High correlation coefficient (R2 > 0.75 for most PAHs) between the measured and predicted concentrations of PAHs suggests for the applicability of the PCA/APCS receptor modeling approach for estimation of source contribution to the PAHs in particulates.

  9. Analyzing brain networks with PCA and conditional Granger causality.

    PubMed

    Zhou, Zhenyu; Chen, Yonghong; Ding, Mingzhou; Wright, Paul; Lu, Zuhong; Liu, Yijun

    2009-07-01

    Identifying directional influences in anatomical and functional circuits presents one of the greatest challenges for understanding neural computations in the brain. Granger causality mapping (GCM) derived from vector autoregressive models of data has been employed for this purpose, revealing complex temporal and spatial dynamics underlying cognitive processes. However, the traditional GCM methods are computationally expensive, as signals from thousands of voxels within selected regions of interest (ROIs) are individually processed, and being based on pairwise Granger causality, they lack the ability to distinguish direct from indirect connectivity among brain regions. In this work a new algorithm called PCA based conditional GCM is proposed to overcome these problems. The algorithm implements the following two procedures: (i) dimensionality reduction in ROIs of interest with principle component analysis (PCA), and (ii) estimation of the direct causal influences in local brain networks, using conditional Granger causality. Our results show that the proposed method achieves greater accuracy in detecting network connectivity than the commonly used pairwise Granger causality method. Furthermore, the use of PCA components in conjunction with conditional GCM greatly reduces the computational cost relative to the use of individual voxel time series. Copyright 2009 Wiley-Liss, Inc

  10. [Research on spectra recognition method for cabbages and weeds based on PCA and SIMCA].

    PubMed

    Zu, Qin; Deng, Wei; Wang, Xiu; Zhao, Chun-Jiang

    2013-10-01

    In order to improve the accuracy and efficiency of weed identification, the difference of spectral reflectance was employed to distinguish between crops and weeds. Firstly, the different combinations of Savitzky-Golay (SG) convolutional derivation and multiplicative scattering correction (MSC) method were applied to preprocess the raw spectral data. Then the clustering analysis of various types of plants was completed by using principal component analysis (PCA) method, and the feature wavelengths which were sensitive for classifying various types of plants were extracted according to the corresponding loading plots of the optimal principal components in PCA results. Finally, setting the feature wavelengths as the input variables, the soft independent modeling of class analogy (SIMCA) classification method was used to identify the various types of plants. The experimental results of classifying cabbages and weeds showed that on the basis of the optimal pretreatment by a synthetic application of MSC and SG convolutional derivation with SG's parameters set as 1rd order derivation, 3th degree polynomial and 51 smoothing points, 23 feature wavelengths were extracted in accordance with the top three principal components in PCA results. When SIMCA method was used for classification while the previously selected 23 feature wavelengths were set as the input variables, the classification rates of the modeling set and the prediction set were respectively up to 98.6% and 100%.

  11. Research on Air Quality Evaluation based on Principal Component Analysis

    NASA Astrophysics Data System (ADS)

    Wang, Xing; Wang, Zilin; Guo, Min; Chen, Wei; Zhang, Huan

    2018-01-01

    Economic growth has led to environmental capacity decline and the deterioration of air quality. Air quality evaluation as a fundamental of environmental monitoring and air pollution control has become increasingly important. Based on the principal component analysis (PCA), this paper evaluates the air quality of a large city in Beijing-Tianjin-Hebei Area in recent 10 years and identifies influencing factors, in order to provide reference to air quality management and air pollution control.

  12. Strain Transient Detection Techniques: A Comparison of Source Parameter Inversions of Signals Isolated through Principal Component Analysis (PCA), Non-Linear PCA, and Rotated PCA

    NASA Astrophysics Data System (ADS)

    Lipovsky, B.; Funning, G. J.

    2009-12-01

    We compare several techniques for the analysis of geodetic time series with the ultimate aim to characterize the physical processes which are represented therein. We compare three methods for the analysis of these data: Principal Component Analysis (PCA), Non-Linear PCA (NLPCA), and Rotated PCA (RPCA). We evaluate each method by its ability to isolate signals which may be any combination of low amplitude (near noise level), temporally transient, unaccompanied by seismic emissions, and small scale with respect to the spatial domain. PCA is a powerful tool for extracting structure from large datasets which is traditionally realized through either the solution of an eigenvalue problem or through iterative methods. PCA is an transformation of the coordinate system of our data such that the new "principal" data axes retain maximal variance and minimal reconstruction error (Pearson, 1901; Hotelling, 1933). RPCA is achieved by an orthogonal transformation of the principal axes determined in PCA. In the analysis of meteorological data sets, RPCA has been seen to overcome domain shape dependencies, correct for sampling errors, and to determine principal axes which more closely represent physical processes (e.g., Richman, 1986). NLPCA generalizes PCA such that principal axes are replaced by principal curves (e.g., Hsieh 2004). We achieve NLPCA through an auto-associative feed-forward neural network (Scholz, 2005). We show the geophysical relevance of these techniques by application of each to a synthetic data set. Results are compared by inverting principal axes to determine deformation source parameters. Temporal variability in source parameters, estimated by each method, are also compared.

  13. Structured Sparse Principal Components Analysis With the TV-Elastic Net Penalty.

    PubMed

    de Pierrefeu, Amicie; Lofstedt, Tommy; Hadj-Selem, Fouad; Dubois, Mathieu; Jardri, Renaud; Fovet, Thomas; Ciuciu, Philippe; Frouin, Vincent; Duchesnay, Edouard

    2018-02-01

    Principal component analysis (PCA) is an exploratory tool widely used in data analysis to uncover the dominant patterns of variability within a population. Despite its ability to represent a data set in a low-dimensional space, PCA's interpretability remains limited. Indeed, the components produced by PCA are often noisy or exhibit no visually meaningful patterns. Furthermore, the fact that the components are usually non-sparse may also impede interpretation, unless arbitrary thresholding is applied. However, in neuroimaging, it is essential to uncover clinically interpretable phenotypic markers that would account for the main variability in the brain images of a population. Recently, some alternatives to the standard PCA approach, such as sparse PCA (SPCA), have been proposed, their aim being to limit the density of the components. Nonetheless, sparsity alone does not entirely solve the interpretability problem in neuroimaging, since it may yield scattered and unstable components. We hypothesized that the incorporation of prior information regarding the structure of the data may lead to improved relevance and interpretability of brain patterns. We therefore present a simple extension of the popular PCA framework that adds structured sparsity penalties on the loading vectors in order to identify the few stable regions in the brain images that capture most of the variability. Such structured sparsity can be obtained by combining, e.g., and total variation (TV) penalties, where the TV regularization encodes information on the underlying structure of the data. This paper presents the structured SPCA (denoted SPCA-TV) optimization framework and its resolution. We demonstrate SPCA-TV's effectiveness and versatility on three different data sets. It can be applied to any kind of structured data, such as, e.g., -dimensional array images or meshes of cortical surfaces. The gains of SPCA-TV over unstructured approaches (such as SPCA and ElasticNet PCA) or structured approach (such as GraphNet PCA) are significant, since SPCA-TV reveals the variability within a data set in the form of intelligible brain patterns that are easier to interpret and more stable across different samples.

  14. Geographical classification of Epimedium based on HPLC fingerprint analysis combined with multi-ingredients quantitative analysis.

    PubMed

    Xu, Ning; Zhou, Guofu; Li, Xiaojuan; Lu, Heng; Meng, Fanyun; Zhai, Huaqiang

    2017-05-01

    A reliable and comprehensive method for identifying the origin and assessing the quality of Epimedium has been developed. The method is based on analysis of HPLC fingerprints, combined with similarity analysis, hierarchical cluster analysis (HCA), principal component analysis (PCA) and multi-ingredient quantitative analysis. Nineteen batches of Epimedium, collected from different areas in the western regions of China, were used to establish the fingerprints and 18 peaks were selected for the analysis. Similarity analysis, HCA and PCA all classified the 19 areas into three groups. Simultaneous quantification of the five major bioactive ingredients in the Epimedium samples was also carried out to confirm the consistency of the quality tests. These methods were successfully used to identify the geographical origin of the Epimedium samples and to evaluate their quality. Copyright © 2016 John Wiley & Sons, Ltd.

  15. Improved estimation of parametric images of cerebral glucose metabolic rate from dynamic FDG-PET using volume-wise principle component analysis

    NASA Astrophysics Data System (ADS)

    Dai, Xiaoqian; Tian, Jie; Chen, Zhe

    2010-03-01

    Parametric images can represent both spatial distribution and quantification of the biological and physiological parameters of tracer kinetics. The linear least square (LLS) method is a well-estimated linear regression method for generating parametric images by fitting compartment models with good computational efficiency. However, bias exists in LLS-based parameter estimates, owing to the noise present in tissue time activity curves (TTACs) that propagates as correlated error in the LLS linearized equations. To address this problem, a volume-wise principal component analysis (PCA) based method is proposed. In this method, firstly dynamic PET data are properly pre-transformed to standardize noise variance as PCA is a data driven technique and can not itself separate signals from noise. Secondly, the volume-wise PCA is applied on PET data. The signals can be mostly represented by the first few principle components (PC) and the noise is left in the subsequent PCs. Then the noise-reduced data are obtained using the first few PCs by applying 'inverse PCA'. It should also be transformed back according to the pre-transformation method used in the first step to maintain the scale of the original data set. Finally, the obtained new data set is used to generate parametric images using the linear least squares (LLS) estimation method. Compared with other noise-removal method, the proposed method can achieve high statistical reliability in the generated parametric images. The effectiveness of the method is demonstrated both with computer simulation and with clinical dynamic FDG PET study.

  16. Tool Wear Prediction in Ti-6Al-4V Machining through Multiple Sensor Monitoring and PCA Features Pattern Recognition.

    PubMed

    Caggiano, Alessandra

    2018-03-09

    Machining of titanium alloys is characterised by extremely rapid tool wear due to the high cutting temperature and the strong adhesion at the tool-chip and tool-workpiece interface, caused by the low thermal conductivity and high chemical reactivity of Ti alloys. With the aim to monitor the tool conditions during dry turning of Ti-6Al-4V alloy, a machine learning procedure based on the acquisition and processing of cutting force, acoustic emission and vibration sensor signals during turning is implemented. A number of sensorial features are extracted from the acquired sensor signals in order to feed machine learning paradigms based on artificial neural networks. To reduce the large dimensionality of the sensorial features, an advanced feature extraction methodology based on Principal Component Analysis (PCA) is proposed. PCA allowed to identify a smaller number of features ( k = 2 features), the principal component scores, obtained through linear projection of the original d features into a new space with reduced dimensionality k = 2, sufficient to describe the variance of the data. By feeding artificial neural networks with the PCA features, an accurate diagnosis of tool flank wear ( VB max ) was achieved, with predicted values very close to the measured tool wear values.

  17. Tool Wear Prediction in Ti-6Al-4V Machining through Multiple Sensor Monitoring and PCA Features Pattern Recognition

    PubMed Central

    2018-01-01

    Machining of titanium alloys is characterised by extremely rapid tool wear due to the high cutting temperature and the strong adhesion at the tool-chip and tool-workpiece interface, caused by the low thermal conductivity and high chemical reactivity of Ti alloys. With the aim to monitor the tool conditions during dry turning of Ti-6Al-4V alloy, a machine learning procedure based on the acquisition and processing of cutting force, acoustic emission and vibration sensor signals during turning is implemented. A number of sensorial features are extracted from the acquired sensor signals in order to feed machine learning paradigms based on artificial neural networks. To reduce the large dimensionality of the sensorial features, an advanced feature extraction methodology based on Principal Component Analysis (PCA) is proposed. PCA allowed to identify a smaller number of features (k = 2 features), the principal component scores, obtained through linear projection of the original d features into a new space with reduced dimensionality k = 2, sufficient to describe the variance of the data. By feeding artificial neural networks with the PCA features, an accurate diagnosis of tool flank wear (VBmax) was achieved, with predicted values very close to the measured tool wear values. PMID:29522443

  18. Classification and identification of Rhodobryum roseum Limpr. and its adulterants based on fourier-transform infrared spectroscopy (FTIR) and chemometrics.

    PubMed

    Cao, Zhen; Wang, Zhenjie; Shang, Zhonglin; Zhao, Jiancheng

    2017-01-01

    Fourier-transform infrared spectroscopy (FTIR) with the attenuated total reflectance technique was used to identify Rhodobryum roseum from its four adulterants. The FTIR spectra of six samples in the range from 4000 cm-1 to 600 cm-1 were obtained. The second-derivative transformation test was used to identify the small and nearby absorption peaks. A cluster analysis was performed to classify the spectra in a dendrogram based on the spectral similarity. Principal component analysis (PCA) was used to classify the species of six moss samples. A cluster analysis with PCA was used to identify different genera. However, some species of the same genus exhibited highly similar chemical components and FTIR spectra. Fourier self-deconvolution and discrete wavelet transform (DWT) were used to enhance the differences among the species with similar chemical components and FTIR spectra. Three scales were selected as the feature-extracting space in the DWT domain. The results show that FTIR spectroscopy with chemometrics is suitable for identifying Rhodobryum roseum and its adulterants.

  19. Discrimination of Geographical Origin of Asian Garlic Using Isotopic and Chemical Datasets under Stepwise Principal Component Analysis.

    PubMed

    Liu, Tsang-Sen; Lin, Jhen-Nan; Peng, Tsung-Ren

    2018-01-16

    Isotopic compositions of δ 2 H, δ 18 O, δ 13 C, and δ 15 N and concentrations of 22 trace elements from garlic samples were analyzed and processed with stepwise principal component analysis (PCA) to discriminate garlic's country of origin among Asian regions including South Korea, Vietnam, Taiwan, and China. Results indicate that there is no single trace-element concentration or isotopic composition that can accomplish the study's purpose and the stepwise PCA approach proposed does allow for discrimination between countries on a regional basis. Sequentially, Step-1 PCA distinguishes garlic's country of origin among Taiwanese, South Korean, and Vietnamese samples; Step-2 PCA discriminates Chinese garlic from South Korean garlic; and Step-3 and Step-4 PCA, Chinese garlic from Vietnamese garlic. In model tests, countries of origin of all audit samples were correctly discriminated by stepwise PCA. Consequently, this study demonstrates that stepwise PCA as applied is a simple and effective approach to discriminating country of origin among Asian garlics. © 2018 American Academy of Forensic Sciences.

  20. Visible micro-Raman spectroscopy of single human mammary epithelial cells exposed to x-ray radiation.

    PubMed

    Delfino, Ines; Perna, Giuseppe; Lasalvia, Maria; Capozzi, Vito; Manti, Lorenzo; Camerlingo, Carlo; Lepore, Maria

    2015-03-01

    A micro-Raman spectroscopy investigation has been performed in vitro on single human mammary epithelial cells after irradiation by graded x-ray doses. The analysis by principal component analysis (PCA) and interval-PCA (i-PCA) methods has allowed us to point out the small differences in the Raman spectra induced by irradiation. This experimental approach has enabled us to delineate radiation-induced changes in protein, nucleic acid, lipid, and carbohydrate content. In particular, the dose dependence of PCA and i-PCA components has been analyzed. Our results have confirmed that micro-Raman spectroscopy coupled to properly chosen data analysis methods is a very sensitive technique to detect early molecular changes at the single-cell level following exposure to ionizing radiation. This would help in developing innovative approaches to monitor radiation cancer radiotherapy outcome so as to reduce the overall radiation dose and minimize damage to the surrounding healthy cells, both aspects being of great importance in the field of radiation therapy.

  1. Principal component analysis of dynamic fluorescence images for diagnosis of diabetic vasculopathy

    NASA Astrophysics Data System (ADS)

    Seo, Jihye; An, Yuri; Lee, Jungsul; Ku, Taeyun; Kang, Yujung; Ahn, Chulwoo; Choi, Chulhee

    2016-04-01

    Indocyanine green (ICG) fluorescence imaging has been clinically used for noninvasive visualizations of vascular structures. We have previously developed a diagnostic system based on dynamic ICG fluorescence imaging for sensitive detection of vascular disorders. However, because high-dimensional raw data were used, the analysis of the ICG dynamics proved difficult. We used principal component analysis (PCA) in this study to extract important elements without significant loss of information. We examined ICG spatiotemporal profiles and identified critical features related to vascular disorders. PCA time courses of the first three components showed a distinct pattern in diabetic patients. Among the major components, the second principal component (PC2) represented arterial-like features. The explained variance of PC2 in diabetic patients was significantly lower than in normal controls. To visualize the spatial pattern of PCs, pixels were mapped with red, green, and blue channels. The PC2 score showed an inverse pattern between normal controls and diabetic patients. We propose that PC2 can be used as a representative bioimaging marker for the screening of vascular diseases. It may also be useful in simple extractions of arterial-like features.

  2. Principal component analysis as a tool for library design: a case study investigating natural products, brand-name drugs, natural product-like libraries, and drug-like libraries.

    PubMed

    Wenderski, Todd A; Stratton, Christopher F; Bauer, Renato A; Kopp, Felix; Tan, Derek S

    2015-01-01

    Principal component analysis (PCA) is a useful tool in the design and planning of chemical libraries. PCA can be used to reveal differences in structural and physicochemical parameters between various classes of compounds by displaying them in a convenient graphical format. Herein, we demonstrate the use of PCA to gain insight into structural features that differentiate natural products, synthetic drugs, natural product-like libraries, and drug-like libraries, and show how the results can be used to guide library design.

  3. Principal Component Analysis as a Tool for Library Design: A Case Study Investigating Natural Products, Brand-Name Drugs, Natural Product-Like Libraries, and Drug-Like Libraries

    PubMed Central

    Wenderski, Todd A.; Stratton, Christopher F.; Bauer, Renato A.; Kopp, Felix; Tan, Derek S.

    2015-01-01

    Principal component analysis (PCA) is a useful tool in the design and planning of chemical libraries. PCA can be used to reveal differences in structural and physicochemical parameters between various classes of compounds by displaying them in a convenient graphical format. Herein, we demonstrate the use of PCA to gain insight into structural features that differentiate natural products, synthetic drugs, natural product-like libraries, and drug-like libraries, and show how the results can be used to guide library design. PMID:25618349

  4. Sparse principal component analysis in medical shape modeling

    NASA Astrophysics Data System (ADS)

    Sjöstrand, Karl; Stegmann, Mikkel B.; Larsen, Rasmus

    2006-03-01

    Principal component analysis (PCA) is a widely used tool in medical image analysis for data reduction, model building, and data understanding and exploration. While PCA is a holistic approach where each new variable is a linear combination of all original variables, sparse PCA (SPCA) aims at producing easily interpreted models through sparse loadings, i.e. each new variable is a linear combination of a subset of the original variables. One of the aims of using SPCA is the possible separation of the results into isolated and easily identifiable effects. This article introduces SPCA for shape analysis in medicine. Results for three different data sets are given in relation to standard PCA and sparse PCA by simple thresholding of small loadings. Focus is on a recent algorithm for computing sparse principal components, but a review of other approaches is supplied as well. The SPCA algorithm has been implemented using Matlab and is available for download. The general behavior of the algorithm is investigated, and strengths and weaknesses are discussed. The original report on the SPCA algorithm argues that the ordering of modes is not an issue. We disagree on this point and propose several approaches to establish sensible orderings. A method that orders modes by decreasing variance and maximizes the sum of variances for all modes is presented and investigated in detail.

  5. Assets as a Socioeconomic Status Index: Categorical Principal Components Analysis vs. Latent Class Analysis.

    PubMed

    Sartipi, Majid; Nedjat, Saharnaz; Mansournia, Mohammad Ali; Baigi, Vali; Fotouhi, Akbar

    2016-11-01

    Some variables like Socioeconomic Status (SES) cannot be directly measured, instead, so-called 'latent variables' are measured indirectly through calculating tangible items. There are different methods for measuring latent variables such as data reduction methods e.g. Principal Components Analysis (PCA) and Latent Class Analysis (LCA). The purpose of our study was to measure assets index- as a representative of SES- through two methods of Non-Linear PCA (NLPCA) and LCA, and to compare them for choosing the most appropriate model. This was a cross sectional study in which 1995 respondents filled the questionnaires about their assets in Tehran. The data were analyzed by SPSS 19 (CATPCA command) and SAS 9.2 (PROC LCA command) to estimate their socioeconomic status. The results were compared based on the Intra-class Correlation Coefficient (ICC). The 6 derived classes from LCA based on BIC, were highly consistent with the 6 classes from CATPCA (Categorical PCA) (ICC = 0.87, 95%CI: 0.86 - 0.88). There is no gold standard to measure SES. Therefore, it is not possible to definitely say that a specific method is better than another one. LCA is a complicated method that presents detailed information about latent variables and required one assumption (local independency), while NLPCA is a simple method, which requires more assumptions. Generally, NLPCA seems to be an acceptable method of analysis because of its simplicity and high agreement with LCA.

  6. Face-space architectures: evidence for the use of independent color-based features.

    PubMed

    Nestor, Adrian; Plaut, David C; Behrmann, Marlene

    2013-07-01

    The concept of psychological face space lies at the core of many theories of face recognition and representation. To date, much of the understanding of face space has been based on principal component analysis (PCA); the structure of the psychological space is thought to reflect some important aspects of a physical face space characterized by PCA applications to face images. In the present experiments, we investigated alternative accounts of face space and found that independent component analysis provided the best fit to human judgments of face similarity and identification. Thus, our results challenge an influential approach to the study of human face space and provide evidence for the role of statistically independent features in face encoding. In addition, our findings support the use of color information in the representation of facial identity, and we thus argue for the inclusion of such information in theoretical and computational constructs of face space.

  7. An improved principal component analysis based region matching method for fringe direction estimation

    NASA Astrophysics Data System (ADS)

    He, A.; Quan, C.

    2018-04-01

    The principal component analysis (PCA) and region matching combined method is effective for fringe direction estimation. However, its mask construction algorithm for region matching fails in some circumstances, and the algorithm for conversion of orientation to direction in mask areas is computationally-heavy and non-optimized. We propose an improved PCA based region matching method for the fringe direction estimation, which includes an improved and robust mask construction scheme, and a fast and optimized orientation-direction conversion algorithm for the mask areas. Along with the estimated fringe direction map, filtered fringe pattern by automatic selective reconstruction modification and enhanced fast empirical mode decomposition (ASRm-EFEMD) is used for Hilbert spiral transform (HST) to demodulate the phase. Subsequently, windowed Fourier ridge (WFR) method is used for the refinement of the phase. The robustness and effectiveness of proposed method are demonstrated by both simulated and experimental fringe patterns.

  8. Optimized principal component analysis on coronagraphic images of the fomalhaut system

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Meshkat, Tiffany; Kenworthy, Matthew A.; Quanz, Sascha P.

    We present the results of a study to optimize the principal component analysis (PCA) algorithm for planet detection, a new algorithm complementing angular differential imaging and locally optimized combination of images (LOCI) for increasing the contrast achievable next to a bright star. The stellar point spread function (PSF) is constructed by removing linear combinations of principal components, allowing the flux from an extrasolar planet to shine through. The number of principal components used determines how well the stellar PSF is globally modeled. Using more principal components may decrease the number of speckles in the final image, but also increases themore » background noise. We apply PCA to Fomalhaut Very Large Telescope NaCo images acquired at 4.05 μm with an apodized phase plate. We do not detect any companions, with a model dependent upper mass limit of 13-18 M {sub Jup} from 4-10 AU. PCA achieves greater sensitivity than the LOCI algorithm for the Fomalhaut coronagraphic data by up to 1 mag. We make several adaptations to the PCA code and determine which of these prove the most effective at maximizing the signal-to-noise from a planet very close to its parent star. We demonstrate that optimizing the number of principal components used in PCA proves most effective for pulling out a planet signal.« less

  9. Generation of Boundary Manikin Anthropometry

    NASA Technical Reports Server (NTRS)

    Young, Karen S.; Margerum, Sarah; Barr, Abbe; Ferrer, Mike A.; Rajulu, Sudhakar

    2008-01-01

    The purpose of this study was to develop 3D digital boundary manikins that are representative of the anthropometry of a unique population. These digital manikins can be used by designers to verify and validate that the components of the spacesuit design satisfy the requirements specified in the Human Systems Integration Requirements (HSIR) document. Currently, the HSIR requires the suit to accommodate the 1st percentile American female to the 99th percentile American male. The manikin anthropometry was derived using two methods: Principal Component Analysis (PCA) and Whole Body Posture Based Analysis (WBPBA). PCA is a statistical method for reducing a multidimensional data set by using eigenvectors and eigenvalues. The goal is to create a reduced data set that encapsulates the majority of the variation in the population. WBPBA is a multivariate analytical approach that was developed by the Anthropometry and Biomechanics Facility (ABF) to identify the extremes of the population for a given body posture. WBPBA is a simulation-based method that finds extremes in a population based on anthropometry and posture whereas PCA is based solely on anthropometry. Both methods yield a list of subjects and their anthropometry from the target population; PCA resulted in 20 female and 22 male subjects anthropometry and WBPBA resulted in 7 subjects' anthropometry representing the extreme subjects in the target population. The subjects anthropometry is then used to 'morph' a baseline digital scan of a person with the same body type to create a 3D digital model that can be used as a tool for designers, the details of which will be discussed in subsequent papers.

  10. Principal component analysis-based pattern analysis of dose-volume histograms and influence on rectal toxicity.

    PubMed

    Söhn, Matthias; Alber, Markus; Yan, Di

    2007-09-01

    The variability of dose-volume histogram (DVH) shapes in a patient population can be quantified using principal component analysis (PCA). We applied this to rectal DVHs of prostate cancer patients and investigated the correlation of the PCA parameters with late bleeding. PCA was applied to the rectal wall DVHs of 262 patients, who had been treated with a four-field box, conformal adaptive radiotherapy technique. The correlated changes in the DVH pattern were revealed as "eigenmodes," which were ordered by their importance to represent data set variability. Each DVH is uniquely characterized by its principal components (PCs). The correlation of the first three PCs and chronic rectal bleeding of Grade 2 or greater was investigated with uni- and multivariate logistic regression analyses. Rectal wall DVHs in four-field conformal RT can primarily be represented by the first two or three PCs, which describe approximately 94% or 96% of the DVH shape variability, respectively. The first eigenmode models the total irradiated rectal volume; thus, PC1 correlates to the mean dose. Mode 2 describes the interpatient differences of the relative rectal volume in the two- or four-field overlap region. Mode 3 reveals correlations of volumes with intermediate doses ( approximately 40-45 Gy) and volumes with doses >70 Gy; thus, PC3 is associated with the maximal dose. According to univariate logistic regression analysis, only PC2 correlated significantly with toxicity. However, multivariate logistic regression analysis with the first two or three PCs revealed an increased probability of bleeding for DVHs with more than one large PC. PCA can reveal the correlation structure of DVHs for a patient population as imposed by the treatment technique and provide information about its relationship to toxicity. It proves useful for augmenting normal tissue complication probability modeling approaches.

  11. Principal component analysis on a torus: Theory and application to protein dynamics.

    PubMed

    Sittel, Florian; Filk, Thomas; Stock, Gerhard

    2017-12-28

    A dimensionality reduction method for high-dimensional circular data is developed, which is based on a principal component analysis (PCA) of data points on a torus. Adopting a geometrical view of PCA, various distance measures on a torus are introduced and the associated problem of projecting data onto the principal subspaces is discussed. The main idea is that the (periodicity-induced) projection error can be minimized by transforming the data such that the maximal gap of the sampling is shifted to the periodic boundary. In a second step, the covariance matrix and its eigendecomposition can be computed in a standard manner. Adopting molecular dynamics simulations of two well-established biomolecular systems (Aib 9 and villin headpiece), the potential of the method to analyze the dynamics of backbone dihedral angles is demonstrated. The new approach allows for a robust and well-defined construction of metastable states and provides low-dimensional reaction coordinates that accurately describe the free energy landscape. Moreover, it offers a direct interpretation of covariances and principal components in terms of the angular variables. Apart from its application to PCA, the method of maximal gap shifting is general and can be applied to any other dimensionality reduction method for circular data.

  12. Principal component analysis on a torus: Theory and application to protein dynamics

    NASA Astrophysics Data System (ADS)

    Sittel, Florian; Filk, Thomas; Stock, Gerhard

    2017-12-01

    A dimensionality reduction method for high-dimensional circular data is developed, which is based on a principal component analysis (PCA) of data points on a torus. Adopting a geometrical view of PCA, various distance measures on a torus are introduced and the associated problem of projecting data onto the principal subspaces is discussed. The main idea is that the (periodicity-induced) projection error can be minimized by transforming the data such that the maximal gap of the sampling is shifted to the periodic boundary. In a second step, the covariance matrix and its eigendecomposition can be computed in a standard manner. Adopting molecular dynamics simulations of two well-established biomolecular systems (Aib9 and villin headpiece), the potential of the method to analyze the dynamics of backbone dihedral angles is demonstrated. The new approach allows for a robust and well-defined construction of metastable states and provides low-dimensional reaction coordinates that accurately describe the free energy landscape. Moreover, it offers a direct interpretation of covariances and principal components in terms of the angular variables. Apart from its application to PCA, the method of maximal gap shifting is general and can be applied to any other dimensionality reduction method for circular data.

  13. A PCA-Based method for determining craniofacial relationship and sexual dimorphism of facial shapes.

    PubMed

    Shui, Wuyang; Zhou, Mingquan; Maddock, Steve; He, Taiping; Wang, Xingce; Deng, Qingqiong

    2017-11-01

    Previous studies have used principal component analysis (PCA) to investigate the craniofacial relationship, as well as sex determination using facial factors. However, few studies have investigated the extent to which the choice of principal components (PCs) affects the analysis of craniofacial relationship and sexual dimorphism. In this paper, we propose a PCA-based method for visual and quantitative analysis, using 140 samples of 3D heads (70 male and 70 female), produced from computed tomography (CT) images. There are two parts to the method. First, skull and facial landmarks are manually marked to guide the model's registration so that dense corresponding vertices occupy the same relative position in every sample. Statistical shape spaces of the skull and face in dense corresponding vertices are constructed using PCA. Variations in these vertices, captured in every principal component (PC), are visualized to observe shape variability. The correlations of skull- and face-based PC scores are analysed, and linear regression is used to fit the craniofacial relationship. We compute the PC coefficients of a face based on this craniofacial relationship and the PC scores of a skull, and apply the coefficients to estimate a 3D face for the skull. To evaluate the accuracy of the computed craniofacial relationship, the mean and standard deviation of every vertex between the two models are computed, where these models are reconstructed using real PC scores and coefficients. Second, each PC in facial space is analysed for sex determination, for which support vector machines (SVMs) are used. We examined the correlation between PCs and sex, and explored the extent to which the choice of PCs affects the expression of sexual dimorphism. Our results suggest that skull- and face-based PCs can be used to describe the craniofacial relationship and that the accuracy of the method can be improved by using an increased number of face-based PCs. The results show that the accuracy of the sex classification is related to the choice of PCs. The highest sex classification rate is 91.43% using our method. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. Breast Shape Analysis With Curvature Estimates and Principal Component Analysis for Cosmetic and Reconstructive Breast Surgery.

    PubMed

    Catanuto, Giuseppe; Taher, Wafa; Rocco, Nicola; Catalano, Francesca; Allegra, Dario; Milotta, Filippo Luigi Maria; Stanco, Filippo; Gallo, Giovanni; Nava, Maurizio Bruno

    2018-03-20

    Breast shape is defined utilizing mainly qualitative assessment (full, flat, ptotic) or estimates, such as volume or distances between reference points, that cannot describe it reliably. We will quantitatively describe breast shape with two parameters derived from a statistical methodology denominated principal component analysis (PCA). We created a heterogeneous dataset of breast shapes acquired with a commercial infrared 3-dimensional scanner on which PCA was performed. We plotted on a Cartesian plane the two highest values of PCA for each breast (principal components 1 and 2). Testing of the methodology on a preoperative and postoperative surgical case and test-retest was performed by two operators. The first two principal components derived from PCA are able to characterize the shape of the breast included in the dataset. The test-retest demonstrated that different operators are able to obtain very similar values of PCA. The system is also able to identify major changes in the preoperative and postoperative stages of a two-stage reconstruction. Even minor changes were correctly detected by the system. This methodology can reliably describe the shape of a breast. An expert operator and a newly trained operator can reach similar results in a test/re-testing validation. Once developed and after further validation, this methodology could be employed as a good tool for outcome evaluation, auditing, and benchmarking.

  15. Morphological analysis of Trichomycterus areolatus Valenciennes, 1846 from southern Chilean rivers using a truss-based system (Siluriformes, Trichomycteridae)

    PubMed Central

    Colihueque, Nelson; Corrales, Olga; Yáñez, Miguel

    2017-01-01

    Abstract Trichomycterus areolatus Valenciennes, 1846 is a small endemic catfish inhabiting the Andean river basins of Chile. In this study, the morphological variability of three T. areolatus populations, collected in two river basins from southern Chile, was assessed with multivariate analyses, including principal component analysis (PCA) and discriminant function analysis (DFA). It is hypothesized that populations must segregate morphologically from each other based on the river basin that they were sampled from, since each basin presents relatively particular hydrological characteristics. Significant morphological differences among the three populations were found with PCA (ANOSIM test, r = 0.552, p < 0.0001) and DFA (Wilks’s λ = 0.036, p < 0.01). PCA accounted for a total variation of 56.16% by the first two principal components. The first Principal Component (PC1) and PC2 explained 34.72 and 21.44% of the total variation, respectively. The scatter-plot of the first two discriminant functions (DF1 on DF2) also validated the existence of three different populations. In group classification using DFA, 93.3% of the specimens were correctly-classified into their original populations. Of the total of 22 transformed truss measurements, 17 exhibited highly significant (p < 0.01) differences among populations. The data support the existence of T. areolatus morphological variation across different rivers in southern Chile, likely reflecting the geographic isolation underlying population structure of the species. PMID:29134012

  16. Human Classification Based on Gestural Motions by Using Components of PCA

    NASA Astrophysics Data System (ADS)

    Aziz, Azri A.; Wan, Khairunizam; Za'aba, S. K.; B, Shahriman A.; Adnan, Nazrul H.; H, Asyekin; R, Zuradzman M.

    2013-12-01

    Lately, a study of human capabilities with the aim to be integrated into machine is the famous topic to be discussed. Moreover, human are bless with special abilities that they can hear, see, sense, speak, think and understand each other. Giving such abilities to machine for improvement of human life is researcher's aim for better quality of life in the future. This research was concentrating on human gesture, specifically arm motions for differencing the individuality which lead to the development of the hand gesture database. We try to differentiate the human physical characteristic based on hand gesture represented by arm trajectories. Subjects are selected from different type of the body sizes, and then acquired data undergo resampling process. The results discuss the classification of human based on arm trajectories by using Principle Component Analysis (PCA).

  17. Fast, Exact Bootstrap Principal Component Analysis for p > 1 million

    PubMed Central

    Fisher, Aaron; Caffo, Brian; Schwartz, Brian; Zipunnikov, Vadim

    2015-01-01

    Many have suggested a bootstrap procedure for estimating the sampling variability of principal component analysis (PCA) results. However, when the number of measurements per subject (p) is much larger than the number of subjects (n), calculating and storing the leading principal components from each bootstrap sample can be computationally infeasible. To address this, we outline methods for fast, exact calculation of bootstrap principal components, eigenvalues, and scores. Our methods leverage the fact that all bootstrap samples occupy the same n-dimensional subspace as the original sample. As a result, all bootstrap principal components are limited to the same n-dimensional subspace and can be efficiently represented by their low dimensional coordinates in that subspace. Several uncertainty metrics can be computed solely based on the bootstrap distribution of these low dimensional coordinates, without calculating or storing the p-dimensional bootstrap components. Fast bootstrap PCA is applied to a dataset of sleep electroencephalogram recordings (p = 900, n = 392), and to a dataset of brain magnetic resonance images (MRIs) (p ≈ 3 million, n = 352). For the MRI dataset, our method allows for standard errors for the first 3 principal components based on 1000 bootstrap samples to be calculated on a standard laptop in 47 minutes, as opposed to approximately 4 days with standard methods. PMID:27616801

  18. Stream-based Hebbian eigenfilter for real-time neuronal spike discrimination

    PubMed Central

    2012-01-01

    Background Principal component analysis (PCA) has been widely employed for automatic neuronal spike sorting. Calculating principal components (PCs) is computationally expensive, and requires complex numerical operations and large memory resources. Substantial hardware resources are therefore needed for hardware implementations of PCA. General Hebbian algorithm (GHA) has been proposed for calculating PCs of neuronal spikes in our previous work, which eliminates the needs of computationally expensive covariance analysis and eigenvalue decomposition in conventional PCA algorithms. However, large memory resources are still inherently required for storing a large volume of aligned spikes for training PCs. The large size memory will consume large hardware resources and contribute significant power dissipation, which make GHA difficult to be implemented in portable or implantable multi-channel recording micro-systems. Method In this paper, we present a new algorithm for PCA-based spike sorting based on GHA, namely stream-based Hebbian eigenfilter, which eliminates the inherent memory requirements of GHA while keeping the accuracy of spike sorting by utilizing the pseudo-stationarity of neuronal spikes. Because of the reduction of large hardware storage requirements, the proposed algorithm can lead to ultra-low hardware resources and power consumption of hardware implementations, which is critical for the future multi-channel micro-systems. Both clinical and synthetic neural recording data sets were employed for evaluating the accuracy of the stream-based Hebbian eigenfilter. The performance of spike sorting using stream-based eigenfilter and the computational complexity of the eigenfilter were rigorously evaluated and compared with conventional PCA algorithms. Field programmable logic arrays (FPGAs) were employed to implement the proposed algorithm, evaluate the hardware implementations and demonstrate the reduction in both power consumption and hardware memories achieved by the streaming computing Results and discussion Results demonstrate that the stream-based eigenfilter can achieve the same accuracy and is 10 times more computationally efficient when compared with conventional PCA algorithms. Hardware evaluations show that 90.3% logic resources, 95.1% power consumption and 86.8% computing latency can be reduced by the stream-based eigenfilter when compared with PCA hardware. By utilizing the streaming method, 92% memory resources and 67% power consumption can be saved when compared with the direct implementation of GHA. Conclusion Stream-based Hebbian eigenfilter presents a novel approach to enable real-time spike sorting with reduced computational complexity and hardware costs. This new design can be further utilized for multi-channel neuro-physiological experiments or chronic implants. PMID:22490725

  19. High Accuracy Passive Magnetic Field-Based Localization for Feedback Control Using Principal Component Analysis.

    PubMed

    Foong, Shaohui; Sun, Zhenglong

    2016-08-12

    In this paper, a novel magnetic field-based sensing system employing statistically optimized concurrent multiple sensor outputs for precise field-position association and localization is presented. This method capitalizes on the independence between simultaneous spatial field measurements at multiple locations to induce unique correspondences between field and position. This single-source-multi-sensor configuration is able to achieve accurate and precise localization and tracking of translational motion without contact over large travel distances for feedback control. Principal component analysis (PCA) is used as a pseudo-linear filter to optimally reduce the dimensions of the multi-sensor output space for computationally efficient field-position mapping with artificial neural networks (ANNs). Numerical simulations are employed to investigate the effects of geometric parameters and Gaussian noise corruption on PCA assisted ANN mapping performance. Using a 9-sensor network, the sensing accuracy and closed-loop tracking performance of the proposed optimal field-based sensing system is experimentally evaluated on a linear actuator with a significantly more expensive optical encoder as a comparison.

  20. Low-rank plus sparse decomposition for exoplanet detection in direct-imaging ADI sequences. The LLSG algorithm

    NASA Astrophysics Data System (ADS)

    Gomez Gonzalez, C. A.; Absil, O.; Absil, P.-A.; Van Droogenbroeck, M.; Mawet, D.; Surdej, J.

    2016-05-01

    Context. Data processing constitutes a critical component of high-contrast exoplanet imaging. Its role is almost as important as the choice of a coronagraph or a wavefront control system, and it is intertwined with the chosen observing strategy. Among the data processing techniques for angular differential imaging (ADI), the most recent is the family of principal component analysis (PCA) based algorithms. It is a widely used statistical tool developed during the first half of the past century. PCA serves, in this case, as a subspace projection technique for constructing a reference point spread function (PSF) that can be subtracted from the science data for boosting the detectability of potential companions present in the data. Unfortunately, when building this reference PSF from the science data itself, PCA comes with certain limitations such as the sensitivity of the lower dimensional orthogonal subspace to non-Gaussian noise. Aims: Inspired by recent advances in machine learning algorithms such as robust PCA, we aim to propose a localized subspace projection technique that surpasses current PCA-based post-processing algorithms in terms of the detectability of companions at near real-time speed, a quality that will be useful for future direct imaging surveys. Methods: We used randomized low-rank approximation methods recently proposed in the machine learning literature, coupled with entry-wise thresholding to decompose an ADI image sequence locally into low-rank, sparse, and Gaussian noise components (LLSG). This local three-term decomposition separates the starlight and the associated speckle noise from the planetary signal, which mostly remains in the sparse term. We tested the performance of our new algorithm on a long ADI sequence obtained on β Pictoris with VLT/NACO. Results: Compared to a standard PCA approach, LLSG decomposition reaches a higher signal-to-noise ratio and has an overall better performance in the receiver operating characteristic space. This three-term decomposition brings a detectability boost compared to the full-frame standard PCA approach, especially in the small inner working angle region where complex speckle noise prevents PCA from discerning true companions from noise.

  1. RECENT APPLICATIONS OF SOURCE APPORTIONMENT METHODS AND RELATED NEEDS

    EPA Science Inventory

    Traditional receptor modeling studies have utilized factor analysis (like principal component analysis, PCA) and/or Chemical Mass Balance (CMB) to assess source influences. The limitations with these approaches is that PCA is qualitative and CMB requires the input of source pr...

  2. [Research on fast classification based on LIBS technology and principle component analyses].

    PubMed

    Yu, Qi; Ma, Xiao-Hong; Wang, Rui; Zhao, Hua-Feng

    2014-11-01

    Laser-induced breakdown spectroscopy (LIBS) and the principle component analysis (PCA) were combined to study aluminum alloy classification in the present article. Classification experiments were done on thirteen different kinds of standard samples of aluminum alloy which belong to 4 different types, and the results suggested that the LIBS-PCA method can be used to aluminum alloy fast classification. PCA was used to analyze the spectrum data from LIBS experiments, three principle components were figured out that contribute the most, the principle component scores of the spectrums were calculated, and the scores of the spectrums data in three-dimensional coordinates were plotted. It was found that the spectrum sample points show clear convergence phenomenon according to the type of aluminum alloy they belong to. This result ensured the three principle components and the preliminary aluminum alloy type zoning. In order to verify its accuracy, 20 different aluminum alloy samples were used to do the same experiments to verify the aluminum alloy type zoning. The experimental result showed that the spectrum sample points all located in their corresponding area of the aluminum alloy type, and this proved the correctness of the earlier aluminum alloy standard sample type zoning method. Based on this, the identification of unknown type of aluminum alloy can be done. All the experimental results showed that the accuracy of principle component analyses method based on laser-induced breakdown spectroscopy is more than 97.14%, and it can classify the different type effectively. Compared to commonly used chemical methods, laser-induced breakdown spectroscopy can do the detection of the sample in situ and fast with little sample preparation, therefore, using the method of the combination of LIBS and PCA in the areas such as quality testing and on-line industrial controlling can save a lot of time and cost, and improve the efficiency of detection greatly.

  3. Predicting timing of foot strike during running, independent of striking technique, using principal component analysis of joint angles.

    PubMed

    Osis, Sean T; Hettinga, Blayne A; Leitch, Jessica; Ferber, Reed

    2014-08-22

    As 3-dimensional (3D) motion-capture for clinical gait analysis continues to evolve, new methods must be developed to improve the detection of gait cycle events based on kinematic data. Recently, the application of principal component analysis (PCA) to gait data has shown promise in detecting important biomechanical features. Therefore, the purpose of this study was to define a new foot strike detection method for a continuum of striking techniques, by applying PCA to joint angle waveforms. In accordance with Newtonian mechanics, it was hypothesized that transient features in the sagittal-plane accelerations of the lower extremity would be linked with the impulsive application of force to the foot at foot strike. Kinematic and kinetic data from treadmill running were selected for 154 subjects, from a database of gait biomechanics. Ankle, knee and hip sagittal plane angular acceleration kinematic curves were chained together to form a row input to a PCA matrix. A linear polynomial was calculated based on PCA scores, and a 10-fold cross-validation was performed to evaluate prediction accuracy against gold-standard foot strike as determined by a 10 N rise in the vertical ground reaction force. Results show 89-94% of all predicted foot strikes were within 4 frames (20 ms) of the gold standard with the largest error being 28 ms. It is concluded that this new foot strike detection is an improvement on existing methods and can be applied regardless of whether the runner exhibits a rearfoot, midfoot, or forefoot strike pattern. Copyright © 2014 Elsevier Ltd. All rights reserved.

  4. A multifaceted independent performance analysis of facial subspace recognition algorithms.

    PubMed

    Bajwa, Usama Ijaz; Taj, Imtiaz Ahmad; Anwar, Muhammad Waqas; Wang, Xuan

    2013-01-01

    Face recognition has emerged as the fastest growing biometric technology and has expanded a lot in the last few years. Many new algorithms and commercial systems have been proposed and developed. Most of them use Principal Component Analysis (PCA) as a base for their techniques. Different and even conflicting results have been reported by researchers comparing these algorithms. The purpose of this study is to have an independent comparative analysis considering both performance and computational complexity of six appearance based face recognition algorithms namely PCA, 2DPCA, A2DPCA, (2D)(2)PCA, LPP and 2DLPP under equal working conditions. This study was motivated due to the lack of unbiased comprehensive comparative analysis of some recent subspace methods with diverse distance metric combinations. For comparison with other studies, FERET, ORL and YALE databases have been used with evaluation criteria as of FERET evaluations which closely simulate real life scenarios. A comparison of results with previous studies is performed and anomalies are reported. An important contribution of this study is that it presents the suitable performance conditions for each of the algorithms under consideration.

  5. ERS-2 SAR and IRS-1C LISS III data fusion: A PCA approach to improve remote sensing based geological interpretation

    NASA Astrophysics Data System (ADS)

    Pal, S. K.; Majumdar, T. J.; Bhattacharya, Amit K.

    Fusion of optical and synthetic aperture radar data has been attempted in the present study for mapping of various lithologic units over a part of the Singhbhum Shear Zone (SSZ) and its surroundings. ERS-2 SAR data over the study area has been enhanced using Fast Fourier Transformation (FFT) based filtering approach, and also using Frost filtering technique. Both the enhanced SAR imagery have been then separately fused with histogram equalized IRS-1C LISS III image using Principal Component Analysis (PCA) technique. Later, Feature-oriented Principal Components Selection (FPCS) technique has been applied to generate False Color Composite (FCC) images, from which corresponding geological maps have been prepared. Finally, GIS techniques have been successfully used for change detection analysis in the lithological interpretation between the published geological map and the fusion based geological maps. In general, there is good agreement between these maps over a large portion of the study area. Based on the change detection studies, few areas could be identified which need attention for further detailed ground-based geological studies.

  6. PCA as a practical indicator of OPLS-DA model reliability.

    PubMed

    Worley, Bradley; Powers, Robert

    Principal Component Analysis (PCA) and Orthogonal Projections to Latent Structures Discriminant Analysis (OPLS-DA) are powerful statistical modeling tools that provide insights into separations between experimental groups based on high-dimensional spectral measurements from NMR, MS or other analytical instrumentation. However, when used without validation, these tools may lead investigators to statistically unreliable conclusions. This danger is especially real for Partial Least Squares (PLS) and OPLS, which aggressively force separations between experimental groups. As a result, OPLS-DA is often used as an alternative method when PCA fails to expose group separation, but this practice is highly dangerous. Without rigorous validation, OPLS-DA can easily yield statistically unreliable group separation. A Monte Carlo analysis of PCA group separations and OPLS-DA cross-validation metrics was performed on NMR datasets with statistically significant separations in scores-space. A linearly increasing amount of Gaussian noise was added to each data matrix followed by the construction and validation of PCA and OPLS-DA models. With increasing added noise, the PCA scores-space distance between groups rapidly decreased and the OPLS-DA cross-validation statistics simultaneously deteriorated. A decrease in correlation between the estimated loadings (added noise) and the true (original) loadings was also observed. While the validity of the OPLS-DA model diminished with increasing added noise, the group separation in scores-space remained basically unaffected. Supported by the results of Monte Carlo analyses of PCA group separations and OPLS-DA cross-validation metrics, we provide practical guidelines and cross-validatory recommendations for reliable inference from PCA and OPLS-DA models.

  7. Multivariate analysis for scanning tunneling spectroscopy data

    NASA Astrophysics Data System (ADS)

    Yamanishi, Junsuke; Iwase, Shigeru; Ishida, Nobuyuki; Fujita, Daisuke

    2018-01-01

    We applied principal component analysis (PCA) to two-dimensional tunneling spectroscopy (2DTS) data obtained on a Si(111)-(7 × 7) surface to explore the effectiveness of multivariate analysis for interpreting 2DTS data. We demonstrated that several components that originated mainly from specific atoms at the Si(111)-(7 × 7) surface can be extracted by PCA. Furthermore, we showed that hidden components in the tunneling spectra can be decomposed (peak separation), which is difficult to achieve with normal 2DTS analysis without the support of theoretical calculations. Our analysis showed that multivariate analysis can be an additional powerful way to analyze 2DTS data and extract hidden information from a large amount of spectroscopic data.

  8. Principal Component Analysis of Thermographic Data

    NASA Technical Reports Server (NTRS)

    Winfree, William P.; Cramer, K. Elliott; Zalameda, Joseph N.; Howell, Patricia A.; Burke, Eric R.

    2015-01-01

    Principal Component Analysis (PCA) has been shown effective for reducing thermographic NDE data. While a reliable technique for enhancing the visibility of defects in thermal data, PCA can be computationally intense and time consuming when applied to the large data sets typical in thermography. Additionally, PCA can experience problems when very large defects are present (defects that dominate the field-of-view), since the calculation of the eigenvectors is now governed by the presence of the defect, not the "good" material. To increase the processing speed and to minimize the negative effects of large defects, an alternative method of PCA is being pursued where a fixed set of eigenvectors, generated from an analytic model of the thermal response of the material under examination, is used to process the thermal data from composite materials. This method has been applied for characterization of flaws.

  9. Real-time myoelectric control of a multi-fingered hand prosthesis using principal components analysis.

    PubMed

    Matrone, Giulia C; Cipriani, Christian; Carrozza, Maria Chiara; Magenes, Giovanni

    2012-06-15

    In spite of the advances made in the design of dexterous anthropomorphic hand prostheses, these sophisticated devices still lack adequate control interfaces which could allow amputees to operate them in an intuitive and close-to-natural way. In this study, an anthropomorphic five-fingered robotic hand, actuated by six motors, was used as a prosthetic hand emulator to assess the feasibility of a control approach based on Principal Components Analysis (PCA), specifically conceived to address this problem. Since it was demonstrated elsewhere that the first two principal components (PCs) can describe the whole hand configuration space sufficiently well, the controller here employed reverted the PCA algorithm and allowed to drive a multi-DoF hand by combining a two-differential channels EMG input with these two PCs. Hence, the novelty of this approach stood in the PCA application for solving the challenging problem of best mapping the EMG inputs into the degrees of freedom (DoFs) of the prosthesis. A clinically viable two DoFs myoelectric controller, exploiting two differential channels, was developed and twelve able-bodied participants, divided in two groups, volunteered to control the hand in simple grasp trials, using forearm myoelectric signals. Task completion rates and times were measured. The first objective (assessed through one group of subjects) was to understand the effectiveness of the approach; i.e., whether it is possible to drive the hand in real-time, with reasonable performance, in different grasps, also taking advantage of the direct visual feedback of the moving hand. The second objective (assessed through a different group) was to investigate the intuitiveness, and therefore to assess statistical differences in the performance throughout three consecutive days. Subjects performed several grasp, transport and release trials with differently shaped objects, by operating the hand with the myoelectric PCA-based controller. Experimental trials showed that the simultaneous use of the two differential channels paradigm was successful. This work demonstrates that the proposed two-DoFs myoelectric controller based on PCA allows to drive in real-time a prosthetic hand emulator into different prehensile patterns with excellent performance. These results open up promising possibilities for the development of intuitive, effective myoelectric hand controllers.

  10. Finger crease pattern recognition using Legendre moments and principal component analysis

    NASA Astrophysics Data System (ADS)

    Luo, Rongfang; Lin, Tusheng

    2007-03-01

    The finger joint lines defined as finger creases and its distribution can identify a person. In this paper, we propose a new finger crease pattern recognition method based on Legendre moments and principal component analysis (PCA). After obtaining the region of interest (ROI) for each finger image in the pre-processing stage, Legendre moments under Radon transform are applied to construct a moment feature matrix from the ROI, which greatly decreases the dimensionality of ROI and can represent principal components of the finger creases quite well. Then, an approach to finger crease pattern recognition is designed based on Karhunen-Loeve (K-L) transform. The method applies PCA to a moment feature matrix rather than the original image matrix to achieve the feature vector. The proposed method has been tested on a database of 824 images from 103 individuals using the nearest neighbor classifier. The accuracy up to 98.584% has been obtained when using 4 samples per class for training. The experimental results demonstrate that our proposed approach is feasible and effective in biometrics.

  11. An application of principal component analysis to the clavicle and clavicle fixation devices.

    PubMed

    Daruwalla, Zubin J; Courtis, Patrick; Fitzpatrick, Clare; Fitzpatrick, David; Mullett, Hannan

    2010-03-26

    Principal component analysis (PCA) enables the building of statistical shape models of bones and joints. This has been used in conjunction with computer assisted surgery in the past. However, PCA of the clavicle has not been performed. Using PCA, we present a novel method that examines the major modes of size and three-dimensional shape variation in male and female clavicles and suggests a method of grouping the clavicle into size and shape categories. Twenty-one high-resolution computerized tomography scans of the clavicle were reconstructed and analyzed using a specifically developed statistical software package. After performing statistical shape analysis, PCA was applied to study the factors that account for anatomical variation. The first principal component representing size accounted for 70.5 percent of anatomical variation. The addition of a further three principal components accounted for almost 87 percent. Using statistical shape analysis, clavicles in males have a greater lateral depth and are longer, wider and thicker than in females. However, the sternal angle in females is larger than in males. PCA confirmed these differences between genders but also noted that men exhibit greater variance and classified clavicles into five morphological groups. This unique approach is the first that standardizes a clavicular orientation. It provides information that is useful to both, the biomedical engineer and clinician. Other applications include implant design with regard to modifying current or designing future clavicle fixation devices. Our findings support the need for further development of clavicle fixation devices and the questioning of whether gender-specific devices are necessary.

  12. Multilevel principal component analysis (mPCA) in shape analysis: A feasibility study in medical and dental imaging.

    PubMed

    Farnell, D J J; Popat, H; Richmond, S

    2016-06-01

    Methods used in image processing should reflect any multilevel structures inherent in the image dataset or they run the risk of functioning inadequately. We wish to test the feasibility of multilevel principal components analysis (PCA) to build active shape models (ASMs) for cases relevant to medical and dental imaging. Multilevel PCA was used to carry out model fitting to sets of landmark points and it was compared to the results of "standard" (single-level) PCA. Proof of principle was tested by applying mPCA to model basic peri-oral expressions (happy, neutral, sad) approximated to the junction between the mouth/lips. Monte Carlo simulations were used to create this data which allowed exploration of practical implementation issues such as the number of landmark points, number of images, and number of groups (i.e., "expressions" for this example). To further test the robustness of the method, mPCA was subsequently applied to a dental imaging dataset utilising landmark points (placed by different clinicians) along the boundary of mandibular cortical bone in panoramic radiographs of the face. Changes of expression that varied between groups were modelled correctly at one level of the model and changes in lip width that varied within groups at another for the Monte Carlo dataset. Extreme cases in the test dataset were modelled adequately by mPCA but not by standard PCA. Similarly, variations in the shape of the cortical bone were modelled by one level of mPCA and variations between the experts at another for the panoramic radiographs dataset. Results for mPCA were found to be comparable to those of standard PCA for point-to-point errors via miss-one-out testing for this dataset. These errors reduce with increasing number of eigenvectors/values retained, as expected. We have shown that mPCA can be used in shape models for dental and medical image processing. mPCA was found to provide more control and flexibility when compared to standard "single-level" PCA. Specifically, mPCA is preferable to "standard" PCA when multiple levels occur naturally in the dataset. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  13. Principal component analysis of the CT density histogram to generate parametric response maps of COPD

    NASA Astrophysics Data System (ADS)

    Zha, N.; Capaldi, D. P. I.; Pike, D.; McCormack, D. G.; Cunningham, I. A.; Parraga, G.

    2015-03-01

    Pulmonary x-ray computed tomography (CT) may be used to characterize emphysema and airways disease in patients with chronic obstructive pulmonary disease (COPD). One analysis approach - parametric response mapping (PMR) utilizes registered inspiratory and expiratory CT image volumes and CT-density-histogram thresholds, but there is no consensus regarding the threshold values used, or their clinical meaning. Principal-component-analysis (PCA) of the CT density histogram can be exploited to quantify emphysema using data-driven CT-density-histogram thresholds. Thus, the objective of this proof-of-concept demonstration was to develop a PRM approach using PCA-derived thresholds in COPD patients and ex-smokers without airflow limitation. Methods: Fifteen COPD ex-smokers and 5 normal ex-smokers were evaluated. Thoracic CT images were also acquired at full inspiration and full expiration and these images were non-rigidly co-registered. PCA was performed for the CT density histograms, from which the components with the highest eigenvalues greater than one were summed. Since the values of the principal component curve correlate directly with the variability in the sample, the maximum and minimum points on the curve were used as threshold values for the PCA-adjusted PRM technique. Results: A significant correlation was determined between conventional and PCA-adjusted PRM with 3He MRI apparent diffusion coefficient (p<0.001), with CT RA950 (p<0.0001), as well as with 3He MRI ventilation defect percent, a measurement of both small airways disease (p=0.049 and p=0.06, respectively) and emphysema (p=0.02). Conclusions: PRM generated using PCA thresholds of the CT density histogram showed significant correlations with CT and 3He MRI measurements of emphysema, but not airways disease.

  14. The Raman spectrum character of skin tumor induced by UVB

    NASA Astrophysics Data System (ADS)

    Wu, Shulian; Hu, Liangjun; Wang, Yunxia; Li, Yongzeng

    2016-03-01

    In our study, the skin canceration processes induced by UVB were analyzed from the perspective of tissue spectrum. A home-made Raman spectral system with a millimeter order excitation laser spot size combined with a multivariate statistical analysis for monitoring the skin changed irradiated by UVB was studied and the discrimination were evaluated. Raman scattering signals of the SCC and normal skin were acquired. Spectral differences in Raman spectra were revealed. Linear discriminant analysis (LDA) based on principal component analysis (PCA) were employed to generate diagnostic algorithms for the classification of skin SCC and normal. The results indicated that Raman spectroscopy combined with PCA-LDA demonstrated good potential for improving the diagnosis of skin cancers.

  15. Locally linear embedding: dimension reduction of massive protostellar spectra

    NASA Astrophysics Data System (ADS)

    Ward, J. L.; Lumsden, S. L.

    2016-09-01

    We present the results of the application of locally linear embedding (LLE) to reduce the dimensionality of dereddened and continuum subtracted near-infrared spectra using a combination of models and real spectra of massive protostars selected from the Red MSX Source survey data base. A brief comparison is also made with two other dimension reduction techniques; principal component analysis (PCA) and Isomap using the same set of spectra as well as a more advanced form of LLE, Hessian locally linear embedding. We find that whilst LLE certainly has its limitations, it significantly outperforms both PCA and Isomap in classification of spectra based on the presence/absence of emission lines and provides a valuable tool for classification and analysis of large spectral data sets.

  16. Rapid differentiation of Chinese hop varieties (Humulus lupulus) using volatile fingerprinting by HS-SPME-GC-MS combined with multivariate statistical analysis.

    PubMed

    Liu, Zechang; Wang, Liping; Liu, Yumei

    2018-01-18

    Hops impart flavor to beer, with the volatile components characterizing the various hop varieties and qualities. Fingerprinting, especially flavor fingerprinting, is often used to identify 'flavor products' because inconsistencies in the description of flavor may lead to an incorrect definition of beer quality. Compared to flavor fingerprinting, volatile fingerprinting is simpler and easier. We performed volatile fingerprinting using head space-solid phase micro-extraction gas chromatography-mass spectrometry combined with similarity analysis and principal component analysis (PCA) for evaluating and distinguishing between three major Chinese hops. Eighty-four volatiles were identified, which were classified into seven categories. Volatile fingerprinting based on similarity analysis did not yield any obvious result. By contrast, hop varieties and qualities were identified using volatile fingerprinting based on PCA. The potential variables explained the variance in the three hop varieties. In addition, the dendrogram and principal component score plot described the differences and classifications of hops. Volatile fingerprinting plus multivariate statistical analysis can rapidly differentiate between the different varieties and qualities of the three major Chinese hops. Furthermore, this method can be used as a reference in other fields. © 2018 Society of Chemical Industry. © 2018 Society of Chemical Industry.

  17. Forensic analysis of Salvia divinorum using multivariate statistical procedures. Part I: discrimination from related Salvia species.

    PubMed

    Willard, Melissa A Bodnar; McGuffin, Victoria L; Smith, Ruth Waddell

    2012-01-01

    Salvia divinorum is a hallucinogenic herb that is internationally regulated. In this study, salvinorin A, the active compound in S. divinorum, was extracted from S. divinorum plant leaves using a 5-min extraction with dichloromethane. Four additional Salvia species (Salvia officinalis, Salvia guaranitica, Salvia splendens, and Salvia nemorosa) were extracted using this procedure, and all extracts were analyzed by gas chromatography-mass spectrometry. Differentiation of S. divinorum from other Salvia species was successful based on visual assessment of the resulting chromatograms. To provide a more objective comparison, the total ion chromatograms (TICs) were subjected to principal components analysis (PCA). Prior to PCA, the TICs were subjected to a series of data pretreatment procedures to minimize non-chemical sources of variance in the data set. Successful discrimination of S. divinorum from the other four Salvia species was possible based on visual assessment of the PCA scores plot. To provide a numerical assessment of the discrimination, a series of statistical procedures such as Euclidean distance measurement, hierarchical cluster analysis, Student's t tests, Wilcoxon rank-sum tests, and Pearson product moment correlation were also applied to the PCA scores. The statistical procedures were then compared to determine the advantages and disadvantages for forensic applications.

  18. Principal Component Analysis for Normal-Distribution-Valued Symbolic Data.

    PubMed

    Wang, Huiwen; Chen, Meiling; Shi, Xiaojun; Li, Nan

    2016-02-01

    This paper puts forward a new approach to principal component analysis (PCA) for normal-distribution-valued symbolic data, which has a vast potential of applications in the economic and management field. We derive a full set of numerical characteristics and variance-covariance structure for such data, which forms the foundation for our analytical PCA approach. Our approach is able to use all of the variance information in the original data than the prevailing representative-type approach in the literature which only uses centers, vertices, etc. The paper also provides an accurate approach to constructing the observations in a PC space based on the linear additivity property of normal distribution. The effectiveness of the proposed method is illustrated by simulated numerical experiments. At last, our method is applied to explain the puzzle of risk-return tradeoff in China's stock market.

  19. Potential of non-invasive esophagus cancer detection based on urine surface-enhanced Raman spectroscopy

    NASA Astrophysics Data System (ADS)

    Huang, Shaohua; Wang, Lan; Chen, Weisheng; Feng, Shangyuan; Lin, Juqiang; Huang, Zufang; Chen, Guannan; Li, Buhong; Chen, Rong

    2014-11-01

    Non-invasive esophagus cancer detection based on urine surface-enhanced Raman spectroscopy (SERS) analysis was presented. Urine SERS spectra were measured on esophagus cancer patients (n = 56) and healthy volunteers (n = 36) for control analysis. Tentative assignments of the urine SERS spectra indicated some interesting esophagus cancer-specific biomolecular changes, including a decrease in the relative content of urea and an increase in the percentage of uric acid in the urine of esophagus cancer patients compared to that of healthy subjects. Principal component analysis (PCA) combined with linear discriminant analysis (LDA) was employed to analyze and differentiate the SERS spectra between normal and esophagus cancer urine. The diagnostic algorithms utilizing a multivariate analysis method achieved a diagnostic sensitivity of 89.3% and specificity of 83.3% for separating esophagus cancer samples from normal urine samples. These results from the explorative work suggested that silver nano particle-based urine SERS analysis coupled with PCA-LDA multivariate analysis has potential for non-invasive detection of esophagus cancer.

  20. [Analyzing and modeling methods of near infrared spectroscopy for in-situ prediction of oil yield from oil shale].

    PubMed

    Liu, Jie; Zhang, Fu-Dong; Teng, Fei; Li, Jun; Wang, Zhi-Hong

    2014-10-01

    In order to in-situ detect the oil yield of oil shale, based on portable near infrared spectroscopy analytical technology, with 66 rock core samples from No. 2 well drilling of Fuyu oil shale base in Jilin, the modeling and analyzing methods for in-situ detection were researched. By the developed portable spectrometer, 3 data formats (reflectance, absorbance and K-M function) spectra were acquired. With 4 different modeling data optimization methods: principal component-mahalanobis distance (PCA-MD) for eliminating abnormal samples, uninformative variables elimination (UVE) for wavelength selection and their combina- tions: PCA-MD + UVE and UVE + PCA-MD, 2 modeling methods: partial least square (PLS) and back propagation artificial neural network (BPANN), and the same data pre-processing, the modeling and analyzing experiment were performed to determine the optimum analysis model and method. The results show that the data format, modeling data optimization method and modeling method all affect the analysis precision of model. Results show that whether or not using the optimization method, reflectance or K-M function is the proper spectrum format of the modeling database for two modeling methods. Using two different modeling methods and four different data optimization methods, the model precisions of the same modeling database are different. For PLS modeling method, the PCA-MD and UVE + PCA-MD data optimization methods can improve the modeling precision of database using K-M function spectrum data format. For BPANN modeling method, UVE, UVE + PCA-MD and PCA- MD + UVE data optimization methods can improve the modeling precision of database using any of the 3 spectrum data formats. In addition to using the reflectance spectra and PCA-MD data optimization method, modeling precision by BPANN method is better than that by PLS method. And modeling with reflectance spectra, UVE optimization method and BPANN modeling method, the model gets the highest analysis precision, its correlation coefficient (Rp) is 0.92, and its standard error of prediction (SEP) is 0.69%.

  1. Multivariate Analysis of Electron Detachment Dissociation and Infrared Multiphoton Dissociation Mass Spectra of Heparan Sulfate Tetrasaccharides Differing Only in Hexuronic acid Stereochemistry

    NASA Astrophysics Data System (ADS)

    Oh, Han Bin; Leach, Franklin E.; Arungundram, Sailaja; Al-Mafraji, Kanar; Venot, Andre; Boons, Geert-Jan; Amster, I. Jonathan

    2011-03-01

    The structural characterization of glycosaminoglycan (GAG) carbohydrates by mass spectrometry has been a long-standing analytical challenge due to the inherent heterogeneity of these biomolecules, specifically polydispersity, variability in sulfation, and hexuronic acid stereochemistry. Recent advances in tandem mass spectrometry methods employing threshold and electron-based ion activation have resulted in the ability to determine the location of the labile sulfate modification as well as assign the stereochemistry of hexuronic acid residues. To facilitate the analysis of complex electron detachment dissociation (EDD) spectra, principal component analysis (PCA) is employed to differentiate the hexuronic acid stereochemistry of four synthetic GAG epimers whose EDD spectra are nearly identical upon visual inspection. For comparison, PCA is also applied to infrared multiphoton dissociation spectra (IRMPD) of the examined epimers. To assess the applicability of multivariate methods in GAG mixture analysis, PCA is utilized to identify the relative content of two epimers in a binary mixture.

  2. A Dimensionally Reduced Clustering Methodology for Heterogeneous Occupational Medicine Data Mining.

    PubMed

    Saâdaoui, Foued; Bertrand, Pierre R; Boudet, Gil; Rouffiac, Karine; Dutheil, Frédéric; Chamoux, Alain

    2015-10-01

    Clustering is a set of techniques of the statistical learning aimed at finding structures of heterogeneous partitions grouping homogenous data called clusters. There are several fields in which clustering was successfully applied, such as medicine, biology, finance, economics, etc. In this paper, we introduce the notion of clustering in multifactorial data analysis problems. A case study is conducted for an occupational medicine problem with the purpose of analyzing patterns in a population of 813 individuals. To reduce the data set dimensionality, we base our approach on the Principal Component Analysis (PCA), which is the statistical tool most commonly used in factorial analysis. However, the problems in nature, especially in medicine, are often based on heterogeneous-type qualitative-quantitative measurements, whereas PCA only processes quantitative ones. Besides, qualitative data are originally unobservable quantitative responses that are usually binary-coded. Hence, we propose a new set of strategies allowing to simultaneously handle quantitative and qualitative data. The principle of this approach is to perform a projection of the qualitative variables on the subspaces spanned by quantitative ones. Subsequently, an optimal model is allocated to the resulting PCA-regressed subspaces.

  3. Sparse PCA corrects for cell type heterogeneity in epigenome-wide association studies.

    PubMed

    Rahmani, Elior; Zaitlen, Noah; Baran, Yael; Eng, Celeste; Hu, Donglei; Galanter, Joshua; Oh, Sam; Burchard, Esteban G; Eskin, Eleazar; Zou, James; Halperin, Eran

    2016-05-01

    In epigenome-wide association studies (EWAS), different methylation profiles of distinct cell types may lead to false discoveries. We introduce ReFACTor, a method based on principal component analysis (PCA) and designed for the correction of cell type heterogeneity in EWAS. ReFACTor does not require knowledge of cell counts, and it provides improved estimates of cell type composition, resulting in improved power and control for false positives in EWAS. Corresponding software is available at http://www.cs.tau.ac.il/~heran/cozygene/software/refactor.html.

  4. Differential principal component analysis of ChIP-seq.

    PubMed

    Ji, Hongkai; Li, Xia; Wang, Qian-fei; Ning, Yang

    2013-04-23

    We propose differential principal component analysis (dPCA) for analyzing multiple ChIP-sequencing datasets to identify differential protein-DNA interactions between two biological conditions. dPCA integrates unsupervised pattern discovery, dimension reduction, and statistical inference into a single framework. It uses a small number of principal components to summarize concisely the major multiprotein synergistic differential patterns between the two conditions. For each pattern, it detects and prioritizes differential genomic loci by comparing the between-condition differences with the within-condition variation among replicate samples. dPCA provides a unique tool for efficiently analyzing large amounts of ChIP-sequencing data to study dynamic changes of gene regulation across different biological conditions. We demonstrate this approach through analyses of differential chromatin patterns at transcription factor binding sites and promoters as well as allele-specific protein-DNA interactions.

  5. RG-inspired machine learning for lattice field theory

    NASA Astrophysics Data System (ADS)

    Foreman, Sam; Giedt, Joel; Meurice, Yannick; Unmuth-Yockey, Judah

    2018-03-01

    Machine learning has been a fast growing field of research in several areas dealing with large datasets. We report recent attempts to use renormalization group (RG) ideas in the context of machine learning. We examine coarse graining procedures for perceptron models designed to identify the digits of the MNIST data. We discuss the correspondence between principal components analysis (PCA) and RG flows across the transition for worm configurations of the 2D Ising model. Preliminary results regarding the logarithmic divergence of the leading PCA eigenvalue were presented at the conference. More generally, we discuss the relationship between PCA and observables in Monte Carlo simulations and the possibility of reducing the number of learning parameters in supervised learning based on RG inspired hierarchical ansatzes.

  6. A Parallel Product-Convolution approach for representing the depth varying Point Spread Functions in 3D widefield microscopy based on principal component analysis.

    PubMed

    Arigovindan, Muthuvel; Shaevitz, Joshua; McGowan, John; Sedat, John W; Agard, David A

    2010-03-29

    We address the problem of computational representation of image formation in 3D widefield fluorescence microscopy with depth varying spherical aberrations. We first represent 3D depth-dependent point spread functions (PSFs) as a weighted sum of basis functions that are obtained by principal component analysis (PCA) of experimental data. This representation is then used to derive an approximating structure that compactly expresses the depth variant response as a sum of few depth invariant convolutions pre-multiplied by a set of 1D depth functions, where the convolving functions are the PCA-derived basis functions. The model offers an efficient and convenient trade-off between complexity and accuracy. For a given number of approximating PSFs, the proposed method results in a much better accuracy than the strata based approximation scheme that is currently used in the literature. In addition to yielding better accuracy, the proposed methods automatically eliminate the noise in the measured PSFs.

  7. Principal Component Analysis: A Method for Determining the Essential Dynamics of Proteins

    PubMed Central

    David, Charles C.; Jacobs, Donald J.

    2015-01-01

    It has become commonplace to employ principal component analysis to reveal the most important motions in proteins. This method is more commonly known by its acronym, PCA. While most popular molecular dynamics packages inevitably provide PCA tools to analyze protein trajectories, researchers often make inferences of their results without having insight into how to make interpretations, and they are often unaware of limitations and generalizations of such analysis. Here we review best practices for applying standard PCA, describe useful variants, discuss why one may wish to make comparison studies, and describe a set of metrics that make comparisons possible. In practice, one will be forced to make inferences about the essential dynamics of a protein without having the desired amount of samples. Therefore, considerable time is spent on describing how to judge the significance of results, highlighting pitfalls. The topic of PCA is reviewed from the perspective of many practical considerations, and useful recipes are provided. PMID:24061923

  8. Principal component analysis: a method for determining the essential dynamics of proteins.

    PubMed

    David, Charles C; Jacobs, Donald J

    2014-01-01

    It has become commonplace to employ principal component analysis to reveal the most important motions in proteins. This method is more commonly known by its acronym, PCA. While most popular molecular dynamics packages inevitably provide PCA tools to analyze protein trajectories, researchers often make inferences of their results without having insight into how to make interpretations, and they are often unaware of limitations and generalizations of such analysis. Here we review best practices for applying standard PCA, describe useful variants, discuss why one may wish to make comparison studies, and describe a set of metrics that make comparisons possible. In practice, one will be forced to make inferences about the essential dynamics of a protein without having the desired amount of samples. Therefore, considerable time is spent on describing how to judge the significance of results, highlighting pitfalls. The topic of PCA is reviewed from the perspective of many practical considerations, and useful recipes are provided.

  9. Spectral discrimination of bleached and healthy submerged corals based on principal components analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Holden, H.; LeDrew, E.

    1997-06-01

    Remote discrimination of substrate types in relatively shallow coastal waters has been limited by the spatial and spectral resolution of available sensors. An additional limiting factor is the strong attenuating influence of the water column over the substrate. As a result, there have been limited attempts to map submerged ecosystems such as coral reefs based on spectral characteristics. Both healthy and bleached corals were measured at depth with a hand-held spectroradiometer, and their spectra compared. Two separate principal components analyses (PCA) were performed on two sets of spectral data. The PCA revealed that there is indeed a spectral difference basedmore » on health. In the first data set, the first component (healthy coral) explains 46.82%, while the second component (bleached coral) explains 46.35% of the variance. In the second data set, the first component (bleached coral) explained 46.99%; the second component (healthy coral) explained 36.55%; and the third component (healthy coral) explained 15.44 % of the total variance in the original data. These results are encouraging with respect to using an airborne spectroradiometer to identify areas of bleached corals thus enabling accurate monitoring over time.« less

  10. Prediction of BP reactivity to talking using hybrid soft computing approaches.

    PubMed

    Kaur, Gurmanik; Arora, Ajat Shatru; Jain, Vijender Kumar

    2014-01-01

    High blood pressure (BP) is associated with an increased risk of cardiovascular diseases. Therefore, optimal precision in measurement of BP is appropriate in clinical and research studies. In this work, anthropometric characteristics including age, height, weight, body mass index (BMI), and arm circumference (AC) were used as independent predictor variables for the prediction of BP reactivity to talking. Principal component analysis (PCA) was fused with artificial neural network (ANN), adaptive neurofuzzy inference system (ANFIS), and least square-support vector machine (LS-SVM) model to remove the multicollinearity effect among anthropometric predictor variables. The statistical tests in terms of coefficient of determination (R (2)), root mean square error (RMSE), and mean absolute percentage error (MAPE) revealed that PCA based LS-SVM (PCA-LS-SVM) model produced a more efficient prediction of BP reactivity as compared to other models. This assessment presents the importance and advantages posed by PCA fused prediction models for prediction of biological variables.

  11. Discriminating the Mineralogical Composition in Drill Cuttings Based on Absorption Spectra in the Terahertz Range.

    PubMed

    Miao, Xinyang; Li, Hao; Bao, Rima; Feng, Chengjing; Wu, Hang; Zhan, Honglei; Li, Yizhang; Zhao, Kun

    2017-02-01

    Understanding the geological units of a reservoir is essential to the development and management of the resource. In this paper, drill cuttings from several depths from an oilfield were studied using terahertz time domain spectroscopy (THz-TDS). Cluster analysis (CA) and principal component analysis (PCA) were employed to classify and analyze the cuttings. The cuttings were clearly classified based on CA and PCA methods, and the results were in agreement with the lithology. Moreover, calcite and dolomite have stronger absorption of a THz pulse than any other minerals, based on an analysis of the PC1 scores. Quantitative analyses of minor minerals were also realized by building a series of linear and non-linear models between contents and PC2 scores. The results prove THz technology to be a promising means for determining reservoir lithology as well as other properties, which will be a significant supplementary method in oil fields.

  12. Quantication and analysis of respiratory motion from 4D MRI

    NASA Astrophysics Data System (ADS)

    Aizzuddin Abd Rahni, Ashrani; Lewis, Emma; Wells, Kevin

    2014-11-01

    It is well known that respiratory motion affects image acquisition and also external beam radiotherapy (EBRT) treatment planning and delivery. However often the existing approaches for respiratory motion management are based on a generic view of respiratory motion such as the general movement of organ, tissue or fiducials. This paper thus aims to present a more in depth analysis of respiratory motion based on 4D MRI for further integration into motion correction in image acquisition or image based EBRT. Internal and external motion was first analysed separately, on a per-organ basis for internal motion. Principal component analysis (PCA) was then performed on the internal and external motion vectors separately and the relationship between the two PCA spaces was analysed. The motion extracted from 4D MRI on general was found to be consistent with what has been reported in literature.

  13. Decomposition-Based Failure Mode Identification Method for Risk-Free Design of Large Systems

    NASA Technical Reports Server (NTRS)

    Tumer, Irem Y.; Stone, Robert B.; Roberts, Rory A.; Clancy, Daniel (Technical Monitor)

    2002-01-01

    When designing products, it is crucial to assure failure and risk-free operation in the intended operating environment. Failures are typically studied and eliminated as much as possible during the early stages of design. The few failures that go undetected result in unacceptable damage and losses in high-risk applications where public safety is of concern. Published NASA and NTSB accident reports point to a variety of components identified as sources of failures in the reported cases. In previous work, data from these reports were processed and placed in matrix form for all the system components and failure modes encountered, and then manipulated using matrix methods to determine similarities between the different components and failure modes. In this paper, these matrices are represented in the form of a linear combination of failures modes, mathematically formed using Principal Components Analysis (PCA) decomposition. The PCA decomposition results in a low-dimensionality representation of all failure modes and components of interest, represented in a transformed coordinate system. Such a representation opens the way for efficient pattern analysis and prediction of failure modes with highest potential risks on the final product, rather than making decisions based on the large space of component and failure mode data. The mathematics of the proposed method are explained first using a simple example problem. The method is then applied to component failure data gathered from helicopter, accident reports to demonstrate its potential.

  14. [In vitro transdermal delivery of the active fraction of xiangfusiwu decoction based on principal component analysis].

    PubMed

    Li, Zhen-Hao; Liu, Pei; Qian, Da-Wei; Li, Wei; Shang, Er-Xin; Duan, Jin-Ao

    2013-06-01

    The objective of the present study was to establish a method based on principal component analysis (PCA) for the study of transdermal delivery of multiple components in Chinese medicine, and to choose the best penetration enhancers for the active fraction of Xiangfusiwu decoction (BW) with this method. Improved Franz diffusion cells with isolated rat abdomen skins were carried out to experiment on the transdermal delivery of six active components, including ferulic acid, paeoniflorin, albiflorin, protopine, tetrahydropalmatine and tetrahydrocolumbamine. The concentrations of these components were determined by LC-MS/MS, then the total factor scores of the concentrations at different times were calculated using PCA and were employed instead of the concentrations to compute the cumulative amounts and steady fluxes, the latter of which were considered as the indexes for optimizing penetration enhancers. The results showed that compared to the control group, the steady fluxes of the other groups increased significantly and furthermore, 4% azone with 1% propylene glycol manifested the best effect. The six components could penetrate through skin well under the action of penetration enhancers. The method established in this study has been proved to be suitable for the study of transdermal delivery of multiple components, and it provided a scientific basis for preparation research of Xiangfusiwu decoction and moreover, it could be a reference for Chinese medicine research.

  15. Statistical techniques applied to aerial radiometric surveys (STAARS): principal components analysis user's manual. [NURE program

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Koch, C.D.; Pirkle, F.L.; Schmidt, J.S.

    1981-01-01

    A Principal Components Analysis (PCA) has been written to aid in the interpretation of multivariate aerial radiometric data collected by the US Department of Energy (DOE) under the National Uranium Resource Evaluation (NURE) program. The variations exhibited by these data have been reduced and classified into a number of linear combinations by using the PCA program. The PCA program then generates histograms and outlier maps of the individual variates. Black and white plots can be made on a Calcomp plotter by the application of follow-up programs. All programs referred to in this guide were written for a DEC-10. From thismore » analysis a geologist may begin to interpret the data structure. Insight into geological processes underlying the data may be obtained.« less

  16. Standardized processing of MALDI imaging raw data for enhancement of weak analyte signals in mouse models of gastric cancer and Alzheimer's disease.

    PubMed

    Schwartz, Matthias; Meyer, Björn; Wirnitzer, Bernhard; Hopf, Carsten

    2015-03-01

    Conventional mass spectrometry image preprocessing methods used for denoising, such as the Savitzky-Golay smoothing or discrete wavelet transformation, typically do not only remove noise but also weak signals. Recently, memory-efficient principal component analysis (PCA) in conjunction with random projections (RP) has been proposed for reversible compression and analysis of large mass spectrometry imaging datasets. It considers single-pixel spectra in their local context and consequently offers the prospect of using information from the spectra of adjacent pixels for denoising or signal enhancement. However, little systematic analysis of key RP-PCA parameters has been reported so far, and the utility and validity of this method for context-dependent enhancement of known medically or pharmacologically relevant weak analyte signals in linear-mode matrix-assisted laser desorption/ionization (MALDI) mass spectra has not been explored yet. Here, we investigate MALDI imaging datasets from mouse models of Alzheimer's disease and gastric cancer to systematically assess the importance of selecting the right number of random projections k and of principal components (PCs) L for reconstructing reproducibly denoised images after compression. We provide detailed quantitative data for comparison of RP-PCA-denoising with the Savitzky-Golay and wavelet-based denoising in these mouse models as a resource for the mass spectrometry imaging community. Most importantly, we demonstrate that RP-PCA preprocessing can enhance signals of low-intensity amyloid-β peptide isoforms such as Aβ1-26 even in sparsely distributed Alzheimer's β-amyloid plaques and that it enables enhanced imaging of multiply acetylated histone H4 isoforms in response to pharmacological histone deacetylase inhibition in vivo. We conclude that RP-PCA denoising may be a useful preprocessing step in biomarker discovery workflows.

  17. Evaluating motion processing algorithms for use with functional near-infrared spectroscopy data from young children.

    PubMed

    Delgado Reyes, Lourdes M; Bohache, Kevin; Wijeakumar, Sobanawartiny; Spencer, John P

    2018-04-01

    Motion artifacts are often a significant component of the measured signal in functional near-infrared spectroscopy (fNIRS) experiments. A variety of methods have been proposed to address this issue, including principal components analysis (PCA), correlation-based signal improvement (CBSI), wavelet filtering, and spline interpolation. The efficacy of these techniques has been compared using simulated data; however, our understanding of how these techniques fare when dealing with task-based cognitive data is limited. Brigadoi et al. compared motion correction techniques in a sample of adult data measured during a simple cognitive task. Wavelet filtering showed the most promise as an optimal technique for motion correction. Given that fNIRS is often used with infants and young children, it is critical to evaluate the effectiveness of motion correction techniques directly with data from these age groups. This study addresses that problem by evaluating motion correction algorithms implemented in HomER2. The efficacy of each technique was compared quantitatively using objective metrics related to the physiological properties of the hemodynamic response. Results showed that targeted PCA (tPCA), spline, and CBSI retained a higher number of trials. These techniques also performed well in direct head-to-head comparisons with the other approaches using quantitative metrics. The CBSI method corrected many of the artifacts present in our data; however, this approach produced sometimes unstable HRFs. The targeted PCA and spline methods proved to be the most robust, performing well across all comparison metrics. When compared head to head, tPCA consistently outperformed spline. We conclude, therefore, that tPCA is an effective technique for correcting motion artifacts in fNIRS data from young children.

  18. Resolution of coi-dominant phytoplankton species in a eutrophiclake using synchrotron-based Fourier transform infraredspectroscopy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dean, A.P.; Martin, Michael C.; Sigee, D.C.

    2006-10-09

    Synchrotron-based Fourier-transform infrared (FTIR)microspectroscopy was used to distinguish micropopulations of thecodominant algae Microcystis aeruginosa (Cyanophyceae) and Ceratiumhirundinella (Dinophyceae) in mixed phytoplankton samples taken from thewater column of a stratified eutrophic lake (Rostherne Mere, UK). FTIRspectra of the two algae showed a closely similar sequence of 10 bandsover the wave-number range 4000-900 cm-1. These were assigned to a rangeof vibrationally active chemical groups using published band assignmentsand on the basis of correlation and factor analysis. In both algae,intracellular concentrations of macromolecular components (determined asband intensity) varied considerably within the same population,indicating substantial intraspecific heterogeneity. Interspecificdifferences were separately analysed in relation tomore » discrete bands and bymultivariate analysis of the entire spectral region 1750-900 cm-1. Interms of discrete bands, comparison of individual intensities (normalisedto amide 1) demonstrated significant (99 percent probability level)differences in relation to six bands between the two algal species. Keyinterspecific differences were also noted in relation to the positions ofbands 2, 10 (carbohydrate) and 7 (protein) and in the 3-D plots derivedby principal component analysis (PCA) of the sequence of bandintensities. PCA of entire spectral regions showed clear resolutionofspecies in the PCA plot, with indication of separation on the basis ofprotein (region 1700-1500 cm1) and carbohydrate (region 1150-900 cm1)composition in the loading plot. Hierarchical cluster analysis (Wardalgorithm) of entire spectral regions also showed clear discrimination ofthe two species within the resulting dendrogram.« less

  19. An improved geographically weighted regression model for PM2.5 concentration estimation in large areas

    NASA Astrophysics Data System (ADS)

    Zhai, Liang; Li, Shuang; Zou, Bin; Sang, Huiyong; Fang, Xin; Xu, Shan

    2018-05-01

    Considering the spatial non-stationary contributions of environment variables to PM2.5 variations, the geographically weighted regression (GWR) modeling method has been using to estimate PM2.5 concentrations widely. However, most of the GWR models in reported studies so far were established based on the screened predictors through pretreatment correlation analysis, and this process might cause the omissions of factors really driving PM2.5 variations. This study therefore developed a best subsets regression (BSR) enhanced principal component analysis-GWR (PCA-GWR) modeling approach to estimate PM2.5 concentration by fully considering all the potential variables' contributions simultaneously. The performance comparison experiment between PCA-GWR and regular GWR was conducted in the Beijing-Tianjin-Hebei (BTH) region over a one-year-period. Results indicated that the PCA-GWR modeling outperforms the regular GWR modeling with obvious higher model fitting- and cross-validation based adjusted R2 and lower RMSE. Meanwhile, the distribution map of PM2.5 concentration from PCA-GWR modeling also clearly depicts more spatial variation details in contrast to the one from regular GWR modeling. It can be concluded that the BSR enhanced PCA-GWR modeling could be a reliable way for effective air pollution concentration estimation in the coming future by involving all the potential predictor variables' contributions to PM2.5 variations.

  20. Adaptive online monitoring for ICU patients by combining just-in-time learning and principal component analysis.

    PubMed

    Li, Xuejian; Wang, Youqing

    2016-12-01

    Offline general-type models are widely used for patients' monitoring in intensive care units (ICUs), which are developed by using past collected datasets consisting of thousands of patients. However, these models may fail to adapt to the changing states of ICU patients. Thus, to be more robust and effective, the monitoring models should be adaptable to individual patients. A novel combination of just-in-time learning (JITL) and principal component analysis (PCA), referred to learning-type PCA (L-PCA), was proposed for adaptive online monitoring of patients in ICUs. JITL was used to gather the most relevant data samples for adaptive modeling of complex physiological processes. PCA was used to build an online individual-type model and calculate monitoring statistics, and then to judge whether the patient's status is normal or not. The adaptability of L-PCA lies in the usage of individual data and the continuous updating of the training dataset. Twelve subjects were selected from the Physiobank's Multi-parameter Intelligent Monitoring for Intensive Care II (MIMIC II) database, and five vital signs of each subject were chosen. The proposed method was compared with the traditional PCA and fast moving-window PCA (Fast MWPCA). The experimental results demonstrated that the fault detection rates respectively increased by 20 % and 47 % compared with PCA and Fast MWPCA. L-PCA is first introduced into ICU patients monitoring and achieves the best monitoring performance in terms of adaptability to changes in patient status and sensitivity for abnormality detection.

  1. Dimensionality reduction for the quantitative evaluation of a smartphone-based Timed Up and Go test.

    PubMed

    Palmerini, Luca; Mellone, Sabato; Rocchi, Laura; Chiari, Lorenzo

    2011-01-01

    The Timed Up and Go is a clinical test to assess mobility in the elderly and in Parkinson's disease. Lately instrumented versions of the test are being considered, where inertial sensors assess motion. To improve the pervasiveness, ease of use, and cost, we consider a smartphone's accelerometer as the measurement system. Several parameters (usually highly correlated) can be computed from the signals recorded during the test. To avoid redundancy and obtain the features that are most sensitive to the locomotor performance, a dimensionality reduction was performed through principal component analysis (PCA). Forty-nine healthy subjects of different ages were tested. PCA was performed to extract new features (principal components) which are not redundant combinations of the original parameters and account for most of the data variability. They can be useful for exploratory analysis and outlier detection. Then, a reduced set of the original parameters was selected through correlation analysis with the principal components. This set could be recommended for studies based on healthy adults. The proposed procedure could be used as a first-level feature selection in classification studies (i.e. healthy-Parkinson's disease, fallers-non fallers) and could allow, in the future, a complete system for movement analysis to be incorporated in a smartphone.

  2. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Roper, J; Bradshaw, B; Godette, K

    Purpose: To create a knowledge-based algorithm for prostate LDR brachytherapy treatment planning that standardizes plan quality using seed arrangements tailored to individual physician preferences while being fast enough for real-time planning. Methods: A dataset of 130 prior cases was compiled for a physician with an active prostate seed implant practice. Ten cases were randomly selected to test the algorithm. Contours from the 120 library cases were registered to a common reference frame. Contour variations were characterized on a point by point basis using principle component analysis (PCA). A test case was converted to PCA vectors using the same process andmore » then compared with each library case using a Mahalanobis distance to evaluate similarity. Rank order PCA scores were used to select the best-matched library case. The seed arrangement was extracted from the best-matched case and used as a starting point for planning the test case. Computational time was recorded. Any subsequent modifications were recorded that required input from a treatment planner to achieve an acceptable plan. Results: The computational time required to register contours from a test case and evaluate PCA similarity across the library was approximately 10s. Five of the ten test cases did not require any seed additions, deletions, or moves to obtain an acceptable plan. The remaining five test cases required on average 4.2 seed modifications. The time to complete manual plan modifications was less than 30s in all cases. Conclusion: A knowledge-based treatment planning algorithm was developed for prostate LDR brachytherapy based on principle component analysis. Initial results suggest that this approach can be used to quickly create treatment plans that require few if any modifications by the treatment planner. In general, test case plans have seed arrangements which are very similar to prior cases, and thus are inherently tailored to physician preferences.« less

  3. Pepper seed variety identification based on visible/near-infrared spectral technology

    NASA Astrophysics Data System (ADS)

    Li, Cuiling; Wang, Xiu; Meng, Zhijun; Fan, Pengfei; Cai, Jichen

    2016-11-01

    Pepper is a kind of important fruit vegetable, with the expansion of pepper hybrid planting area, detection of pepper seed purity is especially important. This research used visible/near infrared (VIS/NIR) spectral technology to detect the variety of single pepper seed, and chose hybrid pepper seeds "Zhuo Jiao NO.3", "Zhuo Jiao NO.4" and "Zhuo Jiao NO.5" as research sample. VIS/NIR spectral data of 80 "Zhuo Jiao NO.3", 80 "Zhuo Jiao NO.4" and 80 "Zhuo Jiao NO.5" pepper seeds were collected, and the original spectral data was pretreated with standard normal variable (SNV) transform, first derivative (FD), and Savitzky-Golay (SG) convolution smoothing methods. Principal component analysis (PCA) method was adopted to reduce the dimension of the spectral data and extract principal components, according to the distribution of the first principal component (PC1) along with the second principal component(PC2) in the twodimensional plane, similarly, the distribution of PC1 coupled with the third principal component(PC3), and the distribution of PC2 combined with PC3, distribution areas of three varieties of pepper seeds were divided in each twodimensional plane, and the discriminant accuracy of PCA was tested through observing the distribution area of samples' principal components in validation set. This study combined PCA and linear discriminant analysis (LDA) to identify single pepper seed varieties, results showed that with the FD preprocessing method, the discriminant accuracy of pepper seed varieties was 98% for validation set, it concludes that using VIS/NIR spectral technology is feasible for identification of single pepper seed varieties.

  4. Functional Data Analysis in NTCP Modeling: A New Method to Explore the Radiation Dose-Volume Effects

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Benadjaoud, Mohamed Amine, E-mail: mohamedamine.benadjaoud@gustaveroussy.fr; Université Paris sud, Le Kremlin-Bicêtre; Institut Gustave Roussy, Villejuif

    2014-11-01

    Purpose/Objective(s): To describe a novel method to explore radiation dose-volume effects. Functional data analysis is used to investigate the information contained in differential dose-volume histograms. The method is applied to the normal tissue complication probability modeling of rectal bleeding (RB) for patients irradiated in the prostatic bed by 3-dimensional conformal radiation therapy. Methods and Materials: Kernel density estimation was used to estimate the individual probability density functions from each of the 141 rectum differential dose-volume histograms. Functional principal component analysis was performed on the estimated probability density functions to explore the variation modes in the dose distribution. The functional principalmore » components were then tested for association with RB using logistic regression adapted to functional covariates (FLR). For comparison, 3 other normal tissue complication probability models were considered: the Lyman-Kutcher-Burman model, logistic model based on standard dosimetric parameters (LM), and logistic model based on multivariate principal component analysis (PCA). Results: The incidence rate of grade ≥2 RB was 14%. V{sub 65Gy} was the most predictive factor for the LM (P=.058). The best fit for the Lyman-Kutcher-Burman model was obtained with n=0.12, m = 0.17, and TD50 = 72.6 Gy. In PCA and FLR, the components that describe the interdependence between the relative volumes exposed at intermediate and high doses were the most correlated to the complication. The FLR parameter function leads to a better understanding of the volume effect by including the treatment specificity in the delivered mechanistic information. For RB grade ≥2, patients with advanced age are significantly at risk (odds ratio, 1.123; 95% confidence interval, 1.03-1.22), and the fits of the LM, PCA, and functional principal component analysis models are significantly improved by including this clinical factor. Conclusion: Functional data analysis provides an attractive method for flexibly estimating the dose-volume effect for normal tissues in external radiation therapy.« less

  5. [Identification of Pummelo Cultivars Based on Hyperspectral Imaging Technology].

    PubMed

    Li, Xun-lan; Yi, Shi-lai; He, Shao-lan; Lü, Qiang; Xie, Rang-jin; Zheng, Yong-qiang; Deng, Lie

    2015-09-01

    Existing methods for the identification of pummelo cultivars are usually time-consuming and costly, and are therefore inconvenient to be used in cases that a rapid identification is needed. This research was aimed at identifying different pummelo cultivars by hyperspectral imaging technology which can achieve a rapid and highly sensitive measurement. A total of 240 leaf samples, 60 for each of the four cultivars were investigated. Samples were divided into two groups such as calibration set (48 samples of each cultivar) and validation set (12 samples of each cultivar) by a Kennard-Stone-based algorithm. Hyperspectral images of both adaxial and abaxial surfaces of each leaf were obtained, and were segmented into a region of interest (ROI) using a simple threshold. Spectra of leaf samples were extracted from ROI. To remove the absolute noises of the spectra, only the date of spectral range 400~1000 nm was used for analysis. Multiplicative scatter correction (MSC) and standard normal variable (SNV) were utilized for data preprocessing. Principal component analysis (PCA) was used to extract the best principal components, and successive projections algorithm (SPA) was used to extract the effective wavelengths. Least squares support vector machine (LS-SVM) was used to obtain the discrimination model of the four different pummelo cultivars. To find out the optimal values of σ2 and γ which were important parameters in LS-SVM modeling, Grid-search technique and Cross-Validation were applied. The first 10 and 11 principal components were extracted by PCA for the hyperspectral data of adaxial surface and abaxial surface, respectively. There were 31 and 21 effective wavelengths selected by SPA based on the hyperspectral data of adaxial surface and abaxial surface, respectively. The best principal components and the effective wavelengths were used as inputs of LS-SVM models, and then the PCA-LS-SVM model and the SPA-LS-SVM model were built. The results showed that 99.46% and 98.44% of identification accuracy was achieved in the calibration set for the PCA-LS-SVM model and the SPA-LS-SVM model, respectively, and a 95.83% of identification accuracy was achieved in the validation set for both the PCA-LS-SVM and the SPA- LS-SVM models, which were built based on the hyperspectral data of adaxial surface. Comparatively, the results of the PCA-LS-SVM and the SPA-LS-SVM models built based on the hyperspectral data of abaxial surface both achieved identification accuracies of 100% for both calibration set and validation set. The overall results demonstrated that use of hyperspectral data of adaxial and abaxial leaf surfaces coupled with the use of PCA-LS-SVM and the SPA-LS-SVM could achieve an accurate identification of pummelo cultivars. It was feasible to use hyperspectral imaging technology to identify different pummelo cultivars, and hyperspectral imaging technology provided an alternate way of rapid identification of pummelo cultivars. Moreover, the results in this paper demonstrated that the data from the abaxial surface of leaf was more sensitive in identifying pummelo cultivars. This study provided a new method for to the fast discrimination of pummelo cultivars.

  6. Classification of alloys using laser induced breakdown spectroscopy with principle component analysis

    NASA Astrophysics Data System (ADS)

    Syuhada Mangsor, Aneez; Haider Rizvi, Zuhaib; Chaudhary, Kashif; Safwan Aziz, Muhammad

    2018-05-01

    The study of atomic spectroscopy has contributed to a wide range of scientific applications. In principle, laser induced breakdown spectroscopy (LIBS) method has been used to analyse various types of matter regardless of its physical state, either it is solid, liquid or gas because all elements emit light of characteristic frequencies when it is excited to sufficiently high energy. The aim of this work was to analyse the signature spectrums of each element contained in three different types of samples. Metal alloys of Aluminium, Titanium and Brass with the purities of 75%, 80%, 85%, 90% and 95% were used as the manipulated variable and their LIBS spectra were recorded. The characteristic emission lines of main elements were identified from the spectra as well as its corresponding contents. Principal component analysis (PCA) was carried out using the data from LIBS spectra. Three obvious clusters were observed in 3-dimensional PCA plot which corresponding to the different group of alloys. Findings from this study showed that LIBS technology with the help of principle component analysis could conduct the variety discrimination of alloys demonstrating the capability of LIBS-PCA method in field of spectro-analysis. Thus, LIBS-PCA method is believed to be an effective method for classifying alloys with different percentage of purifications, which was high-cost and time-consuming before.

  7. Principal Component Analysis in the Spectral Analysis of the Dynamic Laser Speckle Patterns

    NASA Astrophysics Data System (ADS)

    Ribeiro, K. M.; Braga, R. A., Jr.; Horgan, G. W.; Ferreira, D. D.; Safadi, T.

    2014-02-01

    Dynamic laser speckle is a phenomenon that interprets an optical patterns formed by illuminating a surface under changes with coherent light. Therefore, the dynamic change of the speckle patterns caused by biological material is known as biospeckle. Usually, these patterns of optical interference evolving in time are analyzed by graphical or numerical methods, and the analysis in frequency domain has also been an option, however involving large computational requirements which demands new approaches to filter the images in time. Principal component analysis (PCA) works with the statistical decorrelation of data and it can be used as a data filtering. In this context, the present work evaluated the PCA technique to filter in time the data from the biospeckle images aiming the reduction of time computer consuming and improving the robustness of the filtering. It was used 64 images of biospeckle in time observed in a maize seed. The images were arranged in a data matrix and statistically uncorrelated by PCA technique, and the reconstructed signals were analyzed using the routine graphical and numerical methods to analyze the biospeckle. Results showed the potential of the PCA tool in filtering the dynamic laser speckle data, with the definition of markers of principal components related to the biological phenomena and with the advantage of fast computational processing.

  8. Landslides Identification Using Airborne Laser Scanning Data Derived Topographic Terrain Attributes and Support Vector Machine Classification

    NASA Astrophysics Data System (ADS)

    Pawłuszek, Kamila; Borkowski, Andrzej

    2016-06-01

    Since the availability of high-resolution Airborne Laser Scanning (ALS) data, substantial progress in geomorphological research, especially in landslide analysis, has been carried out. First and second order derivatives of Digital Terrain Model (DTM) have become a popular and powerful tool in landslide inventory mapping. Nevertheless, an automatic landslide mapping based on sophisticated classifiers including Support Vector Machine (SVM), Artificial Neural Network or Random Forests is often computationally time consuming. The objective of this research is to deeply explore topographic information provided by ALS data and overcome computational time limitation. For this reason, an extended set of topographic features and the Principal Component Analysis (PCA) were used to reduce redundant information. The proposed novel approach was tested on a susceptible area affected by more than 50 landslides located on Rożnów Lake in Carpathian Mountains, Poland. The initial seven PCA components with 90% of the total variability in the original topographic attributes were used for SVM classification. Comparing results with landslide inventory map, the average user's accuracy (UA), producer's accuracy (PA), and overall accuracy (OA) were calculated for two models according to the classification results. Thereby, for the PCA-feature-reduced model UA, PA, and OA were found to be 72%, 76%, and 72%, respectively. Similarly, UA, PA, and OA in the non-reduced original topographic model, was 74%, 77% and 74%, respectively. Using the initial seven PCA components instead of the twenty original topographic attributes does not significantly change identification accuracy but reduce computational time.

  9. Guided filter and principal component analysis hybrid method for hyperspectral pansharpening

    NASA Astrophysics Data System (ADS)

    Qu, Jiahui; Li, Yunsong; Dong, Wenqian

    2018-01-01

    Hyperspectral (HS) pansharpening aims to generate a fused HS image with high spectral and spatial resolution through integrating an HS image with a panchromatic (PAN) image. A guided filter (GF) and principal component analysis (PCA) hybrid HS pansharpening method is proposed. First, the HS image is interpolated and the PCA transformation is performed on the interpolated HS image. The first principal component (PC1) channel concentrates on the spatial information of the HS image. Different from the traditional PCA method, the proposed method sharpens the PAN image and utilizes the GF to obtain the spatial information difference between the HS image and the enhanced PAN image. Then, in order to reduce spectral and spatial distortion, an appropriate tradeoff parameter is defined and the spatial information difference is injected into the PC1 channel through multiplying by this tradeoff parameter. Once the new PC1 channel is obtained, the fused image is finally generated by the inverse PCA transformation. Experiments performed on both synthetic and real datasets show that the proposed method outperforms other several state-of-the-art HS pansharpening methods in both subjective and objective evaluations.

  10. Application of principal component analysis to multispectral imaging data for evaluation of pigmented skin lesions

    NASA Astrophysics Data System (ADS)

    Jakovels, Dainis; Lihacova, Ilze; Kuzmina, Ilona; Spigulis, Janis

    2013-11-01

    Non-invasive and fast primary diagnostics of pigmented skin lesions is required due to frequent incidence of skin cancer - melanoma. Diagnostic potential of principal component analysis (PCA) for distant skin melanoma recognition is discussed. Processing of the measured clinical multi-spectral images (31 melanomas and 94 nonmalignant pigmented lesions) in the wavelength range of 450-950 nm by means of PCA resulted in 87 % sensitivity and 78 % specificity for separation between malignant melanomas and pigmented nevi.

  11. Recurrence quantification analysis applied to spatiotemporal pattern analysis in high-density mapping of human atrial fibrillation.

    PubMed

    Zeemering, Stef; Bonizzi, Pietro; Maesen, Bart; Peeters, Ralf; Schotten, Ulrich

    2015-01-01

    Spatiotemporal complexity of atrial fibrillation (AF) patterns is often quantified by annotated intracardiac contact mapping. We introduce a new approach that applies recurrence plot (RP) construction followed by recurrence quantification analysis (RQA) to epicardial atrial electrograms, recorded with a high-density grid of electrodes. In 32 patients with no history of AF (aAF, n=11), paroxysmal AF (PAF, n=12) and persistent AF (persAF, n=9), RPs were constructed using a phase space electrogram embedding dimension equal to the estimated AF cycle length. Spatial information was incorporated by 1) averaging the recurrence over all electrodes, and 2) by applying principal component analysis (PCA) to the matrix of embedded electrograms and selecting the first principal component as a representation of spatial diversity. Standard RQA parameters were computed on the constructed RPs and correlated to the number of fibrillation waves per AF cycle (NW). Averaged RP RQA parameters showed no correlation with NW. Correlations improved when applying PCA, with maximum correlation achieved between RP threshold and NW (RR1%, r=0.68, p <; 0.001) and RP determinism (DET, r=-0.64, p <; 0.001). All studied RQA parameters based on the PCA RP were able to discriminate between persAF and aAF/PAF (DET persAF 0.40 ± 0.11 vs. 0.59 ± 0.14/0.62 ± 0.16, p <; 0.01). RP construction and RQA combined with PCA provide a quick and reliable tool to visualize dynamical behaviour and to assess the complexity of contact mapping patterns in AF.

  12. AlleleCoder: a PERL script for coding codominant polymorphism data for PCA analysis

    USDA-ARS?s Scientific Manuscript database

    A useful biological interpretation of diploid heterozygotes is in terms of the dose of the common allele (0, 1 or 2 copies). We have developed a PERL script that converts FASTA files into coded spreadsheets suitable for Principal Component Analysis (PCA). In combination with R and R Commander, two- ...

  13. Noninvasive detection of nasopharyngeal carcinoma based on saliva proteins using surface-enhanced Raman spectroscopy

    NASA Astrophysics Data System (ADS)

    Lin, Xueliang; Lin, Duo; Ge, Xiaosong; Qiu, Sufang; Feng, Shangyuan; Chen, Rong

    2017-10-01

    The present study evaluated the capability of saliva analysis combining membrane protein purification with surface-enhanced Raman spectroscopy (SERS) for noninvasive detection of nasopharyngeal carcinoma (NPC). A rapid and convenient protein purification method based on cellulose acetate membrane was developed. A total of 659 high-quality SERS spectra were acquired from purified proteins extracted from the saliva samples of 170 patients with pathologically confirmed NPC and 71 healthy volunteers. Spectral analysis of those saliva protein SERS spectra revealed specific changes in some biochemical compositions, which were possibly associated with NPC transformation. Furthermore, principal component analysis combined with linear discriminant analysis (PCA-LDA) was utilized to analyze and classify the saliva protein SERS spectra from NPC and healthy subjects. Diagnostic sensitivity of 70.7%, specificity of 70.3%, and diagnostic accuracy of 70.5% could be achieved by PCA-LDA for NPC identification. These results show that this assay based on saliva protein SERS analysis holds promising potential for developing a rapid, noninvasive, and convenient clinical tool for NPC screening.

  14. Combination of PCA and LORETA for sources analysis of ERP data: an emotional processing study

    NASA Astrophysics Data System (ADS)

    Hu, Jin; Tian, Jie; Yang, Lei; Pan, Xiaohong; Liu, Jiangang

    2006-03-01

    The purpose of this paper is to study spatiotemporal patterns of neuronal activity in emotional processing by analysis of ERP data. 108 pictures (categorized as positive, negative and neutral) were presented to 24 healthy, right-handed subjects while 128-channel EEG data were recorded. An analysis of two steps was applied to the ERP data. First, principal component analysis was performed to obtain significant ERP components. Then LORETA was applied to each component to localize their brain sources. The first six principal components were extracted, each of which showed different spatiotemporal patterns of neuronal activity. The results agree with other emotional study by fMRI or PET. The combination of PCA and LORETA can be used to analyze spatiotemporal patterns of ERP data in emotional processing.

  15. Exploring high-affinity binding properties of octamer peptides by principal component analysis of tetramer peptides.

    PubMed

    Kume, Akiko; Kawai, Shun; Kato, Ryuji; Iwata, Shinmei; Shimizu, Kazunori; Honda, Hiroyuki

    2017-02-01

    To investigate the binding properties of a peptide sequence, we conducted principal component analysis (PCA) of the physicochemical features of a tetramer peptide library comprised of 512 peptides, and the variables were reduced to two principal components. We selected IL-2 and IgG as model proteins and the binding affinity to these proteins was assayed using the 512 peptides mentioned above. PCA of binding affinity data showed that 16 and 18 variables were suitable for localizing IL-2 and IgG high-affinity binding peptides, respectively, into a restricted region of the PCA plot. We then investigated whether the binding affinity of octamer peptide libraries could be predicted using the identified region in the tetramer PCA. The results show that octamer high-affinity binding peptides were also concentrated in the tetramer high-affinity binding region of both IL-2 and IgG. The average fluorescence intensity of high-affinity binding peptides was 3.3- and 2.1-fold higher than that of low-affinity binding peptides for IL-2 and IgG, respectively. We conclude that PCA may be used to identify octamer peptides with high- or low-affinity binding properties from data from a tetramer peptide library. Copyright © 2016 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.

  16. Scalable Robust Principal Component Analysis Using Grassmann Averages.

    PubMed

    Hauberg, Sren; Feragen, Aasa; Enficiaud, Raffi; Black, Michael J

    2016-11-01

    In large datasets, manual data verification is impossible, and we must expect the number of outliers to increase with data size. While principal component analysis (PCA) can reduce data size, and scalable solutions exist, it is well-known that outliers can arbitrarily corrupt the results. Unfortunately, state-of-the-art approaches for robust PCA are not scalable. We note that in a zero-mean dataset, each observation spans a one-dimensional subspace, giving a point on the Grassmann manifold. We show that the average subspace corresponds to the leading principal component for Gaussian data. We provide a simple algorithm for computing this Grassmann Average ( GA), and show that the subspace estimate is less sensitive to outliers than PCA for general distributions. Because averages can be efficiently computed, we immediately gain scalability. We exploit robust averaging to formulate the Robust Grassmann Average (RGA) as a form of robust PCA. The resulting Trimmed Grassmann Average ( TGA) is appropriate for computer vision because it is robust to pixel outliers. The algorithm has linear computational complexity and minimal memory requirements. We demonstrate TGA for background modeling, video restoration, and shadow removal. We show scalability by performing robust PCA on the entire Star Wars IV movie; a task beyond any current method. Source code is available online.

  17. Discrimination of selected species of pathogenic bacteria using near-infrared Raman spectroscopy and principal components analysis

    NASA Astrophysics Data System (ADS)

    de Siqueira e Oliveira, Fernanda S.; Giana, Hector E.; Silveira, Landulfo, Jr.

    2012-03-01

    It has been proposed a method based on Raman spectroscopy for identification of different microorganisms involved in bacterial urinary tract infections. Spectra were collected from different bacterial colonies (Gram negative: E. coli, K. pneumoniae, P. mirabilis, P. aeruginosa, E. cloacae and Gram positive: S. aureus and Enterococcus sp.), grown in culture medium (Agar), using a Raman spectrometer with a fiber Raman probe (830 nm). Colonies were scraped from Agar surface placed in an aluminum foil for Raman measurements. After pre-processing, spectra were submitted to a Principal Component Analysis and Mahalanobis distance (PCA/MD) discrimination algorithm. It has been found that the mean Raman spectra of different bacterial species show similar bands, being the S. aureus well characterized by strong bands related to carotenoids. PCA/MD could discriminate Gram positive bacteria with sensitivity and specificity of 100% and Gram negative bacteria with good sensitivity and high specificity.

  18. Chemometric and multivariate statistical analysis of time-of-flight secondary ion mass spectrometry spectra from complex Cu-Fe sulfides.

    PubMed

    Kalegowda, Yogesh; Harmer, Sarah L

    2012-03-20

    Time-of-flight secondary ion mass spectrometry (TOF-SIMS) spectra of mineral samples are complex, comprised of large mass ranges and many peaks. Consequently, characterization and classification analysis of these systems is challenging. In this study, different chemometric and statistical data evaluation methods, based on monolayer sensitive TOF-SIMS data, have been tested for the characterization and classification of copper-iron sulfide minerals (chalcopyrite, chalcocite, bornite, and pyrite) at different flotation pulp conditions (feed, conditioned feed, and Eh modified). The complex mass spectral data sets were analyzed using the following chemometric and statistical techniques: principal component analysis (PCA); principal component-discriminant functional analysis (PC-DFA); soft independent modeling of class analogy (SIMCA); and k-Nearest Neighbor (k-NN) classification. PCA was found to be an important first step in multivariate analysis, providing insight into both the relative grouping of samples and the elemental/molecular basis for those groupings. For samples exposed to oxidative conditions (at Eh ~430 mV), each technique (PCA, PC-DFA, SIMCA, and k-NN) was found to produce excellent classification. For samples at reductive conditions (at Eh ~ -200 mV SHE), k-NN and SIMCA produced the most accurate classification. Phase identification of particles that contain the same elements but a different crystal structure in a mixed multimetal mineral system has been achieved.

  19. A graph-Laplacian-based feature extraction algorithm for neural spike sorting.

    PubMed

    Ghanbari, Yasser; Spence, Larry; Papamichalis, Panos

    2009-01-01

    Analysis of extracellular neural spike recordings is highly dependent upon the accuracy of neural waveform classification, commonly referred to as spike sorting. Feature extraction is an important stage of this process because it can limit the quality of clustering which is performed in the feature space. This paper proposes a new feature extraction method (which we call Graph Laplacian Features, GLF) based on minimizing the graph Laplacian and maximizing the weighted variance. The algorithm is compared with Principal Components Analysis (PCA, the most commonly-used feature extraction method) using simulated neural data. The results show that the proposed algorithm produces more compact and well-separated clusters compared to PCA. As an added benefit, tentative cluster centers are output which can be used to initialize a subsequent clustering stage.

  20. Chemometric analysis of minerals in gluten-free products.

    PubMed

    Gliszczyńska-Świgło, Anna; Klimczak, Inga; Rybicka, Iga

    2018-06-01

    Numerous studies indicate mineral deficiencies in people on a gluten-free (GF) diet. These deficiencies may indicate that GF products are a less valuable source of minerals than gluten-containing products. In the study, the nutritional quality of 50 GF products is discussed taking into account the nutritional requirements for minerals expressed as percentage of recommended daily allowance (%RDA) or percentage of adequate intake (%AI) for a model celiac patient. Elements analyzed were calcium, potassium, magnesium, sodium, copper, iron, manganese, and zinc. Analysis of %RDA or %AI was performed using principal component analysis (PCA) and hierarchical cluster analysis (HCA). Using PCA, the differentiation between products based on rice, corn, potato, GF wheat starch and based on buckwheat, chickpea, millet, oats, amaranth, teff, quinoa, chestnut, and acorn was possible. In the HCA, four clusters were created. The main criterion determining the adherence of the sample to the cluster was the content of all minerals included to HCA (K, Mg, Cu, Fe, Mn); however, only the Mn content differentiated four formed groups. GF products made of buckwheat, chickpea, millet, oats, amaranth, teff, quinoa, chestnut, and acorn are better source of minerals than based on other GF raw materials, what was confirmed by PCA and HCA. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.

  1. Principal component analysis-based unsupervised feature extraction applied to in silico drug discovery for posttraumatic stress disorder-mediated heart disease.

    PubMed

    Taguchi, Y-h; Iwadate, Mitsuo; Umeyama, Hideaki

    2015-04-30

    Feature extraction (FE) is difficult, particularly if there are more features than samples, as small sample numbers often result in biased outcomes or overfitting. Furthermore, multiple sample classes often complicate FE because evaluating performance, which is usual in supervised FE, is generally harder than the two-class problem. Developing sample classification independent unsupervised methods would solve many of these problems. Two principal component analysis (PCA)-based FE, specifically, variational Bayes PCA (VBPCA) was extended to perform unsupervised FE, and together with conventional PCA (CPCA)-based unsupervised FE, were tested as sample classification independent unsupervised FE methods. VBPCA- and CPCA-based unsupervised FE both performed well when applied to simulated data, and a posttraumatic stress disorder (PTSD)-mediated heart disease data set that had multiple categorical class observations in mRNA/microRNA expression of stressed mouse heart. A critical set of PTSD miRNAs/mRNAs were identified that show aberrant expression between treatment and control samples, and significant, negative correlation with one another. Moreover, greater stability and biological feasibility than conventional supervised FE was also demonstrated. Based on the results obtained, in silico drug discovery was performed as translational validation of the methods. Our two proposed unsupervised FE methods (CPCA- and VBPCA-based) worked well on simulated data, and outperformed two conventional supervised FE methods on a real data set. Thus, these two methods have suggested equivalence for FE on categorical multiclass data sets, with potential translational utility for in silico drug discovery.

  2. On a PCA-based lung motion model

    NASA Astrophysics Data System (ADS)

    Li, Ruijiang; Lewis, John H.; Jia, Xun; Zhao, Tianyu; Liu, Weifeng; Wuenschel, Sara; Lamb, James; Yang, Deshan; Low, Daniel A.; Jiang, Steve B.

    2011-09-01

    Respiration-induced organ motion is one of the major uncertainties in lung cancer radiotherapy and is crucial to be able to accurately model the lung motion. Most work so far has focused on the study of the motion of a single point (usually the tumor center of mass), and much less work has been done to model the motion of the entire lung. Inspired by the work of Zhang et al (2007 Med. Phys. 34 4772-81), we believe that the spatiotemporal relationship of the entire lung motion can be accurately modeled based on principle component analysis (PCA) and then a sparse subset of the entire lung, such as an implanted marker, can be used to drive the motion of the entire lung (including the tumor). The goal of this work is twofold. First, we aim to understand the underlying reason why PCA is effective for modeling lung motion and find the optimal number of PCA coefficients for accurate lung motion modeling. We attempt to address the above important problems both in a theoretical framework and in the context of real clinical data. Second, we propose a new method to derive the entire lung motion using a single internal marker based on the PCA model. The main results of this work are as follows. We derived an important property which reveals the implicit regularization imposed by the PCA model. We then studied the model using two mathematical respiratory phantoms and 11 clinical 4DCT scans for eight lung cancer patients. For the mathematical phantoms with cosine and an even power (2n) of cosine motion, we proved that 2 and 2n PCA coefficients and eigenvectors will completely represent the lung motion, respectively. Moreover, for the cosine phantom, we derived the equivalence conditions for the PCA motion model and the physiological 5D lung motion model (Low et al 2005 Int. J. Radiat. Oncol. Biol. Phys. 63 921-9). For the clinical 4DCT data, we demonstrated the modeling power and generalization performance of the PCA model. The average 3D modeling error using PCA was within 1 mm (0.7 ± 0.1 mm). When a single artificial internal marker was used to derive the lung motion, the average 3D error was found to be within 2 mm (1.8 ± 0.3 mm) through comprehensive statistical analysis. The optimal number of PCA coefficients needs to be determined on a patient-by-patient basis and two PCA coefficients seem to be sufficient for accurate modeling of the lung motion for most patients. In conclusion, we have presented thorough theoretical analysis and clinical validation of the PCA lung motion model. The feasibility of deriving the entire lung motion using a single marker has also been demonstrated on clinical data using a simulation approach.

  3. Diagnostic potential for gold nanoparticle-based surface-enhanced Raman spectroscopy to provide colorectal cancer screening using blood serum sample

    NASA Astrophysics Data System (ADS)

    Lin, Duo; Feng, Shangyuan; Pan, Jianji; Chen, Yanping; Lin, Juqiang; Sun, Liqing; Chen, Rong

    2011-11-01

    Surface-enhanced Raman spectroscopy (SERS) is a vibrational spectroscopic technique that is capable of probing the biomolecular changes associated with diseased transformation. The objective of our study was to explore gold nanoparticle based SERS to obtain blood serum biochemical information for non-invasive colorectal cancer detection. SERS measurements were performed on two groups of blood serum samples: one group from patients (n = 38) with pathologically confirmed colorectal cancer and the other group from healthy volunteers (control subjects, n = 45). Tentative assignments of the Raman bands in the measured SERS spectra suggested interesting cancer specific biomolecular changes, including an increase in the relative amounts of nucleic acid, a decrease in the percentage of saccharide and proteins contents in the blood serum of colorectal cancer patients as compared to that of healthy subjects. Principal component analysis (PCA) of the measured SERS spectra separated the spectral features of the two groups into two distinct clusters with little overlaps. Linear discriminate analysis (LDA) based on the PCA generated features differentiated the nasopharyngeal cancer SERS spectra from normal SERS spectra with high sensitivity (97.4%) and specificity (100%). The results from this exploratory study demonstrated that gold nanoparticle based SERS serum analysis combined with PCA-LDA has tremendous potential for the non-invasive detection of colorectal cancers.

  4. Diagnostic potential for gold nanoparticle-based surface-enhanced Raman spectroscopy to provide colorectal cancer screening using blood serum sample

    NASA Astrophysics Data System (ADS)

    Lin, Duo; Feng, Shangyuan; Pan, Jianji; Chen, Yanping; Lin, Juqiang; Sun, Liqing; Chen, Rong

    2012-03-01

    Surface-enhanced Raman spectroscopy (SERS) is a vibrational spectroscopic technique that is capable of probing the biomolecular changes associated with diseased transformation. The objective of our study was to explore gold nanoparticle based SERS to obtain blood serum biochemical information for non-invasive colorectal cancer detection. SERS measurements were performed on two groups of blood serum samples: one group from patients (n = 38) with pathologically confirmed colorectal cancer and the other group from healthy volunteers (control subjects, n = 45). Tentative assignments of the Raman bands in the measured SERS spectra suggested interesting cancer specific biomolecular changes, including an increase in the relative amounts of nucleic acid, a decrease in the percentage of saccharide and proteins contents in the blood serum of colorectal cancer patients as compared to that of healthy subjects. Principal component analysis (PCA) of the measured SERS spectra separated the spectral features of the two groups into two distinct clusters with little overlaps. Linear discriminate analysis (LDA) based on the PCA generated features differentiated the nasopharyngeal cancer SERS spectra from normal SERS spectra with high sensitivity (97.4%) and specificity (100%). The results from this exploratory study demonstrated that gold nanoparticle based SERS serum analysis combined with PCA-LDA has tremendous potential for the non-invasive detection of colorectal cancers.

  5. Variability search in M 31 using principal component analysis and the Hubble Source Catalogue

    NASA Astrophysics Data System (ADS)

    Moretti, M. I.; Hatzidimitriou, D.; Karampelas, A.; Sokolovsky, K. V.; Bonanos, A. Z.; Gavras, P.; Yang, M.

    2018-06-01

    Principal component analysis (PCA) is being extensively used in Astronomy but not yet exhaustively exploited for variability search. The aim of this work is to investigate the effectiveness of using the PCA as a method to search for variable stars in large photometric data sets. We apply PCA to variability indices computed for light curves of 18 152 stars in three fields in M 31 extracted from the Hubble Source Catalogue. The projection of the data into the principal components is used as a stellar variability detection and classification tool, capable of distinguishing between RR Lyrae stars, long-period variables (LPVs) and non-variables. This projection recovered more than 90 per cent of the known variables and revealed 38 previously unknown variable stars (about 30 per cent more), all LPVs except for one object of uncertain variability type. We conclude that this methodology can indeed successfully identify candidate variable stars.

  6. Demixed principal component analysis of neural population data.

    PubMed

    Kobak, Dmitry; Brendel, Wieland; Constantinidis, Christos; Feierstein, Claudia E; Kepecs, Adam; Mainen, Zachary F; Qi, Xue-Lian; Romo, Ranulfo; Uchida, Naoshige; Machens, Christian K

    2016-04-12

    Neurons in higher cortical areas, such as the prefrontal cortex, are often tuned to a variety of sensory and motor variables, and are therefore said to display mixed selectivity. This complexity of single neuron responses can obscure what information these areas represent and how it is represented. Here we demonstrate the advantages of a new dimensionality reduction technique, demixed principal component analysis (dPCA), that decomposes population activity into a few components. In addition to systematically capturing the majority of the variance of the data, dPCA also exposes the dependence of the neural representation on task parameters such as stimuli, decisions, or rewards. To illustrate our method we reanalyze population data from four datasets comprising different species, different cortical areas and different experimental tasks. In each case, dPCA provides a concise way of visualizing the data that summarizes the task-dependent features of the population response in a single figure.

  7. A Genealogical Interpretation of Principal Components Analysis

    PubMed Central

    McVean, Gil

    2009-01-01

    Principal components analysis, PCA, is a statistical method commonly used in population genetics to identify structure in the distribution of genetic variation across geographical location and ethnic background. However, while the method is often used to inform about historical demographic processes, little is known about the relationship between fundamental demographic parameters and the projection of samples onto the primary axes. Here I show that for SNP data the projection of samples onto the principal components can be obtained directly from considering the average coalescent times between pairs of haploid genomes. The result provides a framework for interpreting PCA projections in terms of underlying processes, including migration, geographical isolation, and admixture. I also demonstrate a link between PCA and Wright's fst and show that SNP ascertainment has a largely simple and predictable effect on the projection of samples. Using examples from human genetics, I discuss the application of these results to empirical data and the implications for inference. PMID:19834557

  8. Ionospheric total electron content anomalies due to Typhoon Nakri on 29 May 2008: A nonlinear principal component analysis

    NASA Astrophysics Data System (ADS)

    Lin, Jyh-Woei

    2012-09-01

    This paper uses Nonlinear Principal Component Analysis (NLPCA) and Principal Component Analysis (PCA) to determine Total Electron Content (TEC) anomalies in the ionosphere for the Nakri Typhoon on 29 May, 2008 (UTC). NLPCA, PCA and image processing are applied to the global ionospheric map (GIM) with transforms conducted for the time period 12:00-14:00 UT on 29 May 2008 when the wind was most intense. Results show that at a height of approximately 150-200 km the TEC anomaly using NLPCA is more localized; however its intensity increases with height and becomes more widespread. The TEC anomalies are not found by PCA. Potential causes of the results are discussed with emphasis given to vertical acoustic gravity waves. The approximate position of the typhoon's eye can be detected if the GIM is divided into fine enough maps with adequate spatial-resolution at GPS-TEC receivers. This implies that the trace of the typhoon in the regional GIM is caught using NLPCA.

  9. Diurnal global variability of the Earth's magnetic field during geomagnetically quiet conditions

    NASA Astrophysics Data System (ADS)

    Klausner, V.

    2012-12-01

    This work proposes a methodology (or treatment) to establish a representative signal of the global magnetic diurnal variation. It is based on a spatial distribution in both longitude and latitude of a set of magnetic stations as well as their magnetic behavior on a time basis. We apply the Principal Component Analysis (PCA) technique using gapped wavelet transform and wavelet correlation. This new approach was used to describe the characteristics of the magnetic variations at Vassouras (Brazil) and 12 other magnetic stations spread around the terrestrial globe. Using magnetograms from 2007, we have investigated the global dominant pattern of the Sq variation as a function of low solar activity. This year was divided into two seasons for seasonal variation analysis: solstices (June and December) and equinoxes (March and September). We aim to reconstruct the original geomagnetic data series of the H component taking into account only the diurnal variations with periods of 24 hours on geomagnetically quiet days. We advance a proposal to reconstruct the Sq baseline using only the PCA first mode. The first interpretation of the results suggests that PCA/wavelet method could be used to the reconstruction of the Sq baseline.

  10. PCA feature extraction for change detection in multidimensional unlabeled data.

    PubMed

    Kuncheva, Ludmila I; Faithfull, William J

    2014-01-01

    When classifiers are deployed in real-world applications, it is assumed that the distribution of the incoming data matches the distribution of the data used to train the classifier. This assumption is often incorrect, which necessitates some form of change detection or adaptive classification. While there has been a lot of work on change detection based on the classification error monitored over the course of the operation of the classifier, finding changes in multidimensional unlabeled data is still a challenge. Here, we propose to apply principal component analysis (PCA) for feature extraction prior to the change detection. Supported by a theoretical example, we argue that the components with the lowest variance should be retained as the extracted features because they are more likely to be affected by a change. We chose a recently proposed semiparametric log-likelihood change detection criterion that is sensitive to changes in both mean and variance of the multidimensional distribution. An experiment with 35 datasets and an illustration with a simple video segmentation demonstrate the advantage of using extracted features compared to raw data. Further analysis shows that feature extraction through PCA is beneficial, specifically for data with multiple balanced classes.

  11. Efficient principal component analysis for multivariate 3D voxel-based mapping of brain functional imaging data sets as applied to FDG-PET and normal aging.

    PubMed

    Zuendorf, Gerhard; Kerrouche, Nacer; Herholz, Karl; Baron, Jean-Claude

    2003-01-01

    Principal component analysis (PCA) is a well-known technique for reduction of dimensionality of functional imaging data. PCA can be looked at as the projection of the original images onto a new orthogonal coordinate system with lower dimensions. The new axes explain the variance in the images in decreasing order of importance, showing correlations between brain regions. We used an efficient, stable and analytical method to work out the PCA of Positron Emission Tomography (PET) images of 74 normal subjects using [(18)F]fluoro-2-deoxy-D-glucose (FDG) as a tracer. Principal components (PCs) and their relation to age effects were investigated. Correlations between the projections of the images on the new axes and the age of the subjects were carried out. The first two PCs could be identified as being the only PCs significantly correlated to age. The first principal component, which explained 10% of the data set variance, was reduced only in subjects of age 55 or older and was related to loss of signal in and adjacent to ventricles and basal cisterns, reflecting expected age-related brain atrophy with enlarging CSF spaces. The second principal component, which accounted for 8% of the total variance, had high loadings from prefrontal, posterior parietal and posterior cingulate cortices and showed the strongest correlation with age (r = -0.56), entirely consistent with previously documented age-related declines in brain glucose utilization. Thus, our method showed that the effect of aging on brain metabolism has at least two independent dimensions. This method should have widespread applications in multivariate analysis of brain functional images. Copyright 2002 Wiley-Liss, Inc.

  12. Improved medical image fusion based on cascaded PCA and shift invariant wavelet transforms.

    PubMed

    Reena Benjamin, J; Jayasree, T

    2018-02-01

    In the medical field, radiologists need more informative and high-quality medical images to diagnose diseases. Image fusion plays a vital role in the field of biomedical image analysis. It aims to integrate the complementary information from multimodal images, producing a new composite image which is expected to be more informative for visual perception than any of the individual input images. The main objective of this paper is to improve the information, to preserve the edges and to enhance the quality of the fused image using cascaded principal component analysis (PCA) and shift invariant wavelet transforms. A novel image fusion technique based on cascaded PCA and shift invariant wavelet transforms is proposed in this paper. PCA in spatial domain extracts relevant information from the large dataset based on eigenvalue decomposition, and the wavelet transform operating in the complex domain with shift invariant properties brings out more directional and phase details of the image. The significance of maximum fusion rule applied in dual-tree complex wavelet transform domain enhances the average information and morphological details. The input images of the human brain of two different modalities (MRI and CT) are collected from whole brain atlas data distributed by Harvard University. Both MRI and CT images are fused using cascaded PCA and shift invariant wavelet transform method. The proposed method is evaluated based on three main key factors, namely structure preservation, edge preservation, contrast preservation. The experimental results and comparison with other existing fusion methods show the superior performance of the proposed image fusion framework in terms of visual and quantitative evaluations. In this paper, a complex wavelet-based image fusion has been discussed. The experimental results demonstrate that the proposed method enhances the directional features as well as fine edge details. Also, it reduces the redundant details, artifacts, distortions.

  13. Measuring the Indonesian provinces competitiveness by using PCA technique

    NASA Astrophysics Data System (ADS)

    Runita, Ditha; Fajriyah, Rohmatul

    2017-12-01

    Indonesia is a country which has vast teritoty. It has 34 provinces. Building local competitiveness is critical to enhance the long-term national competitiveness especially for a country as diverse as Indonesia. A competitive local government can attract and maintain successful firms and increase living standards for its inhabitants, because investment and skilled workers gravitate from uncompetitive regions to more competitive ones. Altough there are other methods to measuring competitiveness, but here we have demonstrated a simple method using principal component analysis (PCA). It can directly be applied to correlated, multivariate data. The analysis on Indonesian provinces provides 3 clusters based on the competitiveness measurement and the clusters are Bad, Good and Best perform provinces.

  14. Study on elemental fingerprint of traditional marine Chinese medicine oysters from Jiaozhou Bay, China

    NASA Astrophysics Data System (ADS)

    Zheng, Yongjun; Zheng, Kang; Li, Yantuan

    2012-09-01

    In order to investigate the relationship between the trace elements and the characteristics of the oysters, we analyzed the trace elements present in the germplasm of oysters from different producing areas in the Jiaozhou Bay. The element fingerprints were established to reflect the elemental characteristics of the oysters. Concentration patterns of the elements were deciphered by principle component analysis (PCA) and hierarchical cluster analysis (HCA). The six regions were discriminated with accuracy using HCA and PCA based on the concentration of 16 trace elements. The elements were viewed as characteristic elements of the oysters and the fingerprints of these elements could be used to distinguish the quality of the oysters.

  15. Fast Steerable Principal Component Analysis

    PubMed Central

    Zhao, Zhizhen; Shkolnisky, Yoel; Singer, Amit

    2016-01-01

    Cryo-electron microscopy nowadays often requires the analysis of hundreds of thousands of 2-D images as large as a few hundred pixels in each direction. Here, we introduce an algorithm that efficiently and accurately performs principal component analysis (PCA) for a large set of 2-D images, and, for each image, the set of its uniform rotations in the plane and their reflections. For a dataset consisting of n images of size L × L pixels, the computational complexity of our algorithm is O(nL3 + L4), while existing algorithms take O(nL4). The new algorithm computes the expansion coefficients of the images in a Fourier–Bessel basis efficiently using the nonuniform fast Fourier transform. We compare the accuracy and efficiency of the new algorithm with traditional PCA and existing algorithms for steerable PCA. PMID:27570801

  16. Raman signatures of ferroic domain walls captured by principal component analysis.

    PubMed

    Nataf, G F; Barrett, N; Kreisel, J; Guennou, M

    2018-01-24

    Ferroic domain walls are currently investigated by several state-of-the art techniques in order to get a better understanding of their distinct, functional properties. Here, principal component analysis (PCA) of Raman maps is used to study ferroelectric domain walls (DWs) in LiNbO 3 and ferroelastic DWs in NdGaO 3 . It is shown that PCA allows us to quickly and reliably identify small Raman peak variations at ferroelectric DWs and that the value of a peak shift can be deduced-accurately and without a priori-from a first order Taylor expansion of the spectra. The ability of PCA to separate the contribution of ferroelastic domains and DWs to Raman spectra is emphasized. More generally, our results provide a novel route for the statistical analysis of any property mapped across a DW.

  17. Improved Statistical Fault Detection Technique and Application to Biological Phenomena Modeled by S-Systems.

    PubMed

    Mansouri, Majdi; Nounou, Mohamed N; Nounou, Hazem N

    2017-09-01

    In our previous work, we have demonstrated the effectiveness of the linear multiscale principal component analysis (PCA)-based moving window (MW)-generalized likelihood ratio test (GLRT) technique over the classical PCA and multiscale principal component analysis (MSPCA)-based GLRT methods. The developed fault detection algorithm provided optimal properties by maximizing the detection probability for a particular false alarm rate (FAR) with different values of windows, and however, most real systems are nonlinear, which make the linear PCA method not able to tackle the issue of non-linearity to a great extent. Thus, in this paper, first, we apply a nonlinear PCA to obtain an accurate principal component of a set of data and handle a wide range of nonlinearities using the kernel principal component analysis (KPCA) model. The KPCA is among the most popular nonlinear statistical methods. Second, we extend the MW-GLRT technique to one that utilizes exponential weights to residuals in the moving window (instead of equal weightage) as it might be able to further improve fault detection performance by reducing the FAR using exponentially weighed moving average (EWMA). The developed detection method, which is called EWMA-GLRT, provides improved properties, such as smaller missed detection and FARs and smaller average run length. The idea behind the developed EWMA-GLRT is to compute a new GLRT statistic that integrates current and previous data information in a decreasing exponential fashion giving more weight to the more recent data. This provides a more accurate estimation of the GLRT statistic and provides a stronger memory that will enable better decision making with respect to fault detection. Therefore, in this paper, a KPCA-based EWMA-GLRT method is developed and utilized in practice to improve fault detection in biological phenomena modeled by S-systems and to enhance monitoring process mean. The idea behind a KPCA-based EWMA-GLRT fault detection algorithm is to combine the advantages brought forward by the proposed EWMA-GLRT fault detection chart with the KPCA model. Thus, it is used to enhance fault detection of the Cad System in E. coli model through monitoring some of the key variables involved in this model such as enzymes, transport proteins, regulatory proteins, lysine, and cadaverine. The results demonstrate the effectiveness of the proposed KPCA-based EWMA-GLRT method over Q , GLRT, EWMA, Shewhart, and moving window-GLRT methods. The detection performance is assessed and evaluated in terms of FAR, missed detection rates, and average run length (ARL 1 ) values.

  18. A comparison of the usefulness of canonical analysis, principal components analysis, and band selection for extraction of features from TMS data for landcover analysis

    NASA Technical Reports Server (NTRS)

    Boyd, R. K.; Brumfield, J. O.; Campbell, W. J.

    1984-01-01

    Three feature extraction methods, canonical analysis (CA), principal component analysis (PCA), and band selection, have been applied to Thematic Mapper Simulator (TMS) data in order to evaluate the relative performance of the methods. The results obtained show that CA is capable of providing a transformation of TMS data which leads to better classification results than provided by all seven bands, by PCA, or by band selection. A second conclusion drawn from the study is that TMS bands 2, 3, 4, and 7 (thermal) are most important for landcover classification.

  19. Fourier Transform Infrared Spectroscopy (FTIR) and Multivariate Analysis for Identification of Different Vegetable Oils Used in Biodiesel Production

    PubMed Central

    Mueller, Daniela; Ferrão, Marco Flôres; Marder, Luciano; da Costa, Adilson Ben; de Cássia de Souza Schneider, Rosana

    2013-01-01

    The main objective of this study was to use infrared spectroscopy to identify vegetable oils used as raw material for biodiesel production and apply multivariate analysis to the data. Six different vegetable oil sources—canola, cotton, corn, palm, sunflower and soybeans—were used to produce biodiesel batches. The spectra were acquired by Fourier transform infrared spectroscopy using a universal attenuated total reflectance sensor (FTIR-UATR). For the multivariate analysis principal component analysis (PCA), hierarchical cluster analysis (HCA), interval principal component analysis (iPCA) and soft independent modeling of class analogy (SIMCA) were used. The results indicate that is possible to develop a methodology to identify vegetable oils used as raw material in the production of biodiesel by FTIR-UATR applying multivariate analysis. It was also observed that the iPCA found the best spectral range for separation of biodiesel batches using FTIR-UATR data, and with this result, the SIMCA method classified 100% of the soybean biodiesel samples. PMID:23539030

  20. WE-DE-201-04: Cross Validation of Knowledge-Based Treatment Planning for Prostate LDR Brachytherapy Using Principle Component Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Roper, J; Ghavidel, B; Godette, K

    Purpose: To validate a knowledge-based algorithm for prostate LDR brachytherapy treatment planning. Methods: A dataset of 100 cases was compiled from an active prostate seed implant service. Cases were randomized into 10 subsets. For each subset, the 90 remaining library cases were registered to a common reference frame and then characterized on a point by point basis using principle component analysis (PCA). Each test case was converted to PCA vectors using the same process and compared with each library case using a Mahalanobis distance to evaluate similarity. Rank order PCA scores were used to select the best-matched library case. Themore » seed arrangement was extracted from the best-matched case and used as a starting point for planning the test case. Any subsequent modifications were recorded that required input from a treatment planner to achieve V100>95%, V150<60%, V200<20%. To simulate operating-room planning constraints, seed activity was held constant, and the seed count could not increase. Results: The computational time required to register test-case contours and evaluate PCA similarity across the library was 10s. Preliminary analysis of 2 subsets shows that 9 of 20 test cases did not require any seed modifications to obtain an acceptable plan. Five test cases required fewer than 10 seed modifications or a grid shift. Another 5 test cases required approximately 20 seed modifications. An acceptable plan was not achieved for 1 outlier, which was substantially larger than its best match. Modifications took between 5s and 6min. Conclusion: A knowledge-based treatment planning algorithm for prostate LDR brachytherapy is being cross validated using 100 prior cases. Preliminary results suggest that for this size library, acceptable plans can be achieved without planner input in about half of the cases while varying amounts of planner input are needed in remaining cases. Computational time and planning time are compatible with clinical practice.« less

  1. Principal component analysis of normalized full spectrum mass spectrometry data in multiMS-toolbox: An effective tool to identify important factors for classification of different metabolic patterns and bacterial strains.

    PubMed

    Cejnar, Pavel; Kuckova, Stepanka; Prochazka, Ales; Karamonova, Ludmila; Svobodova, Barbora

    2018-06-15

    Explorative statistical analysis of mass spectrometry data is still a time-consuming step. We analyzed critical factors for application of principal component analysis (PCA) in mass spectrometry and focused on two whole spectrum based normalization techniques and their application in the analysis of registered peak data and, in comparison, in full spectrum data analysis. We used this technique to identify different metabolic patterns in the bacterial culture of Cronobacter sakazakii, an important foodborne pathogen. Two software utilities, the ms-alone, a python-based utility for mass spectrometry data preprocessing and peak extraction, and the multiMS-toolbox, an R software tool for advanced peak registration and detailed explorative statistical analysis, were implemented. The bacterial culture of Cronobacter sakazakii was cultivated on Enterobacter sakazakii Isolation Agar, Blood Agar Base and Tryptone Soya Agar for 24 h and 48 h and applied by the smear method on an Autoflex speed MALDI-TOF mass spectrometer. For three tested cultivation media only two different metabolic patterns of Cronobacter sakazakii were identified using PCA applied on data normalized by two different normalization techniques. Results from matched peak data and subsequent detailed full spectrum analysis identified only two different metabolic patterns - a cultivation on Enterobacter sakazakii Isolation Agar showed significant differences to the cultivation on the other two tested media. The metabolic patterns for all tested cultivation media also proved the dependence on cultivation time. Both whole spectrum based normalization techniques together with the full spectrum PCA allow identification of important discriminative factors in experiments with several variable condition factors avoiding any problems with improper identification of peaks or emphasis on bellow threshold peak data. The amounts of processed data remain still manageable. Both implemented software utilities are available free of charge from http://uprt.vscht.cz/ms. Copyright © 2018 John Wiley & Sons, Ltd.

  2. Free-energy landscape of RNA hairpins constructed via dihedral angle principal component analysis.

    PubMed

    Riccardi, Laura; Nguyen, Phuong H; Stock, Gerhard

    2009-12-31

    To systematically construct a low-dimensional free-energy landscape of RNA systems from a classical molecular dynamics simulation, various versions of the principal component analysis (PCA) are compared: the cPCA using the Cartesian coordinates of all atoms, the dPCA using the sine/cosine-transformed six backbone dihedral angles as well as the glycosidic torsional angle chi and the pseudorotational angle P, the aPCA which ignores the circularity of the 6 + 2 dihedral angles of the RNA, and the dPCA(etatheta), which approximates the 6 backbone dihedral angles by 2 pseudotorsional angles eta and theta. As representative examples, a 10-nucleotide UUCG hairpin and the 36-nucleotide segment SL1 of the Psi site of HIV-1 are studied by classical molecular dynamics simulation, using the Amber all-atom force field and explicit solvent. It is shown that the conformational heterogeneity of the RNA hairpins can only be resolved by an angular PCA such as the dPCA but not by the cPCA using Cartesian coordinates. Apart from possible artifacts due to the coupling of overall and internal motion, this is because the details of hydrogen bonding and stacking interactions but also of global structural rearrangements of the RNA are better discriminated by dihedral angles. In line with recent experiments, it is found that the free energy landscape of RNA hairpins is quite rugged and contains various metastable conformational states which may serve as an intermediate for unfolding.

  3. Performance analysis of a Principal Component Analysis ensemble classifier for Emotiv headset P300 spellers.

    PubMed

    Elsawy, Amr S; Eldawlatly, Seif; Taher, Mohamed; Aly, Gamal M

    2014-01-01

    The current trend to use Brain-Computer Interfaces (BCIs) with mobile devices mandates the development of efficient EEG data processing methods. In this paper, we demonstrate the performance of a Principal Component Analysis (PCA) ensemble classifier for P300-based spellers. We recorded EEG data from multiple subjects using the Emotiv neuroheadset in the context of a classical oddball P300 speller paradigm. We compare the performance of the proposed ensemble classifier to the performance of traditional feature extraction and classifier methods. Our results demonstrate the capability of the PCA ensemble classifier to classify P300 data recorded using the Emotiv neuroheadset with an average accuracy of 86.29% on cross-validation data. In addition, offline testing of the recorded data reveals an average classification accuracy of 73.3% that is significantly higher than that achieved using traditional methods. Finally, we demonstrate the effect of the parameters of the P300 speller paradigm on the performance of the method.

  4. Fast discrimination of traditional Chinese medicine according to geographical origins with FTIR spectroscopy and advanced pattern recognition techniques

    NASA Astrophysics Data System (ADS)

    Li, Ning; Wang, Yan; Xu, Kexin

    2006-08-01

    Combined with Fourier transform infrared (FTIR) spectroscopy and three kinds of pattern recognition techniques, 53 traditional Chinese medicine danshen samples were rapidly discriminated according to geographical origins. The results showed that it was feasible to discriminate using FTIR spectroscopy ascertained by principal component analysis (PCA). An effective model was built by employing the Soft Independent Modeling of Class Analogy (SIMCA) and PCA, and 82% of the samples were discriminated correctly. Through use of the artificial neural network (ANN)-based back propagation (BP) network, the origins of danshen were completely classified.

  5. Radar fall detection using principal component analysis

    NASA Astrophysics Data System (ADS)

    Jokanovic, Branka; Amin, Moeness; Ahmad, Fauzia; Boashash, Boualem

    2016-05-01

    Falls are a major cause of fatal and nonfatal injuries in people aged 65 years and older. Radar has the potential to become one of the leading technologies for fall detection, thereby enabling the elderly to live independently. Existing techniques for fall detection using radar are based on manual feature extraction and require significant parameter tuning in order to provide successful detections. In this paper, we employ principal component analysis for fall detection, wherein eigen images of observed motions are employed for classification. Using real data, we demonstrate that the PCA based technique provides performance improvement over the conventional feature extraction methods.

  6. Phenotypic Characterization and Multivariate Analysis to Explain Body Conformation in Lesser Known Buffalo (Bubalus bubalis) from North India

    PubMed Central

    Vohra, V.; Niranjan, S. K.; Mishra, A. K.; Jamuna, V.; Chopra, A.; Sharma, Neelesh; Jeong, Dong Kee

    2015-01-01

    Phenotypic characterization and body biometric in 13 traits (height at withers, body length, chest girth, paunch girth, ear length, tail length, length of tail up to switch, face length, face width, horn length, circumference of horn at base, distances between pin bone and hip bone) were recorded in 233 adult Gojri buffaloes from Punjab and Himachal Pradesh states of India. Traits were analysed by using varimax rotated principal component analysis (PCA) with Kaiser Normalization to explain body conformation. PCA revealed four components which explained about 70.9% of the total variation. First component described the general body conformation and explained 31.5% of total variation. It was represented by significant positive high loading of height at wither, body length, heart girth, face length and face width. The communality ranged from 0.83 (hip bone distance) to 0.45 (horn length) and unique factors ranged from 0.16 to 0.55 for all these 13 different biometric traits. Present study suggests that first principal component can be used in the evaluation and comparison of body conformation in buffaloes and thus provides an opportunity to distinguish between early and late maturing to adult, based on a small group of biometric traits to explain body conformation in adult buffaloes. PMID:25656215

  7. Improving the accuracy of Density Functional Theory (DFT) calculation for homolysis bond dissociation energies of Y-NO bond: generalized regression neural network based on grey relational analysis and principal component analysis.

    PubMed

    Li, Hong Zhi; Tao, Wei; Gao, Ting; Li, Hui; Lu, Ying Hua; Su, Zhong Min

    2011-01-01

    We propose a generalized regression neural network (GRNN) approach based on grey relational analysis (GRA) and principal component analysis (PCA) (GP-GRNN) to improve the accuracy of density functional theory (DFT) calculation for homolysis bond dissociation energies (BDE) of Y-NO bond. As a demonstration, this combined quantum chemistry calculation with the GP-GRNN approach has been applied to evaluate the homolysis BDE of 92 Y-NO organic molecules. The results show that the ull-descriptor GRNN without GRA and PCA (F-GRNN) and with GRA (G-GRNN) approaches reduce the root-mean-square (RMS) of the calculated homolysis BDE of 92 organic molecules from 5.31 to 0.49 and 0.39 kcal mol(-1) for the B3LYP/6-31G (d) calculation. Then the newly developed GP-GRNN approach further reduces the RMS to 0.31 kcal mol(-1). Thus, the GP-GRNN correction on top of B3LYP/6-31G (d) can improve the accuracy of calculating the homolysis BDE in quantum chemistry and can predict homolysis BDE which cannot be obtained experimentally.

  8. Spectral discrimination of serum from liver cancer and liver cirrhosis using Raman spectroscopy

    NASA Astrophysics Data System (ADS)

    Yang, Tianyue; Li, Xiaozhou; Yu, Ting; Sun, Ruomin; Li, Siqi

    2011-07-01

    In this paper, Raman spectra of human serum were measured using Raman spectroscopy, then the spectra was analyzed by multivariate statistical methods of principal component analysis (PCA). Then linear discriminant analysis (LDA) was utilized to differentiate the loading score of different diseases as the diagnosing algorithm. Artificial neural network (ANN) was used for cross-validation. The diagnosis sensitivity and specificity by PCA-LDA are 88% and 79%, while that of the PCA-ANN are 89% and 95%. It can be seen that modern analyzing method is a useful tool for the analysis of serum spectra for diagnosing diseases.

  9. Apple variety and maturity profiling of base ciders using UV spectroscopy.

    PubMed

    Girschik, Lachlan; Jones, Joanna E; Kerslake, Fiona L; Robertson, Mark; Dambergs, Robert G; Swarts, Nigel D

    2017-08-01

    Varietal base ciders were produced from three varieties of dessert apples ('Pink Lady®', 'Royal Gala' and 'Red Delicious') at pre-commercial, commercial and post-commercial harvest timings. Rapid analytical methods were used to categorise the base ciders, and data analysed using principal component analysis (PCA). The titratable acidity of apple must was significantly higher for the pre-commercial harvest fruit for both the 'Royal Gala' and 'Red Delicious' varieties. The base cider phenolic content was highest in the pre-commercial harvest fruit for all varieties. 'Red Delicious' had the highest total phenolics as determined by spectral analysis and supported by the classification provided by the PCA analysis. The spectral fingerprints of the ciders showed two main peaks at approximately 280nm and 320nm indicating phenolic concentrations. Studies analysing characteristics of dessert apple varieties with relevance for cider production will allow for informed decision making for both apple producers and cider makers. Copyright © 2017 Elsevier Ltd. All rights reserved.

  10. Selection of solubility parameters for characterization of pharmaceutical excipients.

    PubMed

    Adamska, Katarzyna; Voelkel, Adam; Héberger, Károly

    2007-11-09

    The solubility parameter (delta(2)), corrected solubility parameter (delta(T)) and its components (delta(d), delta(p), delta(h)) were determined for series of pharmaceutical excipients by using inverse gas chromatography (IGC). Principal component analysis (PCA) was applied for the selection of the solubility parameters which assure the complete characterization of examined materials. Application of PCA suggests that complete description of examined materials is achieved with four solubility parameters, i.e. delta(2) and Hansen solubility parameters (delta(d), delta(p), delta(h)). Selection of the excipients through PCA of their solubility parameters data can be used for prediction of their behavior in a multi-component system, e.g. for selection of the best materials to form stable pharmaceutical liquid mixtures or stable coating formulation.

  11. Health status monitoring for ICU patients based on locally weighted principal component analysis.

    PubMed

    Ding, Yangyang; Ma, Xin; Wang, Youqing

    2018-03-01

    Intelligent status monitoring for critically ill patients can help medical stuff quickly discover and assess the changes of disease and then make appropriate treatment strategy. However, general-type monitoring model now widely used is difficult to adapt the changes of intensive care unit (ICU) patients' status due to its fixed pattern, and a more robust, efficient and fast monitoring model should be developed to the individual. A data-driven learning approach combining locally weighted projection regression (LWPR) and principal component analysis (PCA) is firstly proposed and applied to monitor the nonlinear process of patients' health status in ICU. LWPR is used to approximate the complex nonlinear process with local linear models, in which PCA could be further applied to status monitoring, and finally a global weighted statistic will be acquired for detecting the possible abnormalities. Moreover, some improved versions are developed, such as LWPR-MPCA and LWPR-JPCA, which also have superior performance. Eighteen subjects were selected from the Physiobank's Multi-parameter Intelligent Monitoring for Intensive Care II (MIMIC II) database, and two vital signs of each subject were chosen for online monitoring. The proposed method was compared with several existing methods including traditional PCA, Partial least squares (PLS), just in time learning combined with modified PCA (L-PCA), and Kernel PCA (KPCA). The experimental results demonstrated that the mean fault detection rate (FDR) of PCA can be improved by 41.7% after adding LWPR. The mean FDR of LWPR-MPCA was increased by 8.3%, compared with the latest reported method L-PCA. Meanwhile, LWPR spent less training time than others, especially KPCA. LWPR is first introduced into ICU patients monitoring and achieves the best monitoring performance including adaptability to changes in patient status, sensitivity for abnormality detection as well as its fast learning speed and low computational complexity. The algorithm is an excellent approach to establishing a personalized model for patients, which is the mainstream direction of modern medicine in the following development, as well as improving the global monitoring performance. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.

  12. Features of Cross-Correlation Analysis in a Data-Driven Approach for Structural Damage Assessment

    PubMed Central

    Camacho Navarro, Jhonatan; Ruiz, Magda; Villamizar, Rodolfo; Mujica, Luis

    2018-01-01

    This work discusses the advantage of using cross-correlation analysis in a data-driven approach based on principal component analysis (PCA) and piezodiagnostics to obtain successful diagnosis of events in structural health monitoring (SHM). In this sense, the identification of noisy data and outliers, as well as the management of data cleansing stages can be facilitated through the implementation of a preprocessing stage based on cross-correlation functions. Additionally, this work evidences an improvement in damage detection when the cross-correlation is included as part of the whole damage assessment approach. The proposed methodology is validated by processing data measurements from piezoelectric devices (PZT), which are used in a piezodiagnostics approach based on PCA and baseline modeling. Thus, the influence of cross-correlation analysis used in the preprocessing stage is evaluated for damage detection by means of statistical plots and self-organizing maps. Three laboratory specimens were used as test structures in order to demonstrate the validity of the methodology: (i) a carbon steel pipe section with leak and mass damage types, (ii) an aircraft wing specimen, and (iii) a blade of a commercial aircraft turbine, where damages are specified as mass-added. As the main concluding remark, the suitability of cross-correlation features combined with a PCA-based piezodiagnostic approach in order to achieve a more robust damage assessment algorithm is verified for SHM tasks. PMID:29762505

  13. Features of Cross-Correlation Analysis in a Data-Driven Approach for Structural Damage Assessment.

    PubMed

    Camacho Navarro, Jhonatan; Ruiz, Magda; Villamizar, Rodolfo; Mujica, Luis; Quiroga, Jabid

    2018-05-15

    This work discusses the advantage of using cross-correlation analysis in a data-driven approach based on principal component analysis (PCA) and piezodiagnostics to obtain successful diagnosis of events in structural health monitoring (SHM). In this sense, the identification of noisy data and outliers, as well as the management of data cleansing stages can be facilitated through the implementation of a preprocessing stage based on cross-correlation functions. Additionally, this work evidences an improvement in damage detection when the cross-correlation is included as part of the whole damage assessment approach. The proposed methodology is validated by processing data measurements from piezoelectric devices (PZT), which are used in a piezodiagnostics approach based on PCA and baseline modeling. Thus, the influence of cross-correlation analysis used in the preprocessing stage is evaluated for damage detection by means of statistical plots and self-organizing maps. Three laboratory specimens were used as test structures in order to demonstrate the validity of the methodology: (i) a carbon steel pipe section with leak and mass damage types, (ii) an aircraft wing specimen, and (iii) a blade of a commercial aircraft turbine, where damages are specified as mass-added. As the main concluding remark, the suitability of cross-correlation features combined with a PCA-based piezodiagnostic approach in order to achieve a more robust damage assessment algorithm is verified for SHM tasks.

  14. SU-G-BRA-03: PCA Based Imaging Angle Optimization for 2D Cine MRI Based Radiotherapy Guidance

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, T; Yue, N; Jabbour, S

    2016-06-15

    Purpose: To develop an imaging angle optimization methodology for orthogonal 2D cine MRI based radiotherapy guidance using Principal Component Analysis (PCA) of target motion retrieved from 4DCT. Methods: We retrospectively analyzed 4DCT of 6 patients with lung tumor. A radiation oncologist manually contoured the target volume at the maximal inhalation phase of the respiratory cycle. An object constrained deformable image registration (DIR) method has been developed to track the target motion along the respiration at ten phases. The motion of the center of the target mass has been analyzed using the PCA to find out the principal motion components thatmore » were uncorrelated with each other. Two orthogonal image planes for cineMRI have been determined using this method to minimize the through plane motion during MRI based radiotherapy guidance. Results: 3D target respiratory motion for all 6 patients has been efficiently retrieved from 4DCT. In this process, the object constrained DIR demonstrated satisfactory accuracy and efficiency to enable the automatic motion tracking for clinical application. The average motion amplitude in the AP, lateral, and longitudinal directions were 3.6mm (min: 1.6mm, max: 5.6mm), 1.7mm (min: 0.6mm, max: 2.7mm), and 5.6mm (min: 1.8mm, max: 16.1mm), respectively. Based on PCA, the optimal orthogonal imaging planes were determined for cineMRI. The average angular difference between the PCA determined imaging planes and the traditional AP and lateral imaging planes were 47 and 31 degrees, respectively. After optimization, the average amplitude of through plane motion reduced from 3.6mm in AP images to 2.5mm (min:1.3mm, max:3.9mm); and from 1.7mm in lateral images to 0.6mm (min: 0.2mm, max:1.5mm), while the principal in plane motion amplitude increased from 5.6mm to 6.5mm (min: 2.8mm, max: 17mm). Conclusion: DIR and PCA can be used to optimize the orthogonal image planes of cineMRI to minimize the through plane motion during radiotherapy guidance.« less

  15. A new statistic for identifying batch effects in high-throughput genomic data that uses guided principal component analysis.

    PubMed

    Reese, Sarah E; Archer, Kellie J; Therneau, Terry M; Atkinson, Elizabeth J; Vachon, Celine M; de Andrade, Mariza; Kocher, Jean-Pierre A; Eckel-Passow, Jeanette E

    2013-11-15

    Batch effects are due to probe-specific systematic variation between groups of samples (batches) resulting from experimental features that are not of biological interest. Principal component analysis (PCA) is commonly used as a visual tool to determine whether batch effects exist after applying a global normalization method. However, PCA yields linear combinations of the variables that contribute maximum variance and thus will not necessarily detect batch effects if they are not the largest source of variability in the data. We present an extension of PCA to quantify the existence of batch effects, called guided PCA (gPCA). We describe a test statistic that uses gPCA to test whether a batch effect exists. We apply our proposed test statistic derived using gPCA to simulated data and to two copy number variation case studies: the first study consisted of 614 samples from a breast cancer family study using Illumina Human 660 bead-chip arrays, whereas the second case study consisted of 703 samples from a family blood pressure study that used Affymetrix SNP Array 6.0. We demonstrate that our statistic has good statistical properties and is able to identify significant batch effects in two copy number variation case studies. We developed a new statistic that uses gPCA to identify whether batch effects exist in high-throughput genomic data. Although our examples pertain to copy number data, gPCA is general and can be used on other data types as well. The gPCA R package (Available via CRAN) provides functionality and data to perform the methods in this article. reesese@vcu.edu

  16. Principal component analysis-based imaging angle determination for 3D motion monitoring using single-slice on-board imaging.

    PubMed

    Chen, Ting; Zhang, Miao; Jabbour, Salma; Wang, Hesheng; Barbee, David; Das, Indra J; Yue, Ning

    2018-04-10

    Through-plane motion introduces uncertainty in three-dimensional (3D) motion monitoring when using single-slice on-board imaging (OBI) modalities such as cine MRI. We propose a principal component analysis (PCA)-based framework to determine the optimal imaging plane to minimize the through-plane motion for single-slice imaging-based motion monitoring. Four-dimensional computed tomography (4DCT) images of eight thoracic cancer patients were retrospectively analyzed. The target volumes were manually delineated at different respiratory phases of 4DCT. We performed automated image registration to establish the 4D respiratory target motion trajectories for all patients. PCA was conducted using the motion information to define the three principal components of the respiratory motion trajectories. Two imaging planes were determined perpendicular to the second and third principal component, respectively, to avoid imaging with the primary principal component of the through-plane motion. Single-slice images were reconstructed from 4DCT in the PCA-derived orthogonal imaging planes and were compared against the traditional AP/Lateral image pairs on through-plane motion, residual error in motion monitoring, absolute motion amplitude error and the similarity between target segmentations at different phases. We evaluated the significance of the proposed motion monitoring improvement using paired t test analysis. The PCA-determined imaging planes had overall less through-plane motion compared against the AP/Lateral image pairs. For all patients, the average through-plane motion was 3.6 mm (range: 1.6-5.6 mm) for the AP view and 1.7 mm (range: 0.6-2.7 mm) for the Lateral view. With PCA optimization, the average through-plane motion was 2.5 mm (range: 1.3-3.9 mm) and 0.6 mm (range: 0.2-1.5 mm) for the two imaging planes, respectively. The absolute residual error of the reconstructed max-exhale-to-inhale motion averaged 0.7 mm (range: 0.4-1.3 mm, 95% CI: 0.4-1.1 mm) using optimized imaging planes, averaged 0.5 mm (range: 0.3-1.0 mm, 95% CI: 0.2-0.8 mm) using an imaging plane perpendicular to the minimal motion component only and averaged 1.3 mm (range: 0.4-2.8 mm, 95% CI: 0.4-2.3 mm) in AP/Lateral orthogonal image pairs. The root-mean-square error of reconstructed displacement was 0.8 mm for optimized imaging planes, 0.6 mm for imaging plane perpendicular to the minimal motion component only, and 1.6 mm for AP/Lateral orthogonal image pairs. When using the optimized imaging planes for motion monitoring, there was no significant absolute amplitude error of the reconstructed motion (P = 0.0988), while AP/Lateral images had significant error (P = 0.0097) with a paired t test. The average surface distance (ASD) between overlaid two-dimensional (2D) tumor segmentation at end-of-inhale and end-of-exhale for all eight patients was 0.6 ± 0.2 mm in optimized imaging planes and 1.4 ± 0.8 mm in AP/Lateral images. The Dice similarity coefficient (DSC) between overlaid 2D tumor segmentation at end-of-inhale and end-of-exhale for all eight patients was 0.96 ± 0.03 in optimized imaging planes and 0.89 ± 0.05 in AP/Lateral images. Both ASD (P = 0.034) and DSC (P = 0.022) were significantly improved in the optimized imaging planes. Motion monitoring using imaging planes determined by the proposed PCA-based framework had significantly improved performance. Single-slice image-based motion tracking can be used for clinical implementations such as MR image-guided radiation therapy (MR-IGRT). © 2018 American Association of Physicists in Medicine.

  17. Collagen-based proteinaceous binder-pigment interaction study under UV ageing conditions by MALDI-TOF-MS and principal component analysis.

    PubMed

    Romero-Pastor, Julia; Navas, Natalia; Kuckova, Stepanka; Rodríguez-Navarro, Alejandro; Cardell, Carolina

    2012-03-01

    This study focuses on acquiring information on the degradation process of proteinaceous binders due to ultra violet (UV) radiation and possible interactions owing to the presence of historical mineral pigments. With this aim, three different paint model samples were prepared according to medieval recipes, using rabbit glue as proteinaceus binders. One of these model samples contained only the binder, and the other two were prepared by mixing each of the pigments (cinnabar or azurite) with the binder (glue tempera model samples). The model samples were studied by applying Principal Component Analysis (PCA) to their mass spectra obtained with Matrix-Assisted Laser Desorption/Ionization-Time of Flight Mass Spectrometry (MALDI-TOF-MS). The complementary use of Fourier Transform Infrared Spectroscopy to study conformational changes of secondary structure of the proteinaceous binder is also proposed. Ageing effects on the model samples after up to 3000 h of UV irradiation were periodically analyzed by the proposed approach. PCA on MS data proved capable of identifying significant changes in the model samples, and the results suggested different aging behavior based on the pigment present. This research represents the first attempt to use this approach (PCA on MALDI-TOF-MS data) in the field of Cultural Heritage and demonstrates the potential benefits in the study of proteinaceous artistic materials for purposes of conservation and restoration. Copyright © 2012 John Wiley & Sons, Ltd.

  18. Principal component analysis acceleration of rovibrational coarse-grain models for internal energy excitation and dissociation

    NASA Astrophysics Data System (ADS)

    Bellemans, Aurélie; Parente, Alessandro; Magin, Thierry

    2018-04-01

    The present work introduces a novel approach for obtaining reduced chemistry representations of large kinetic mechanisms in strong non-equilibrium conditions. The need for accurate reduced-order models arises from compression of large ab initio quantum chemistry databases for their use in fluid codes. The method presented in this paper builds on existing physics-based strategies and proposes a new approach based on the combination of a simple coarse grain model with Principal Component Analysis (PCA). The internal energy levels of the chemical species are regrouped in distinct energy groups with a uniform lumping technique. Following the philosophy of machine learning, PCA is applied on the training data provided by the coarse grain model to find an optimally reduced representation of the full kinetic mechanism. Compared to recently published complex lumping strategies, no expert judgment is required before the application of PCA. In this work, we will demonstrate the benefits of the combined approach, stressing its simplicity, reliability, and accuracy. The technique is demonstrated by reducing the complex quantum N2(g+1Σ) -N(S4u ) database for studying molecular dissociation and excitation in strong non-equilibrium. Starting from detailed kinetics, an accurate reduced model is developed and used to study non-equilibrium properties of the N2(g+1Σ) -N(S4u ) system in shock relaxation simulations.

  19. Coupling of on-column trypsin digestion-peptide mapping and principal component analysis for stability and biosimilarity assessment of recombinant human growth hormone.

    PubMed

    Shatat, Sara M; Eltanany, Basma M; Mohamed, Abeer A; Al-Ghobashy, Medhat A; Fathalla, Faten A; Abbas, Samah S

    2018-01-01

    Peptide mapping (PM) is a vital technique in biopharmaceutical industry. The fingerprint obtained helps to qualitatively confirm host stability as well as verify primary structure, purity and integrity of the target protein. Yet, in-solution digestion followed by tandem mass spectrometry is not suitable as a routine quality control test. It is time consuming and requires sophisticated, expensive instruments and highly skilled operators. In an attempt to enhance the fuctionality of PM and extract multi-dimentional data about various critical quality attributes and comparability of biosimilars, coupling of PM generated using immobilized trypsin followed by HPLC-UV to principal component analysis (PCA) is proposed. Recombinant human growth hormone (rhGH); was selected as a model biopharmaceutical since it is available in the market from different manufacturers and its PM is a well-established pharmacopoeial test. Samples of different rhGH biosimilars as well as degraded samples: deamidated and oxidized were subjected to trypsin digestion followed by RP-HPLC-UV analysis. PCA of the entire chromatograms of test and reference samples was then conducted. Comparison of the scores of samples and investigation of the loadings plots clearly indicated the applicability of PM-PCA for: i) identity testing, ii) biosimilarity assessment and iii) stability evaluation. Hotelling's T 2 and Q statistics were employed at 95% confidence level to measure the variation and to test the conformance of each sample to the PCA model, respectively. Coupling of PM to PCA provided a novel tool to identify peptide fragments responsible for variation between the test and reference samples as well as evaluation of the extent and relative significance of this variability. Transformation of conventional PM that is largely based on subjective visual comparison into an objective statiscally-guided analysis framework should provide a simple and economic tool to help both manufacturers and regulatory authorities in quality and biosimilarity assessment of biopharmaceuticals. Copyright © 2017 Elsevier B.V. All rights reserved.

  20. An improved PCA method with application to boiler leak detection.

    PubMed

    Sun, Xi; Marquez, Horacio J; Chen, Tongwen; Riaz, Muhammad

    2005-07-01

    Principal component analysis (PCA) is a popular fault detection technique. It has been widely used in process industries, especially in the chemical industry. In industrial applications, achieving a sensitive system capable of detecting incipient faults, which maintains the false alarm rate to a minimum, is a crucial issue. Although a lot of research has been focused on these issues for PCA-based fault detection and diagnosis methods, sensitivity of the fault detection scheme versus false alarm rate continues to be an important issue. In this paper, an improved PCA method is proposed to address this problem. In this method, a new data preprocessing scheme and a new fault detection scheme designed for Hotelling's T2 as well as the squared prediction error are developed. A dynamic PCA model is also developed for boiler leak detection. This new method is applied to boiler water/steam leak detection with real data from Syncrude Canada's utility plant in Fort McMurray, Canada. Our results demonstrate that the proposed method can effectively reduce false alarm rate, provide effective and correct leak alarms, and give early warning to operators.

  1. Detecting most influencing courses on students grades using block PCA

    NASA Astrophysics Data System (ADS)

    Othman, Osama H.; Gebril, Rami Salah

    2014-12-01

    One of the modern solutions adopted in dealing with the problem of large number of variables in statistical analyses is the Block Principal Component Analysis (Block PCA). This modified technique can be used to reduce the vertical dimension (variables) of the data matrix Xn×p by selecting a smaller number of variables, (say m) containing most of the statistical information. These selected variables can then be employed in further investigations and analyses. Block PCA is an adapted multistage technique of the original PCA. It involves the application of Cluster Analysis (CA) and variable selection throughout sub principal components scores (PC's). The application of Block PCA in this paper is a modified version of the original work of Liu et al (2002). The main objective was to apply PCA on each group of variables, (established using cluster analysis), instead of involving the whole large pack of variables which was proved to be unreliable. In this work, the Block PCA is used to reduce the size of a huge data matrix ((n = 41) × (p = 251)) consisting of Grade Point Average (GPA) of the students in 251 courses (variables) in the faculty of science in Benghazi University. In other words, we are constructing a smaller analytical data matrix of the GPA's of the students with less variables containing most variation (statistical information) in the original database. By applying the Block PCA, (12) courses were found to `absorb' most of the variation or influence from the original data matrix, and hence worth to be keep for future statistical exploring and analytical studies. In addition, the course Independent Study (Math.) was found to be the most influencing course on students GPA among the 12 selected courses.

  2. Understanding deformation mechanisms during powder compaction using principal component analysis of compression data.

    PubMed

    Roopwani, Rahul; Buckner, Ira S

    2011-10-14

    Principal component analysis (PCA) was applied to pharmaceutical powder compaction. A solid fraction parameter (SF(c/d)) and a mechanical work parameter (W(c/d)) representing irreversible compression behavior were determined as functions of applied load. Multivariate analysis of the compression data was carried out using PCA. The first principal component (PC1) showed loadings for the solid fraction and work values that agreed with changes in the relative significance of plastic deformation to consolidation at different pressures. The PC1 scores showed the same rank order as the relative plasticity ranking derived from the literature for common pharmaceutical materials. The utility of PC1 in understanding deformation was extended to binary mixtures using a subset of the original materials. Combinations of brittle and plastic materials were characterized using the PCA method. The relationships between PC1 scores and the weight fractions of the mixtures were typically linear showing ideal mixing in their deformation behaviors. The mixture consisting of two plastic materials was the only combination to show a consistent positive deviation from ideality. The application of PCA to solid fraction and mechanical work data appears to be an effective means of predicting deformation behavior during compaction of simple powder mixtures. Copyright © 2011 Elsevier B.V. All rights reserved.

  3. A Novel Acoustic Sensor Approach to Classify Seeds Based on Sound Absorption Spectra

    PubMed Central

    Gasso-Tortajada, Vicent; Ward, Alastair J.; Mansur, Hasib; Brøchner, Torben; Sørensen, Claus G.; Green, Ole

    2010-01-01

    A non-destructive and novel in situ acoustic sensor approach based on the sound absorption spectra was developed for identifying and classifying different seed types. The absorption coefficient spectra were determined by using the impedance tube measurement method. Subsequently, a multivariate statistical analysis, i.e., principal component analysis (PCA), was performed as a way to generate a classification of the seeds based on the soft independent modelling of class analogy (SIMCA) method. The results show that the sound absorption coefficient spectra of different seed types present characteristic patterns which are highly dependent on seed size and shape. In general, seed particle size and sphericity were inversely related with the absorption coefficient. PCA presented reliable grouping capabilities within the diverse seed types, since the 95% of the total spectral variance was described by the first two principal components. Furthermore, the SIMCA classification model based on the absorption spectra achieved optimal results as 100% of the evaluation samples were correctly classified. This study contains the initial structuring of an innovative method that will present new possibilities in agriculture and industry for classifying and determining physical properties of seeds and other materials. PMID:22163455

  4. Using Principal Component and Tidal Analysis as a Quality Metric for Detecting Systematic Heading Uncertainty in Long-Term Acoustic Doppler Current Profiler Data

    NASA Astrophysics Data System (ADS)

    Morley, M. G.; Mihaly, S. F.; Dewey, R. K.; Jeffries, M. A.

    2015-12-01

    Ocean Networks Canada (ONC) operates the NEPTUNE and VENUS cabled ocean observatories to collect data on physical, chemical, biological, and geological ocean conditions over multi-year time periods. Researchers can download real-time and historical data from a large variety of instruments to study complex earth and ocean processes from their home laboratories. Ensuring that the users are receiving the most accurate data is a high priority at ONC, requiring quality assurance and quality control (QAQC) procedures to be developed for all data types. While some data types have relatively straightforward QAQC tests, such as scalar data range limits that are based on expected observed values or measurement limits of the instrument, for other data types the QAQC tests are more comprehensive. Long time series of ocean currents from Acoustic Doppler Current Profilers (ADCP), stitched together from multiple deployments over many years is one such data type where systematic data biases are more difficult to identify and correct. Data specialists at ONC are working to quantify systematic compass heading uncertainty in long-term ADCP records at each of the major study sites using the internal compass, remotely operated vehicle bearings, and more analytical tools such as principal component analysis (PCA) to estimate the optimal instrument alignments. In addition to using PCA, some work has been done to estimate the main components of the current at each site using tidal harmonic analysis. This paper describes the key challenges and presents preliminary PCA and tidal analysis approaches used by ONC to improve long-term observatory current measurements.

  5. Once upon Multivariate Analyses: When They Tell Several Stories about Biological Evolution.

    PubMed

    Renaud, Sabrina; Dufour, Anne-Béatrice; Hardouin, Emilie A; Ledevin, Ronan; Auffray, Jean-Christophe

    2015-01-01

    Geometric morphometrics aims to characterize of the geometry of complex traits. It is therefore by essence multivariate. The most popular methods to investigate patterns of differentiation in this context are (1) the Principal Component Analysis (PCA), which is an eigenvalue decomposition of the total variance-covariance matrix among all specimens; (2) the Canonical Variate Analysis (CVA, a.k.a. linear discriminant analysis (LDA) for more than two groups), which aims at separating the groups by maximizing the between-group to within-group variance ratio; (3) the between-group PCA (bgPCA) which investigates patterns of between-group variation, without standardizing by the within-group variance. Standardizing within-group variance, as performed in the CVA, distorts the relationships among groups, an effect that is particularly strong if the variance is similarly oriented in a comparable way in all groups. Such shared direction of main morphological variance may occur and have a biological meaning, for instance corresponding to the most frequent standing genetic variation in a population. Here we undertake a case study of the evolution of house mouse molar shape across various islands, based on the real dataset and simulations. We investigated how patterns of main variance influence the depiction of among-group differentiation according to the interpretation of the PCA, bgPCA and CVA. Without arguing about a method performing 'better' than another, it rather emerges that working on the total or between-group variance (PCA and bgPCA) will tend to put the focus on the role of direction of main variance as line of least resistance to evolution. Standardizing by the within-group variance (CVA), by dampening the expression of this line of least resistance, has the potential to reveal other relevant patterns of differentiation that may otherwise be blurred.

  6. Two worlds collide: Image analysis methods for quantifying structural variation in cluster molecular dynamics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Steenbergen, K. G., E-mail: kgsteen@gmail.com; Gaston, N.

    2014-02-14

    Inspired by methods of remote sensing image analysis, we analyze structural variation in cluster molecular dynamics (MD) simulations through a unique application of the principal component analysis (PCA) and Pearson Correlation Coefficient (PCC). The PCA analysis characterizes the geometric shape of the cluster structure at each time step, yielding a detailed and quantitative measure of structural stability and variation at finite temperature. Our PCC analysis captures bond structure variation in MD, which can be used to both supplement the PCA analysis as well as compare bond patterns between different cluster sizes. Relying only on atomic position data, without requirement formore » a priori structural input, PCA and PCC can be used to analyze both classical and ab initio MD simulations for any cluster composition or electronic configuration. Taken together, these statistical tools represent powerful new techniques for quantitative structural characterization and isomer identification in cluster MD.« less

  7. Two worlds collide: image analysis methods for quantifying structural variation in cluster molecular dynamics.

    PubMed

    Steenbergen, K G; Gaston, N

    2014-02-14

    Inspired by methods of remote sensing image analysis, we analyze structural variation in cluster molecular dynamics (MD) simulations through a unique application of the principal component analysis (PCA) and Pearson Correlation Coefficient (PCC). The PCA analysis characterizes the geometric shape of the cluster structure at each time step, yielding a detailed and quantitative measure of structural stability and variation at finite temperature. Our PCC analysis captures bond structure variation in MD, which can be used to both supplement the PCA analysis as well as compare bond patterns between different cluster sizes. Relying only on atomic position data, without requirement for a priori structural input, PCA and PCC can be used to analyze both classical and ab initio MD simulations for any cluster composition or electronic configuration. Taken together, these statistical tools represent powerful new techniques for quantitative structural characterization and isomer identification in cluster MD.

  8. A principal component analysis of the dynamics of subdomains and binding sites in human serum albumin.

    PubMed

    Paris, Guillaume; Ramseyer, Christophe; Enescu, Mironel

    2014-05-01

    The conformational dynamics of human serum albumin (HSA) was investigated by principal component analysis (PCA) applied to three molecular dynamics trajectories of 200 ns each. The overlap of the essential subspaces spanned by the first 10 principal components (PC) of different trajectories was about 0.3 showing that the PCA based on a trajectory length of 200 ns is not completely convergent for this protein. The contributions of the relative motion of subdomains and of the subdomains (internal) distortion to the first 10 PCs were found to be comparable. Based on the distribution of the first 3 PC, 10 protein conformers are identified showing relative root mean square deviations (RMSD) between 2.3 and 4.6 Å. The main PCs are found to be delocalized over the whole protein structure indicating that the motions of different protein subdomains are coupled. This coupling is considered as being related to the allosteric effects observed upon ligand binding to HSA. On the other hand, the first PC of one of the three trajectories describes a conformational transition of the protein domain I that is close to that experimentally observed upon myristate binding. This is a theoretical support for the older hypothesis stating that changes of the protein onformation favorable to binding can precede the ligand complexation. A detailed all atoms PCA performed on the primary Sites 1 and 2 confirms the multiconformational character of the HSA binding sites as well as the significant coupling of their motions. Copyright © 2013 Wiley Periodicals, Inc.

  9. SESNPCA: Principal Component Analysis Applied to Stripped-Envelope Core-Collapse Supernovae

    NASA Astrophysics Data System (ADS)

    Williamson, Marc; Bianco, Federica; Modjaz, Maryam

    2018-01-01

    In the new era of time-domain astronomy, it will become increasingly important to have rigorous, data driven models for classifying transients, including supernovae (SNe). We present the first application of principal component analysis (PCA) to stripped-envelope core-collapse supernovae (SESNe). Previous studies of SNe types Ib, IIb, Ic, and broad-line Ic (Ic-BL) focus only on specific spectral features, while our PCA algorithm uses all of the information contained in each spectrum. We use one of the largest compiled datasets of SESNe, containing over 150 SNe, each with spectra taken at multiple phases. Our work focuses on 49 SNe with spectra taken 15 ± 5 days after maximum V-band light where better distinctions can be made between SNe type Ib and Ic spectra. We find that spectra of SNe type IIb and Ic-BL are separable from the other types in PCA space, indicating that PCA is a promising option for developing a purely data driven model for SESNe classification.

  10. Relationships between NIR spectra and sensory attributes of Thai commercial fish sauces.

    PubMed

    Ritthiruangdej, Pitiporn; Suwonsichon, Thongchai

    2007-07-01

    Twenty Thai commercial fish sauces were characterized by sensory descriptive analysis and near-infrared (NIR) spectroscopy. The main objectives were i) to investigate the relationships between sensory attributes and NIR spectra of samples and ii) to characterize the sensory characteristics of fish sauces based on NIR data. A generic descriptive analysis with 12 trained panels was used to characterize the sensory attributes. These attributes consisted of 15 descriptors: brown color, 5 aromatics (sweet, caramelized, fermented, fishy, and musty), 4 tastes (sweet, salty, bitter, and umami), 3 aftertastes (sweet, salty and bitter) and 2 flavors (caramelized and fishy). The results showed that Thai fish sauce samples exhibited significant differences in all of sensory attribute values (p < 0.05). NIR transflectance spectra were obtained from 1100 to 2500 nm. Prior to investigation of the relationships between sensory attributes and NIR spectra, principal component analysis (PCA) was applied to reduce the dimensionality of the spectral data from 622 wavelengths to two uncorrelated components (NIR1 and NIR2) which explained 92 and 7% of the total variation, respectively. NIR1 was highly correlated with the wavelength regions of 1100 - 1544, 1774 - 2062, 2092 - 2308, and 2358 - 2440 nm, while NIR2 was highly correlated with the wavelength regions of 1742 - 1764, 2066 - 2088, and 2312 - 2354 nm. Subsequently, the relationships among these two components and all sensory attributes were also investigated by PCA. The results showed that the first three principal components (PCs) named as fishy flavor component (PC1), sweet component (PC2) and bitterness component (PC3), respectively, explained a total of 66.86% of the variation. NIR1 was mainly correlated to the sensory attributes of fishy aromatic, fishy flavor and sweet aftertaste on PC1. In addition, the PCA using only the factor loadings of NIR1 and NIR2 could be used to classify samples into three groups which showed high, medium and low degrees of fishy aromatic, fishy flavor and sweet aftertaste.

  11. Visualizing Hyolaryngeal Mechanics in Swallowing Using Dynamic MRI

    PubMed Central

    Pearson, William G.; Zumwalt, Ann C.

    2013-01-01

    Introduction Coordinates of anatomical landmarks are captured using dynamic MRI to explore whether a proposed two-sling mechanism underlies hyolaryngeal elevation in pharyngeal swallowing. A principal components analysis (PCA) is applied to coordinates to determine the covariant function of the proposed mechanism. Methods Dynamic MRI (dMRI) data were acquired from eleven healthy subjects during a repeated swallows task. Coordinates mapping the proposed mechanism are collected from each dynamic (frame) of a dynamic MRI swallowing series of a randomly selected subject in order to demonstrate shape changes in a single subject. Coordinates representing minimum and maximum hyolaryngeal elevation of all 11 subjects were also mapped to demonstrate shape changes of the system among all subjects. MophoJ software was used to perform PCA and determine vectors of shape change (eigenvectors) for elements of the two-sling mechanism of hyolaryngeal elevation. Results For both single subject and group PCAs, hyolaryngeal elevation accounted for the first principal component of variation. For the single subject PCA, the first principal component accounted for 81.5% of the variance. For the between subjects PCA, the first principal component accounted for 58.5% of the variance. Eigenvectors and shape changes associated with this first principal component are reported. Discussion Eigenvectors indicate that two-muscle slings and associated skeletal elements function as components of a covariant mechanism to elevate the hyolaryngeal complex. Morphological analysis is useful to model shape changes in the two-sling mechanism of hyolaryngeal elevation. PMID:25090608

  12. Protein-RNA specificity by high-throughput principal component analysis of NMR spectra.

    PubMed

    Collins, Katherine M; Oregioni, Alain; Robertson, Laura E; Kelly, Geoff; Ramos, Andres

    2015-03-31

    Defining the RNA target selectivity of the proteins regulating mRNA metabolism is a key issue in RNA biology. Here we present a novel use of principal component analysis (PCA) to extract the RNA sequence preference of RNA binding proteins. We show that PCA can be used to compare the changes in the nuclear magnetic resonance (NMR) spectrum of a protein upon binding a set of quasi-degenerate RNAs and define the nucleobase specificity. We couple this application of PCA to an automated NMR spectra recording and processing protocol and obtain an unbiased and high-throughput NMR method for the analysis of nucleobase preference in protein-RNA interactions. We test the method on the RNA binding domains of three important regulators of RNA metabolism. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. Derivation of Boundary Manikins: A Principal Component Analysis

    NASA Technical Reports Server (NTRS)

    Young, Karen; Margerum, Sarah; Barr, Abbe; Ferrer, Mike A.; Rajulu, Sudhakar

    2008-01-01

    When designing any human-system interface, it is critical to provide realistic anthropometry to properly represent how a person fits within a given space. This study aimed to identify a minimum number of boundary manikins or representative models of subjects anthropometry from a target population, which would realistically represent the population. The boundary manikin anthropometry was derived using, Principal Component Analysis (PCA). PCA is a statistical approach to reduce a multi-dimensional dataset using eigenvectors and eigenvalues. The measurements used in the PCA were identified as those measurements critical for suit and cockpit design. The PCA yielded a total of 26 manikins per gender, as well as their anthropometry from the target population. Reduction techniques were implemented to reduce this number further with a final result of 20 female and 22 male subjects. The anthropometry of the boundary manikins was then be used to create 3D digital models (to be discussed in subsequent papers) intended for use by designers to test components of their space suit design, to verify that the requirements specified in the Human Systems Integration Requirements (HSIR) document are met. The end-goal is to allow for designers to generate suits which accommodate the diverse anthropometry of the user population.

  14. An algorithm for separation of mixed sparse and Gaussian sources

    PubMed Central

    Akkalkotkar, Ameya

    2017-01-01

    Independent component analysis (ICA) is a ubiquitous method for decomposing complex signal mixtures into a small set of statistically independent source signals. However, in cases in which the signal mixture consists of both nongaussian and Gaussian sources, the Gaussian sources will not be recoverable by ICA and will pollute estimates of the nongaussian sources. Therefore, it is desirable to have methods for mixed ICA/PCA which can separate mixtures of Gaussian and nongaussian sources. For mixtures of purely Gaussian sources, principal component analysis (PCA) can provide a basis for the Gaussian subspace. We introduce a new method for mixed ICA/PCA which we call Mixed ICA/PCA via Reproducibility Stability (MIPReSt). Our method uses a repeated estimations technique to rank sources by reproducibility, combined with decomposition of multiple subsamplings of the original data matrix. These multiple decompositions allow us to assess component stability as the size of the data matrix changes, which can be used to determinine the dimension of the nongaussian subspace in a mixture. We demonstrate the utility of MIPReSt for signal mixtures consisting of simulated sources and real-word (speech) sources, as well as mixture of unknown composition. PMID:28414814

  15. An algorithm for separation of mixed sparse and Gaussian sources.

    PubMed

    Akkalkotkar, Ameya; Brown, Kevin Scott

    2017-01-01

    Independent component analysis (ICA) is a ubiquitous method for decomposing complex signal mixtures into a small set of statistically independent source signals. However, in cases in which the signal mixture consists of both nongaussian and Gaussian sources, the Gaussian sources will not be recoverable by ICA and will pollute estimates of the nongaussian sources. Therefore, it is desirable to have methods for mixed ICA/PCA which can separate mixtures of Gaussian and nongaussian sources. For mixtures of purely Gaussian sources, principal component analysis (PCA) can provide a basis for the Gaussian subspace. We introduce a new method for mixed ICA/PCA which we call Mixed ICA/PCA via Reproducibility Stability (MIPReSt). Our method uses a repeated estimations technique to rank sources by reproducibility, combined with decomposition of multiple subsamplings of the original data matrix. These multiple decompositions allow us to assess component stability as the size of the data matrix changes, which can be used to determinine the dimension of the nongaussian subspace in a mixture. We demonstrate the utility of MIPReSt for signal mixtures consisting of simulated sources and real-word (speech) sources, as well as mixture of unknown composition.

  16. Application of mass spectrometry based electronic nose and chemometrics for fingerprinting radiation treatment

    NASA Astrophysics Data System (ADS)

    Gupta, Sumit; Variyar, Prasad S.; Sharma, Arun

    2015-01-01

    Volatile compounds were isolated from apples and grapes employing solid phase micro extraction (SPME) and subsequently analyzed by GC/MS equipped with a transfer line without stationary phase. Single peak obtained was integrated to obtain total mass spectrum of the volatile fraction of samples. A data matrix having relative abundance of all mass-to-charge ratios was subjected to principal component analysis (PCA) and linear discriminant analysis (LDA) to identify radiation treatment. PCA results suggested that there is sufficient variability between control and irradiated samples to build classification models based on supervised techniques. LDA successfully aided in segregating control from irradiated samples at all doses (0.1, 0.25, 0.5, 1.0, 1.5, 2.0 kGy). SPME-MS with chemometrics was successfully demonstrated as simple screening method for radiation treatment.

  17. Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis.

    PubMed

    You, Zhu-Hong; Lei, Ying-Ke; Zhu, Lin; Xia, Junfeng; Wang, Bing

    2013-01-01

    Protein-protein interactions (PPIs) play crucial roles in the execution of various cellular processes and form the basis of biological mechanisms. Although large amount of PPIs data for different species has been generated by high-throughput experimental techniques, current PPI pairs obtained with experimental methods cover only a fraction of the complete PPI networks, and further, the experimental methods for identifying PPIs are both time-consuming and expensive. Hence, it is urgent and challenging to develop automated computational methods to efficiently and accurately predict PPIs. We present here a novel hierarchical PCA-EELM (principal component analysis-ensemble extreme learning machine) model to predict protein-protein interactions only using the information of protein sequences. In the proposed method, 11188 protein pairs retrieved from the DIP database were encoded into feature vectors by using four kinds of protein sequences information. Focusing on dimension reduction, an effective feature extraction method PCA was then employed to construct the most discriminative new feature set. Finally, multiple extreme learning machines were trained and then aggregated into a consensus classifier by majority voting. The ensembling of extreme learning machine removes the dependence of results on initial random weights and improves the prediction performance. When performed on the PPI data of Saccharomyces cerevisiae, the proposed method achieved 87.00% prediction accuracy with 86.15% sensitivity at the precision of 87.59%. Extensive experiments are performed to compare our method with state-of-the-art techniques Support Vector Machine (SVM). Experimental results demonstrate that proposed PCA-EELM outperforms the SVM method by 5-fold cross-validation. Besides, PCA-EELM performs faster than PCA-SVM based method. Consequently, the proposed approach can be considered as a new promising and powerful tools for predicting PPI with excellent performance and less time.

  18. Quantitative thickness prediction of tectonically deformed coal using Extreme Learning Machine and Principal Component Analysis: a case study

    NASA Astrophysics Data System (ADS)

    Wang, Xin; Li, Yan; Chen, Tongjun; Yan, Qiuyan; Ma, Li

    2017-04-01

    The thickness of tectonically deformed coal (TDC) has positive correlation associations with gas outbursts. In order to predict the TDC thickness of coal beds, we propose a new quantitative predicting method using an extreme learning machine (ELM) algorithm, a principal component analysis (PCA) algorithm, and seismic attributes. At first, we build an ELM prediction model using the PCA attributes of a synthetic seismic section. The results suggest that the ELM model can produce a reliable and accurate prediction of the TDC thickness for synthetic data, preferring Sigmoid activation function and 20 hidden nodes. Then, we analyze the applicability of the ELM model on the thickness prediction of the TDC with real application data. Through the cross validation of near-well traces, the results suggest that the ELM model can produce a reliable and accurate prediction of the TDC. After that, we use 250 near-well traces from 10 wells to build an ELM predicting model and use the model to forecast the TDC thickness of the No. 15 coal in the study area using the PCA attributes as the inputs. Comparing the predicted results, it is noted that the trained ELM model with two selected PCA attributes yields better predication results than those from the other combinations of the attributes. Finally, the trained ELM model with real seismic data have a different number of hidden nodes (10) than the trained ELM model with synthetic seismic data. In summary, it is feasible to use an ELM model to predict the TDC thickness using the calculated PCA attributes as the inputs. However, the input attributes, the activation function and the number of hidden nodes in the ELM model should be selected and tested carefully based on individual application.

  19. DWI-associated entire-tumor histogram analysis for the differentiation of low-grade prostate cancer from intermediate-high-grade prostate cancer.

    PubMed

    Wu, Chen-Jiang; Wang, Qing; Li, Hai; Wang, Xiao-Ning; Liu, Xi-Sheng; Shi, Hai-Bin; Zhang, Yu-Dong

    2015-10-01

    To investigate diagnostic efficiency of DWI using entire-tumor histogram analysis in differentiating the low-grade (LG) prostate cancer (PCa) from intermediate-high-grade (HG) PCa in comparison with conventional ROI-based measurement. DW images (b of 0-1400 s/mm(2)) from 126 pathology-confirmed PCa (diameter >0.5 cm) in 110 patients were retrospectively collected and processed by mono-exponential model. The measurement of tumor apparent diffusion coefficients (ADCs) was performed with using histogram-based and ROI-based approach, respectively. The diagnostic ability of ADCs from two methods for differentiating LG-PCa (Gleason score, GS ≤ 6) from HG-PCa (GS > 6) was determined by ROC regression, and compared by McNemar's test. There were 49 LG-tumor and 77 HG-tumor at pathologic findings. Histogram-based ADCs (mean, median, 10th and 90th) and ROI-based ADCs (mean) showed dominant relationships with ordinal GS of Pca (ρ = -0.225 to -0.406, p < 0.05). All above imaging indices reflected significant difference between LG-PCa and HG-PCa (all p values <0.01). Histogram 10th ADCs had dominantly high Az (0.738), Youden index (0.415), and positive likelihood ratio (LR+, 2.45) in stratifying tumor GS against mean, median and 90th ADCs, and ROI-based ADCs. Histogram mean, median, and 10th ADCs showed higher specificity (65.3%-74.1% vs. 44.9%, p < 0.01), but lower sensitivity (57.1%-71.3% vs. 84.4%, p < 0.05) than ROI-based ADCs in differentiating LG-PCa from HG-PCa. DWI-associated histogram analysis had higher specificity, Az, Youden index, and LR+ for differentiation of PCa Gleason grade than ROI-based approach.

  20. Study of recognizing multiple persons' complicated hand gestures from the video sequence acquired by a moving camera

    NASA Astrophysics Data System (ADS)

    Dan, Luo; Ohya, Jun

    2010-02-01

    Recognizing hand gestures from the video sequence acquired by a dynamic camera could be a useful interface between humans and mobile robots. We develop a state based approach to extract and recognize hand gestures from moving camera images. We improved Human-Following Local Coordinate (HFLC) System, a very simple and stable method for extracting hand motion trajectories, which is obtained from the located human face, body part and hand blob changing factor. Condensation algorithm and PCA-based algorithm was performed to recognize extracted hand trajectories. In last research, this Condensation Algorithm based method only applied for one person's hand gestures. In this paper, we propose a principal component analysis (PCA) based approach to improve the recognition accuracy. For further improvement, temporal changes in the observed hand area changing factor are utilized as new image features to be stored in the database after being analyzed by PCA. Every hand gesture trajectory in the database is classified into either one hand gesture categories, two hand gesture categories, or temporal changes in hand blob changes. We demonstrate the effectiveness of the proposed method by conducting experiments on 45 kinds of sign language based Japanese and American Sign Language gestures obtained from 5 people. Our experimental recognition results show better performance is obtained by PCA based approach than the Condensation algorithm based method.

  1. An inter-comparison of PM10 source apportionment using PCA and PMF receptor models in three European sites.

    PubMed

    Cesari, Daniela; Amato, F; Pandolfi, M; Alastuey, A; Querol, X; Contini, D

    2016-08-01

    Source apportionment of aerosol is an important approach to investigate aerosol formation and transformation processes as well as to assess appropriate mitigation strategies and to investigate causes of non-compliance with air quality standards (Directive 2008/50/CE). Receptor models (RMs) based on chemical composition of aerosol measured at specific sites are a useful, and widely used, tool to perform source apportionment. However, an analysis of available studies in the scientific literature reveals heterogeneities in the approaches used, in terms of "working variables" such as the number of samples in the dataset and the number of chemical species used as well as in the modeling tools used. In this work, an inter-comparison of PM10 source apportionment results obtained at three European measurement sites is presented, using two receptor models: principal component analysis coupled with multi-linear regression analysis (PCA-MLRA) and positive matrix factorization (PMF). The inter-comparison focuses on source identification, quantification of source contribution to PM10, robustness of the results, and how these are influenced by the number of chemical species available in the datasets. Results show very similar component/factor profiles identified by PCA and PMF, with some discrepancies in the number of factors. The PMF model appears to be more suitable to separate secondary sulfate and secondary nitrate with respect to PCA at least in the datasets analyzed. Further, some difficulties have been observed with PCA in separating industrial and heavy oil combustion contributions. Commonly at all sites, the crustal contributions found with PCA were larger than those found with PMF, and the secondary inorganic aerosol contributions found by PCA were lower than those found by PMF. Site-dependent differences were also observed for traffic and marine contributions. The inter-comparison of source apportionment performed on complete datasets (using the full range of available chemical species) and incomplete datasets (with reduced number of chemical species) allowed to investigate the sensitivity of source apportionment (SA) results to the working variables used in the RMs. Results show that, at both sites, the profiles and the contributions of the different sources calculated with PMF are comparable within the estimated uncertainties indicating a good stability and robustness of PMF results. In contrast, PCA outputs are more sensitive to the chemical species present in the datasets. In PCA, the crustal contributions are higher in the incomplete datasets and the traffic contributions are significantly lower for incomplete datasets.

  2. Principal component analysis of TOF-SIMS spectra, images and depth profiles: an industrial perspective

    NASA Astrophysics Data System (ADS)

    Pacholski, Michaeleen L.

    2004-06-01

    Principal component analysis (PCA) has been successfully applied to time-of-flight secondary ion mass spectrometry (TOF-SIMS) spectra, images and depth profiles. Although SIMS spectral data sets can be small (in comparison to datasets typically discussed in literature from other analytical techniques such as gas or liquid chromatography), each spectrum has thousands of ions resulting in what can be a difficult comparison of samples. Analysis of industrially-derived samples means the identity of most surface species are unknown a priori and samples must be analyzed rapidly to satisfy customer demands. PCA enables rapid assessment of spectral differences (or lack there of) between samples and identification of chemically different areas on sample surfaces for images. Depth profile analysis helps define interfaces and identify low-level components in the system.

  3. Analysis and Evaluation of the Characteristic Taste Components in Portobello Mushroom.

    PubMed

    Wang, Jinbin; Li, Wen; Li, Zhengpeng; Wu, Wenhui; Tang, Xueming

    2018-05-10

    To identify the characteristic taste components of the common cultivated mushroom (brown; Portobello), Agaricus bisporus, taste components in the stipe and pileus of Portobello mushroom harvested at different growth stages were extracted and identified, and principal component analysis (PCA) and taste active value (TAV) were used to reveal the characteristic taste components during the each of the growth stages of Portobello mushroom. In the stipe and pileus, 20 and 14 different principal taste components were identified, respectively, and they were considered as the principal taste components of Portobello mushroom fruit bodies, which included most amino acids and 5'-nucleotides. Some taste components that were found at high levels, such as lactic acid and citric acid, were not detected as Portobello mushroom principal taste components through PCA. However, due to their high content, Portobello mushroom could be used as a source of organic acids. The PCA and TAV results revealed that 5'-GMP, glutamic acid, malic acid, alanine, proline, leucine, and aspartic acid were the characteristic taste components of Portobello mushroom fruit bodies. Portobello mushroom was also found to be rich in protein and amino acids, so it might also be useful in the formulation of nutraceuticals and functional food. The results in this article could provide a theoretical basis for understanding and regulating the characteristic flavor components synthesis process of Portobello mushroom. © 2018 Institute of Food Technologists®.

  4. Tailored multivariate analysis for modulated enhanced diffraction

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Caliandro, Rocco; Guccione, Pietro; Nico, Giovanni

    2015-10-21

    Modulated enhanced diffraction (MED) is a technique allowing the dynamic structural characterization of crystalline materials subjected to an external stimulus, which is particularly suited forin situandoperandostructural investigations at synchrotron sources. Contributions from the (active) part of the crystal system that varies synchronously with the stimulus can be extracted by an offline analysis, which can only be applied in the case of periodic stimuli and linear system responses. In this paper a new decomposition approach based on multivariate analysis is proposed. The standard principal component analysis (PCA) is adapted to treat MED data: specific figures of merit based on their scoresmore » and loadings are found, and the directions of the principal components obtained by PCA are modified to maximize such figures of merit. As a result, a general method to decompose MED data, called optimum constrained components rotation (OCCR), is developed, which produces very precise results on simulated data, even in the case of nonperiodic stimuli and/or nonlinear responses. The multivariate analysis approach is able to supply in one shot both the diffraction pattern related to the active atoms (through the OCCR loadings) and the time dependence of the system response (through the OCCR scores). When applied to real data, OCCR was able to supply only the latter information, as the former was hindered by changes in abundances of different crystal phases, which occurred besides structural variations in the specific case considered. To develop a decomposition procedure able to cope with this combined effect represents the next challenge in MED analysis.« less

  5. [Discrimination of varieties of borneol using terahertz spectra based on principal component analysis and support vector machine].

    PubMed

    Li, Wu; Hu, Bing; Wang, Ming-wei

    2014-12-01

    In the present paper, the terahertz time-domain spectroscopy (THz-TDS) identification model of borneol based on principal component analysis (PCA) and support vector machine (SVM) was established. As one Chinese common agent, borneol needs a rapid, simple and accurate detection and identification method for its different source and being easily confused in the pharmaceutical and trade links. In order to assure the quality of borneol product and guard the consumer's right, quickly, efficiently and correctly identifying borneol has significant meaning to the production and transaction of borneol. Terahertz time-domain spectroscopy is a new spectroscopy approach to characterize material using terahertz pulse. The absorption terahertz spectra of blumea camphor, borneol camphor and synthetic borneol were measured in the range of 0.2 to 2 THz with the transmission THz-TDS. The PCA scores of 2D plots (PC1 X PC2) and 3D plots (PC1 X PC2 X PC3) of three kinds of borneol samples were obtained through PCA analysis, and both of them have good clustering effect on the 3 different kinds of borneol. The value matrix of the first 10 principal components (PCs) was used to replace the original spectrum data, and the 60 samples of the three kinds of borneol were trained and then the unknown 60 samples were identified. Four kinds of support vector machine model of different kernel functions were set up in this way. Results show that the accuracy of identification and classification of SVM RBF kernel function for three kinds of borneol is 100%, and we selected the SVM with the radial basis kernel function to establish the borneol identification model, in addition, in the noisy case, the classification accuracy rates of four SVM kernel function are above 85%, and this indicates that SVM has strong generalization ability. This study shows that PCA with SVM method of borneol terahertz spectroscopy has good classification and identification effects, and provides a new method for species identification of borneol in Chinese medicine.

  6. On a PCA-based lung motion model

    PubMed Central

    Li, Ruijiang; Lewis, John H; Jia, Xun; Zhao, Tianyu; Liu, Weifeng; Wuenschel, Sara; Lamb, James; Yang, Deshan; Low, Daniel A; Jiang, Steve B

    2014-01-01

    Respiration-induced organ motion is one of the major uncertainties in lung cancer radiotherapy and is crucial to be able to accurately model the lung motion. Most work so far has focused on the study of the motion of a single point (usually the tumor center of mass), and much less work has been done to model the motion of the entire lung. Inspired by the work of Zhang et al (2007 Med. Phys. 34 4772–81), we believe that the spatiotemporal relationship of the entire lung motion can be accurately modeled based on principle component analysis (PCA) and then a sparse subset of the entire lung, such as an implanted marker, can be used to drive the motion of the entire lung (including the tumor). The goal of this work is twofold. First, we aim to understand the underlying reason why PCA is effective for modeling lung motion and find the optimal number of PCA coefficients for accurate lung motion modeling. We attempt to address the above important problems both in a theoretical framework and in the context of real clinical data. Second, we propose a new method to derive the entire lung motion using a single internal marker based on the PCA model. The main results of this work are as follows. We derived an important property which reveals the implicit regularization imposed by the PCA model. We then studied the model using two mathematical respiratory phantoms and 11 clinical 4DCT scans for eight lung cancer patients. For the mathematical phantoms with cosine and an even power (2n) of cosine motion, we proved that 2 and 2n PCA coefficients and eigenvectors will completely represent the lung motion, respectively. Moreover, for the cosine phantom, we derived the equivalence conditions for the PCA motion model and the physiological 5D lung motion model (Low et al 2005 Int. J. Radiat. Oncol. Biol. Phys. 63 921–9). For the clinical 4DCT data, we demonstrated the modeling power and generalization performance of the PCA model. The average 3D modeling error using PCA was within 1 mm (0.7 ± 0.1 mm). When a single artificial internal marker was used to derive the lung motion, the average 3D error was found to be within 2 mm (1.8 ± 0.3 mm) through comprehensive statistical analysis. The optimal number of PCA coefficients needs to be determined on a patient-by-patient basis and two PCA coefficients seem to be sufficient for accurate modeling of the lung motion for most patients. In conclusion, we have presented thorough theoretical analysis and clinical validation of the PCA lung motion model. The feasibility of deriving the entire lung motion using a single marker has also been demonstrated on clinical data using a simulation approach. PMID:21865624

  7. A fuzzy-based data transformation for feature extraction to increase classification performance with small medical data sets.

    PubMed

    Li, Der-Chiang; Liu, Chiao-Wen; Hu, Susan C

    2011-05-01

    Medical data sets are usually small and have very high dimensionality. Too many attributes will make the analysis less efficient and will not necessarily increase accuracy, while too few data will decrease the modeling stability. Consequently, the main objective of this study is to extract the optimal subset of features to increase analytical performance when the data set is small. This paper proposes a fuzzy-based non-linear transformation method to extend classification related information from the original data attribute values for a small data set. Based on the new transformed data set, this study applies principal component analysis (PCA) to extract the optimal subset of features. Finally, we use the transformed data with these optimal features as the input data for a learning tool, a support vector machine (SVM). Six medical data sets: Pima Indians' diabetes, Wisconsin diagnostic breast cancer, Parkinson disease, echocardiogram, BUPA liver disorders dataset, and bladder cancer cases in Taiwan, are employed to illustrate the approach presented in this paper. This research uses the t-test to evaluate the classification accuracy for a single data set; and uses the Friedman test to show the proposed method is better than other methods over the multiple data sets. The experiment results indicate that the proposed method has better classification performance than either PCA or kernel principal component analysis (KPCA) when the data set is small, and suggest creating new purpose-related information to improve the analysis performance. This paper has shown that feature extraction is important as a function of feature selection for efficient data analysis. When the data set is small, using the fuzzy-based transformation method presented in this work to increase the information available produces better results than the PCA and KPCA approaches. Copyright © 2011 Elsevier B.V. All rights reserved.

  8. A SPECTRAL GRAPH APPROACH TO DISCOVERING GENETIC ANCESTRY1

    PubMed Central

    Lee, Ann B.; Luca, Diana; Roeder, Kathryn

    2010-01-01

    Mapping human genetic variation is fundamentally interesting in fields such as anthropology and forensic inference. At the same time, patterns of genetic diversity confound efforts to determine the genetic basis of complex disease. Due to technological advances, it is now possible to measure hundreds of thousands of genetic variants per individual across the genome. Principal component analysis (PCA) is routinely used to summarize the genetic similarity between subjects. The eigenvectors are interpreted as dimensions of ancestry. We build on this idea using a spectral graph approach. In the process we draw on connections between multidimensional scaling and spectral kernel methods. Our approach, based on a spectral embedding derived from the normalized Laplacian of a graph, can produce more meaningful delineation of ancestry than by using PCA. The method is stable to outliers and can more easily incorporate different similarity measures of genetic data than PCA. We illustrate a new algorithm for genetic clustering and association analysis on a large, genetically heterogeneous sample. PMID:20689656

  9. Principal component analysis for protein folding dynamics.

    PubMed

    Maisuradze, Gia G; Liwo, Adam; Scheraga, Harold A

    2009-01-09

    Protein folding is considered here by studying the dynamics of the folding of the triple beta-strand WW domain from the Formin-binding protein 28. Starting from the unfolded state and ending either in the native or nonnative conformational states, trajectories are generated with the coarse-grained united residue (UNRES) force field. The effectiveness of principal components analysis (PCA), an already established mathematical technique for finding global, correlated motions in atomic simulations of proteins, is evaluated here for coarse-grained trajectories. The problems related to PCA and their solutions are discussed. The folding and nonfolding of proteins are examined with free-energy landscapes. Detailed analyses of many folding and nonfolding trajectories at different temperatures show that PCA is very efficient for characterizing the general folding and nonfolding features of proteins. It is shown that the first principal component captures and describes in detail the dynamics of a system. Anomalous diffusion in the folding/nonfolding dynamics is examined by the mean-square displacement (MSD) and the fractional diffusion and fractional kinetic equations. The collisionless (or ballistic) behavior of a polypeptide undergoing Brownian motion along the first few principal components is accounted for.

  10. Accurate Structural Correlations from Maximum Likelihood Superpositions

    PubMed Central

    Theobald, Douglas L; Wuttke, Deborah S

    2008-01-01

    The cores of globular proteins are densely packed, resulting in complicated networks of structural interactions. These interactions in turn give rise to dynamic structural correlations over a wide range of time scales. Accurate analysis of these complex correlations is crucial for understanding biomolecular mechanisms and for relating structure to function. Here we report a highly accurate technique for inferring the major modes of structural correlation in macromolecules using likelihood-based statistical analysis of sets of structures. This method is generally applicable to any ensemble of related molecules, including families of nuclear magnetic resonance (NMR) models, different crystal forms of a protein, and structural alignments of homologous proteins, as well as molecular dynamics trajectories. Dominant modes of structural correlation are determined using principal components analysis (PCA) of the maximum likelihood estimate of the correlation matrix. The correlations we identify are inherently independent of the statistical uncertainty and dynamic heterogeneity associated with the structural coordinates. We additionally present an easily interpretable method (“PCA plots”) for displaying these positional correlations by color-coding them onto a macromolecular structure. Maximum likelihood PCA of structural superpositions, and the structural PCA plots that illustrate the results, will facilitate the accurate determination of dynamic structural correlations analyzed in diverse fields of structural biology. PMID:18282091

  11. Extracting spectral contrast in Landsat Thematic Mapper image data using selective principal component analysis

    USGS Publications Warehouse

    Chavez, P.S.; Kwarteng, A.Y.

    1989-01-01

    A challenge encountered with Landsat Thematic Mapper (TM) data, which includes data from size reflective spectral bands, is displaying as much information as possible in a three-image set for color compositing or digital analysis. Principal component analysis (PCA) applied to the six TM bands simultaneously is often used to address this problem. However, two problems that can be encountered using the PCA method are that information of interest might be mathematically mapped to one of the unused components and that a color composite can be difficult to interpret. "Selective' PCA can be used to minimize both of these problems. The spectral contrast among several spectral regions was mapped for a northern Arizona site using Landsat TM data. Field investigations determined that most of the spectral contrast seen in this area was due to one of the following: the amount of iron and hematite in the soils and rocks, vegetation differences, standing and running water, or the presence of gypsum, which has a higher moisture retention capability than do the surrounding soils and rocks. -from Authors

  12. Progress Towards Improved Analysis of TES X-ray Data Using Principal Component Analysis

    NASA Technical Reports Server (NTRS)

    Busch, S. E.; Adams, J. S.; Bandler, S. R.; Chervenak, J. A.; Eckart, M. E.; Finkbeiner, F. M.; Fixsen, D. J.; Kelley, R. L.; Kilbourne, C. A.; Lee, S.-J.; hide

    2015-01-01

    The traditional method of applying a digital optimal filter to measure X-ray pulses from transition-edge sensor (TES) devices does not achieve the best energy resolution when the signals have a highly non-linear response to energy, or the noise is non-stationary during the pulse. We present an implementation of a method to analyze X-ray data from TESs, which is based upon principal component analysis (PCA). Our method separates the X-ray signal pulse into orthogonal components that have the largest variance. We typically recover pulse height, arrival time, differences in pulse shape, and the variation of pulse height with detector temperature. These components can then be combined to form a representation of pulse energy. An added value of this method is that by reporting information on more descriptive parameters (as opposed to a single number representing energy), we generate a much more complete picture of the pulse received. Here we report on progress in developing this technique for future implementation on X-ray telescopes. We used an 55Fe source to characterize Mo/Au TESs. On the same dataset, the PCA method recovers a spectral resolution that is better by a factor of two than achievable with digital optimal filters.

  13. Non-linear principal component analysis applied to Lorenz models and to North Atlantic SLP

    NASA Astrophysics Data System (ADS)

    Russo, A.; Trigo, R. M.

    2003-04-01

    A non-linear generalisation of Principal Component Analysis (PCA), denoted Non-Linear Principal Component Analysis (NLPCA), is introduced and applied to the analysis of three data sets. Non-Linear Principal Component Analysis allows for the detection and characterisation of low-dimensional non-linear structure in multivariate data sets. This method is implemented using a 5-layer feed-forward neural network introduced originally in the chemical engineering literature (Kramer, 1991). The method is described and details of its implementation are addressed. Non-Linear Principal Component Analysis is first applied to a data set sampled from the Lorenz attractor (1963). It is found that the NLPCA approximations are more representative of the data than are the corresponding PCA approximations. The same methodology was applied to the less known Lorenz attractor (1984). However, the results obtained weren't as good as those attained with the famous 'Butterfly' attractor. Further work with this model is underway in order to assess if NLPCA techniques can be more representative of the data characteristics than are the corresponding PCA approximations. The application of NLPCA to relatively 'simple' dynamical systems, such as those proposed by Lorenz, is well understood. However, the application of NLPCA to a large climatic data set is much more challenging. Here, we have applied NLPCA to the sea level pressure (SLP) field for the entire North Atlantic area and the results show a slight imcrement of explained variance associated. Finally, directions for future work are presented.%}

  14. Biochemical and nutritional components of selected honey samples.

    PubMed

    Chua, Lee Suan; Adnan, Nur Ardawati

    2014-01-01

    The purpose of this study was to investigate the relationship of biochemical (enzymes) and nutritional components in the selected honey samples from Malaysia. The relationship is important to estimate the quality of honey based on the concentration of these nutritious components. Such a study is limited for honey samples from tropical countries with heavy rainfall throughout the year. A number of six honey samples that commonly consumed by local people were collected for the study. Both the biochemical and nutritional components were analysed by using standard methods from Association of Official Analytical Chemists (AOAC). Individual monosaccharides, disaccharides and 17 amino acids in honey were determined by using liquid chromatographic method. The results showed that the peroxide activity was positively correlated with moisture content (r = 0.8264), but negatively correlated with carbohydrate content (r = 0.7755) in honey. The chromatographic sugar and free amino acid profiles showed that the honey samples could be clustered based on the type and maturity of honey. Proline explained for 64.9% of the total variance in principle component analysis (PCA). The correlation between honey components and honey quality has been established for the selected honey samples based on their biochemical and nutritional concentrations. PCA results revealed that the ratio of sucrose to maltose could be used to measure honey maturity, whereas proline was the marker compound used to distinguish honey either as floral or honeydew.

  15. Demixed principal component analysis of neural population data

    PubMed Central

    Kobak, Dmitry; Brendel, Wieland; Constantinidis, Christos; Feierstein, Claudia E; Kepecs, Adam; Mainen, Zachary F; Qi, Xue-Lian; Romo, Ranulfo; Uchida, Naoshige; Machens, Christian K

    2016-01-01

    Neurons in higher cortical areas, such as the prefrontal cortex, are often tuned to a variety of sensory and motor variables, and are therefore said to display mixed selectivity. This complexity of single neuron responses can obscure what information these areas represent and how it is represented. Here we demonstrate the advantages of a new dimensionality reduction technique, demixed principal component analysis (dPCA), that decomposes population activity into a few components. In addition to systematically capturing the majority of the variance of the data, dPCA also exposes the dependence of the neural representation on task parameters such as stimuli, decisions, or rewards. To illustrate our method we reanalyze population data from four datasets comprising different species, different cortical areas and different experimental tasks. In each case, dPCA provides a concise way of visualizing the data that summarizes the task-dependent features of the population response in a single figure. DOI: http://dx.doi.org/10.7554/eLife.10989.001 PMID:27067378

  16. Classification of fMRI resting-state maps using machine learning techniques: A comparative study

    NASA Astrophysics Data System (ADS)

    Gallos, Ioannis; Siettos, Constantinos

    2017-11-01

    We compare the efficiency of Principal Component Analysis (PCA) and nonlinear learning manifold algorithms (ISOMAP and Diffusion maps) for classifying brain maps between groups of schizophrenia patients and healthy from fMRI scans during a resting-state experiment. After a standard pre-processing pipeline, we applied spatial Independent component analysis (ICA) to reduce (a) noise and (b) spatial-temporal dimensionality of fMRI maps. On the cross-correlation matrix of the ICA components, we applied PCA, ISOMAP and Diffusion Maps to find an embedded low-dimensional space. Finally, support-vector-machines (SVM) and k-NN algorithms were used to evaluate the performance of the algorithms in classifying between the two groups.

  17. New approach for rapid assessment of trophic status of Yellow Sea and East China Sea using easy-to-measure parameters

    NASA Astrophysics Data System (ADS)

    Kong, Xianyu; Liu, Yanfang; Jian, Huimin; Su, Rongguo; Yao, Qingzhen; Shi, Xiaoyong

    2017-10-01

    To realize potential cost savings in coastal monitoring programs and provide timely advice for marine management, there is an urgent need for efficient evaluation tools based on easily measured variables for the rapid and timely assessment of estuarine and offshore eutrophication. In this study, using parallel factor analysis (PARAFAC), principal component analysis (PCA), and discriminant function analysis (DFA) with the trophic index (TRIX) for reference, we developed an approach for rapidly assessing the eutrophication status of coastal waters using easy-to-measure parameters, including chromophoric dissolved organic matter (CDOM), fluorescence excitation-emission matrices, CDOM UV-Vis absorbance, and other water-quality parameters (turbidity, chlorophyll a, and dissolved oxygen). First, we decomposed CDOM excitation-emission matrices (EEMs) by PARAFAC to identify three components. Then, we applied PCA to simplify the complexity of the relationships between the water-quality parameters. Finally, we used the PCA score values as independent variables in DFA to develop a eutrophication assessment model. The developed model yielded classification accuracy rates of 97.1%, 80.5%, 90.3%, and 89.1% for good, moderate, and poor water qualities, and for the overall data sets, respectively. Our results suggest that these easy-to-measure parameters could be used to develop a simple approach for rapid in-situ assessment and monitoring of the eutrophication of estuarine and offshore areas.

  18. A study of anatomy of distal femur pertaining to total knee replacement: an analysis, conclusions and recommendations.

    PubMed

    Kumar, K; Sharma, D

    2018-04-01

    Multiple landmarks including the transepicondylar axis (TEA), posterior condylar axis (PCA) and anterior trochlear line (TL) have been used to set up the femoral component rotation, but each is faced with its own practical obstacle that limits its usage. Also a common practice is to set the femoral component rotation at 3° external rotation to PCA and valgus resection angle at 5°-7° to anatomical axis of femur. For the reason that the anatomy of each knee is different, it may not be justified to practice such a set protocol in all cases. The aim of the study was to compare the anatomical landmarks used to set up the femoral component rotation and to study the variability in the different anatomical relationships relevant to total knee replacement. The study had 52 patients (94 knees) with grade IV osteoarthritis. Full-length lower limb scanogram and 1 mm cross-sectional cuts of distal femur were taken. aTEA, sTEA, PCL, TL, CTA, PCA, TLA and valgus angles were taken for all knees. aTEA is identifiable in all cases but sTEA in only 59 knees (62.77%). Correspondingly, CTA is calculable in all knees and PCA in 62.77% cases. Mean CTA and mean PCA were 5.4° ± 1.88° SD and 0.71° ± 1.95° SD, respectively. Mean angle between aTEA and sTEA was 4.88. TL is a line difficult to draw because of high incidence of anterior osteophytes, making CTA a more reliable parameter than TLA. Mean TLA was 10.31° ± 3.52° SD. Mean valgus resection angle was 4.86° ± 2.53° SD. Gender- or side-based differences in any of these values were not statistically different. Using aTEA or sTEA can make a big difference in femoral component rotation; therefore, whether aTEA or sTEA should be used needs to be further investigated. CTA, PCA and valgus resection angle need to be individually calculated for each knee. Use of TLA is not recommended.

  19. Principle component analysis (PCA) for investigation of relationship between population dynamics of microbial pathogenesis, chemical and sensory characteristics in beef slices containing Tarragon essential oil.

    PubMed

    Alizadeh Behbahani, Behrooz; Tabatabaei Yazdi, Farideh; Shahidi, Fakhri; Mortazavi, Seyed Ali; Mohebbi, Mohebbat

    2017-04-01

    Principle component analysis (PCA) was employed to examine the effect of the exerted treatments on the beef shelf life as well as discovering the correlations between the studied responses. Considering the variability of the dimensions of the responses, correlation coefficients were applied to form the matrix and extract the eigenvalue. Antimicrobial effect was evaluated on 10 pathogenic microorganisms through the methods of hole-plate diffusion method, disk diffusion method, pour plate method, minimum inhibitory concentration and minimum bactericidal/fungicidal concentration. Antioxidant potential and total phenolic content were examined through the method of 2,2-diphenyl-1-picrylhydrazyl (DPPH) and Folin-Ciocalteu method, respectively. The components were identified through gas chromatography and gas chromatography/mass spectrometry. Barhang seed mucilage (BSM) based edible coating containing 0, 0.5, 1 and 1.5% (w/w) Tarragon (T) essential oil mix were applied on beef slices to control the growth of pathogenic microorganisms. Microbiological (total viable count, psychrotrophic count, Escherichia coli, Staphylococcus aureus and fungi), chemical (thiobarbituric acid, peroxide value and pH) and sensory characteristics (odor, color and overall acceptability) analysis measurements were made during the storage periodically. PCA was employed to examine the effect of the exerted treatments on the beef shelf life as well as discovering the correlations between the studied responses. Considering the variability of the dimensions of the responses, correlation coefficients were applied to form the matrix and extract the eigenvalue. The PCA showed that the properties of the uncoated meat samples on the 9th, 12th, 15th and 18th days of storage are continuously changing independent of the exerted treatments on the other samples. This reveals the effect of the exerted treatments on the samples. Copyright © 2017 Elsevier Ltd. All rights reserved.

  20. Spectral data compression using weighted principal component analysis with consideration of human visual system and light sources

    NASA Astrophysics Data System (ADS)

    Cao, Qian; Wan, Xiaoxia; Li, Junfeng; Liu, Qiang; Liang, Jingxing; Li, Chan

    2016-10-01

    This paper proposed two weight functions based on principal component analysis (PCA) to reserve more colorimetric information in spectral data compression process. One weight function consisted of the CIE XYZ color-matching functions representing the characteristic of the human visual system, while another was made up of the CIE XYZ color-matching functions of human visual system and relative spectral power distribution of the CIE standard illuminant D65. The improvement obtained from the proposed two methods were tested to compress and reconstruct the reflectance spectra of 1600 glossy Munsell color chips and 1950 Natural Color System color chips as well as six multispectral images. The performance was evaluated by the mean values of color difference under the CIE 1931 standard colorimetric observer and the CIE standard illuminant D65 and A. The mean values of root mean square errors between the original and reconstructed spectra were also calculated. The experimental results show that the proposed two methods significantly outperform the standard PCA and another two weighted PCA in the aspects of colorimetric reconstruction accuracy with very slight degradation in spectral reconstruction accuracy. In addition, weight functions with the CIE standard illuminant D65 can improve the colorimetric reconstruction accuracy compared to weight functions without the CIE standard illuminant D65.

  1. Representation of Probability Density Functions from Orbit Determination using the Particle Filter

    NASA Technical Reports Server (NTRS)

    Mashiku, Alinda K.; Garrison, James; Carpenter, J. Russell

    2012-01-01

    Statistical orbit determination enables us to obtain estimates of the state and the statistical information of its region of uncertainty. In order to obtain an accurate representation of the probability density function (PDF) that incorporates higher order statistical information, we propose the use of nonlinear estimation methods such as the Particle Filter. The Particle Filter (PF) is capable of providing a PDF representation of the state estimates whose accuracy is dependent on the number of particles or samples used. For this method to be applicable to real case scenarios, we need a way of accurately representing the PDF in a compressed manner with little information loss. Hence we propose using the Independent Component Analysis (ICA) as a non-Gaussian dimensional reduction method that is capable of maintaining higher order statistical information obtained using the PF. Methods such as the Principal Component Analysis (PCA) are based on utilizing up to second order statistics, hence will not suffice in maintaining maximum information content. Both the PCA and the ICA are applied to two scenarios that involve a highly eccentric orbit with a lower apriori uncertainty covariance and a less eccentric orbit with a higher a priori uncertainty covariance, to illustrate the capability of the ICA in relation to the PCA.

  2. Identification of fungal phytopathogens using Fourier transform infrared-attenuated total reflection spectroscopy and advanced statistical methods

    NASA Astrophysics Data System (ADS)

    Salman, Ahmad; Lapidot, Itshak; Pomerantz, Ami; Tsror, Leah; Shufan, Elad; Moreh, Raymond; Mordechai, Shaul; Huleihel, Mahmoud

    2012-01-01

    The early diagnosis of phytopathogens is of a great importance; it could save large economical losses due to crops damaged by fungal diseases, and prevent unnecessary soil fumigation or the use of fungicides and bactericides and thus prevent considerable environmental pollution. In this study, 18 isolates of three different fungi genera were investigated; six isolates of Colletotrichum coccodes, six isolates of Verticillium dahliae and six isolates of Fusarium oxysporum. Our main goal was to differentiate these fungi samples on the level of isolates, based on their infrared absorption spectra obtained using the Fourier transform infrared-attenuated total reflection (FTIR-ATR) sampling technique. Advanced statistical and mathematical methods: principal component analysis (PCA), linear discriminant analysis (LDA), and k-means were applied to the spectra after manipulation. Our results showed significant spectral differences between the various fungi genera examined. The use of k-means enabled classification between the genera with a 94.5% accuracy, whereas the use of PCA [3 principal components (PCs)] and LDA has achieved a 99.7% success rate. However, on the level of isolates, the best differentiation results were obtained using PCA (9 PCs) and LDA for the lower wavenumber region (800-1775 cm-1), with identification success rates of 87%, 85.5%, and 94.5% for Colletotrichum, Fusarium, and Verticillium strains, respectively.

  3. E-nose based rapid prediction of early mouldy grain using probabilistic neural networks

    PubMed Central

    Ying, Xiaoguo; Liu, Wei; Hui, Guohua; Fu, Jun

    2015-01-01

    In this paper, early mouldy grain rapid prediction method using probabilistic neural network (PNN) and electronic nose (e-nose) was studied. E-nose responses to rice, red bean, and oat samples with different qualities were measured and recorded. E-nose data was analyzed using principal component analysis (PCA), back propagation (BP) network, and PNN, respectively. Results indicated that PCA and BP network could not clearly discriminate grain samples with different mouldy status and showed poor predicting accuracy. PNN showed satisfying discriminating abilities to grain samples with an accuracy of 93.75%. E-nose combined with PNN is effective for early mouldy grain prediction. PMID:25714125

  4. The pre-image problem in kernel methods.

    PubMed

    Kwok, James Tin-yau; Tsang, Ivor Wai-hung

    2004-11-01

    In this paper, we address the problem of finding the pre-image of a feature vector in the feature space induced by a kernel. This is of central importance in some kernel applications, such as on using kernel principal component analysis (PCA) for image denoising. Unlike the traditional method which relies on nonlinear optimization, our proposed method directly finds the location of the pre-image based on distance constraints in the feature space. It is noniterative, involves only linear algebra and does not suffer from numerical instability or local minimum problems. Evaluations on performing kernel PCA and kernel clustering on the USPS data set show much improved performance.

  5. Tensor decomposition-based and principal-component-analysis-based unsupervised feature extraction applied to the gene expression and methylation profiles in the brains of social insects with multiple castes.

    PubMed

    Taguchi, Y-H

    2018-05-08

    Even though coexistence of multiple phenotypes sharing the same genomic background is interesting, it remains incompletely understood. Epigenomic profiles may represent key factors, with unknown contributions to the development of multiple phenotypes, and social-insect castes are a good model for elucidation of the underlying mechanisms. Nonetheless, previous studies have failed to identify genes associated with aberrant gene expression and methylation profiles because of the lack of suitable methodology that can address this problem properly. A recently proposed principal component analysis (PCA)-based and tensor decomposition (TD)-based unsupervised feature extraction (FE) can solve this problem because these two approaches can deal with gene expression and methylation profiles even when a small number of samples is available. PCA-based and TD-based unsupervised FE methods were applied to the analysis of gene expression and methylation profiles in the brains of two social insects, Polistes canadensis and Dinoponera quadriceps. Genes associated with differential expression and methylation between castes were identified, and analysis of enrichment of Gene Ontology terms confirmed reliability of the obtained sets of genes from the biological standpoint. Biologically relevant genes, shown to be associated with significant differential gene expression and methylation between castes, were identified here for the first time. The identification of these genes may help understand the mechanisms underlying epigenetic control of development of multiple phenotypes under the same genomic conditions.

  6. Near-infrared confocal micro-Raman spectroscopy combined with PCA-LDA multivariate analysis for detection of esophageal cancer

    NASA Astrophysics Data System (ADS)

    Chen, Long; Wang, Yue; Liu, Nenrong; Lin, Duo; Weng, Cuncheng; Zhang, Jixue; Zhu, Lihuan; Chen, Weisheng; Chen, Rong; Feng, Shangyuan

    2013-06-01

    The diagnostic capability of using tissue intrinsic micro-Raman signals to obtain biochemical information from human esophageal tissue is presented in this paper. Near-infrared micro-Raman spectroscopy combined with multivariate analysis was applied for discrimination of esophageal cancer tissue from normal tissue samples. Micro-Raman spectroscopy measurements were performed on 54 esophageal cancer tissues and 55 normal tissues in the 400-1750 cm-1 range. The mean Raman spectra showed significant differences between the two groups. Tentative assignments of the Raman bands in the measured tissue spectra suggested some changes in protein structure, a decrease in the relative amount of lactose, and increases in the percentages of tryptophan, collagen and phenylalanine content in esophageal cancer tissue as compared to those of a normal subject. The diagnostic algorithms based on principal component analysis (PCA) and linear discriminate analysis (LDA) achieved a diagnostic sensitivity of 87.0% and specificity of 70.9% for separating cancer from normal esophageal tissue samples. The result demonstrated that near-infrared micro-Raman spectroscopy combined with PCA-LDA analysis could be an effective and sensitive tool for identification of esophageal cancer.

  7. SU-E-I-58: Objective Models of Breast Shape Undergoing Mammography and Tomosynthesis Using Principal Component Analysis.

    PubMed

    Feng, Ssj; Sechopoulos, I

    2012-06-01

    To develop an objective model of the shape of the compressed breast undergoing mammographic or tomosynthesis acquisition. Automated thresholding and edge detection was performed on 984 anonymized digital mammograms (492 craniocaudal (CC) view mammograms and 492 medial lateral oblique (MLO) view mammograms), to extract the edge of each breast. Principal Component Analysis (PCA) was performed on these edge vectors to identify a limited set of parameters and eigenvectors that. These parameters and eigenvectors comprise a model that can be used to describe the breast shapes present in acquired mammograms and to generate realistic models of breasts undergoing acquisition. Sample breast shapes were then generated from this model and evaluated. The mammograms in the database were previously acquired for a separate study and authorized for use in further research. The PCA successfully identified two principal components and their corresponding eigenvectors, forming the basis for the breast shape model. The simulated breast shapes generated from the model are reasonable approximations of clinically acquired mammograms. Using PCA, we have obtained models of the compressed breast undergoing mammographic or tomosynthesis acquisition based on objective analysis of a large image database. Up to now, the breast in the CC view has been approximated as a semi-circular tube, while there has been no objectively-obtained model for the MLO view breast shape. Such models can be used for various breast imaging research applications, such as x-ray scatter estimation and correction, dosimetry estimates, and computer-aided detection and diagnosis. © 2012 American Association of Physicists in Medicine.

  8. InterFace: A software package for face image warping, averaging, and principal components analysis.

    PubMed

    Kramer, Robin S S; Jenkins, Rob; Burton, A Mike

    2017-12-01

    We describe InterFace, a software package for research in face recognition. The package supports image warping, reshaping, averaging of multiple face images, and morphing between faces. It also supports principal components analysis (PCA) of face images, along with tools for exploring the "face space" produced by PCA. The package uses a simple graphical user interface, allowing users to perform these sophisticated image manipulations without any need for programming knowledge. The program is available for download in the form of an app, which requires that users also have access to the (freely available) MATLAB Runtime environment.

  9. Dimensionality and summary measures of the SF-36 v1.6: comparison of scale- and item-based approach across ECRHS II adults population.

    PubMed

    Grassi, Mario; Nucera, Andrea

    2010-01-01

    The objective of this study was twofold: 1) to confirm the hypothetical eight scales and two-component summaries of the questionnaire Short Form 36 Health Survey (SF-36), and 2) to evaluate the performance of two alternative measures to the original physical component summary (PCS) and mental component summary (MCS). We performed principal component analysis (PCA) based on 35 items, after optimal scaling via multiple correspondence analysis (MCA), and subsequently on eight scales, after standard summative scoring. Item-based summary measures were planned. Data from the European Community Respiratory Health Survey II follow-up of 8854 subjects from 25 centers were analyzed to cross-validate the original and the novel PCS and MCS. Overall, the scale- and item-based comparison indicated that the SF-36 scales and summaries meet the supposed dimensionality. However, vitality, social functioning, and general health items did not fit data optimally. The novel measures, derived a posteriori by unit-rule from an oblique (correlated) MCA/PCA solution, are simple item sums or weighted scale sums where the weights are the raw scale ranges. These item-based scores yielded consistent scale-summary results for outliers profiles, with an expected known-group differences validity. We were able to confirm the hypothesized dimensionality of eight scales and two summaries of the SF-36. The alternative scoring reaches at least the same required standards of the original scoring. In addition, it can reduce the item-scale inconsistencies without loss of predictive validity.

  10. Robust prediction of protein subcellular localization combining PCA and WSVMs.

    PubMed

    Tian, Jiang; Gu, Hong; Liu, Wenqi; Gao, Chiyang

    2011-08-01

    Automated prediction of protein subcellular localization is an important tool for genome annotation and drug discovery, and Support Vector Machines (SVMs) can effectively solve this problem in a supervised manner. However, the datasets obtained from real experiments are likely to contain outliers or noises, which can lead to poor generalization ability and classification accuracy. To explore this problem, we adopt strategies to lower the effect of outliers. First we design a method based on Weighted SVMs, different weights are assigned to different data points, so the training algorithm will learn the decision boundary according to the relative importance of the data points. Second we analyse the influence of Principal Component Analysis (PCA) on WSVM classification, propose a hybrid classifier combining merits of both PCA and WSVM. After performing dimension reduction operations on the datasets, kernel-based possibilistic c-means algorithm can generate more suitable weights for the training, as PCA transforms the data into a new coordinate system with largest variances affected greatly by the outliers. Experiments on benchmark datasets show promising results, which confirms the effectiveness of the proposed method in terms of prediction accuracy. Copyright © 2011 Elsevier Ltd. All rights reserved.

  11. Identification of regional activation by factorization of high-density surface EMG signals: A comparison of Principal Component Analysis and Non-negative Matrix factorization.

    PubMed

    Gallina, Alessio; Garland, S Jayne; Wakeling, James M

    2018-05-22

    In this study, we investigated whether principal component analysis (PCA) and non-negative matrix factorization (NMF) perform similarly for the identification of regional activation within the human vastus medialis. EMG signals from 64 locations over the VM were collected from twelve participants while performing a low-force isometric knee extension. The envelope of the EMG signal of each channel was calculated by low-pass filtering (8 Hz) the monopolar EMG signal after rectification. The data matrix was factorized using PCA and NMF, and up to 5 factors were considered for each algorithm. Association between explained variance, spatial weights and temporal scores between the two algorithms were compared using Pearson correlation. For both PCA and NMF, a single factor explained approximately 70% of the variance of the signal, while two and three factors explained just over 85% or 90%. The variance explained by PCA and NMF was highly comparable (R > 0.99). Spatial weights and temporal scores extracted with non-negative reconstruction of PCA and NMF were highly associated (all p < 0.001, mean R > 0.97). Regional VM activation can be identified using high-density surface EMG and factorization algorithms. Regional activation explains up to 30% of the variance of the signal, as identified through both PCA and NMF. Copyright © 2018 Elsevier Ltd. All rights reserved.

  12. Discrimination of selected species of pathogenic bacteria using near-infrared Raman spectroscopy and principal components analysis

    NASA Astrophysics Data System (ADS)

    de Siqueira e Oliveira, Fernanda SantAna; Giana, Hector Enrique; Silveira, Landulfo

    2012-10-01

    A method, based on Raman spectroscopy, for identification of different microorganisms involved in bacterial urinary tract infections has been proposed. Spectra were collected from different bacterial colonies (Gram-negative: Escherichia coli, Klebsiella pneumoniae, Proteus mirabilis, Pseudomonas aeruginosa and Enterobacter cloacae, and Gram-positive: Staphylococcus aureus and Enterococcus spp.), grown on culture medium (agar), using a Raman spectrometer with a fiber Raman probe (830 nm). Colonies were scraped from the agar surface and placed on an aluminum foil for Raman measurements. After preprocessing, spectra were submitted to a principal component analysis and Mahalanobis distance (PCA/MD) discrimination algorithm. We found that the mean Raman spectra of different bacterial species show similar bands, and S. aureus was well characterized by strong bands related to carotenoids. PCA/MD could discriminate Gram-positive bacteria with sensitivity and specificity of 100% and Gram-negative bacteria with sensitivity ranging from 58 to 88% and specificity ranging from 87% to 99%.

  13. WFIRST: Principal Components Analysis of H4RG-10 Near-IR Detector Data Cubes

    NASA Astrophysics Data System (ADS)

    Rauscher, Bernard

    2018-01-01

    The Wide Field Infrared Survey Telescope’s (WFIRST) Wide Field Instrument (WFI) incorporates an array of eighteen Teledyne H4RG-10 near-IR detector arrays. Because WFIRST’s science investigations require controlling systematic uncertainties to state-of-the-art levels, we conducted principal components analysis (PCA) of some H4RG-10 test data obtained in the NASA Goddard Space Flight Center Detector Characterization Laboratory (DCL). The PCA indicates that the Legendre polynomials provide a nearly orthogonal representation of up-the-ramp sampled illuminated data cubes, and suggests other representations that may provide an even more compact representation of the data in some circumstances. We hypothesize that by using orthogonal representations, such as those described here, it may be possible to control systematic errors better than has been achieved before for NASA missions. We believe that these findings are probably applicable to other H4RG, H2RG, and H1RG based systems.

  14. Design of experiments and principal component analysis as approaches for enhancing performance of gas-diffusional air-breathing bilirubin oxidase cathode

    NASA Astrophysics Data System (ADS)

    Babanova, Sofia; Artyushkova, Kateryna; Ulyanova, Yevgenia; Singhal, Sameer; Atanassov, Plamen

    2014-01-01

    Two statistical methods, design of experiments (DOE) and principal component analysis (PCA) are employed to investigate and improve performance of air-breathing gas-diffusional enzymatic electrodes. DOE is utilized as a tool for systematic organization and evaluation of various factors affecting the performance of the composite system. Based on the results from the DOE, an improved cathode is constructed. The current density generated utilizing the improved cathode (755 ± 39 μA cm-2 at 0.3 V vs. Ag/AgCl) is 2-5 times higher than the highest current density previously achieved. Three major factors contributing to the cathode performance are identified: the amount of enzyme, the volume of phosphate buffer used to immobilize the enzyme, and the thickness of the gas-diffusion layer (GDL). PCA is applied as an independent confirmation tool to support conclusions made by DOE and to visualize the contribution of factors in individual cathode configurations.

  15. Plaque Tissue Morphology-Based Stroke Risk Stratification Using Carotid Ultrasound: A Polling-Based PCA Learning Paradigm.

    PubMed

    Saba, Luca; Jain, Pankaj K; Suri, Harman S; Ikeda, Nobutaka; Araki, Tadashi; Singh, Bikesh K; Nicolaides, Andrew; Shafique, Shoaib; Gupta, Ajay; Laird, John R; Suri, Jasjit S

    2017-06-01

    Severe atherosclerosis disease in carotid arteries causes stenosis which in turn leads to stroke. Machine learning systems have been previously developed for plaque wall risk assessment using morphology-based characterization. The fundamental assumption in such systems is the extraction of the grayscale features of the plaque region. Even though these systems have the ability to perform risk stratification, they lack the ability to achieve higher performance due their inability to select and retain dominant features. This paper introduces a polling-based principal component analysis (PCA) strategy embedded in the machine learning framework to select and retain dominant features, resulting in superior performance. This leads to more stability and reliability. The automated system uses offline image data along with the ground truth labels to generate the parameters, which are then used to transform the online grayscale features to predict the risk of stroke. A set of sixteen grayscale plaque features is computed. Utilizing the cross-validation protocol (K = 10), and the PCA cutoff of 0.995, the machine learning system is able to achieve an accuracy of 98.55 and 98.83%corresponding to the carotidfar wall and near wall plaques, respectively. The corresponding reliability of the system was 94.56 and 95.63%, respectively. The automated system was validated against the manual risk assessment system and the precision of merit for same cross-validation settings and PCA cutoffs are 98.28 and 93.92%for the far and the near wall, respectively.PCA-embedded morphology-based plaque characterization shows a powerful strategy for risk assessment and can be adapted in clinical settings.

  16. Feature Extraction of Event-Related Potentials Using Wavelets: An Application to Human Performance Monitoring

    NASA Technical Reports Server (NTRS)

    Trejo, Leonard J.; Shensa, Mark J.; Remington, Roger W. (Technical Monitor)

    1998-01-01

    This report describes the development and evaluation of mathematical models for predicting human performance from discrete wavelet transforms (DWT) of event-related potentials (ERP) elicited by task-relevant stimuli. The DWT was compared to principal components analysis (PCA) for representation of ERPs in linear regression and neural network models developed to predict a composite measure of human signal detection performance. Linear regression models based on coefficients of the decimated DWT predicted signal detection performance with half as many f ree parameters as comparable models based on PCA scores. In addition, the DWT-based models were more resistant to model degradation due to over-fitting than PCA-based models. Feed-forward neural networks were trained using the backpropagation,-, algorithm to predict signal detection performance based on raw ERPs, PCA scores, or high-power coefficients of the DWT. Neural networks based on high-power DWT coefficients trained with fewer iterations, generalized to new data better, and were more resistant to overfitting than networks based on raw ERPs. Networks based on PCA scores did not generalize to new data as well as either the DWT network or the raw ERP network. The results show that wavelet expansions represent the ERP efficiently and extract behaviorally important features for use in linear regression or neural network models of human performance. The efficiency of the DWT is discussed in terms of its decorrelation and energy compaction properties. In addition, the DWT models provided evidence that a pattern of low-frequency activity (1 to 3.5 Hz) occurring at specific times and scalp locations is a reliable correlate of human signal detection performance.

  17. Feature extraction of event-related potentials using wavelets: an application to human performance monitoring

    NASA Technical Reports Server (NTRS)

    Trejo, L. J.; Shensa, M. J.

    1999-01-01

    This report describes the development and evaluation of mathematical models for predicting human performance from discrete wavelet transforms (DWT) of event-related potentials (ERP) elicited by task-relevant stimuli. The DWT was compared to principal components analysis (PCA) for representation of ERPs in linear regression and neural network models developed to predict a composite measure of human signal detection performance. Linear regression models based on coefficients of the decimated DWT predicted signal detection performance with half as many free parameters as comparable models based on PCA scores. In addition, the DWT-based models were more resistant to model degradation due to over-fitting than PCA-based models. Feed-forward neural networks were trained using the backpropagation algorithm to predict signal detection performance based on raw ERPs, PCA scores, or high-power coefficients of the DWT. Neural networks based on high-power DWT coefficients trained with fewer iterations, generalized to new data better, and were more resistant to overfitting than networks based on raw ERPs. Networks based on PCA scores did not generalize to new data as well as either the DWT network or the raw ERP network. The results show that wavelet expansions represent the ERP efficiently and extract behaviorally important features for use in linear regression or neural network models of human performance. The efficiency of the DWT is discussed in terms of its decorrelation and energy compaction properties. In addition, the DWT models provided evidence that a pattern of low-frequency activity (1 to 3.5 Hz) occurring at specific times and scalp locations is a reliable correlate of human signal detection performance. Copyright 1999 Academic Press.

  18. The use of principal component and cluster analysis to differentiate banana peel flours based on their starch and dietary fibre components.

    PubMed

    Ramli, Saifullah; Ismail, Noryati; Alkarkhi, Abbas Fadhl Mubarek; Easa, Azhar Mat

    2010-08-01

    Banana peel flour (BPF) prepared from green or ripe Cavendish and Dream banana fruits were assessed for their total starch (TS), digestible starch (DS), resistant starch (RS), total dietary fibre (TDF), soluble dietary fibre (SDF) and insoluble dietary fibre (IDF). Principal component analysis (PCA) identified that only 1 component was responsible for 93.74% of the total variance in the starch and dietary fibre components that differentiated ripe and green banana flours. Cluster analysis (CA) applied to similar data obtained two statistically significant clusters (green and ripe bananas) to indicate difference in behaviours according to the stages of ripeness based on starch and dietary fibre components. We concluded that the starch and dietary fibre components could be used to discriminate between flours prepared from peels obtained from fruits of different ripeness. The results were also suggestive of the potential of green and ripe BPF as functional ingredients in food.

  19. The Use of Principal Component and Cluster Analysis to Differentiate Banana Peel Flours Based on Their Starch and Dietary Fibre Components

    PubMed Central

    Ramli, Saifullah; Ismail, Noryati; Alkarkhi, Abbas Fadhl Mubarek; Easa, Azhar Mat

    2010-01-01

    Banana peel flour (BPF) prepared from green or ripe Cavendish and Dream banana fruits were assessed for their total starch (TS), digestible starch (DS), resistant starch (RS), total dietary fibre (TDF), soluble dietary fibre (SDF) and insoluble dietary fibre (IDF). Principal component analysis (PCA) identified that only 1 component was responsible for 93.74% of the total variance in the starch and dietary fibre components that differentiated ripe and green banana flours. Cluster analysis (CA) applied to similar data obtained two statistically significant clusters (green and ripe bananas) to indicate difference in behaviours according to the stages of ripeness based on starch and dietary fibre components. We concluded that the starch and dietary fibre components could be used to discriminate between flours prepared from peels obtained from fruits of different ripeness. The results were also suggestive of the potential of green and ripe BPF as functional ingredients in food. PMID:24575193

  20. Comparison of five Lonicera flowers by simultaneous determination of multi-components with single reference standard method and principal component analysis.

    PubMed

    Gao, Wen; Wang, Rui; Li, Dan; Liu, Ke; Chen, Jun; Li, Hui-Jun; Xu, Xiaojun; Li, Ping; Yang, Hua

    2016-01-05

    The flowers of Lonicera japonica Thunb. were extensively used to treat many diseases. As the demands for L. japonica increased, some related Lonicera plants were often confused or misused. Caffeoylquinic acids were always regarded as chemical markers in the quality control of L. japonica, but they could be found in all Lonicera species. Thus, a simple and reliable method for the evaluation of different Lonicera flowers is necessary to be established. In this work a method based on single standard to determine multi-components (SSDMC) combined with principal component analysis (PCA) for control and distinguish of Lonicera species flowers have been developed. Six components including three caffeoylquinic acids and three iridoid glycosides were assayed simultaneously using chlorogenic acid as the reference standard. The credibility and feasibility of the SSDMC method were carefully validated and the results demonstrated that there were no remarkable differences compared with external standard method. Finally, a total of fifty-one batches covering five Lonicera species were analyzed and PCA was successfully applied to distinguish the Lonicera species. This strategy simplifies the processes in the quality control of multiple-componential herbal medicine which effectively adapted for improving the quality control of those herbs belonging to closely related species. Copyright © 2015 Elsevier B.V. All rights reserved.

  1. Geochemical differentiation processes for arc magma of the Sengan volcanic cluster, Northeastern Japan, constrained from principal component analysis

    NASA Astrophysics Data System (ADS)

    Ueki, Kenta; Iwamori, Hikaru

    2017-10-01

    In this study, with a view of understanding the structure of high-dimensional geochemical data and discussing the chemical processes at work in the evolution of arc magmas, we employed principal component analysis (PCA) to evaluate the compositional variations of volcanic rocks from the Sengan volcanic cluster of the Northeastern Japan Arc. We analyzed the trace element compositions of various arc volcanic rocks, sampled from 17 different volcanoes in a volcanic cluster. The PCA results demonstrated that the first three principal components accounted for 86% of the geochemical variation in the magma of the Sengan region. Based on the relationships between the principal components and the major elements, the mass-balance relationships with respect to the contributions of minerals, the composition of plagioclase phenocrysts, geothermal gradient, and seismic velocity structure in the crust, the first, the second, and the third principal components appear to represent magma mixing, crystallizations of olivine/pyroxene, and crystallizations of plagioclase, respectively. These represented 59%, 20%, and 6%, respectively, of the variance in the entire compositional range, indicating that magma mixing accounted for the largest variance in the geochemical variation of the arc magma. Our result indicated that crustal processes dominate the geochemical variation of magma in the Sengan volcanic cluster.

  2. Principal component analysis of socioeconomic factors and their association with malaria and arbovirus risk in Tanzania: a sensitivity analysis.

    PubMed

    Homenauth, Esha; Kajeguka, Debora; Kulkarni, Manisha A

    2017-11-01

    Principal component analysis (PCA) is frequently adopted for creating socioeconomic proxies in order to investigate the independent effects of wealth on disease status. The guidelines and methods for the creation of these proxies are well described and validated. The Demographic and Health Survey, World Health Survey and the Living Standards Measurement Survey are examples of large data sets that use PCA to create wealth indices particularly in low and middle-income countries (LMIC), where quantifying wealth-disease associations is problematic due to the unavailability of reliable income and expenditure data. However, the application of this method to smaller survey data sets, especially in rural LMIC settings, is less rigorously studied.In this paper, we aimed to highlight some of these issues by investigating the association of derived wealth indices using PCA on risk of vector-borne disease infection in Tanzania focusing on malaria and key arboviruses (ie, dengue and chikungunya). We demonstrated that indices consisting of subsets of socioeconomic indicators provided the least methodologically flawed representations of household wealth compared with an index that combined all socioeconomic variables. These results suggest that the choice of the socioeconomic indicators included in a wealth proxy can influence the relative position of households in the overall wealth hierarchy, and subsequently the strength of disease associations. This can, therefore, influence future resource planning activities and should be considered among investigators who use a PCA-derived wealth index based on community-level survey data to influence programme or policy decisions in rural LMIC settings. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  3. Feasibility of Rapid Multitracer PET Tumor Imaging

    NASA Astrophysics Data System (ADS)

    Kadrmas, D. J.; Rust, T. C.

    2005-10-01

    Positron emission tomography (PET) can characterize different aspects of tumor physiology using various tracers. PET scans are usually performed using only one tracer since there is no explicit signal for distinguishing multiple tracers. We tested the feasibility of rapidly imaging multiple PET tracers using dynamic imaging techniques, where the signals from each tracer are separated based upon differences in tracer half-life, kinetics, and distribution. Time-activity curve populations for FDG, acetate, ATSM, and PTSM were simulated using appropriate compartment models, and noisy dual-tracer curves were computed by shifting and adding the single-tracer curves. Single-tracer components were then estimated from dual-tracer data using two methods: principal component analysis (PCA)-based fits of single-tracer components to multitracer data, and parallel multitracer compartment models estimating single-tracer rate parameters from multitracer time-activity curves. The PCA analysis found that there is information content present for separating multitracer data, and that tracer separability depends upon tracer kinetics, injection order and timing. Multitracer compartment modeling recovered rate parameters for individual tracers with good accuracy but somewhat higher statistical uncertainty than single-tracer results when the injection delay was >10 min. These approaches to processing rapid multitracer PET data may potentially provide a new tool for characterizing multiple aspects of tumor physiology in vivo.

  4. Differentiating malignant from benign breast tumors on acoustic radiation force impulse imaging using fuzzy-based neural networks with principle component analysis

    NASA Astrophysics Data System (ADS)

    Liu, Hsiao-Chuan; Chou, Yi-Hong; Tiu, Chui-Mei; Hsieh, Chi-Wen; Liu, Brent; Shung, K. Kirk

    2017-03-01

    Many modalities have been developed as screening tools for breast cancer. A new screening method called acoustic radiation force impulse (ARFI) imaging was created for distinguishing breast lesions based on localized tissue displacement. This displacement was quantitated by virtual touch tissue imaging (VTI). However, VTIs sometimes express reverse results to intensity information in clinical observation. In the study, a fuzzy-based neural network with principle component analysis (PCA) was proposed to differentiate texture patterns of malignant breast from benign tumors. Eighty VTIs were randomly retrospected. Thirty four patients were determined as BI-RADS category 2 or 3, and the rest of them were determined as BI-RADS category 4 or 5 by two leading radiologists. Morphological method and Boolean algebra were performed as the image preprocessing to acquire region of interests (ROIs) on VTIs. Twenty four quantitative parameters deriving from first-order statistics (FOS), fractal dimension and gray level co-occurrence matrix (GLCM) were utilized to analyze the texture pattern of breast tumors on VTIs. PCA was employed to reduce the dimension of features. Fuzzy-based neural network as a classifier to differentiate malignant from benign breast tumors. Independent samples test was used to examine the significance of the difference between benign and malignant breast tumors. The area Az under the receiver operator characteristic (ROC) curve, sensitivity, specificity and accuracy were calculated to evaluate the performance of the system. Most all of texture parameters present significant difference between malignant and benign tumors with p-value of less than 0.05 except the average of fractal dimension. For all features classified by fuzzy-based neural network, the sensitivity, specificity, accuracy and Az were 95.7%, 97.1%, 95% and 0.964, respectively. However, the sensitivity, specificity, accuracy and Az can be increased to 100%, 97.1%, 98.8% and 0.985, respectively if PCA was performed to reduce the dimension of features. Patterns of breast tumors on VTIs can effectively be recognized by quantitative texture parameters, and differentiated malignant from benign lesions by fuzzy-based neural network with PCA.

  5. Using multi-scale entropy and principal component analysis to monitor gears degradation via the motor current signature analysis

    NASA Astrophysics Data System (ADS)

    Aouabdi, Salim; Taibi, Mahmoud; Bouras, Slimane; Boutasseta, Nadir

    2017-06-01

    This paper describes an approach for identifying localized gear tooth defects, such as pitting, using phase currents measured from an induction machine driving the gearbox. A new tool of anomaly detection based on multi-scale entropy (MSE) algorithm SampEn which allows correlations in signals to be identified over multiple time scales. The motor current signature analysis (MCSA) in conjunction with principal component analysis (PCA) and the comparison of observed values with those predicted from a model built using nominally healthy data. The Simulation results show that the proposed method is able to detect gear tooth pitting in current signals.

  6. Comparison of receptor models for source apportionment of the PM10 in Zaragoza (Spain).

    PubMed

    Callén, M S; de la Cruz, M T; López, J M; Navarro, M V; Mastral, A M

    2009-08-01

    Receptor models are useful to understand the chemical and physical characteristics of air pollutants by identifying their sources and by estimating contributions of each source to receptor concentrations. In this work, three receptor models based on principal component analysis with absolute principal component scores (PCA-APCS), Unmix and positive matrix factorization (PMF) were applied to study for the first time the apportionment of the airborne particulate matter less or equal than 10microm (PM10) in Zaragoza, Spain, during 1year sampling campaign (2003-2004). The PM10 samples were characterized regarding their concentrations in inorganic components: trace elements and ions and also organic components: polycyclic aromatic hydrocarbons (PAH) not only in the solid phase but also in the gas phase. A comparison of the three receptor models was carried out in order to do a more robust characterization of the PM10. The three models predicted that the major sources of PM10 in Zaragoza were related to natural sources (60%, 75% and 47%, respectively, for PCA-APCS, Unmix and PMF) although anthropogenic sources also contributed to PM10 (28%, 25% and 39%). With regard to the anthropogenic sources, while PCA and PMF allowed high discrimination in the sources identification associated with different combustion sources such as traffic and industry, fossil fuel, biomass and fuel-oil combustion, heavy traffic and evaporative emissions, the Unmix model only allowed the identification of industry and traffic emissions, evaporative emissions and heavy-duty vehicles. The three models provided good correlations between the experimental and modelled PM10 concentrations with major precision and the closest agreement between the PMF and PCA models.

  7. Retest of a Principal Components Analysis of Two Household Environmental Risk Instruments.

    PubMed

    Oneal, Gail A; Postma, Julie; Odom-Maryon, Tamara; Butterfield, Patricia

    2016-08-01

    Household Risk Perception (HRP) and Self-Efficacy in Environmental Risk Reduction (SEERR) instruments were developed for a public health nurse-delivered intervention designed to reduce home-based, environmental health risks among rural, low-income families. The purpose of this study was to test both instruments in a second low-income population that differed geographically and economically from the original sample. Participants (N = 199) were recruited from the Women, Infants, and Children (WIC) program. Paper and pencil surveys were collected at WIC sites by research-trained student nurses. Exploratory principal components analysis (PCA) was conducted, and comparisons were made to the original PCA for the purpose of data reduction. Instruments showed satisfactory Cronbach alpha values for all components. HRP components were reduced from five to four, which explained 70% of variance. The components were labeled sensed risks, unseen risks, severity of risks, and knowledge. In contrast to the original testing, environmental tobacco smoke (ETS) items was not a separate component of the HRP. The SEERR analysis demonstrated four components explaining 71% of variance, with similar patterns of items as in the first study, including a component on ETS, but some differences in item location. Although low-income populations constituted both samples, differences in demographics and risk exposures may have played a role in component and item locations. Findings provided justification for changing or reducing items, and for tailoring the instruments to population-level risks and behaviors. Although analytic refinement will continue, both instruments advance the measurement of environmental health risk perception and self-efficacy. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  8. ToF-SIMS PCA analysis of Myrtus communis L.

    NASA Astrophysics Data System (ADS)

    Piras, F. M.; Dettori, M. F.; Magnani, A.

    2009-06-01

    Nowadays there is a growing interest of researchers for the application of sophisticated analytical techniques in conjunction with statistical data analysis methods to the characterization of natural products to assure their authenticity and quality, and for the possibility of direct analysis of food to obtain maximum information. In this work, time-of-flight secondary ion mass spectrometry (ToF-SIMS) in conjunction with principal components analysis (PCA) are applied to study the chemical composition and variability of Sardinian myrtle ( Myrtus communis L.) through the analysis of both berries alcoholic extracts and berries epicarp. ToF-SIMS spectra of berries epicarp show that the epicuticular waxes consist mainly of carboxylic acids with chain length ranging from C20 to C30, or identical species formed from fragmentation of long-chain esters. PCA of ToF-SIMS data from myrtle berries epicarp distinguishes two groups characterized by a different surface concentration of triacontanoic acid. Variability in antocyanins, flavonols, α-tocopherol, and myrtucommulone contents is showed by ToF-SIMS PCA analysis of myrtle berries alcoholic extracts.

  9. [Identification of green tea brand based on hyperspectra imaging technology].

    PubMed

    Zhang, Hai-Liang; Liu, Xiao-Li; Zhu, Feng-Le; He, Yong

    2014-05-01

    Hyperspectral imaging technology was developed to identify different brand famous green tea based on PCA information and image information fusion. First 512 spectral images of six brands of famous green tea in the 380 approximately 1 023 nm wavelength range were collected and principal component analysis (PCA) was performed with the goal of selecting two characteristic bands (545 and 611 nm) that could potentially be used for classification system. Then, 12 gray level co-occurrence matrix (GLCM) features (i. e., mean, covariance, homogeneity, energy, contrast, correlation, entropy, inverse gap, contrast, difference from the second-order and autocorrelation) based on the statistical moment were extracted from each characteristic band image. Finally, integration of the 12 texture features and three PCA spectral characteristics for each green tea sample were extracted as the input of LS-SVM. Experimental results showed that discriminating rate was 100% in the prediction set. The receiver operating characteristic curve (ROC) assessment methods were used to evaluate the LS-SVM classification algorithm. Overall results sufficiently demonstrate that hyperspectral imaging technology can be used to perform classification of green tea.

  10. Principal component analysis of phenolic acid spectra

    USDA-ARS?s Scientific Manuscript database

    Phenolic acids are common plant metabolites that exhibit bioactive properties and have applications in functional food and animal feed formulations. The ultraviolet (UV) and infrared (IR) spectra of four closely related phenolic acid structures were evaluated by principal component analysis (PCA) to...

  11. Discrimination of multilocus sequence typing-based Campylobacter jejuni subgroups by MALDI-TOF mass spectrometry.

    PubMed

    Zautner, Andreas Erich; Masanta, Wycliffe Omurwa; Tareen, Abdul Malik; Weig, Michael; Lugert, Raimond; Groß, Uwe; Bader, Oliver

    2013-11-07

    Campylobacter jejuni, the most common bacterial pathogen causing gastroenteritis, shows a wide genetic diversity. Previously, we demonstrated by the combination of multi locus sequence typing (MLST)-based UPGMA-clustering and analysis of 16 genetic markers that twelve different C. jejuni subgroups can be distinguished. Among these are two prominent subgroups. The first subgroup contains the majority of hyperinvasive strains and is characterized by a dimeric form of the chemotaxis-receptor Tlp7(m+c). The second has an extended amino acid metabolism and is characterized by the presence of a periplasmic asparaginase (ansB) and gamma-glutamyl-transpeptidase (ggt). Phyloproteomic principal component analysis (PCA) hierarchical clustering of MALDI-TOF based intact cell mass spectrometry (ICMS) spectra was able to group particular C. jejuni subgroups of phylogenetic related isolates in distinct clusters. Especially the aforementioned Tlp7(m+c)(+) and ansB+/ ggt+ subgroups could be discriminated by PCA. Overlay of ICMS spectra of all isolates led to the identification of characteristic biomarker ions for these specific C. jejuni subgroups. Thus, mass peak shifts can be used to identify the C. jejuni subgroup with an extended amino acid metabolism. Although the PCA hierarchical clustering of ICMS-spectra groups the tested isolates into a different order as compared to MLST-based UPGMA-clustering, the isolates of the indicator-groups form predominantly coherent clusters. These clusters reflect phenotypic aspects better than phylogenetic clustering, indicating that the genes corresponding to the biomarker ions are phylogenetically coupled to the tested marker genes. Thus, PCA clustering could be an additional tool for analyzing the relatedness of bacterial isolates.

  12. Neural network ensemble based CAD system for focal liver lesions from B-mode ultrasound.

    PubMed

    Virmani, Jitendra; Kumar, Vinod; Kalra, Naveen; Khandelwal, Niranjan

    2014-08-01

    A neural network ensemble (NNE) based computer-aided diagnostic (CAD) system to assist radiologists in differential diagnosis between focal liver lesions (FLLs), including (1) typical and atypical cases of Cyst, hemangioma (HEM) and metastatic carcinoma (MET) lesions, (2) small and large hepatocellular carcinoma (HCC) lesions, along with (3) normal (NOR) liver tissue is proposed in the present work. Expert radiologists, visualize the textural characteristics of regions inside and outside the lesions to differentiate between different FLLs, accordingly texture features computed from inside lesion regions of interest (IROIs) and texture ratio features computed from IROIs and surrounding lesion regions of interests (SROIs) are taken as input. Principal component analysis (PCA) is used for reducing the dimensionality of the feature space before classifier design. The first step of classification module consists of a five class PCA-NN based primary classifier which yields probability outputs for five liver image classes. The second step of classification module consists of ten binary PCA-NN based secondary classifiers for NOR/Cyst, NOR/HEM, NOR/HCC, NOR/MET, Cyst/HEM, Cyst/HCC, Cyst/MET, HEM/HCC, HEM/MET and HCC/MET classes. The probability outputs of five class PCA-NN based primary classifier is used to determine the first two most probable classes for a test instance, based on which it is directed to the corresponding binary PCA-NN based secondary classifier for crisp classification between two classes. By including the second step of the classification module, classification accuracy increases from 88.7 % to 95 %. The promising results obtained by the proposed system indicate its usefulness to assist radiologists in differential diagnosis of FLLs.

  13. Principal component analysis vs. self-organizing maps combined with hierarchical clustering for pattern recognition in volcano seismic spectra

    NASA Astrophysics Data System (ADS)

    Unglert, K.; Radić, V.; Jellinek, A. M.

    2016-06-01

    Variations in the spectral content of volcano seismicity related to changes in volcanic activity are commonly identified manually in spectrograms. However, long time series of monitoring data at volcano observatories require tools to facilitate automated and rapid processing. Techniques such as self-organizing maps (SOM) and principal component analysis (PCA) can help to quickly and automatically identify important patterns related to impending eruptions. For the first time, we evaluate the performance of SOM and PCA on synthetic volcano seismic spectra constructed from observations during two well-studied eruptions at Klauea Volcano, Hawai'i, that include features observed in many volcanic settings. In particular, our objective is to test which of the techniques can best retrieve a set of three spectral patterns that we used to compose a synthetic spectrogram. We find that, without a priori knowledge of the given set of patterns, neither SOM nor PCA can directly recover the spectra. We thus test hierarchical clustering, a commonly used method, to investigate whether clustering in the space of the principal components and on the SOM, respectively, can retrieve the known patterns. Our clustering method applied to the SOM fails to detect the correct number and shape of the known input spectra. In contrast, clustering of the data reconstructed by the first three PCA modes reproduces these patterns and their occurrence in time more consistently. This result suggests that PCA in combination with hierarchical clustering is a powerful practical tool for automated identification of characteristic patterns in volcano seismic spectra. Our results indicate that, in contrast to PCA, common clustering algorithms may not be ideal to group patterns on the SOM and that it is crucial to evaluate the performance of these tools on a control dataset prior to their application to real data.

  14. Distribution of a low dose compound within pharmaceutical tablet by using multivariate curve resolution on Raman hyperspectral images.

    PubMed

    Boiret, Mathieu; de Juan, Anna; Gorretta, Nathalie; Ginot, Yves-Michel; Roger, Jean-Michel

    2015-01-25

    In this work, Raman hyperspectral images and multivariate curve resolution-alternating least squares (MCR-ALS) are used to study the distribution of actives and excipients within a pharmaceutical drug product. This article is mainly focused on the distribution of a low dose constituent. Different approaches are compared, using initially filtered or non-filtered data, or using a column-wise augmented dataset before starting the MCR-ALS iterative process including appended information on the low dose component. In the studied formulation, magnesium stearate is used as a lubricant to improve powder flowability. With a theoretical concentration of 0.5% (w/w) in the drug product, the spectral variance contained in the data is weak. By using a principal component analysis (PCA) filtered dataset as a first step of the MCR-ALS approach, the lubricant information is lost in the non-explained variance and its associated distribution in the tablet cannot be highlighted. A sufficient number of components to generate the PCA noise-filtered matrix has to be used in order to keep the lubricant variability within the data set analyzed or, otherwise, work with the raw non-filtered data. Different models are built using an increasing number of components to perform the PCA reduction. It is shown that the magnesium stearate information can be extracted from a PCA model using a minimum of 20 components. In the last part, a column-wise augmented matrix, including a reference spectrum of the lubricant, is used before starting MCR-ALS process. PCA reduction is performed on the augmented matrix, so the magnesium stearate contribution is included within the MCR-ALS calculations. By using an appropriate PCA reduction, with a sufficient number of components, or by using an augmented dataset including appended information on the low dose component, the distribution of the two actives, the two main excipients and the low dose lubricant are correctly recovered. Copyright © 2014 Elsevier B.V. All rights reserved.

  15. Continuous Metabolic Syndrome Scores for Children Using Salivary Biomarkers.

    PubMed

    Shi, Ping; Goodson, J Max; Hartman, Mor-Li; Hasturk, Hatice; Yaskell, Tina; Vargas, Jorel; Cugini, Maryann; Barake, Roula; Alsmadi, Osama; Al-Mutawa, Sabiha; Ariga, Jitendra; Soparkar, Pramod; Behbehani, Jawad; Behbehani, Kazem; Welty, Francine

    2015-01-01

    Binary definitions of the metabolic syndrome based on the presence of a particular number of individual risk factors are limited, particularly in the pediatric population. To address this limitation, we aimed at constructing composite and continuous metabolic syndrome scores (cmetS) to represent an overall measure of metabolic syndrome (MetS) in a large cohort of metabolically at-risk children, focusing on the use of the usual clinical parameters (waist circumference (WC) and systolic blood pressure (SBP), supplemented with two salivary surrogate variables (glucose and high density lipoprotein cholesterol (HDLC). Two different approaches used to create the scores were evaluated in comparison. Data from 8,112 Kuwaiti children (10.00 ± 0.67 years) were used to construct two cmetS for each subject. The first cmetS (cmetS-Z) was created by summing standardized residuals of each variable regressed on age and gender; and the second cmetS (cmetS-PCA) was defined as the first principal component from gender-specific principal component analysis based on the four variables. There was a graded relationship between both scores and the number of adverse risk factors. The areas under the curve using cmetS-Z and cmetS-PCA as predictors for severe metabolic syndrome (defined as the presence of ≥3 metabolic risk factors) were 0.935 and 0.912, respectively. cmetS-Z was positively associated with WC, SBP, and glucose, but inversely associated with HDLC. Except for the lack of association with glucose, cmetS-PCA was similar to cmetS-Z in boys, but had minimum loading on HDLC in girls. Analysis using quantile regression showed an inverse association of fitness level with cmetS-PCA (p = 0.001 for boys; p = 0.002 for girls), and comparison of cmetS-Z and cmetS-PCA suggested that WC and SBP were main contributory components. Significant alterations in the relationship between cmetS and salivary adipocytokines were demonstrated in overweight and obese children as compared to underweight and normal-weight children. We have derived continuous summary scores for MetS from a large-scale pediatric study using two different approaches, incorporating salivary measures as surrogate for plasma measures. The derived scores were viable expressions of metabolic risk, and can be utilized to study the relationships of MetS with various aspects of the metabolic disease process.

  16. Lithological mapping of Kanjamalai hill using hyperspectral remote sensing tools in Salem district, Tamil Nadu, India

    NASA Astrophysics Data System (ADS)

    Arulbalaji, Palanisamy; Balasubramanian, Gurugnanam

    2017-07-01

    This study uses advanced spaceborne thermal emission and reflection radiometer (ASTER) hyperspectral remote sensing techniques to discriminate rock types composing Kanjamalai hill located in the Salem district of Tamil Nadu, India. Kanjamalai hill is of particular interest because it contains economically viable iron ore deposits. ASTER hyperspectral data were subjected to principal component analysis (PCA), independent component analysis (ICA), and minimum noise fraction (MNF) to improve identification of lithologies remotely and to compare these digital data results with published geologic maps. Hyperspectral remote sensing analysis indicates that PCA (R∶G∶B=2∶1∶3), MNF (R∶G∶B=3∶2∶1), and ICA (R∶G∶B=1∶3∶2) provide the best band combination for effective discrimination of lithological rock types composing Kanjamalai hill. The remote sensing-derived lithological map compares favorably with a published geological map from Geological Survey of India and has been verified with ground truth field investigations. Therefore, ASTER data-based lithological mapping provides fast, cost-effective, and accurate geologic data useful for lithological discrimination and identification of ore deposits.

  17. Classification using NMR-based metabolomics of Sophora flavescens grown in Japan and China.

    PubMed

    Suzuki, Ryuichiro; Ikeda, Yuriko; Yamamoto, Akari; Saima, Toyoe; Fujita, Tatsuya; Fukuda, Tatsuo; Fukuda, Eriko; Baba, Masaki; Okada, Yoshihito; Shirataki, Yoshiaki

    2012-11-01

    We demonstrate that NMR-based metabolomics can be used to identify the country of growth (Japan or China) of Sophora flavescens plants. Principle Component Analysis (PCA) conducted on extracts of S. flavescens grown in China provided data distinct from that of extracts of plants grown in Japan. Loading plot analysis showed signals characteristic of Japanese S. flavescens. NMR analyses showed these signals to be due to kurarinol (1) and kushenol H (2). These compounds were confirmed by HPLC analysis to be distinctive markers for Japanese S. flavescens.

  18. PEM-PCA: a parallel expectation-maximization PCA face recognition architecture.

    PubMed

    Rujirakul, Kanokmon; So-In, Chakchai; Arnonkijpanich, Banchar

    2014-01-01

    Principal component analysis or PCA has been traditionally used as one of the feature extraction techniques in face recognition systems yielding high accuracy when requiring a small number of features. However, the covariance matrix and eigenvalue decomposition stages cause high computational complexity, especially for a large database. Thus, this research presents an alternative approach utilizing an Expectation-Maximization algorithm to reduce the determinant matrix manipulation resulting in the reduction of the stages' complexity. To improve the computational time, a novel parallel architecture was employed to utilize the benefits of parallelization of matrix computation during feature extraction and classification stages including parallel preprocessing, and their combinations, so-called a Parallel Expectation-Maximization PCA architecture. Comparing to a traditional PCA and its derivatives, the results indicate lower complexity with an insignificant difference in recognition precision leading to high speed face recognition systems, that is, the speed-up over nine and three times over PCA and Parallel PCA.

  19. Strategies for reducing large fMRI data sets for independent component analysis.

    PubMed

    Wang, Ze; Wang, Jiongjiong; Calhoun, Vince; Rao, Hengyi; Detre, John A; Childress, Anna R

    2006-06-01

    In independent component analysis (ICA), principal component analysis (PCA) is generally used to reduce the raw data to a few principal components (PCs) through eigenvector decomposition (EVD) on the data covariance matrix. Although this works for spatial ICA (sICA) on moderately sized fMRI data, it is intractable for temporal ICA (tICA), since typical fMRI data have a high spatial dimension, resulting in an unmanageable data covariance matrix. To solve this problem, two practical data reduction methods are presented in this paper. The first solution is to calculate the PCs of tICA from the PCs of sICA. This approach works well for moderately sized fMRI data; however, it is highly computationally intensive, even intractable, when the number of scans increases. The second solution proposed is to perform PCA decomposition via a cascade recursive least squared (CRLS) network, which provides a uniform data reduction solution for both sICA and tICA. Without the need to calculate the covariance matrix, CRLS extracts PCs directly from the raw data, and the PC extraction can be terminated after computing an arbitrary number of PCs without the need to estimate the whole set of PCs. Moreover, when the whole data set becomes too large to be loaded into the machine memory, CRLS-PCA can save data retrieval time by reading the data once, while the conventional PCA requires numerous data retrieval steps for both covariance matrix calculation and PC extractions. Real fMRI data were used to evaluate the PC extraction precision, computational expense, and memory usage of the presented methods.

  20. High Performance Parallel Architectures

    NASA Technical Reports Server (NTRS)

    El-Ghazawi, Tarek; Kaewpijit, Sinthop

    1998-01-01

    Traditional remote sensing instruments are multispectral, where observations are collected at a few different spectral bands. Recently, many hyperspectral instruments, that can collect observations at hundreds of bands, have been operational. Furthermore, there have been ongoing research efforts on ultraspectral instruments that can produce observations at thousands of spectral bands. While these remote sensing technology developments hold great promise for new findings in the area of Earth and space science, they present many challenges. These include the need for faster processing of such increased data volumes, and methods for data reduction. Dimension Reduction is a spectral transformation, aimed at concentrating the vital information and discarding redundant data. One such transformation, which is widely used in remote sensing, is the Principal Components Analysis (PCA). This report summarizes our progress on the development of a parallel PCA and its implementation on two Beowulf cluster configuration; one with fast Ethernet switch and the other with a Myrinet interconnection. Details of the implementation and performance results, for typical sets of multispectral and hyperspectral NASA remote sensing data, are presented and analyzed based on the algorithm requirements and the underlying machine configuration. It will be shown that the PCA application is quite challenging and hard to scale on Ethernet-based clusters. However, the measurements also show that a high- performance interconnection network, such as Myrinet, better matches the high communication demand of PCA and can lead to a more efficient PCA execution.

  1. An Internet of Things System for Underground Mine Air Quality Pollutant Prediction Based on Azure Machine Learning

    PubMed Central

    Jo, ByungWan

    2018-01-01

    The implementation of wireless sensor networks (WSNs) for monitoring the complex, dynamic, and harsh environment of underground coal mines (UCMs) is sought around the world to enhance safety. However, previously developed smart systems are limited to monitoring or, in a few cases, can report events. Therefore, this study introduces a reliable, efficient, and cost-effective internet of things (IoT) system for air quality monitoring with newly added features of assessment and pollutant prediction. This system is comprised of sensor modules, communication protocols, and a base station, running Azure Machine Learning (AML) Studio over it. Arduino-based sensor modules with eight different parameters were installed at separate locations of an operational UCM. Based on the sensed data, the proposed system assesses mine air quality in terms of the mine environment index (MEI). Principal component analysis (PCA) identified CH4, CO, SO2, and H2S as the most influencing gases significantly affecting mine air quality. The results of PCA were fed into the ANN model in AML studio, which enabled the prediction of MEI. An optimum number of neurons were determined for both actual input and PCA-based input parameters. The results showed a better performance of the PCA-based ANN for MEI prediction, with R2 and RMSE values of 0.6654 and 0.2104, respectively. Therefore, the proposed Arduino and AML-based system enhances mine environmental safety by quickly assessing and predicting mine air quality. PMID:29561777

  2. An Internet of Things System for Underground Mine Air Quality Pollutant Prediction Based on Azure Machine Learning.

    PubMed

    Jo, ByungWan; Khan, Rana Muhammad Asad

    2018-03-21

    The implementation of wireless sensor networks (WSNs) for monitoring the complex, dynamic, and harsh environment of underground coal mines (UCMs) is sought around the world to enhance safety. However, previously developed smart systems are limited to monitoring or, in a few cases, can report events. Therefore, this study introduces a reliable, efficient, and cost-effective internet of things (IoT) system for air quality monitoring with newly added features of assessment and pollutant prediction. This system is comprised of sensor modules, communication protocols, and a base station, running Azure Machine Learning (AML) Studio over it. Arduino-based sensor modules with eight different parameters were installed at separate locations of an operational UCM. Based on the sensed data, the proposed system assesses mine air quality in terms of the mine environment index (MEI). Principal component analysis (PCA) identified CH₄, CO, SO₂, and H₂S as the most influencing gases significantly affecting mine air quality. The results of PCA were fed into the ANN model in AML studio, which enabled the prediction of MEI. An optimum number of neurons were determined for both actual input and PCA-based input parameters. The results showed a better performance of the PCA-based ANN for MEI prediction, with R ² and RMSE values of 0.6654 and 0.2104, respectively. Therefore, the proposed Arduino and AML-based system enhances mine environmental safety by quickly assessing and predicting mine air quality.

  3. Isolation–By–Distance–and–Time in a stepping–stone model

    PubMed Central

    Duforet-Frebourg, Nicolas; Slatkin, Montgomery

    2015-01-01

    With the great advances in ancient DNA extraction, genetic data are now obtained from geographically separated individuals from both present and past. However, population genetics theory about the joint effect of space and time has not been thoroughly studied. Based on the classical stepping–stone model, we develop the theory of Isolation by Distance and Time. We derive the correlation of allele frequencies between demes in the case where ancient samples are present, and investigate the impact of edge effects with forward–in–time simulations. We also derive results about coalescent times in circular and toroidal models. As one of the most common ways to investigate population structure is principal components analysis (PCA), we evaluate the impact of our theory on PCA plots. Our results demonstrate that time between samples is an important factor. Ancient samples tend to be drawn to the center of a PCA plot. PMID:26592162

  4. Evaluation of three pumpkin species: correlation with physicochemical, antioxidant properties and classification using SPME-GC-MS and E-nose methods.

    PubMed

    Zhou, Chun-Li; Mi, Li; Hu, Xue-Yan; Zhu, Bi-Hua

    2017-09-01

    To ascertain the most discriminant variables for three pumpkin species principal component analysis (PCA) was performed. Twenty-four parameters (pH, conductivity, sucrose, glucose, total soluble solids, L* , a* , b* , individual weight, edible rate, firmness, citric acid, fumaric acid, l-ascorbic acid, malic acid, PPO activity, POD activity, total flavonoids, vitamin E, total phenolics, DPPH, FRAP, β-carotene, and aroma) were considered. The studied pumpkin species were Cucurbita maxima , Cucurbita moschata , and Cucurbita pepo . Three pumpkin species were classified by PCA based on aroma, physicochemical and antioxidant properties because the sum of PC1 and PC2 were both greater than 85% (85.06 and 93.64% respectively). Results were validated by the PCA and showed that PPO activity, total flavonoid, sucrose, glucose, TSS, a* , pH, malic acid, vitamin E, DPPH, FRAP and β-carotene, and aroma are highly useful parameters to classify pumpkin species.

  5. Automatically detect and track infrared small targets with kernel Fukunaga-Koontz transform and Kalman prediction.

    PubMed

    Liu, Ruiming; Liu, Erqi; Yang, Jie; Zeng, Yong; Wang, Fanglin; Cao, Yuan

    2007-11-01

    Fukunaga-Koontz transform (FKT), stemming from principal component analysis (PCA), is used in many pattern recognition and image-processing fields. It cannot capture the higher-order statistical property of natural images, so its detection performance is not satisfying. PCA has been extended into kernel PCA in order to capture the higher-order statistics. However, thus far there have been no researchers who have definitely proposed kernel FKT (KFKT) and researched its detection performance. For accurately detecting potential small targets from infrared images, we first extend FKT into KFKT to capture the higher-order statistical properties of images. Then a framework based on Kalman prediction and KFKT, which can automatically detect and track small targets, is developed. Results of experiments show that KFKT outperforms FKT and the proposed framework is competent to automatically detect and track infrared point targets.

  6. Automatically detect and track infrared small targets with kernel Fukunaga-Koontz transform and Kalman prediction

    NASA Astrophysics Data System (ADS)

    Liu, Ruiming; Liu, Erqi; Yang, Jie; Zeng, Yong; Wang, Fanglin; Cao, Yuan

    2007-11-01

    Fukunaga-Koontz transform (FKT), stemming from principal component analysis (PCA), is used in many pattern recognition and image-processing fields. It cannot capture the higher-order statistical property of natural images, so its detection performance is not satisfying. PCA has been extended into kernel PCA in order to capture the higher-order statistics. However, thus far there have been no researchers who have definitely proposed kernel FKT (KFKT) and researched its detection performance. For accurately detecting potential small targets from infrared images, we first extend FKT into KFKT to capture the higher-order statistical properties of images. Then a framework based on Kalman prediction and KFKT, which can automatically detect and track small targets, is developed. Results of experiments show that KFKT outperforms FKT and the proposed framework is competent to automatically detect and track infrared point targets.

  7. Decoupled ARX and RBF Neural Network Modeling Using PCA and GA Optimization for Nonlinear Distributed Parameter Systems.

    PubMed

    Zhang, Ridong; Tao, Jili; Lu, Renquan; Jin, Qibing

    2018-02-01

    Modeling of distributed parameter systems is difficult because of their nonlinearity and infinite-dimensional characteristics. Based on principal component analysis (PCA), a hybrid modeling strategy that consists of a decoupled linear autoregressive exogenous (ARX) model and a nonlinear radial basis function (RBF) neural network model are proposed. The spatial-temporal output is first divided into a few dominant spatial basis functions and finite-dimensional temporal series by PCA. Then, a decoupled ARX model is designed to model the linear dynamics of the dominant modes of the time series. The nonlinear residual part is subsequently parameterized by RBFs, where genetic algorithm is utilized to optimize their hidden layer structure and the parameters. Finally, the nonlinear spatial-temporal dynamic system is obtained after the time/space reconstruction. Simulation results of a catalytic rod and a heat conduction equation demonstrate the effectiveness of the proposed strategy compared to several other methods.

  8. Discrimination of transgenic soybean seeds by terahertz spectroscopy

    NASA Astrophysics Data System (ADS)

    Liu, Wei; Liu, Changhong; Chen, Feng; Yang, Jianbo; Zheng, Lei

    2016-10-01

    Discrimination of genetically modified organisms is increasingly demanded by legislation and consumers worldwide. The feasibility of a non-destructive discrimination of glyphosate-resistant and conventional soybean seeds and their hybrid descendants was examined by terahertz time-domain spectroscopy system combined with chemometrics. Principal component analysis (PCA), least squares-support vector machines (LS-SVM) and PCA-back propagation neural network (PCA-BPNN) models with the first and second derivative and standard normal variate (SNV) transformation pre-treatments were applied to classify soybean seeds based on genotype. Results demonstrated clear differences among glyphosate-resistant, hybrid descendants and conventional non-transformed soybean seeds could easily be visualized with an excellent classification (accuracy was 88.33% in validation set) using the LS-SVM and the spectra with SNV pre-treatment. The results indicated that THz spectroscopy techniques together with chemometrics would be a promising technique to distinguish transgenic soybean seeds from non-transformed seeds with high efficiency and without any major sample preparation.

  9. Water quality analysis of the Rapur area, Andhra Pradesh, South India using multivariate techniques

    NASA Astrophysics Data System (ADS)

    Nagaraju, A.; Sreedhar, Y.; Thejaswi, A.; Sayadi, Mohammad Hossein

    2017-10-01

    The groundwater samples from Rapur area were collected from different sites to evaluate the major ion chemistry. The large number of data can lead to difficulties in the integration, interpretation, and representation of the results. Two multivariate statistical methods, hierarchical cluster analysis (HCA) and factor analysis (FA), were applied to evaluate their usefulness to classify and identify geochemical processes controlling groundwater geochemistry. Four statistically significant clusters were obtained from 30 sampling stations. This has resulted two important clusters viz., cluster 1 (pH, Si, CO3, Mg, SO4, Ca, K, HCO3, alkalinity, Na, Na + K, Cl, and hardness) and cluster 2 (EC and TDS) which are released to the study area from different sources. The application of different multivariate statistical techniques, such as principal component analysis (PCA), assists in the interpretation of complex data matrices for a better understanding of water quality of a study area. From PCA, it is clear that the first factor (factor 1), accounted for 36.2% of the total variance, was high positive loading in EC, Mg, Cl, TDS, and hardness. Based on the PCA scores, four significant cluster groups of sampling locations were detected on the basis of similarity of their water quality.

  10. Fast detection of Piscirickettsia salmonis in Salmo salar serum through MALDI-TOF-MS profiling.

    PubMed

    Olate, Verónica R; Nachtigall, Fabiane M; Santos, Leonardo S; Soto, Alex; Araya, Macarena; Oyanedel, Sandra; Díaz, Verónica; Marchant, Vanessa; Rios-Momberg, Mauricio

    2016-03-01

    Piscirickettsia salmonis is a pathogenic bacteria known as the aetiological agent of the salmonid rickettsial syndrome and causes a high mortality in farmed salmonid fishes. Detection of P. salmonis in farmed fishes is based mainly on molecular biology and immunohistochemistry techniques. These techniques are in most of the cases expensive and time consuming. In the search of new alternatives to detect the presence of P. salmonis in salmonid fishes, this work proposed the use of MALDI-TOF-MS to compare serum protein profiles from Salmo salar fish, including experimentally infected and non-infected fishes using principal component analysis (PCA). Samples were obtained from a controlled bioassay where S. salar was challenged with P. salmonis in a cohabitation model and classified according to the presence or absence of the bacteria by real time PCR analysis. MALDI spectra of the fish serum samples showed differences in its serum protein composition. These differences were corroborated with PCA analysis. The results demonstrated that the use of both MALDI-TOF-MS and PCA represents a useful tool to discriminate the fish status through the analysis of salmonid serum samples. Copyright © 2016 John Wiley & Sons, Ltd.

  11. Mini-DIAL system measurements coupled with multivariate data analysis to identify TIC and TIM simulants: preliminary absorption database analysis.

    NASA Astrophysics Data System (ADS)

    Gaudio, P.; Malizia, A.; Gelfusa, M.; Martinelli, E.; Di Natale, C.; Poggi, L. A.; Bellecci, C.

    2017-01-01

    Nowadays Toxic Industrial Components (TICs) and Toxic Industrial Materials (TIMs) are one of the most dangerous and diffuse vehicle of contamination in urban and industrial areas. The academic world together with the industrial and military one are working on innovative solutions to monitor the diffusion in atmosphere of such pollutants. In this phase the most common commercial sensors are based on “point detection” technology but it is clear that such instruments cannot satisfy the needs of the smart cities. The new challenge is developing stand-off systems to continuously monitor the atmosphere. Quantum Electronics and Plasma Physics (QEP) research group has a long experience in laser system development and has built two demonstrators based on DIAL (Differential Absorption of Light) technology could be able to identify chemical agents in atmosphere. In this work the authors will present one of those DIAL system, the miniaturized one, together with the preliminary results of an experimental campaign conducted on TICs and TIMs simulants in cell with aim of use the absorption database for the further atmospheric an analysis using the same DIAL system. The experimental results are analysed with standard multivariate data analysis technique as Principal Component Analysis (PCA) to develop a classification model aimed at identifying organic chemical compound in atmosphere. The preliminary results of absorption coefficients of some chemical compound are shown together pre PCA analysis.

  12. Investigation of cell wall composition related to stem lodging resistance in wheat (Triticum aestivum L.) by FTIR spectroscopy.

    PubMed

    Wang, Jian; Zhu, Jinmao; Huang, RuZhu; Yang, YuSheng

    2012-07-01

    We explored the rapid qualitative analysis of wheat cultivars with good lodging resistances by Fourier transform infrared resonance (FTIR) spectroscopy and multivariate statistical analysis. FTIR imaging showing that wheat stem cell walls were mainly composed of cellulose, pectin, protein, and lignin. Principal components analysis (PCA) was used to eliminate multicollinearity among multiple peak absorptions. PCA revealed the developmental internodes of wheat stems could be distributed from low to high along the load of the second principal component, which was consistent with the corresponding bands of cellulose in the FTIR spectra of the cell walls. Furthermore, four distinct stem populations could also be identified by spectral features related to their corresponding mechanical properties via PCA and cluster analysis. Histochemical staining of four types of wheat stems with various abilities to resist lodging revealed that cellulose contributed more than lignin to the ability to resist lodging. These results strongly suggested that the main cell wall component responsible for these differences was cellulose. Therefore, the combination of multivariate analysis and FTIR could rapidly screen wheat cultivars with good lodging resistance. Furthermore, the application of these methods to a much wider range of cultivars of unknown mechanical properties promises to be of interest.

  13. Evaluation of Staining-Dependent Colour Changes in Resin Composites Using Principal Component Analysis

    PubMed Central

    Manojlovic, D.; Lenhardt, L.; Milićević, B.; Antonov, M.; Miletic, V.; Dramićanin, M. D.

    2015-01-01

    Colour changes in Gradia Direct™ composite after immersion in tea, coffee, red wine, Coca-Cola, Colgate mouthwash, and distilled water were evaluated using principal component analysis (PCA) and the CIELAB colour coordinates. The reflection spectra of the composites were used as input data for the PCA. The output data (scores and loadings) provided information about the magnitude and origin of the surface reflection changes after exposure to the staining solutions. The reflection spectra of the stained samples generally exhibited lower reflection in the blue spectral range, which was manifested in the lower content of the blue shade for the samples. Both analyses demonstrated the high staining abilities of tea, coffee, and red wine, which produced total colour changes of 4.31, 6.61, and 6.22, respectively, according to the CIELAB analysis. PCA revealed subtle changes in the reflection spectra of composites immersed in Coca-Cola, demonstrating Coca-Cola’s ability to stain the composite to a small degree. PMID:26450008

  14. Evaluation of Staining-Dependent Colour Changes in Resin Composites Using Principal Component Analysis.

    PubMed

    Manojlovic, D; Lenhardt, L; Milićević, B; Antonov, M; Miletic, V; Dramićanin, M D

    2015-10-09

    Colour changes in Gradia Direct™ composite after immersion in tea, coffee, red wine, Coca-Cola, Colgate mouthwash, and distilled water were evaluated using principal component analysis (PCA) and the CIELAB colour coordinates. The reflection spectra of the composites were used as input data for the PCA. The output data (scores and loadings) provided information about the magnitude and origin of the surface reflection changes after exposure to the staining solutions. The reflection spectra of the stained samples generally exhibited lower reflection in the blue spectral range, which was manifested in the lower content of the blue shade for the samples. Both analyses demonstrated the high staining abilities of tea, coffee, and red wine, which produced total colour changes of 4.31, 6.61, and 6.22, respectively, according to the CIELAB analysis. PCA revealed subtle changes in the reflection spectra of composites immersed in Coca-Cola, demonstrating Coca-Cola's ability to stain the composite to a small degree.

  15. Principal component and normal mode analysis of proteins; a quantitative comparison using the GroEL subunit.

    PubMed

    Skjaerven, Lars; Martinez, Aurora; Reuter, Nathalie

    2011-01-01

    Principal component analysis (PCA) and normal mode analysis (NMA) have emerged as two invaluable tools for studying conformational changes in proteins. To compare these approaches for studying protein dynamics, we have used a subunit of the GroEL chaperone, whose dynamics is well characterized. We first show that both PCA on trajectories from molecular dynamics (MD) simulations and NMA reveal a general dynamical behavior in agreement with what has previously been described for GroEL. We thus compare the reproducibility of PCA on independent MD runs and subsequently investigate the influence of the length of the MD simulations. We show that there is a relatively poor one-to-one correspondence between eigenvectors obtained from two independent runs and conclude that caution should be taken when analyzing principal components individually. We also observe that increasing the simulation length does not improve the agreement with the experimental structural difference. In fact, relatively short MD simulations are sufficient for this purpose. We observe a rapid convergence of the eigenvectors (after ca. 6 ns). Although there is not always a clear one-to-one correspondence, there is a qualitatively good agreement between the movements described by the first five modes obtained with the three different approaches; PCA, all-atoms NMA, and coarse-grained NMA. It is particularly interesting to relate this to the computational cost of the three methods. The results we obtain on the GroEL subunit contribute to the generalization of robust and reproducible strategies for the study of protein dynamics, using either NMA or PCA of trajectories from MD simulations. © 2010 Wiley-Liss, Inc.

  16. Dihedral angle principal component analysis of molecular dynamics simulations.

    PubMed

    Altis, Alexandros; Nguyen, Phuong H; Hegger, Rainer; Stock, Gerhard

    2007-06-28

    It has recently been suggested by Mu et al. [Proteins 58, 45 (2005)] to use backbone dihedral angles instead of Cartesian coordinates in a principal component analysis of molecular dynamics simulations. Dihedral angles may be advantageous because internal coordinates naturally provide a correct separation of internal and overall motion, which was found to be essential for the construction and interpretation of the free energy landscape of a biomolecule undergoing large structural rearrangements. To account for the circular statistics of angular variables, a transformation from the space of dihedral angles {phi(n)} to the metric coordinate space {x(n)=cos phi(n),y(n)=sin phi(n)} was employed. To study the validity and the applicability of the approach, in this work the theoretical foundations underlying the dihedral angle principal component analysis (dPCA) are discussed. It is shown that the dPCA amounts to a one-to-one representation of the original angle distribution and that its principal components can readily be characterized by the corresponding conformational changes of the peptide. Furthermore, a complex version of the dPCA is introduced, in which N angular variables naturally lead to N eigenvalues and eigenvectors. Applying the methodology to the construction of the free energy landscape of decaalanine from a 300 ns molecular dynamics simulation, a critical comparison of the various methods is given.

  17. Dihedral angle principal component analysis of molecular dynamics simulations

    NASA Astrophysics Data System (ADS)

    Altis, Alexandros; Nguyen, Phuong H.; Hegger, Rainer; Stock, Gerhard

    2007-06-01

    It has recently been suggested by Mu et al. [Proteins 58, 45 (2005)] to use backbone dihedral angles instead of Cartesian coordinates in a principal component analysis of molecular dynamics simulations. Dihedral angles may be advantageous because internal coordinates naturally provide a correct separation of internal and overall motion, which was found to be essential for the construction and interpretation of the free energy landscape of a biomolecule undergoing large structural rearrangements. To account for the circular statistics of angular variables, a transformation from the space of dihedral angles {φn} to the metric coordinate space {xn=cosφn,yn=sinφn} was employed. To study the validity and the applicability of the approach, in this work the theoretical foundations underlying the dihedral angle principal component analysis (dPCA) are discussed. It is shown that the dPCA amounts to a one-to-one representation of the original angle distribution and that its principal components can readily be characterized by the corresponding conformational changes of the peptide. Furthermore, a complex version of the dPCA is introduced, in which N angular variables naturally lead to N eigenvalues and eigenvectors. Applying the methodology to the construction of the free energy landscape of decaalanine from a 300ns molecular dynamics simulation, a critical comparison of the various methods is given.

  18. Principal Components Analysis Studies of Martian Clouds

    NASA Astrophysics Data System (ADS)

    Klassen, D. R.; Bell, J. F., III

    2001-11-01

    We present the principal components analysis (PCA) of absolutely calibrated multi-spectral images of Mars as a function of Martian season. The PCA technique is a mathematical rotation and translation of the data from a brightness/wavelength space to a vector space of principal ``traits'' that lie along the directions of maximal variance. The first of these traits, accounting for over 90% of the data variance, is overall brightness and represented by an average Mars spectrum. Interpretation of the remaining traits, which account for the remaining ~10% of the variance, is not always the same and depends upon what other components are in the scene and thus, varies with Martian season. For example, during seasons with large amounts of water ice in the scene, the second trait correlates with the ice and anti-corrlates with temperature. We will investigate the interpretation of the second, and successive important PCA traits. Although these PCA traits are orthogonal in their own vector space, it is unlikely that any one trait represents a singular, mineralogic, spectral end-member. It is more likely that there are many spectral endmembers that vary identically to within the noise level, that the PCA technique will not be able to distinguish them. Another possibility is that similar absorption features among spectral endmembers may be tied to one PCA trait, for example ''amount of 2 \\micron\\ absorption''. We thus attempt to extract spectral endmembers by matching linear combinations of the PCA traits to USGS, JHU, and JPL spectral libraries as aquired through the JPL Aster project. The recovered spectral endmembers are then linearly combined to model the multi-spectral image set. We present here the spectral abundance maps of the water ice/frost endmember which allow us to track Martian clouds and ground frosts. This work supported in part through NASA Planetary Astronomy Grant NAG5-6776. All data gathered at the NASA Infrared Telescope Facility in collaboration with the telescope operators and with thanks to the support staff and day crew.

  19. Investigation of probabilistic principal component analysis compared to proper orthogonal decomposition methods for basis extraction and missing data estimation

    NASA Astrophysics Data System (ADS)

    Lee, Kyunghoon

    To evaluate the maximum likelihood estimates (MLEs) of probabilistic principal component analysis (PPCA) parameters such as a factor-loading, PPCA can invoke an expectation-maximization (EM) algorithm, yielding an EM algorithm for PPCA (EM-PCA). In order to examine the benefits of the EM-PCA for aerospace engineering applications, this thesis attempts to qualitatively and quantitatively scrutinize the EM-PCA alongside both POD and gappy POD using high-dimensional simulation data. In pursuing qualitative investigations, the theoretical relationship between POD and PPCA is transparent such that the factor-loading MLE of PPCA, evaluated by the EM-PCA, pertains to an orthogonal basis obtained by POD. By contrast, the analytical connection between gappy POD and the EM-PCA is nebulous because they distinctively approximate missing data due to their antithetical formulation perspectives: gappy POD solves a least-squares problem whereas the EM-PCA relies on the expectation of the observation probability model. To juxtapose both gappy POD and the EM-PCA, this research proposes a unifying least-squares perspective that embraces the two disparate algorithms within a generalized least-squares framework. As a result, the unifying perspective reveals that both methods address similar least-squares problems; however, their formulations contain dissimilar bases and norms. Furthermore, this research delves into the ramifications of the different bases and norms that will eventually characterize the traits of both methods. To this end, two hybrid algorithms of gappy POD and the EM-PCA are devised and compared to the original algorithms for a qualitative illustration of the different basis and norm effects. After all, a norm reflecting a curve-fitting method is found to more significantly affect estimation error reduction than a basis for two example test data sets: one is absent of data only at a single snapshot and the other misses data across all the snapshots. From a numerical performance aspect, the EM-PCA is computationally less efficient than POD for intact data since it suffers from slow convergence inherited from the EM algorithm. For incomplete data, this thesis quantitatively found that the number of data missing snapshots predetermines whether the EM-PCA or gappy POD outperforms the other because of the computational cost of a coefficient evaluation, resulting from a norm selection. For instance, gappy POD demands laborious computational effort in proportion to the number of data-missing snapshots as a consequence of the gappy norm. In contrast, the computational cost of the EM-PCA is invariant to the number of data-missing snapshots thanks to the L2 norm. In general, the higher the number of data-missing snapshots, the wider the gap between the computational cost of gappy POD and the EM-PCA. Based on the numerical experiments reported in this thesis, the following criterion is recommended regarding the selection between gappy POD and the EM-PCA for computational efficiency: gappy POD for an incomplete data set containing a few data-missing snapshots and the EM-PCA for an incomplete data set involving multiple data-missing snapshots. Last, the EM-PCA is applied to two aerospace applications in comparison to gappy POD as a proof of concept: one with an emphasis on basis extraction and the other with a focus on missing data reconstruction for a given incomplete data set with scattered missing data. The first application exploits the EM-PCA to efficiently construct reduced-order models of engine deck responses obtained by the numerical propulsion system simulation (NPSS), some of whose results are absent due to failed analyses caused by numerical instability. Model-prediction tests validate that engine performance metrics estimated by the reduced-order NPSS model exhibit considerably good agreement with those directly obtained by NPSS. Similarly, the second application illustrates that the EM-PCA is significantly more cost effective than gappy POD at repairing spurious PIV measurements obtained from acoustically-excited, bluff-body jet flow experiments. The EM-PCA reduces computational cost on factors 8 ˜ 19 compared to gappy POD while generating the same restoration results as those evaluated by gappy POD. All in all, through comprehensive theoretical and numerical investigation, this research establishes that the EM-PCA is an efficient alternative to gappy POD for an incomplete data set containing missing data over an entire data set. (Abstract shortened by UMI.)

  20. An Efficient Taguchi Approach for the Performance Optimization of Health, Safety, Environment and Ergonomics in Generation Companies.

    PubMed

    Azadeh, Ali; Sheikhalishahi, Mohammad

    2015-06-01

    A unique framework for performance optimization of generation companies (GENCOs) based on health, safety, environment, and ergonomics (HSEE) indicators is presented. To rank this sector of industry, the combination of data envelopment analysis (DEA), principal component analysis (PCA), and Taguchi are used for all branches of GENCOs. These methods are applied in an integrated manner to measure the performance of GENCO. The preferred model between DEA, PCA, and Taguchi is selected based on sensitivity analysis and maximum correlation between rankings. To achieve the stated objectives, noise is introduced into input data. The results show that Taguchi outperforms other methods. Moreover, a comprehensive experiment is carried out to identify the most influential factor for ranking GENCOs. The approach developed in this study could be used for continuous assessment and improvement of GENCO's performance in supplying energy with respect to HSEE factors. The results of such studies would help managers to have better understanding of weak and strong points in terms of HSEE factors.

  1. Characterization and source identification of pollutants in runoff from a mixed land use watershed using ordination analyses.

    PubMed

    Lee, Dong Hoon; Kim, Jin Hwi; Mendoza, Joseph A; Lee, Chang Hee; Kang, Joo-Hyon

    2016-05-01

    While identification of critical pollutant sources is the key initial step for cost-effective runoff management, it is challenging due to the highly uncertain nature of runoff pollution, especially during a storm event. To identify critical sources and their quantitative contributions to runoff pollution (especially focusing on phosphorous), two ordination methods were used in this study: principal component analysis (PCA) and positive matrix factorization (PMF). For the ordination analyses, we used runoff quality data for 14 storm events, including data for phosphorus, 11 heavy metal species, and eight ionic species measured at the outlets of subcatchments with different land use compositions in a mixed land use watershed. Five factors as sources of runoff pollutants were identified by PCA: agrochemicals, groundwater, native soils, domestic sewage, and urban sources (building materials and automotive activities). PMF identified similar factors to those identified by PCA, with more detailed source mechanisms for groundwater (i.e., nitrate leaching and cation exchange) and urban sources (vehicle components/motor oils/building materials and vehicle exhausts), confirming the sources identified by PCA. PMF was further used to quantify contributions of the identified sources to the water quality. Based on the results, agrochemicals and automotive activities were the two dominant and ubiquitous phosphorus sources (39-61 and 16-47 %, respectively) in the study area, regardless of land use types.

  2. Tracing and separating plasma components causing matrix effects in hydrophilic interaction chromatography-electrospray ionization mass spectrometry.

    PubMed

    Ekdahl, Anja; Johansson, Maria C; Ahnoff, Martin

    2013-04-01

    Matrix effects on electrospray ionization were investigated for plasma samples analysed by hydrophilic interaction chromatography (HILIC) in gradient elution mode, and HILIC columns of different chemistries were tested for separation of plasma components and model analytes. By combining mass spectral data with post-column infusion traces, the following components of protein-precipitated plasma were identified and found to have significant effect on ionization: urea, creatinine, phosphocholine, lysophosphocholine, sphingomyelin, sodium ion, chloride ion, choline and proline betaine. The observed effect on ionization was both matrix-component and analyte dependent. The separation of identified plasma components and model analytes on eight columns was compared, using pair-wise linear correlation analysis and principal component analysis (PCA). Large changes in selectivity could be obtained by change of column, while smaller changes were seen when the mobile phase buffer was changed from ammonium formate pH 3.0 to ammonium acetate pH 4.5. While results from PCA and linear correlation analysis were largely in accord, linear correlation analysis was judged to be more straight-forward in terms of conduction and interpretation.

  3. Age-related differences in early novelty processing: Using PCA to parse the overlapping anterior P2 and N2 components

    PubMed Central

    Daffner, Kirk R.; Alperin, Brittany R.; Mott, Katherine K.; Tusch, Erich; Holcomb, Phillip J.

    2015-01-01

    Previous work demonstrated age-associated increases in the anterior P2 and age-related decreases in the anterior N2 in response to novel stimuli. Principal component analysis (PCA) was used to determine if the inverse relationship between these components was due to their temporal and spatial overlap. PCA revealed an early anterior P2, sensitive to task relevance, and a late anterior P2, responsive to novelty, both exhibiting age-related amplitude increases. A PCA factor representing the anterior N2, sensitive to novelty, exhibited age-related amplitude decreases. The late P2 and N2 to novels inversely correlated. Larger late P2 amplitude to novels was associated with better behavioral performance. Age-related differences in the anterior P2 and N2 to novel stimuli likely represent age-associated changes in independent cognitive operations. Enhanced anterior P2 activity (indexing augmentation in motivational salience) may be a compensatory mechanism for diminished anterior N2 activity (indexing reduced ability of older adults to process ambiguous representations). PMID:25596483

  4. Evaluation of redundancy analysis to identify signatures of local adaptation.

    PubMed

    Capblancq, Thibaut; Luu, Keurcien; Blum, Michael G B; Bazin, Eric

    2018-05-26

    Ordination is a common tool in ecology that aims at representing complex biological information in a reduced space. In landscape genetics, ordination methods such as principal component analysis (PCA) have been used to detect adaptive variation based on genomic data. Taking advantage of environmental data in addition to genotype data, redundancy analysis (RDA) is another ordination approach that is useful to detect adaptive variation. This paper aims at proposing a test statistic based on RDA to search for loci under selection. We compare redundancy analysis to pcadapt, which is a nonconstrained ordination method, and to a latent factor mixed model (LFMM), which is a univariate genotype-environment association method. Individual-based simulations identify evolutionary scenarios where RDA genome scans have a greater statistical power than genome scans based on PCA. By constraining the analysis with environmental variables, RDA performs better than PCA in identifying adaptive variation when selection gradients are weakly correlated with population structure. Additionally, we show that if RDA and LFMM have a similar power to identify genetic markers associated with environmental variables, the RDA-based procedure has the advantage to identify the main selective gradients as a combination of environmental variables. To give a concrete illustration of RDA in population genomics, we apply this method to the detection of outliers and selective gradients on an SNP data set of Populus trichocarpa (Geraldes et al., 2013). The RDA-based approach identifies the main selective gradient contrasting southern and coastal populations to northern and continental populations in the northwestern American coast. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  5. Detection of compatibility between baclofen and excipients with aid of infrared spectroscopy and chemometry

    NASA Astrophysics Data System (ADS)

    Rojek, Barbara; Wesolowski, Marek; Suchacz, Bogdan

    2013-12-01

    In the paper infrared (IR) spectroscopy and multivariate exploration techniques: principal component analysis (PCA) and cluster analysis (CA) were applied as supportive methods for the detection of physicochemical incompatibilities between baclofen and excipients. In the course of research, the most useful rotational strategy in PCA proved to be varimax normalized, while in CA Ward's hierarchical agglomeration with Euclidean distance measure enabled to yield the most interpretable results. Chemometrical calculations confirmed the suitability of PCA and CA as the auxiliary methods for interpretation of infrared spectra in order to recognize whether compatibilities or incompatibilities between active substance and excipients occur. On the basis of IR spectra and the results of PCA and CA it was possible to demonstrate that the presence of lactose, β-cyclodextrin and meglumine in binary mixtures produce interactions with baclofen. The results were verified using differential scanning calorimetry, differential thermal analysis, thermogravimetry/differential thermogravimetry and X-ray powder diffraction analyses.

  6. Principal component analysis to assess the efficiency and mechanism for enhanced coagulation of natural algae-laden water using a novel dual coagulant system.

    PubMed

    Ou, Hua-Se; Wei, Chao-Hai; Deng, Yang; Gao, Nai-Yun; Ren, Yuan; Hu, Yun

    2014-02-01

    A novel dual coagulant system of polyaluminum chloride sulfate (PACS) and polydiallyldimethylammonium chloride (PDADMAC) was used to treat natural algae-laden water from Meiliang Gulf, Lake Taihu. PACS (Aln(OH)mCl3n-m-2k(SO4)k) has a mass ratio of 10 %, a SO4 (2-)/Al3 (+) mole ratio of 0.0664, and an OH/Al mole ratio of 2. The PDADMAC ([C8H16NCl]m) has a MW which ranges from 5 × 10(5) to 20 × 10(5) Da. The variations of contaminants in water samples during treatments were estimated in the form of principal component analysis (PCA) factor scores and conventional variables (turbidity, DOC, etc.). Parallel factor analysis determined four chromophoric dissolved organic matters (CDOM) components, and PCA identified four integrated principle factors. PCA factor 1 had significant correlations with chlorophyll-a (r=0.718), protein-like CDOM C1 (0.689), and C2 (0.756). Factor 2 correlated with UV254 (0.672), humic-like CDOM component C3 (0.716), and C4 (0.758). Factors 3 and 4 had correlations with NH3-N (0.748) and T-P (0.769), respectively. The variations of PCA factors scores revealed that PACS contributed less aluminum dissolution than PAC to obtain equivalent removal efficiency of contaminants. This might be due to the high cationic charge and pre-hydrolyzation of PACS. Compared with PACS coagulation (20 mg L(-1)), the removal of PCA factors 1, 2, and 4 increased 45, 33, and 12 %, respectively, in combined PACS-PDADMAC treatment (0.8 mg L(-1) +20 mg L(-1)). Since PAC contained more Al (0.053 g/1 g) than PACS (0.028 g/1 g), the results indicated that PACS contributed less Al dissolution into the water to obtain equivalent removal efficiency.

  7. Tailored multivariate analysis for modulated enhanced diffraction

    DOE PAGES

    Caliandro, Rocco; Guccione, Pietro; Nico, Giovanni; ...

    2015-10-21

    Modulated enhanced diffraction (MED) is a technique allowing the dynamic structural characterization of crystalline materials subjected to an external stimulus, which is particularly suited forin situandoperandostructural investigations at synchrotron sources. Contributions from the (active) part of the crystal system that varies synchronously with the stimulus can be extracted by an offline analysis, which can only be applied in the case of periodic stimuli and linear system responses. In this paper a new decomposition approach based on multivariate analysis is proposed. The standard principal component analysis (PCA) is adapted to treat MED data: specific figures of merit based on their scoresmore » and loadings are found, and the directions of the principal components obtained by PCA are modified to maximize such figures of merit. As a result, a general method to decompose MED data, called optimum constrained components rotation (OCCR), is developed, which produces very precise results on simulated data, even in the case of nonperiodic stimuli and/or nonlinear responses. Furthermore, the multivariate analysis approach is able to supply in one shot both the diffraction pattern related to the active atoms (through the OCCR loadings) and the time dependence of the system response (through the OCCR scores). Furthermore, when applied to real data, OCCR was able to supply only the latter information, as the former was hindered by changes in abundances of different crystal phases, which occurred besides structural variations in the specific case considered. In order to develop a decomposition procedure able to cope with this combined effect represents the next challenge in MED analysis.« less

  8. Two-dimensional statistical linear discriminant analysis for real-time robust vehicle-type recognition

    NASA Astrophysics Data System (ADS)

    Zafar, I.; Edirisinghe, E. A.; Acar, S.; Bez, H. E.

    2007-02-01

    Automatic vehicle Make and Model Recognition (MMR) systems provide useful performance enhancements to vehicle recognitions systems that are solely based on Automatic License Plate Recognition (ALPR) systems. Several car MMR systems have been proposed in literature. However these approaches are based on feature detection algorithms that can perform sub-optimally under adverse lighting and/or occlusion conditions. In this paper we propose a real time, appearance based, car MMR approach using Two Dimensional Linear Discriminant Analysis that is capable of addressing this limitation. We provide experimental results to analyse the proposed algorithm's robustness under varying illumination and occlusions conditions. We have shown that the best performance with the proposed 2D-LDA based car MMR approach is obtained when the eigenvectors of lower significance are ignored. For the given database of 200 car images of 25 different make-model classifications, a best accuracy of 91% was obtained with the 2D-LDA approach. We use a direct Principle Component Analysis (PCA) based approach as a benchmark to compare and contrast the performance of the proposed 2D-LDA approach to car MMR. We conclude that in general the 2D-LDA based algorithm supersedes the performance of the PCA based approach.

  9. Systems genetic analysis of multivariate response to iron deficiency in mice

    PubMed Central

    Yin, Lina; Unger, Erica L.; Jellen, Leslie C.; Earley, Christopher J.; Allen, Richard P.; Tomaszewicz, Ann; Fleet, James C.

    2012-01-01

    The aim of this study was to identify genes that influence iron regulation under varying dietary iron availability. Male and female mice from 20+ BXD recombinant inbred strains were fed iron-poor or iron-adequate diets from weaning until 4 mo of age. At death, the spleen, liver, and blood were harvested for the measurement of hemoglobin, hematocrit, total iron binding capacity, transferrin saturation, and liver, spleen and plasma iron concentration. For each measure and diet, we found large, strain-related variability. A principal-components analysis (PCA) was performed on the strain means for the seven parameters under each dietary condition for each sex, followed by quantitative trait loci (QTL) analysis on the factors. Compared with the iron-adequate diet, iron deficiency altered the factor structure of the principal components. QTL analysis, combined with PosMed (a candidate gene searching system) published gene expression data and literature citations, identified seven candidate genes, Ptprd, Mdm1, Picalm, lip1, Tcerg1, Skp2, and Frzb based on PCA factor, diet, and sex. Expression of each of these is cis-regulated, significantly correlated with the corresponding PCA factor, and previously reported to regulate iron, directly or indirectly. We propose that polymorphisms in multiple genes underlie individual differences in iron regulation, especially in response to dietary iron challenge. This research shows that iron management is a highly complex trait, influenced by multiple genes. Systems genetics analysis of iron homeostasis holds promise for developing new methods for prevention and treatment of iron deficiency anemia and related diseases. PMID:22461179

  10. Multiscale 3D Shape Analysis using Spherical Wavelets

    PubMed Central

    Nain, Delphine; Haker, Steven; Bobick, Aaron; Tannenbaum, Allen

    2013-01-01

    Shape priors attempt to represent biological variations within a population. When variations are global, Principal Component Analysis (PCA) can be used to learn major modes of variation, even from a limited training set. However, when significant local variations exist, PCA typically cannot represent such variations from a small training set. To address this issue, we present a novel algorithm that learns shape variations from data at multiple scales and locations using spherical wavelets and spectral graph partitioning. Our results show that when the training set is small, our algorithm significantly improves the approximation of shapes in a testing set over PCA, which tends to oversmooth data. PMID:16685992

  11. Multiscale 3D shape analysis using spherical wavelets.

    PubMed

    Nain, Delphine; Haker, Steven; Bobick, Aaron; Tannenbaum, Allen R

    2005-01-01

    Shape priors attempt to represent biological variations within a population. When variations are global, Principal Component Analysis (PCA) can be used to learn major modes of variation, even from a limited training set. However, when significant local variations exist, PCA typically cannot represent such variations from a small training set. To address this issue, we present a novel algorithm that learns shape variations from data at multiple scales and locations using spherical wavelets and spectral graph partitioning. Our results show that when the training set is small, our algorithm significantly improves the approximation of shapes in a testing set over PCA, which tends to oversmooth data.

  12. GO-PCA: An Unsupervised Method to Explore Gene Expression Data Using Prior Knowledge

    PubMed Central

    Wagner, Florian

    2015-01-01

    Method Genome-wide expression profiling is a widely used approach for characterizing heterogeneous populations of cells, tissues, biopsies, or other biological specimen. The exploratory analysis of such data typically relies on generic unsupervised methods, e.g. principal component analysis (PCA) or hierarchical clustering. However, generic methods fail to exploit prior knowledge about the molecular functions of genes. Here, I introduce GO-PCA, an unsupervised method that combines PCA with nonparametric GO enrichment analysis, in order to systematically search for sets of genes that are both strongly correlated and closely functionally related. These gene sets are then used to automatically generate expression signatures with functional labels, which collectively aim to provide a readily interpretable representation of biologically relevant similarities and differences. The robustness of the results obtained can be assessed by bootstrapping. Results I first applied GO-PCA to datasets containing diverse hematopoietic cell types from human and mouse, respectively. In both cases, GO-PCA generated a small number of signatures that represented the majority of lineages present, and whose labels reflected their respective biological characteristics. I then applied GO-PCA to human glioblastoma (GBM) data, and recovered signatures associated with four out of five previously defined GBM subtypes. My results demonstrate that GO-PCA is a powerful and versatile exploratory method that reduces an expression matrix containing thousands of genes to a much smaller set of interpretable signatures. In this way, GO-PCA aims to facilitate hypothesis generation, design of further analyses, and functional comparisons across datasets. PMID:26575370

  13. GO-PCA: An Unsupervised Method to Explore Gene Expression Data Using Prior Knowledge.

    PubMed

    Wagner, Florian

    2015-01-01

    Genome-wide expression profiling is a widely used approach for characterizing heterogeneous populations of cells, tissues, biopsies, or other biological specimen. The exploratory analysis of such data typically relies on generic unsupervised methods, e.g. principal component analysis (PCA) or hierarchical clustering. However, generic methods fail to exploit prior knowledge about the molecular functions of genes. Here, I introduce GO-PCA, an unsupervised method that combines PCA with nonparametric GO enrichment analysis, in order to systematically search for sets of genes that are both strongly correlated and closely functionally related. These gene sets are then used to automatically generate expression signatures with functional labels, which collectively aim to provide a readily interpretable representation of biologically relevant similarities and differences. The robustness of the results obtained can be assessed by bootstrapping. I first applied GO-PCA to datasets containing diverse hematopoietic cell types from human and mouse, respectively. In both cases, GO-PCA generated a small number of signatures that represented the majority of lineages present, and whose labels reflected their respective biological characteristics. I then applied GO-PCA to human glioblastoma (GBM) data, and recovered signatures associated with four out of five previously defined GBM subtypes. My results demonstrate that GO-PCA is a powerful and versatile exploratory method that reduces an expression matrix containing thousands of genes to a much smaller set of interpretable signatures. In this way, GO-PCA aims to facilitate hypothesis generation, design of further analyses, and functional comparisons across datasets.

  14. Chemometrics-based Approach in Analysis of Arnicae flos

    PubMed Central

    Zheleva-Dimitrova, Dimitrina Zh.; Balabanova, Vessela; Gevrenova, Reneta; Doichinova, Irini; Vitkova, Antonina

    2015-01-01

    Introduction: Arnica montana flowers have a long history as herbal medicines for external use on injuries and rheumatic complaints. Objective: To investigate Arnicae flos of cultivated accessions from Bulgaria, Poland, Germany, Finland, and Pharmacy store for phenolic derivatives and sesquiterpene lactones (STLs). Materials and Methods: Samples of Arnica from nine origins were prepared by ultrasound-assisted extraction with 80% methanol for phenolic compounds analysis. Subsequent reverse-phase high-performance liquid chromatography (HPLC) separation of the analytes was performed using gradient elution and ultraviolet detection at 280 and 310 nm (phenolic acids), and 360 nm (flavonoids). Total STLs were determined in chloroform extracts by solid-phase extraction-HPLC at 225 nm. The HPLC generated chromatographic data were analyzed using principal component analysis (PCA) and hierarchical clustering (HC). Results: The highest total amount of phenolic acids was found in the sample from Botanical Garden at Joensuu University, Finland (2.36 mg/g dw). Astragalin, isoquercitrin, and isorhamnetin 3-glucoside were the main flavonol glycosides being present up to 3.37 mg/g (astragalin). Three well-defined clusters were distinguished by PCA and HC. Cluster C1 comprised of the German and Finnish accessions characterized by the highest content of flavonols. Cluster C2 included the Bulgarian and Polish samples presenting a low content of flavonoids. Cluster C3 consisted only of one sample from a pharmacy store. Conclusion: A validated HPLC method for simultaneous determination of phenolic acids, flavonoid glycosides, and aglycones in A. montana flowers was developed. The PCA loading plot showed that quercetin, kaempferol, and isorhamnetin can be used to distinguish different Arnica accessions. SUMMARY A principal component analysis (PCA) on 13 phenolic compounds and total amount of sesquiterpene lactones in Arnicae flos collection tended to cluster the studied 9 accessions into three main groups. The profiles obtained demonstrated that the samples from Germany and Finland are characterized by greater amounts of phenolic derivatives than the Bulgarian and Polish ones. The PCA loading plot showed that quercetin, kaemferol and isorhamnetin can be used to distinguish different arnica accessions. PMID:27013791

  15. Application of EOF/PCA-based methods in the post-processing of GRACE derived water variations

    NASA Astrophysics Data System (ADS)

    Forootan, Ehsan; Kusche, Jürgen

    2010-05-01

    Two problems that users of monthly GRACE gravity field solutions face are 1) the presence of correlated noise in the Stokes coefficients that increases with harmonic degree and causes ‘striping', and 2) the fact that different physical signals are overlaid and difficult to separate from each other in the data. These problems are termed the signal-noise separation problem and the signal-signal separation problem. Methods that are based on principal component analysis and empirical orthogonal functions (PCA/EOF) have been frequently proposed to deal with these problems for GRACE. However, different strategies have been applied to different (spatial: global/regional, spectral: global/order-wise, geoid/equivalent water height) representations of the GRACE level 2 data products, leading to differing results and a general feeling that PCA/EOF-based methods are to be applied ‘with care'. In addition, it is known that conventional EOF/PCA methods force separated modes to be orthogonal, and that, on the other hand, to either EOFs or PCs an arbitrary orthogonal rotation can be applied. The aim of this paper is to provide a common theoretical framework and to study the application of PCA/EOF-based methods as a signal separation tool due to post-process GRACE data products. In order to investigate and illustrate the applicability of PCA/EOF-based methods, we have employed them on GRACE level 2 monthly solutions based on the Center for Space Research, University of Texas (CSR/UT) RL04 products and on the ITG-GRACE03 solutions from the University of Bonn, and on various representations of them. Our results show that EOF modes do reveal the dominating annual, semiannual and also long-periodic signals in the global water storage variations, but they also show how choosing different strategies changes the outcome and may lead to unexpected results.

  16. Decomposing the Apoptosis Pathway Into Biologically Interpretable Principal Components

    PubMed Central

    Wang, Min; Kornblau, Steven M; Coombes, Kevin R

    2018-01-01

    Principal component analysis (PCA) is one of the most common techniques in the analysis of biological data sets, but applying PCA raises 2 challenges. First, one must determine the number of significant principal components (PCs). Second, because each PC is a linear combination of genes, it rarely has a biological interpretation. Existing methods to determine the number of PCs are either subjective or computationally extensive. We review several methods and describe a new R package, PCDimension, that implements additional methods, the most important being an algorithm that extends and automates a graphical Bayesian method. Using simulations, we compared the methods. Our newly automated procedure is competitive with the best methods when considering both accuracy and speed and is the most accurate when the number of objects is small compared with the number of attributes. We applied the method to a proteomics data set from patients with acute myeloid leukemia. Proteins in the apoptosis pathway could be explained using 6 PCs. By clustering the proteins in PC space, we were able to replace the PCs by 6 “biological components,” 3 of which could be immediately interpreted from the current literature. We expect this approach combining PCA with clustering to be widely applicable. PMID:29881252

  17. Nonlinear multivariate and time series analysis by neural network methods

    NASA Astrophysics Data System (ADS)

    Hsieh, William W.

    2004-03-01

    Methods in multivariate statistical analysis are essential for working with large amounts of geophysical data, data from observational arrays, from satellites, or from numerical model output. In classical multivariate statistical analysis, there is a hierarchy of methods, starting with linear regression at the base, followed by principal component analysis (PCA) and finally canonical correlation analysis (CCA). A multivariate time series method, the singular spectrum analysis (SSA), has been a fruitful extension of the PCA technique. The common drawback of these classical methods is that only linear structures can be correctly extracted from the data. Since the late 1980s, neural network methods have become popular for performing nonlinear regression and classification. More recently, neural network methods have been extended to perform nonlinear PCA (NLPCA), nonlinear CCA (NLCCA), and nonlinear SSA (NLSSA). This paper presents a unified view of the NLPCA, NLCCA, and NLSSA techniques and their applications to various data sets of the atmosphere and the ocean (especially for the El Niño-Southern Oscillation and the stratospheric quasi-biennial oscillation). These data sets reveal that the linear methods are often too simplistic to describe real-world systems, with a tendency to scatter a single oscillatory phenomenon into numerous unphysical modes or higher harmonics, which can be largely alleviated in the new nonlinear paradigm.

  18. In Situ Aerosol Profile Measurements and Comparisons with SAGE 3 Aerosol Extinction and Surface Area Profiles at 68 deg North

    NASA Technical Reports Server (NTRS)

    2005-01-01

    Under funding from this proposal three in situ profile measurements of stratospheric sulfate aerosol and ozone were completed from balloon-borne platforms. The measured quantities are aerosol size resolved number concentration and ozone. The one derived product is aerosol size distribution, from which aerosol moments, such as surface area, volume, and extinction can be calculated for comparison with SAGE III measurements and SAGE III derived products, such as surface area. The analysis of these profiles and comparison with SAGE III extinction measurements and SAGE III derived surface areas are provided in Yongxiao (2005), which comprised the research thesis component of Mr. Jian Yongxiao's M.S. degree in Atmospheric Science at the University of Wyoming. In addition analysis continues on using principal component analysis (PCA) to derive aerosol surface area from the 9 wavelength extinction measurements available from SAGE III. Ths paper will present PCA components to calculate surface area from SAGE III measurements and compare these derived surface areas with those available directly from in situ size distribution measurements, as well as surface areas which would be derived from PCA and Thomason's algorithm applied to the four wavelength SAGE II extinction measurements.

  19. Application of principal component analysis in the pollution assessment with heavy metals of vegetable food chain in the old mining areas

    PubMed Central

    2012-01-01

    Background The aim of the paper is to assess by the principal components analysis (PCA) the heavy metal contamination of soil and vegetables widely used as food for people who live in areas contaminated by heavy metals (HMs) due to long-lasting mining activities. This chemometric technique allowed us to select the best model for determining the risk of HMs on the food chain as well as on people's health. Results Many PCA models were computed with different variables: heavy metals contents and some agro-chemical parameters which characterize the soil samples from contaminated and uncontaminated areas, HMs contents of different types of vegetables grown and consumed in these areas, and the complex parameter target hazard quotients (THQ). Results were discussed in terms of principal component analysis. Conclusion There were two major benefits in processing the data PCA: firstly, it helped in optimizing the number and type of data that are best in rendering the HMs contamination of the soil and vegetables. Secondly, it was valuable for selecting the vegetable species which present the highest/minimum risk of a negative impact on the food chain and human health. PMID:23234365

  20. Origin Discrimination of Osmanthus fragrans var. thunbergii Flowers using GC-MS and UPLC-PDA Combined with Multivariable Analysis Methods.

    PubMed

    Zhou, Fei; Zhao, Yajing; Peng, Jiyu; Jiang, Yirong; Li, Maiquan; Jiang, Yuan; Lu, Baiyi

    2017-07-01

    Osmanthus fragrans flowers are used as folk medicine and additives for teas, beverages and foods. The metabolites of O. fragrans flowers from different geographical origins were inconsistent in some extent. Chromatography and mass spectrometry combined with multivariable analysis methods provides an approach for discriminating the origin of O. fragrans flowers. To discriminate the Osmanthus fragrans var. thunbergii flowers from different origins with the identified metabolites. GC-MS and UPLC-PDA were conducted to analyse the metabolites in O. fragrans var. thunbergii flowers (in total 150 samples). Principal component analysis (PCA), soft independent modelling of class analogy analysis (SIMCA) and random forest (RF) analysis were applied to group the GC-MS and UPLC-PDA data. GC-MS identified 32 compounds common to all samples while UPLC-PDA/QTOF-MS identified 16 common compounds. PCA of the UPLC-PDA data generated a better clustering than PCA of the GC-MS data. Ten metabolites (six from GC-MS and four from UPLC-PDA) were selected as effective compounds for discrimination by PCA loadings. SIMCA and RF analysis were used to build classification models, and the RF model, based on the four effective compounds (caffeic acid derivative, acteoside, ligustroside and compound 15), yielded better results with the classification rate of 100% in the calibration set and 97.8% in the prediction set. GC-MS and UPLC-PDA combined with multivariable analysis methods can discriminate the origin of Osmanthus fragrans var. thunbergii flowers. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  1. Discrimination Enhancement with Transient Feature Analysis of a Graphene Chemical Sensor.

    PubMed

    Nallon, Eric C; Schnee, Vincent P; Bright, Collin J; Polcha, Michael P; Li, Qiliang

    2016-01-19

    A graphene chemical sensor is subjected to a set of structurally and chemically similar hydrocarbon compounds consisting of toluene, o-xylene, p-xylene, and mesitylene. The fractional change in resistance of the sensor upon exposure to these compounds exhibits a similar response magnitude among compounds, whereas large variation is observed within repetitions for each compound, causing a response overlap. Therefore, traditional features depending on maximum response change will cause confusion during further discrimination and classification analysis. More robust features that are less sensitive to concentration, sampling, and drift variability would provide higher quality information. In this work, we have explored the advantage of using transient-based exponential fitting coefficients to enhance the discrimination of similar compounds. The advantages of such feature analysis to discriminate each compound is evaluated using principle component analysis (PCA). In addition, machine learning-based classification algorithms were used to compare the prediction accuracies when using fitting coefficients as features. The additional features greatly enhanced the discrimination between compounds while performing PCA and also improved the prediction accuracy by 34% when using linear discrimination analysis.

  2. Detection of Iberian ham aroma by a semiconductor multisensorial system.

    PubMed

    Otero, Laura; Horrillo, M A Carmen; García, María; Sayago, Isabel; Aleixandre, Manuel; Fernández, M A Jesús; Arés, Luis; Gutiérrez, Javier

    2003-11-01

    A semiconductor multisensorial system, based on tin oxide, to control the quality of dry-cured Iberian hams is described. Two types of ham (submitted to different drying temperatures) were selected. Good responses were obtained from the 12 elements forming the multisensor for different operating temperatures. Discrimination between the two types of ham was successfully realised through principal component analysis (PCA).

  3. Detection of Fungus Infection on Petals of Rapeseed (Brassica napus L.) Using NIR Hyperspectral Imaging

    NASA Astrophysics Data System (ADS)

    Zhao, Yan-Ru; Yu, Ke-Qiang; Li, Xiaoli; He, Yong

    2016-12-01

    Infected petals are often regarded as the source for the spread of fungi Sclerotinia sclerotiorum in all growing process of rapeseed (Brassica napus L.) plants. This research aimed to detect fungal infection of rapeseed petals by applying hyperspectral imaging in the spectral region of 874-1734 nm coupled with chemometrics. Reflectance was extracted from regions of interest (ROIs) in the hyperspectral image of each sample. Firstly, principal component analysis (PCA) was applied to conduct a cluster analysis with the first several principal components (PCs). Then, two methods including X-loadings of PCA and random frog (RF) algorithm were used and compared for optimizing wavebands selection. Least squares-support vector machine (LS-SVM) methodology was employed to establish discriminative models based on the optimal and full wavebands. Finally, area under the receiver operating characteristics curve (AUC) was utilized to evaluate classification performance of these LS-SVM models. It was found that LS-SVM based on the combination of all optimal wavebands had the best performance with AUC of 0.929. These results were promising and demonstrated the potential of applying hyperspectral imaging in fungus infection detection on rapeseed petals.

  4. Spatiotemporal deformation patterns of the Lake Urmia Causeway as characterized by multisensor InSAR analysis.

    PubMed

    Karimzadeh, Sadra; Matsuoka, Masashi; Ogushi, Fumitaka

    2018-04-03

    We present deformation patterns in the Lake Urmia Causeway (LUC) in NW Iran based on data collected from four SAR sensors in the form of interferometric synthetic aperture radar (InSAR) time series. Sixty-eight images from Envisat (2004-2008), ALOS-1 (2006-2010), TerraSAR-X (2012-2013) and Sentinel-1 (2015-2017) were acquired, and 227 filtered interferograms were generated using the small baseline subset (SBAS) technique. The rate of line-of-sight (LOS) subsidence of the LUC peaked at 90 mm/year between 2012 and 2013, mainly due to the loss of most of the water in Lake Urmia. Principal component analysis (PCA) was conducted on 200 randomly selected time series of the LUC, and the results are presented in the form of the three major components. The InSAR scores obtained from the PCA were used in a hydro-thermal model to investigate the dynamics of consolidation settlement along the LUC based on detrended water level and temperature data. The results can be used to establish a geodetic network around the LUC to identify more detailed deformation patterns and to help plan future efforts to reduce the possible costs of damage.

  5. Machine learning-based analysis of MR radiomics can help to improve the diagnostic performance of PI-RADS v2 in clinically relevant prostate cancer.

    PubMed

    Wang, Jing; Wu, Chen-Jiang; Bao, Mei-Ling; Zhang, Jing; Wang, Xiao-Ning; Zhang, Yu-Dong

    2017-10-01

    To investigate whether machine learning-based analysis of MR radiomics can help improve the performance PI-RADS v2 in clinically relevant prostate cancer (PCa). This IRB-approved study included 54 patients with PCa undergoing multi-parametric (mp) MRI before prostatectomy. Imaging analysis was performed on 54 tumours, 47 normal peripheral (PZ) and 48 normal transitional (TZ) zone based on histological-radiological correlation. Mp-MRI was scored via PI-RADS, and quantified by measuring radiomic features. Predictive model was developed using a novel support vector machine trained with: (i) radiomics, (ii) PI-RADS scores, (iii) radiomics and PI-RADS scores. Paired comparison was made via ROC analysis. For PCa versus normal TZ, the model trained with radiomics had a significantly higher area under the ROC curve (Az) (0.955 [95% CI 0.923-0.976]) than PI-RADS (Az: 0.878 [0.834-0.914], p < 0.001). The Az between them was insignificant for PCa versus PZ (0.972 [0.945-0.988] vs. 0.940 [0.905-0.965], p = 0.097). When radiomics was added, performance of PI-RADS was significantly improved for PCa versus PZ (Az: 0.983 [0.960-0.995]) and PCa versus TZ (Az: 0.968 [0.940-0.985]). Machine learning analysis of MR radiomics can help improve the performance of PI-RADS in clinically relevant PCa. • Machine-based analysis of MR radiomics outperformed in TZ cancer against PI-RADS. • Adding MR radiomics significantly improved the performance of PI-RADS. • DKI-derived Dapp and Kapp were two strong markers for the diagnosis of PCa.

  6. Performance comparisons between PCA-EA-LBG and PCA-LBG-EA approaches in VQ codebook generation for image compression

    NASA Astrophysics Data System (ADS)

    Tsai, Jinn-Tsong; Chou, Ping-Yi; Chou, Jyh-Horng

    2015-11-01

    The aim of this study is to generate vector quantisation (VQ) codebooks by integrating principle component analysis (PCA) algorithm, Linde-Buzo-Gray (LBG) algorithm, and evolutionary algorithms (EAs). The EAs include genetic algorithm (GA), particle swarm optimisation (PSO), honey bee mating optimisation (HBMO), and firefly algorithm (FF). The study is to provide performance comparisons between PCA-EA-LBG and PCA-LBG-EA approaches. The PCA-EA-LBG approaches contain PCA-GA-LBG, PCA-PSO-LBG, PCA-HBMO-LBG, and PCA-FF-LBG, while the PCA-LBG-EA approaches contain PCA-LBG, PCA-LBG-GA, PCA-LBG-PSO, PCA-LBG-HBMO, and PCA-LBG-FF. All training vectors of test images are grouped according to PCA. The PCA-EA-LBG used the vectors grouped by PCA as initial individuals, and the best solution gained by the EAs was given for LBG to discover a codebook. The PCA-LBG approach is to use the PCA to select vectors as initial individuals for LBG to find a codebook. The PCA-LBG-EA used the final result of PCA-LBG as an initial individual for EAs to find a codebook. The search schemes in PCA-EA-LBG first used global search and then applied local search skill, while in PCA-LBG-EA first used local search and then employed global search skill. The results verify that the PCA-EA-LBG indeed gain superior results compared to the PCA-LBG-EA, because the PCA-EA-LBG explores a global area to find a solution, and then exploits a better one from the local area of the solution. Furthermore the proposed PCA-EA-LBG approaches in designing VQ codebooks outperform existing approaches shown in the literature.

  7. Metabolic profiling of Angelica acutiloba roots utilizing gas chromatography-time-of-flight-mass spectrometry for quality assessment based on cultivation area and cultivar via multivariate pattern recognition.

    PubMed

    Tianniam, Sukanda; Tarachiwin, Lucksanaporn; Bamba, Takeshi; Kobayashi, Akio; Fukusaki, Eiichiro

    2008-06-01

    Gas chromatography time-of-flight mass spectrometry was applied to elucidate the profiling of primary metabolites and to evaluate the differences between quality differences in Angelica acutiloba (or Yamato-toki) roots through the utilization of multivariate pattern recognition-principal component analysis (PCA). Twenty-two metabolites consisting of sugars, amino and organic acids were identified. PCA analysis successfully discriminated the good, the moderate and the bad quality Yamato-toki roots in accordance to their cultivation areas. The results signified two reducing sugars, fructose and glucose being the most accumulated in the bad quality, whereas higher quantity of phosphoric acid, proline, malic acid and citric acid were found in the good and the moderate quality toki roots. PCA was also effective in discriminating samples derive from different cultivars. Yamato-toki roots with the moderate quality were compared by means of PCA, and the results illustrated good discrimination which was influenced most by malic acid. Overall, this study demonstrated that metabolomics technique is accurate and efficient in determining the quality differences in Yamato-toki roots, and has a potential to be a superior and suitable method to assess the quality of this medicinal plant.

  8. Sequential analysis of hydrochemical data for watershed characterization.

    PubMed

    Thyne, Geoffrey; Güler, Cüneyt; Poeter, Eileen

    2004-01-01

    A methodology for characterizing the hydrogeology of watersheds using hydrochemical data that combine statistical, geochemical, and spatial techniques is presented. Surface water and ground water base flow and spring runoff samples (180 total) from a single watershed are first classified using hierarchical cluster analysis. The statistical clusters are analyzed for spatial coherence confirming that the clusters have a geological basis corresponding to topographic flowpaths and showing that the fractured rock aquifer behaves as an equivalent porous medium on the watershed scale. Then principal component analysis (PCA) is used to determine the sources of variation between parameters. PCA analysis shows that the variations within the dataset are related to variations in calcium, magnesium, SO4, and HCO3, which are derived from natural weathering reactions, and pH, NO3, and chlorine, which indicate anthropogenic impact. PHREEQC modeling is used to quantitatively describe the natural hydrochemical evolution for the watershed and aid in discrimination of samples that have an anthropogenic component. Finally, the seasonal changes in the water chemistry of individual sites were analyzed to better characterize the spatial variability of vertical hydraulic conductivity. The integrated result provides a method to characterize the hydrogeology of the watershed that fully utilizes traditional data.

  9. A Novel Weighted Kernel PCA-Based Method for Optimization and Uncertainty Quantification

    NASA Astrophysics Data System (ADS)

    Thimmisetty, C.; Talbot, C.; Chen, X.; Tong, C. H.

    2016-12-01

    It has been demonstrated that machine learning methods can be successfully applied to uncertainty quantification for geophysical systems through the use of the adjoint method coupled with kernel PCA-based optimization. In addition, it has been shown through weighted linear PCA how optimization with respect to both observation weights and feature space control variables can accelerate convergence of such methods. Linear machine learning methods, however, are inherently limited in their ability to represent features of non-Gaussian stochastic random fields, as they are based on only the first two statistical moments of the original data. Nonlinear spatial relationships and multipoint statistics leading to the tortuosity characteristic of channelized media, for example, are captured only to a limited extent by linear PCA. With the aim of coupling the kernel-based and weighted methods discussed, we present a novel mathematical formulation of kernel PCA, Weighted Kernel Principal Component Analysis (WKPCA), that both captures nonlinear relationships and incorporates the attribution of significance levels to different realizations of the stochastic random field of interest. We also demonstrate how new instantiations retaining defining characteristics of the random field can be generated using Bayesian methods. In particular, we present a novel WKPCA-based optimization method that minimizes a given objective function with respect to both feature space random variables and observation weights through which optimal snapshot significance levels and optimal features are learned. We showcase how WKPCA can be applied to nonlinear optimal control problems involving channelized media, and in particular demonstrate an application of the method to learning the spatial distribution of material parameter values in the context of linear elasticity, and discuss further extensions of the method to stochastic inversion.

  10. Removal of BCG artefact from concurrent fMRI-EEG recordings based on EMD and PCA.

    PubMed

    Javed, Ehtasham; Faye, Ibrahima; Malik, Aamir Saeed; Abdullah, Jafri Malin

    2017-11-01

    Simultaneous electroencephalography (EEG) and functional magnetic resonance image (fMRI) acquisitions provide better insight into brain dynamics. Some artefacts due to simultaneous acquisition pose a threat to the quality of the data. One such problematic artefact is the ballistocardiogram (BCG) artefact. We developed a hybrid algorithm that combines features of empirical mode decomposition (EMD) with principal component analysis (PCA) to reduce the BCG artefact. The algorithm does not require extra electrocardiogram (ECG) or electrooculogram (EOG) recordings to extract the BCG artefact. The method was tested with both simulated and real EEG data of 11 participants. From the simulated data, the similarity index between the extracted BCG and the simulated BCG showed the effectiveness of the proposed method in BCG removal. On the other hand, real data were recorded with two conditions, i.e. resting state (eyes closed dataset) and task influenced (event-related potentials (ERPs) dataset). Using qualitative (visual inspection) and quantitative (similarity index, improved normalized power spectrum (INPS) ratio, power spectrum, sample entropy (SE)) evaluation parameters, the assessment results showed that the proposed method can efficiently reduce the BCG artefact while preserving the neuronal signals. Compared with conventional methods, namely, average artefact subtraction (AAS), optimal basis set (OBS) and combined independent component analysis and principal component analysis (ICA-PCA), the statistical analyses of the results showed that the proposed method has better performance, and the differences were significant for all quantitative parameters except for the power and sample entropy. The proposed method does not require any reference signal, prior information or assumption to extract the BCG artefact. It will be very useful in circumstances where the reference signal is not available. Copyright © 2017 Elsevier B.V. All rights reserved.

  11. Chemometric Methods to Quantify 1D and 2D NMR Spectral Differences Among Similar Protein Therapeutics.

    PubMed

    Chen, Kang; Park, Junyong; Li, Feng; Patil, Sharadrao M; Keire, David A

    2018-04-01

    NMR spectroscopy is an emerging analytical tool for measuring complex drug product qualities, e.g., protein higher order structure (HOS) or heparin chemical composition. Most drug NMR spectra have been visually analyzed; however, NMR spectra are inherently quantitative and multivariate and thus suitable for chemometric analysis. Therefore, quantitative measurements derived from chemometric comparisons between spectra could be a key step in establishing acceptance criteria for a new generic drug or a new batch after manufacture change. To measure the capability of chemometric methods to differentiate comparator NMR spectra, we calculated inter-spectra difference metrics on 1D/2D spectra of two insulin drugs, Humulin R® and Novolin R®, from different manufacturers. Both insulin drugs have an identical drug substance but differ in formulation. Chemometric methods (i.e., principal component analysis (PCA), 3-way Tucker3 or graph invariant (GI)) were performed to calculate Mahalanobis distance (D M ) between the two brands (inter-brand) and distance ratio (D R ) among the different lots (intra-brand). The PCA on 1D inter-brand spectral comparison yielded a D M value of 213. In comparing 2D spectra, the Tucker3 analysis yielded the highest differentiability value (D M  = 305) in the comparisons made followed by PCA (D M  = 255) then the GI method (D M  = 40). In conclusion, drug quality comparisons among different lots might benefit from PCA on 1D spectra for rapidly comparing many samples, while higher resolution but more time-consuming 2D-NMR-data-based comparisons using Tucker3 analysis or PCA provide a greater level of assurance for drug structural similarity evaluation between drug brands.

  12. Large Deformation Diffeomorphism and Momentum Based Hippocampal Shape Discrimination in Dementia of the Alzheimer type

    PubMed Central

    Wang, Lei; Beg, Faisal; Ratnanather, Tilak; Ceritoglu, Can; Younes, Laurent; Morris, John C.; Csernansky, John G.; Miller, Michael I.

    2010-01-01

    In large-deformation diffeomorphic metric mapping (LDDMM), the diffeomorphic matching of images are modeled as evolution in time, or a flow, of an associated smooth velocity vector field v controlling the evolution. The initial momentum parameterizes the whole geodesic and encodes the shape and form of the target image. Thus, methods such as principal component analysis (PCA) of the initial momentum leads to analysis of anatomical shape and form in target images without being restricted to small-deformation assumption in the analysis of linear displacements. We apply this approach to a study of dementia of the Alzheimer type (DAT). The left hippocampus in the DAT group shows significant shape abnormality while the right hippocampus shows similar pattern of abnormality. Further, PCA of the initial momentum leads to correct classification of 12 out of 18 DAT subjects and 22 out of 26 control subjects. PMID:17427733

  13. Assessment of the water quality monitoring network of the Piabanha River experimental watersheds in Rio de Janeiro, Brazil, using autoassociative neural networks.

    PubMed

    Villas-Boas, Mariana D; Olivera, Francisco; de Azevedo, Jose Paulo S

    2017-09-01

    Water quality monitoring is a complex issue that requires support tools in order to provide information for water resource management. Budget constraints as well as an inadequate water quality network design call for the development of evaluation tools to provide efficient water quality monitoring. For this purpose, a nonlinear principal component analysis (NLPCA) based on an autoassociative neural network was performed to assess the redundancy of the parameters and monitoring locations of the water quality network in the Piabanha River watershed. Oftentimes, a small number of variables contain the most relevant information, while the others add little or no interpretation to the variability of water quality. Principal component analysis (PCA) is widely used for this purpose. However, conventional PCA is not able to capture the nonlinearities of water quality data, while neural networks can represent those nonlinear relationships. The results presented in this work demonstrate that NLPCA performs better than PCA in the reconstruction of the water quality data of Piabanha watershed, explaining most of data variance. From the results of NLPCA, the most relevant water quality parameter is fecal coliforms (FCs) and the least relevant is chemical oxygen demand (COD). Regarding the monitoring locations, the most relevant is Poço Tarzan (PT) and the least is Parque Petrópolis (PP).

  14. Principal component analysis of dietary and lifestyle patterns in relation to risk of subtypes of esophageal and gastric cancer

    PubMed Central

    Silvera, Stephanie A. Navarro; Mayne, Susan T; Risch, Harvey A.; Gammon, Marilie D; Vaughan, Thomas; Chow, Wong-Ho; Dubin, Joel A; Dubrow, Robert; Schoenberg, Janet; Stanford, Janet L; West, A. Brian; Rotterdam, Heidrun; Blot, William J

    2011-01-01

    Purpose To perform pattern analyses of dietary and lifestyle factors in relation to risk of esophageal and gastric cancers. Methods We evaluated risk factors for esophageal adenocarcinoma (EA), esophageal squamous cell carcinoma (ESCC), gastric cardia adenocarcinoma (GCA), and other gastric cancers (OGA) using data from a population-based case-control study conducted in Connecticut, New Jersey, and western Washington state. Dietary/lifestyle patterns were created using principal component analysis (PCA). Impact of the resultant scores on cancer risk was estimated through logistic regression. Results PCA identified six patterns: meat/nitrite, fruit/vegetable, smoking/alcohol, legume/meat alternate, GERD/BMI, and fish/vitamin C. Risk of each cancer under study increased with rising meat/nitrite score. Risk of EA increased with increasing GERD/BMI score, and risk of ESCC rose with increasing smoking/alcohol score and decreasing GERD/BMI score. Fruit/vegetable scores were inversely associated with EA, ESCC, and GCA. Conclusions PCA may provide a useful approach for summarizing extensive dietary/lifestyle data into fewer interpretable combinations that discriminate between cancer cases and controls. The analyses suggest that meat/nitrite intake is associated with elevated risk of each cancer under study, while fruit/vegetable intake reduces risk of EA, ESCC, and GCA. GERD/obesity were confirmed as risk factors for EA and smoking/alcohol as risk factors for ESCC. PMID:21435900

  15. Discrimination of Schisandrae Chinensis Fructus and Schisandrae Sphenantherae Fructus based on fingerprint profiles of hydrophilic components by high-performance liquid chromatography with ultraviolet detection.

    PubMed

    Oshima, Ryusei; Kotani, Akira; Kuroda, Minpei; Yamamoto, Kazuhiro; Mimaki, Yoshihiro; Hakamata, Hideki

    2018-03-01

    High-performance liquid chromatography with ultraviolet detection (HPLC-UV) using 20 mM phosphate mobile phase and an octadecylsilyl column (Triart C18, 150 × 3.0 mm i.d., 3 μm) has been developed for the analysis of hydrophilic compounds in the water extract of Schisandrae Fructus samples. The present HPLC-UV method permits the accurate and precise determination of malic, citric, and protocatechuic acids in the Japanese Pharmacopoeia (JP) Schisandrae Fructus, Schisandrae Chinensis Fructus and Schisandrae Sphenantherae Fructus. The JP Schisandrae Fructus studied contains 27.98 mg/g malic, 107.08 mg/g citric, and 0.42 mg/g protocatechuic acids, with a relative standard deviation (RSD) of repeatability of <0.9% (n = 6). The content of malic acids in Schisandrae Chinensis Fructus is approximately ten times that in Schisandrae Sphenantherae Fructus. To examine whether the HPLC-UV method is applicable to the fingerprint-based discrimination of Schisandrae Fructus samples obtained from Chinese markets, principal component analysis (PCA) was performed using the determined contents of organic acids and the ratio of six characteristic unknown peaks derived from hydrophilic components to internal standard peak areas. On the score plots, Schisandrae Chinensis Fructus and Schisandrae Sphenantherae Fructus samples are clearly discriminated. Therefore, the HPLC-UV method for the analysis of hydrophilic components coupled with PCA has been shown to be practical and useful in the quality control of Schisandrae Fructus.

  16. Enlightening discriminative network functional modules behind Principal Component Analysis separation in differential-omic science studies

    PubMed Central

    Ciucci, Sara; Ge, Yan; Durán, Claudio; Palladini, Alessandra; Jiménez-Jiménez, Víctor; Martínez-Sánchez, Luisa María; Wang, Yuting; Sales, Susanne; Shevchenko, Andrej; Poser, Steven W.; Herbig, Maik; Otto, Oliver; Androutsellis-Theotokis, Andreas; Guck, Jochen; Gerl, Mathias J.; Cannistraci, Carlo Vittorio

    2017-01-01

    Omic science is rapidly growing and one of the most employed techniques to explore differential patterns in omic datasets is principal component analysis (PCA). However, a method to enlighten the network of omic features that mostly contribute to the sample separation obtained by PCA is missing. An alternative is to build correlation networks between univariately-selected significant omic features, but this neglects the multivariate unsupervised feature compression responsible for the PCA sample segregation. Biologists and medical researchers often prefer effective methods that offer an immediate interpretation to complicated algorithms that in principle promise an improvement but in practice are difficult to be applied and interpreted. Here we present PC-corr: a simple algorithm that associates to any PCA segregation a discriminative network of features. Such network can be inspected in search of functional modules useful in the definition of combinatorial and multiscale biomarkers from multifaceted omic data in systems and precision biomedicine. We offer proofs of PC-corr efficacy on lipidomic, metagenomic, developmental genomic, population genetic, cancer promoteromic and cancer stem-cell mechanomic data. Finally, PC-corr is a general functional network inference approach that can be easily adopted for big data exploration in computer science and analysis of complex systems in physics. PMID:28287094

  17. Extended principle component analysis - a useful tool to understand processes governing water quality at catchment scales

    NASA Astrophysics Data System (ADS)

    Selle, B.; Schwientek, M.

    2012-04-01

    Water quality of ground and surface waters in catchments is typically driven by many complex and interacting processes. While small scale processes are often studied in great detail, their relevance and interplay at catchment scales remain often poorly understood. For many catchments, extensive monitoring data on water quality have been collected for different purposes. These heterogeneous data sets contain valuable information on catchment scale processes but are rarely analysed using integrated methods. Principle component analysis (PCA) has previously been applied to this kind of data sets. However, a detailed analysis of scores, which are an important result of a PCA, is often missing. Mathematically, PCA expresses measured variables on water quality, e.g. nitrate concentrations, as linear combination of independent, not directly observable key processes. These computed key processes are represented by principle components. Their scores are interpretable as process intensities which vary in space and time. Subsequently, scores can be correlated with other key variables and catchment characteristics, such as water travel times and land use that were not considered in PCA. This detailed analysis of scores represents an extension of the commonly applied PCA which could considerably improve the understanding of processes governing water quality at catchment scales. In this study, we investigated the 170 km2 Ammer catchment in SW Germany which is characterised by an above average proportion of agricultural (71%) and urban (17%) areas. The Ammer River is mainly fed by karstic springs. For PCA, we separately analysed concentrations from (a) surface waters of the Ammer River and its tributaries, (b) spring waters from the main aquifers and (c) deep groundwater from production wells. This analysis was extended by a detailed analysis of scores. We analysed measured concentrations on major ions and selected organic micropollutants. Additionally, redox-sensitive variables and environmental tracers indicating groundwater age were analysed for deep groundwater from production wells. For deep groundwater, we found that microbial turnover was stronger influenced by local availability of energy sources than by travel times of groundwater to the wells. Groundwater quality primarily reflected the input of pollutants determined by landuse, e.g. agrochemicals. We concluded that for water quality in the Ammer catchment, conservative mixing of waters with different origin is more important than reactive transport processes along the flow path.

  18. Triggers in advanced neurological conditions: prediction and management of the terminal phase.

    PubMed

    Hussain, Jamilla; Adams, Debi; Allgar, Victoria; Campbell, Colin

    2014-03-01

    The challenge to provide a palliative care service for individuals with advanced neurological conditions is compounded by variability in disease trajectories and symptom profiles. The National End of Life Care Programme (2010) recommended seven 'triggers' for a palliative approach to care for patients with advanced neurological conditions. To establish the frequency of triggers in the palliative phase, and if they could be reduced to fewer components. Management of the terminal phase also was evaluated. Retrospective study of 62 consecutive patients under the care of a specialist palliative neurology service, who had died. Principle component analysis (PCA) was performed to establish the interrelationship between triggers. Frequency of triggers increased as each patient approached death. PCA found that four symptom components explained 76.8% of the variance. These represented: rapid physical decline; significant complex symptoms, including pain; infection in combination with cognitive impairment; and risk of aspiration. Median follow-up under the palliative care service was 336 days. In 56.5% of patients, the cause of death was pneumonia. The terminal phase was recognised in 72.6%. The duration of the terminal phase was 8.8 days on average, and the Liverpool Care of the dying Pathway was commenced in 33.9%. All carers were offered bereavement support. Referral criteria based on the triggers can facilitate appropriate and timely patient access to palliative care. The components deduced through PCA have face validity; however larger studies prospectively validating the triggers are required. Closer scrutiny of the terminal phase is necessary to optimise management.

  19. PCA-based polling strategy in machine learning framework for coronary artery disease risk assessment in intravascular ultrasound: A link between carotid and coronary grayscale plaque morphology.

    PubMed

    Araki, Tadashi; Ikeda, Nobutaka; Shukla, Devarshi; Jain, Pankaj K; Londhe, Narendra D; Shrivastava, Vimal K; Banchhor, Sumit K; Saba, Luca; Nicolaides, Andrew; Shafique, Shoaib; Laird, John R; Suri, Jasjit S

    2016-05-01

    Percutaneous coronary interventional procedures need advance planning prior to stenting or an endarterectomy. Cardiologists use intravascular ultrasound (IVUS) for screening, risk assessment and stratification of coronary artery disease (CAD). We hypothesize that plaque components are vulnerable to rupture due to plaque progression. Currently, there are no standard grayscale IVUS tools for risk assessment of plaque rupture. This paper presents a novel strategy for risk stratification based on plaque morphology embedded with principal component analysis (PCA) for plaque feature dimensionality reduction and dominant feature selection technique. The risk assessment utilizes 56 grayscale coronary features in a machine learning framework while linking information from carotid and coronary plaque burdens due to their common genetic makeup. This system consists of a machine learning paradigm which uses a support vector machine (SVM) combined with PCA for optimal and dominant coronary artery morphological feature extraction. Carotid artery proven intima-media thickness (cIMT) biomarker is adapted as a gold standard during the training phase of the machine learning system. For the performance evaluation, K-fold cross validation protocol is adapted with 20 trials per fold. For choosing the dominant features out of the 56 grayscale features, a polling strategy of PCA is adapted where the original value of the features is unaltered. Different protocols are designed for establishing the stability and reliability criteria of the coronary risk assessment system (cRAS). Using the PCA-based machine learning paradigm and cross-validation protocol, a classification accuracy of 98.43% (AUC 0.98) with K=10 folds using an SVM radial basis function (RBF) kernel was achieved. A reliability index of 97.32% and machine learning stability criteria of 5% were met for the cRAS. This is the first Computer aided design (CADx) system of its kind that is able to demonstrate the ability of coronary risk assessment and stratification while demonstrating a successful design of the machine learning system based on our assumptions. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  20. [Discrimination among different brands of coffee by using vis-near infrared spectra].

    PubMed

    Wang, Yan-Yan; He, Yong; Shao, Yong-Ni; Zhang, Zhi-Fei

    2007-04-01

    Near infrared spectroscopy technology was used to distinguish three different brands of coffee bought from the supermarket. Two models, PCA-BP and WT-BP, were employed for the analysis and prediction of the samples. The discrimination among the different brands of coffee was based on the combination of the function of data compression in the PCA and WT technology and the ability of learning and prediction of the artificial neural network. In the experiment, 60 samples were used for model calibration and 30 for brand prediction. The result showed that both the PCA-BP and WT-BP models achieved 100% discrimination rate, and the wavelet transforms technology is superior to the principal component analysis both in time-consuming and the capability of data compression. It is indicated that the model set up by the combination of WT technology and BP neural network in the present study is rapid in analysis and precise in pattern discrimination. It can be concluded that a new approach to distinguishing pure coffee was of fered and the result of this experiment established the foundation for the determination of the raw material (coffee bean) of different brands of coffee in the market.

  1. Spectral decomposition of asteroid Itokawa based on principal component analysis

    NASA Astrophysics Data System (ADS)

    Koga, Sumire C.; Sugita, Seiji; Kamata, Shunichi; Ishiguro, Masateru; Hiroi, Takahiro; Tatsumi, Eri; Sasaki, Sho

    2018-01-01

    The heliocentric stratification of asteroid spectral types may hold important information on the early evolution of the Solar System. Asteroid spectral taxonomy is based largely on principal component analysis. However, how the surface properties of asteroids, such as the composition and age, are projected in the principal-component (PC) space is not understood well. We decompose multi-band disk-resolved visible spectra of the Itokawa surface with principal component analysis (PCA) in comparison with main-belt asteroids. The obtained distribution of Itokawa spectra projected in the PC space of main-belt asteroids follows a linear trend linking the Q-type and S-type regions and is consistent with the results of space-weathering experiments on ordinary chondrites and olivine, suggesting that this trend may be a space-weathering-induced spectral evolution track for S-type asteroids. Comparison with space-weathering experiments also yield a short average surface age (< a few million years) for Itokawa, consistent with the cosmic-ray-exposure time of returned samples from Itokawa. The Itokawa PC score distribution exhibits asymmetry along the evolution track, strongly suggesting that space weathering has begun saturated on this young asteroid. The freshest spectrum found on Itokawa exhibits a clear sign for space weathering, indicating again that space weathering occurs very rapidly on this body. We also conducted PCA on Itokawa spectra alone and compared the results with space-weathering experiments. The obtained results indicate that the first principal component of Itokawa surface spectra is consistent with spectral change due to space weathering and that the spatial variation in the degree of space weathering is very large (a factor of three in surface age), which would strongly suggest the presence of strong regional/local resurfacing process(es) on this small asteroid.

  2. Dynamic Responses in Brain Networks to Social Feedback: A Dual EEG Acquisition Study in Adolescent Couples

    PubMed Central

    Kuo, Ching-Chang; Ha, Thao; Ebbert, Ashley M.; Tucker, Don M.; Dishion, Thomas J.

    2017-01-01

    Adolescence is a sensitive period for the development of romantic relationships. During this period the maturation of frontolimbic networks is particularly important for the capacity to regulate emotional experiences. In previous research, both functional magnetic resonance imaging (fMRI) and dense array electroencephalography (dEEG) measures have suggested that responses in limbic regions are enhanced in adolescents experiencing social rejection. In the present research, we examined social acceptance and rejection from romantic partners as they engaged in a Chatroom Interact Task. Dual 128-channel dEEG systems were used to record neural responses to acceptance and rejection from both adolescent romantic partners and unfamiliar peers (N = 75). We employed a two-step temporal principal component analysis (PCA) and spatial independent component analysis (ICA) approach to statistically identify the neural components related to social feedback. Results revealed that the early (288 ms) discrimination between acceptance and rejection reflected by the P3a component was significant for the romantic partner but not the unfamiliar peer. In contrast, the later (364 ms) P3b component discriminated between acceptance and rejection for both partners and peers. The two-step approach (PCA then ICA) was better able than either PCA or ICA alone in separating these components of the brain's electrical activity that reflected both temporal and spatial phases of the brain's processing of social feedback. PMID:28620292

  3. Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis

    PubMed Central

    2013-01-01

    Background Protein-protein interactions (PPIs) play crucial roles in the execution of various cellular processes and form the basis of biological mechanisms. Although large amount of PPIs data for different species has been generated by high-throughput experimental techniques, current PPI pairs obtained with experimental methods cover only a fraction of the complete PPI networks, and further, the experimental methods for identifying PPIs are both time-consuming and expensive. Hence, it is urgent and challenging to develop automated computational methods to efficiently and accurately predict PPIs. Results We present here a novel hierarchical PCA-EELM (principal component analysis-ensemble extreme learning machine) model to predict protein-protein interactions only using the information of protein sequences. In the proposed method, 11188 protein pairs retrieved from the DIP database were encoded into feature vectors by using four kinds of protein sequences information. Focusing on dimension reduction, an effective feature extraction method PCA was then employed to construct the most discriminative new feature set. Finally, multiple extreme learning machines were trained and then aggregated into a consensus classifier by majority voting. The ensembling of extreme learning machine removes the dependence of results on initial random weights and improves the prediction performance. Conclusions When performed on the PPI data of Saccharomyces cerevisiae, the proposed method achieved 87.00% prediction accuracy with 86.15% sensitivity at the precision of 87.59%. Extensive experiments are performed to compare our method with state-of-the-art techniques Support Vector Machine (SVM). Experimental results demonstrate that proposed PCA-EELM outperforms the SVM method by 5-fold cross-validation. Besides, PCA-EELM performs faster than PCA-SVM based method. Consequently, the proposed approach can be considered as a new promising and powerful tools for predicting PPI with excellent performance and less time. PMID:23815620

  4. Principal Component Analysis for pulse-shape discrimination of scintillation radiation detectors

    NASA Astrophysics Data System (ADS)

    Alharbi, T.

    2016-01-01

    In this paper, we report on the application of Principal Component analysis (PCA) for pulse-shape discrimination (PSD) of scintillation radiation detectors. The details of the method are described and the performance of the method is experimentally examined by discriminating between neutrons and gamma-rays with a liquid scintillation detector in a mixed radiation field. The performance of the method is also compared against that of the conventional charge-comparison method, demonstrating the superior performance of the method particularly at low light output range. PCA analysis has the important advantage of automatic extraction of the pulse-shape characteristics which makes the PSD method directly applicable to various scintillation detectors without the need for the adjustment of a PSD parameter.

  5. An electrophysiological index of changes in risk decision-making strategies.

    PubMed

    Zhang, Dandan; Gu, Ruolei; Wu, Tingting; Broster, Lucas S; Luo, Yi; Jiang, Yang; Luo, Yue-jia

    2013-07-01

    Human decision-making is significantly modulated by previously experienced outcomes. Using event-related potentials (ERPs), we examined whether ERP components evoked by outcome feedbacks could serve as biomarkers to signal the influence of current outcome evaluation on subsequent decision-making. In this study, 18 adult volunteers participated in a simple monetary gambling task, in which they were asked to choose between two options that differed in risk. Their decisions were immediately followed by outcome presentation. Temporospatial principle component analysis (PCA) was applied to the outcome-onset locked ERPs in the 200-1000 ms time window. The PCA factors that approximated classical ERP components (P2, feedback-related negativity, P3a, and P3b) in terms of time course and scalp distribution were tested for their association with subsequent decision-making strategies. Our results revealed that a fronto-central PCA factor approximating the classical P3a was related to changes of decision-making strategies on subsequent trials. The decision to switch between high- and low-risk options resulted in a larger P3a relative to the decision to retain the same choice. According to the results, we suggest that the amplitude of the fronto-central P3a is an electrophysiological index of the influence of current outcome on subsequent risk decision-making. Furthermore, the ERP source analysis indicated that the activations of the frontopolar cortex and sensorimotor cortex were involved in subsequent changes of strategies, which enriches our understanding of the neural mechanisms of adjusting decision-making strategies based on previous experience. Copyright © 2013 Elsevier Ltd. All rights reserved.

  6. An electrophysiological index of changes in risk decision-making strategies

    PubMed Central

    Zhang, Dandan; Gu, Ruolei; Wu, Tingting; Broster, Lucas S.; Luo, Yi; Jiang, Yang; Luo, Yue-jia

    2014-01-01

    Human decision-making is significantly modulated by previously experienced outcomes. Using event-related potentials (ERPs), we examined whether ERP components evoked by outcome feedbacks could serve as biomarkers to signal the influence of current outcome evaluation on subsequent decision-making. In this study, eighteen adult volunteers participated in a simple monetary gambling task, in which they were asked to choose between two options that differed in risk. Their decisions were immediately followed by outcome presentation. Temporospatial principle component analysis (PCA) was applied to the outcome-onset locked ERPs in the -200 – 1000 ms time window. The PCA factors that approximated classical ERP components (P2, feedback-related negativity, P3a, & P3b) in terms of time course and scalp distribution were tested for their association with subsequent decision-making strategies. Our results revealed that a fronto-central PCA factor approximating the classical P3a was related to changes of decision-making strategies on subsequent trials. The decision to switch between high- and low-risk options resulted in a larger P3a relative to the decision to retain the same choice. According to the results, we suggest the amplitude of the fronto-central P3a is an electrophysiological index of the influence of current outcome on subsequent risk decision-making. Furthermore, the ERP source analysis indicated that the activations of the frontopolar cortex and sensorimotor cortex were involved in subsequent changes of strategies, which enriches our understanding of the neural mechanisms of adjusting decision-making strategies based on previous experience. PMID:23643796

  7. Evaluation of Coptidis Rhizoma-Euodiae Fructus couple and Zuojin products based on HPLC fingerprint chromatogram and simultaneous determination of main bioactive constituents.

    PubMed

    Gao, Xin; Yang, Xiu-Wei; Marriott, Philip J

    2013-11-01

    Coptidis Rhizoma-Euodiae Fructus couple (CEC) is a classic traditional Chinese medicine preparation consisting of Coptidis Rhizoma and Euodiae Fructus at the ratio of 6:1, and used to treat gastro-intestinal disorders. Alkaloids are the main bioactive component. This research provides comprehensive analysis information for the quality control of CEC. To develop a high-performance liquid chromatography-diode array detection fingerprint for chemical composition characteristics of CEC and its products. The samples were separated with a Gemini C18 column by using gradient elution with water-formic acid (100:0.03) and acetonitrile as mobile phase. Flow rate was 1.0 mL/min and detection wavelength was 250 nm. Similarity analysis and principal component analysis (PCA) were employed to evaluate quality consistencies of analytes. Mean chromatograms and correlation coefficients of analytes were calculated by the software "Similarity Evaluation System for Chromatographic Fingerprint of Traditional Chinese Medicine". Fingerprint chromatogram comparison determined 20 representative general fingerprint peaks, and the fingerprint chromatogram resemblances are all better than 0.988. Consistent results were obtained to show that CEC and its related samples could be successfully divided into three groups. Contribution plots generated by PCA were performed to interpret differences among the sample groups while peaks which significantly contributed to classification were identified. Seven bioactive constituents in the samples were verified by quantitative analysis. The chromatographic fingerprint with similarity evaluation and PCA assay combined with quantification of seven compounds could be utilized as a quality control method for the herbal couple.

  8. Quantitative structure-activity relationship study of P2X7 receptor inhibitors using combination of principal component analysis and artificial intelligence methods.

    PubMed

    Ahmadi, Mehdi; Shahlaei, Mohsen

    2015-01-01

    P2X7 antagonist activity for a set of 49 molecules of the P2X7 receptor antagonists, derivatives of purine, was modeled with the aid of chemometric and artificial intelligence techniques. The activity of these compounds was estimated by means of combination of principal component analysis (PCA), as a well-known data reduction method, genetic algorithm (GA), as a variable selection technique, and artificial neural network (ANN), as a non-linear modeling method. First, a linear regression, combined with PCA, (principal component regression) was operated to model the structure-activity relationships, and afterwards a combination of PCA and ANN algorithm was employed to accurately predict the biological activity of the P2X7 antagonist. PCA preserves as much of the information as possible contained in the original data set. Seven most important PC's to the studied activity were selected as the inputs of ANN box by an efficient variable selection method, GA. The best computational neural network model was a fully-connected, feed-forward model with 7-7-1 architecture. The developed ANN model was fully evaluated by different validation techniques, including internal and external validation, and chemical applicability domain. All validations showed that the constructed quantitative structure-activity relationship model suggested is robust and satisfactory.

  9. Quantitative structure–activity relationship study of P2X7 receptor inhibitors using combination of principal component analysis and artificial intelligence methods

    PubMed Central

    Ahmadi, Mehdi; Shahlaei, Mohsen

    2015-01-01

    P2X7 antagonist activity for a set of 49 molecules of the P2X7 receptor antagonists, derivatives of purine, was modeled with the aid of chemometric and artificial intelligence techniques. The activity of these compounds was estimated by means of combination of principal component analysis (PCA), as a well-known data reduction method, genetic algorithm (GA), as a variable selection technique, and artificial neural network (ANN), as a non-linear modeling method. First, a linear regression, combined with PCA, (principal component regression) was operated to model the structure–activity relationships, and afterwards a combination of PCA and ANN algorithm was employed to accurately predict the biological activity of the P2X7 antagonist. PCA preserves as much of the information as possible contained in the original data set. Seven most important PC's to the studied activity were selected as the inputs of ANN box by an efficient variable selection method, GA. The best computational neural network model was a fully-connected, feed-forward model with 7−7−1 architecture. The developed ANN model was fully evaluated by different validation techniques, including internal and external validation, and chemical applicability domain. All validations showed that the constructed quantitative structure–activity relationship model suggested is robust and satisfactory. PMID:26600858

  10. PCA leverage: outlier detection for high-dimensional functional magnetic resonance imaging data.

    PubMed

    Mejia, Amanda F; Nebel, Mary Beth; Eloyan, Ani; Caffo, Brian; Lindquist, Martin A

    2017-07-01

    Outlier detection for high-dimensional (HD) data is a popular topic in modern statistical research. However, one source of HD data that has received relatively little attention is functional magnetic resonance images (fMRI), which consists of hundreds of thousands of measurements sampled at hundreds of time points. At a time when the availability of fMRI data is rapidly growing-primarily through large, publicly available grassroots datasets-automated quality control and outlier detection methods are greatly needed. We propose principal components analysis (PCA) leverage and demonstrate how it can be used to identify outlying time points in an fMRI run. Furthermore, PCA leverage is a measure of the influence of each observation on the estimation of principal components, which are often of interest in fMRI data. We also propose an alternative measure, PCA robust distance, which is less sensitive to outliers and has controllable statistical properties. The proposed methods are validated through simulation studies and are shown to be highly accurate. We also conduct a reliability study using resting-state fMRI data from the Autism Brain Imaging Data Exchange and find that removal of outliers using the proposed methods results in more reliable estimation of subject-level resting-state networks using independent components analysis. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  11. RP-HPLC method using 6-aminoquinolyl-N-hydroxysuccinimidyl carbamate incorporated with normalization technique in principal component analysis to differentiate the bovine, porcine and fish gelatins.

    PubMed

    Azilawati, M I; Hashim, D M; Jamilah, B; Amin, I

    2015-04-01

    The amino acid compositions of bovine, porcine and fish gelatin were determined by amino acid analysis using 6-aminoquinolyl-N-hydroxysuccinimidyl carbamate as derivatization reagent. Sixteen amino acids were identified with similar spectral chromatograms. Data pre-treatment via centering and transformation of data by normalization were performed to provide data that are more suitable for analysis and easier to be interpreted. Principal component analysis (PCA) transformed the original data matrix into a number of principal components (PCs). Three principal components (PCs) described 96.5% of the total variance, and 2 PCs (91%) explained the highest variances. The PCA model demonstrated the relationships among amino acids in the correlation loadings plot to the group of gelatins in the scores plot. Fish gelatin was correlated to threonine, serine and methionine on the positive side of PC1; bovine gelatin was correlated to the non-polar side chains amino acids that were proline, hydroxyproline, leucine, isoleucine and valine on the negative side of PC1 and porcine gelatin was correlated to the polar side chains amino acids that were aspartate, glutamic acid, lysine and tyrosine on the negative side of PC2. Verification on the database using 12 samples from commercial products gelatin-based had confirmed the grouping patterns and the variables correlations. Therefore, this quantitative method is very useful as a screening method to determine gelatin from various sources. Copyright © 2014 Elsevier Ltd. All rights reserved.

  12. Self-aggregation in scaled principal component space

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ding, Chris H.Q.; He, Xiaofeng; Zha, Hongyuan

    2001-10-05

    Automatic grouping of voluminous data into meaningful structures is a challenging task frequently encountered in broad areas of science, engineering and information processing. These data clustering tasks are frequently performed in Euclidean space or a subspace chosen from principal component analysis (PCA). Here we describe a space obtained by a nonlinear scaling of PCA in which data objects self-aggregate automatically into clusters. Projection into this space gives sharp distinctions among clusters. Gene expression profiles of cancer tissue subtypes, Web hyperlink structure and Internet newsgroups are analyzed to illustrate interesting properties of the space.

  13. Principal Component Analysis for Enhancement of Infrared Spectra Monitoring

    NASA Astrophysics Data System (ADS)

    Haney, Ricky Lance

    The issue of air quality within the aircraft cabin is receiving increasing attention from both pilot and flight attendant unions. This is due to exposure events caused by poor air quality that in some cases may have contained toxic oil components due to bleed air that flows from outside the aircraft and then through the engines into the aircraft cabin. Significant short and long-term medical issues for aircraft crew have been attributed to exposure. The need for air quality monitoring is especially evident in the fact that currently within an aircraft there are no sensors to monitor the air quality and potentially harmful gas levels (detect-to-warn sensors), much less systems to monitor and purify the air (detect-to-treat sensors) within the aircraft cabin. The specific purpose of this research is to utilize a mathematical technique called principal component analysis (PCA) in conjunction with principal component regression (PCR) and proportionality constant calculations (PCC) to simplify complex, multi-component infrared (IR) spectra data sets into a reduced data set used for determination of the concentrations of the individual components. Use of PCA can significantly simplify data analysis as well as improve the ability to determine concentrations of individual target species in gas mixtures where significant band overlap occurs in the IR spectrum region. Application of this analytical numerical technique to IR spectrum analysis is important in improving performance of commercial sensors that airlines and aircraft manufacturers could potentially use in an aircraft cabin environment for multi-gas component monitoring. The approach of this research is two-fold, consisting of a PCA application to compare simulation and experimental results with the corresponding PCR and PCC to determine quantitatively the component concentrations within a mixture. The experimental data sets consist of both two and three component systems that could potentially be present as air contaminants in an aircraft cabin. In addition, experimental data sets are analyzed for a hydrogen peroxide (H2O2) aqueous solution mixture to determine H2O2 concentrations at various levels that could be produced during use of a vapor phase hydrogen peroxide (VPHP) decontamination system. After the PCA application to two and three component systems, the analysis technique is further expanded to include the monitoring of potential bleed air contaminants from engine oil combustion. Simulation data sets created from database spectra were utilized to predict gas components and concentrations in unknown engine oil samples at high temperatures as well as time-evolved gases from the heating of engine oils.

  14. Population Analysis of Disabled Children by Departments in France

    NASA Astrophysics Data System (ADS)

    Meidatuzzahra, Diah; Kuswanto, Heri; Pech, Nicolas; Etchegaray, Amélie

    2017-06-01

    In this study, a statistical analysis is performed by model the variations of the disabled about 0-19 years old population among French departments. The aim is to classify the departments according to their profile determinants (socioeconomic and behavioural profiles). The analysis is focused on two types of methods: principal component analysis (PCA) and multiple correspondences factorial analysis (MCA) to review which one is the best methods for interpretation of the correlation between the determinants of disability (independent variable). The PCA is the best method for interpretation of the correlation between the determinants of disability (independent variable). The PCA reduces 14 determinants of disability to 4 axes, keeps 80% of total information, and classifies them into 7 classes. The MCA reduces the determinants to 3 axes, retains only 30% of information, and classifies them into 4 classes.

  15. A reduced basis method for molecular dynamics simulation

    NASA Astrophysics Data System (ADS)

    Vincent-Finley, Rachel Elisabeth

    In this dissertation, we develop a method for molecular simulation based on principal component analysis (PCA) of a molecular dynamics trajectory and least squares approximation of a potential energy function. Molecular dynamics (MD) simulation is a computational tool used to study molecular systems as they evolve through time. With respect to protein dynamics, local motions, such as bond stretching, occur within femtoseconds, while rigid body and large-scale motions, occur within a range of nanoseconds to seconds. To capture motion at all levels, time steps on the order of a femtosecond are employed when solving the equations of motion and simulations must continue long enough to capture the desired large-scale motion. To date, simulations of solvated proteins on the order of nanoseconds have been reported. It is typically the case that simulations of a few nanoseconds do not provide adequate information for the study of large-scale motions. Thus, the development of techniques that allow longer simulation times can advance the study of protein function and dynamics. In this dissertation we use principal component analysis (PCA) to identify the dominant characteristics of an MD trajectory and to represent the coordinates with respect to these characteristics. We augment PCA with an updating scheme based on a reduced representation of a molecule and consider equations of motion with respect to the reduced representation. We apply our method to butane and BPTI and compare the results to standard MD simulations of these molecules. Our results indicate that the molecular activity with respect to our simulation method is analogous to that observed in the standard MD simulation with simulations on the order of picoseconds.

  16. Prediction of protein structural classes by Chou's pseudo amino acid composition: approached using continuous wavelet transform and principal component analysis.

    PubMed

    Li, Zhan-Chao; Zhou, Xi-Bin; Dai, Zong; Zou, Xiao-Yong

    2009-07-01

    A prior knowledge of protein structural classes can provide useful information about its overall structure, so it is very important for quick and accurate determination of protein structural class with computation method in protein science. One of the key for computation method is accurate protein sample representation. Here, based on the concept of Chou's pseudo-amino acid composition (AAC, Chou, Proteins: structure, function, and genetics, 43:246-255, 2001), a novel method of feature extraction that combined continuous wavelet transform (CWT) with principal component analysis (PCA) was introduced for the prediction of protein structural classes. Firstly, the digital signal was obtained by mapping each amino acid according to various physicochemical properties. Secondly, CWT was utilized to extract new feature vector based on wavelet power spectrum (WPS), which contains more abundant information of sequence order in frequency domain and time domain, and PCA was then used to reorganize the feature vector to decrease information redundancy and computational complexity. Finally, a pseudo-amino acid composition feature vector was further formed to represent primary sequence by coupling AAC vector with a set of new feature vector of WPS in an orthogonal space by PCA. As a showcase, the rigorous jackknife cross-validation test was performed on the working datasets. The results indicated that prediction quality has been improved, and the current approach of protein representation may serve as a useful complementary vehicle in classifying other attributes of proteins, such as enzyme family class, subcellular localization, membrane protein types and protein secondary structure, etc.

  17. Discrimination of liver cancer in cellular level based on backscatter micro-spectrum with PCA algorithm and BP neural network

    NASA Astrophysics Data System (ADS)

    Yang, Jing; Wang, Cheng; Cai, Gan; Dong, Xiaona

    2016-10-01

    The incidence and mortality rate of the primary liver cancer are very high and its postoperative metastasis and recurrence have become important factors to the prognosis of patients. Circulating tumor cells (CTC), as a new tumor marker, play important roles in the early diagnosis and individualized treatment. This paper presents an effective method to distinguish liver cancer based on the cellular scattering spectrum, which is a non-fluorescence technique based on the fiber confocal microscopic spectrometer. Combining the principal component analysis (PCA) with back propagation (BP) neural network were utilized to establish an automatic recognition model for backscatter spectrum of the liver cancer cells from blood cell. PCA was applied to reduce the dimension of the scattering spectral data which obtained by the fiber confocal microscopic spectrometer. After dimensionality reduction by PCA, a neural network pattern recognition model with 2 input layer nodes, 11 hidden layer nodes, 3 output nodes was established. We trained the network with 66 samples and also tested it. Results showed that the recognition rate of the three types of cells is more than 90%, the relative standard deviation is only 2.36%. The experimental results showed that the fiber confocal microscopic spectrometer combining with the algorithm of PCA and BP neural network can automatically identify the liver cancer cell from the blood cells. This will provide a better tool for investigating the metastasis of liver cancers in vivo, the biology metabolic characteristics of liver cancers and drug transportation. Additionally, it is obviously referential in practical application.

  18. Differentiation of red wines using an electronic nose based on surface acoustic wave devices.

    PubMed

    García, M; Fernández, M J; Fontecha, J L; Lozano, J; Santos, J P; Aleixandre, M; Sayago, I; Gutiérrez, J; Horrillo, M C

    2006-02-15

    An electronic nose, utilizing the principle of surface acoustic waves (SAW), was used to differentiate among different wines of the same variety of grapes which come from the same cellar. The electronic nose is based on eight surface acoustic wave sensors, one is a reference sensor and the others are coated by different polymers by spray coating technique. Data analysis was performed by two pattern recognition methods; principal component analysis (PCA) and probabilistic neuronal network (PNN). The results showed that electronic nose was able to identify the tested wines.

  19. Item response theory and factor analysis as a mean to characterize occurrence of response shift in a longitudinal quality of life study in breast cancer patients

    PubMed Central

    2014-01-01

    Background The occurrence of response shift (RS) in longitudinal health-related quality of life (HRQoL) studies, reflecting patient adaptation to disease, has already been demonstrated. Several methods have been developed to detect the three different types of response shift (RS), i.e. recalibration RS, 2) reprioritization RS, and 3) reconceptualization RS. We investigated two complementary methods that characterize the occurrence of RS: factor analysis, comprising Principal Component Analysis (PCA) and Multiple Correspondence Analysis (MCA), and a method of Item Response Theory (IRT). Methods Breast cancer patients (n = 381) completed the EORTC QLQ-C30 and EORTC QLQ-BR23 questionnaires at baseline, immediately following surgery, and three and six months after surgery, according to the “then-test/post-test” design. Recalibration was explored using MCA and a model of IRT, called the Linear Logistic Model with Relaxed Assumptions (LLRA) using the then-test method. Principal Component Analysis (PCA) was used to explore reconceptualization and reprioritization. Results MCA highlighted the main profiles of recalibration: patients with high HRQoL level report a slightly worse HRQoL level retrospectively and vice versa. The LLRA model indicated a downward or upward recalibration for each dimension. At six months, the recalibration effect was statistically significant for 11/22 dimensions of the QLQ-C30 and BR23 according to the LLRA model (p ≤ 0.001). Regarding the QLQ-C30, PCA indicated a reprioritization of symptom scales and reconceptualization via an increased correlation between functional scales. Conclusions Our findings demonstrate the usefulness of these analyses in characterizing the occurrence of RS. MCA and IRT model had convergent results with then-test method to characterize recalibration component of RS. PCA is an indirect method in investigating the reprioritization and reconceptualization components of RS. PMID:24606836

  20. Source Apportionment and Risk Assessment of Emerging Contaminants: An Approach of Pharmaco-Signature in Water Systems

    PubMed Central

    Jiang, Jheng Jie; Lee, Chon Lin; Fang, Meng Der; Boyd, Kenneth G.; Gibb, Stuart W.

    2015-01-01

    This paper presents a methodology based on multivariate data analysis for characterizing potential source contributions of emerging contaminants (ECs) detected in 26 river water samples across multi-scape regions during dry and wet seasons. Based on this methodology, we unveil an approach toward potential source contributions of ECs, a concept we refer to as the “Pharmaco-signature.” Exploratory analysis of data points has been carried out by unsupervised pattern recognition (hierarchical cluster analysis, HCA) and receptor model (principal component analysis-multiple linear regression, PCA-MLR) in an attempt to demonstrate significant source contributions of ECs in different land-use zone. Robust cluster solutions grouped the database according to different EC profiles. PCA-MLR identified that 58.9% of the mean summed ECs were contributed by domestic impact, 9.7% by antibiotics application, and 31.4% by drug abuse. Diclofenac, ibuprofen, codeine, ampicillin, tetracycline, and erythromycin-H2O have significant pollution risk quotients (RQ>1), indicating potentially high risk to aquatic organisms in Taiwan. PMID:25874375

  1. Identification and elucidation of anthropogenic source contribution in PM10 pollutant: Insight gain from dispersion and receptor models.

    PubMed

    Roy, Debananda; Singh, Gurdeep; Yadav, Pankaj

    2016-10-01

    Source apportionment study of PM 10 (Particulate Matter) in a critically polluted area of Jharia coalfield, India has been carried out using Dispersion model, Principle Component Analysis (PCA) and Chemical Mass Balance (CMB) techniques. Dispersion model Atmospheric Dispersion Model (AERMOD) was introduced to simplify the complexity of sources in Jharia coalfield. PCA and CMB analysis indicates that monitoring stations near the mining area were mainly affected by the emission from open coal mining and its associated activities such as coal transportation, loading and unloading of coal. Mine fire emission also contributed a considerable amount of particulate matters in monitoring stations. Locations in the city area were mostly affected by vehicular, Liquid Petroleum Gas (LPG) & Diesel Generator (DG) set emissions, residential, and commercial activities. The experimental data sampling and their analysis could aid understanding how dispersion based model technique along with receptor model based concept can be strategically used for quantitative analysis of Natural and Anthropogenic sources of PM 10 . Copyright © 2016. Published by Elsevier B.V.

  2. Finding Imaging Patterns of Structural Covariance via Non-Negative Matrix Factorization

    PubMed Central

    Sotiras, Aristeidis; Resnick, Susan M.; Davatzikos, Christos

    2015-01-01

    In this paper, we investigate the use of Non-Negative Matrix Factorization (NNMF) for the analysis of structural neuroimaging data. The goal is to identify the brain regions that co-vary across individuals in a consistent way, hence potentially being part of underlying brain networks or otherwise influenced by underlying common mechanisms such as genetics and pathologies. NNMF offers a directly data-driven way of extracting relatively localized co-varying structural regions, thereby transcending limitations of Principal Component Analysis (PCA), Independent Component Analysis (ICA) and other related methods that tend to produce dispersed components of positive and negative loadings. In particular, leveraging upon the well known ability of NNMF to produce parts-based representations of image data, we derive decompositions that partition the brain into regions that vary in consistent ways across individuals. Importantly, these decompositions achieve dimensionality reduction via highly interpretable ways and generalize well to new data as shown via split-sample experiments. We empirically validate NNMF in two data sets: i) a Diffusion Tensor (DT) mouse brain development study, and ii) a structural Magnetic Resonance (sMR) study of human brain aging. We demonstrate the ability of NNMF to produce sparse parts-based representations of the data at various resolutions. These representations seem to follow what we know about the underlying functional organization of the brain and also capture some pathological processes. Moreover, we show that these low dimensional representations favorably compare to descriptions obtained with more commonly used matrix factorization methods like PCA and ICA. PMID:25497684

  3. Comparison of discrete Fourier transform (DFT) and principal component analysis/DFT as forecasting tools for absorbance time series received by UV-visible probes installed in urban sewer systems.

    PubMed

    Plazas-Nossa, Leonardo; Torres, Andrés

    2014-01-01

    The objective of this work is to introduce a forecasting method for UV-Vis spectrometry time series that combines principal component analysis (PCA) and discrete Fourier transform (DFT), and to compare the results obtained with those obtained by using DFT. Three time series for three different study sites were used: (i) Salitre wastewater treatment plant (WWTP) in Bogotá; (ii) Gibraltar pumping station in Bogotá; and (iii) San Fernando WWTP in Itagüí (in the south part of Medellín). Each of these time series had an equal number of samples (1051). In general terms, the results obtained are hardly generalizable, as they seem to be highly dependent on specific water system dynamics; however, some trends can be outlined: (i) for UV range, DFT and PCA/DFT forecasting accuracy were almost the same; (ii) for visible range, the PCA/DFT forecasting procedure proposed gives systematically lower forecasting errors and variability than those obtained with the DFT procedure; and (iii) for short forecasting times the PCA/DFT procedure proposed is more suitable than the DFT procedure, according to processing times obtained.

  4. Recursive approach of EEG-segment-based principal component analysis substantially reduces cryogenic pump artifacts in simultaneous EEG-fMRI data.

    PubMed

    Kim, Hyun-Chul; Yoo, Seung-Schik; Lee, Jong-Hwan

    2015-01-01

    Electroencephalography (EEG) data simultaneously acquired with functional magnetic resonance imaging (fMRI) data are preprocessed to remove gradient artifacts (GAs) and ballistocardiographic artifacts (BCAs). Nonetheless, these data, especially in the gamma frequency range, can be contaminated by residual artifacts produced by mechanical vibrations in the MRI system, in particular the cryogenic pump that compresses and transports the helium that chills the magnet (the helium-pump). However, few options are available for the removal of helium-pump artifacts. In this study, we propose a recursive approach of EEG-segment-based principal component analysis (rsPCA) that enables the removal of these helium-pump artifacts. Using the rsPCA method, feature vectors representing helium-pump artifacts were successfully extracted as eigenvectors, and the reconstructed signals of the feature vectors were subsequently removed. A test using simultaneous EEG-fMRI data acquired from left-hand (LH) and right-hand (RH) clenching tasks performed by volunteers found that the proposed rsPCA method substantially reduced helium-pump artifacts in the EEG data and significantly enhanced task-related gamma band activity levels (p=0.0038 and 0.0363 for LH and RH tasks, respectively) in EEG data that have had GAs and BCAs removed. The spatial patterns of the fMRI data were estimated using a hemodynamic response function (HRF) modeled from the estimated gamma band activity in a general linear model (GLM) framework. Active voxel clusters were identified in the post-/pre-central gyri of motor area, only from the rsPCA method (uncorrected p<0.001 for both LH/RH tasks). In addition, the superior temporal pole areas were consistently observed (uncorrected p<0.001 for the LH task and uncorrected p<0.05 for the RH task) in the spatial patterns of the HRF model for gamma band activity when the task paradigm and movement were also included in the GLM. Copyright © 2014 Elsevier Inc. All rights reserved.

  5. Associations between markers of subclinical atherosclerosis and dietary patterns derived by principal components analysis and reduced rank regression in the Multi-Ethnic Study of Atherosclerosis (MESA)1–3

    PubMed Central

    Nettleton, Jennifer A; Steffen, Lyn M; Schulze, Matthias B; Jenny, Nancy S; Barr, R Graham; Bertoni, Alain G; Jacobs, David R

    2010-01-01

    Background The association between diet and cardiovascular disease (CVD) may be mediated partly through inflammatory processes and reflected by markers of subclinical atherosclerosis. Objective We investigated whether empirically derived dietary patterns are associated with coronary artery calcium (CAC) and common and internal carotid artery intima media thickness (IMT) and whether prior information about inflammatory processes would increase the strength of the associations. Design At baseline, dietary patterns were derived with the use of a food-frequency questionnaire, and inflammatory biomarkers, CAC, and IMT were measured in 5089 participants aged 45–84 y, who had no clinical CVD or diabetes, in the Multi-Ethnic Study of Atherosclerosis. Dietary patterns based on variations in C-reactive protein, interleukin-6, homocysteine, and fibrinogen concentrations were created with reduced rank regression (RRR). Dietary patterns based on variations in food group intake were created with principal components analysis (PCA). Results The primary RRR(RRR 1) and PCA(PCA factor 1) dietary patterns were high in total and saturated fat and low in fiber and micronutrients. However, the food sources of these nutrients differed between the dietary patterns. RRR 1 was positively associated with CAC [Agatston score >0: OR(95% CI) for quartile 5 compared with quartile 1 = 1.34 (1.05, 1.71); ln(Agatston score = 1): P for trend = 0.023] and with common carotid IMT [≥1.0 mm: OR (95% CI) for quartile 5 compared with quartile 1 = 1.33 (0.99, 1.79); ln(common carotid IMT): P for trend = 0.006]. PCA 1 was not associated with CAC or IMT. Conclusion The results suggest that subtle differences in dietary pattern composition, realized by incorporating measures of inflammatory processes, affect associations with markers of subclinical atherosclerosis. PMID:17556701

  6. [Research on optimal modeling strategy for licorice extraction process based on near-infrared spectroscopy technology].

    PubMed

    Wang, Hai-Xia; Suo, Tong-Chuan; Yu, He-Shui; Li, Zheng

    2016-10-01

    The manufacture of traditional Chinese medicine (TCM) products is always accompanied by processing complex raw materials and real-time monitoring of the manufacturing process. In this study, we investigated different modeling strategies for the extraction process of licorice. Near-infrared spectra associate with the extraction time was used to detemine the states of the extraction processes. Three modeling approaches, i.e., principal component analysis (PCA), partial least squares regression (PLSR) and parallel factor analysis-PLSR (PARAFAC-PLSR), were adopted for the prediction of the real-time status of the process. The overall results indicated that PCA, PLSR and PARAFAC-PLSR can effectively detect the errors in the extraction procedure and predict the process trajectories, which has important significance for the monitoring and controlling of the extraction processes. Copyright© by the Chinese Pharmaceutical Association.

  7. Feature extraction for ultrasonic sensor based defect detection in ceramic components

    NASA Astrophysics Data System (ADS)

    Kesharaju, Manasa; Nagarajah, Romesh

    2014-02-01

    High density silicon carbide materials are commonly used as the ceramic element of hard armour inserts used in traditional body armour systems to reduce their weight, while providing improved hardness, strength and elastic response to stress. Currently, armour ceramic tiles are inspected visually offline using an X-ray technique that is time consuming and very expensive. In addition, from X-rays multiple defects are also misinterpreted as single defects. Therefore, to address these problems the ultrasonic non-destructive approach is being investigated. Ultrasound based inspection would be far more cost effective and reliable as the methodology is applicable for on-line quality control including implementation of accept/reject criteria. This paper describes a recently developed methodology to detect, locate and classify various manufacturing defects in ceramic tiles using sub band coding of ultrasonic test signals. The wavelet transform is applied to the ultrasonic signal and wavelet coefficients in the different frequency bands are extracted and used as input features to an artificial neural network (ANN) for purposes of signal classification. Two different classifiers, using artificial neural networks (supervised) and clustering (un-supervised) are supplied with features selected using Principal Component Analysis(PCA) and their classification performance compared. This investigation establishes experimentally that Principal Component Analysis(PCA) can be effectively used as a feature selection method that provides superior results for classifying various defects in the context of ultrasonic inspection in comparison with the X-ray technique.

  8. ERP Go/NoGo condition effects are better detected with separate PCAs.

    PubMed

    Barry, Robert J; De Blasio, Frances M; Fogarty, Jack S; Karamacoska, Diana

    2016-08-01

    We explored the separation of Go and NoGo effects in the ERP components elicited in an equiprobable Go/NoGo task, using different forms of temporal Principal Components Analysis (PCA). Following exploratory simulation studies assessing the PCA impact of latency jitter and between-condition latency differences in the P3 latency range, an empirical study compared results of a Combined PCA carried out using both Go and NoGo ERPs together as input, with those from two Separate PCAs carried out on the Go and NoGo ERPs separately. The simulation studies indicated that Separate PCAs provide adequate component recovery in the presence of P3 latency jitter, and that Combined PCAs provide good separation of components only when systematic condition-related latency differences are sufficiently large (here ~110ms). In the empirical data, broadly-similar components were obtained from the Combined and Separate PCAs, supporting previous findings from Combined PCA investigations, and the consequent interpretations of the sequential processing involved. However, the Separate PCAs generated latency differences for components in the Go and NoGo processing chains that better matched the late Go/NoGo ERP peaks, and produced better-defined and larger components that fitted the stages in a hypothetical processing schema developed for this paradigm. Overall, the Separate PCAs yielded a better partitioning of the ERP variance associated with the Go and NoGo conditions, and should be considered as the first choice in future investigations if systematic component or subcomponent latency differences are present or suspected. Copyright © 2016 Elsevier B.V. All rights reserved.

  9. Algorithms for accelerated convergence of adaptive PCA.

    PubMed

    Chatterjee, C; Kang, Z; Roychowdhury, V P

    2000-01-01

    We derive and discuss new adaptive algorithms for principal component analysis (PCA) that are shown to converge faster than the traditional PCA algorithms due to Oja, Sanger, and Xu. It is well known that traditional PCA algorithms that are derived by using gradient descent on an objective function are slow to converge. Furthermore, the convergence of these algorithms depends on appropriate choices of the gain sequences. Since online applications demand faster convergence and an automatic selection of gains, we present new adaptive algorithms to solve these problems. We first present an unconstrained objective function, which can be minimized to obtain the principal components. We derive adaptive algorithms from this objective function by using: 1) gradient descent; 2) steepest descent; 3) conjugate direction; and 4) Newton-Raphson methods. Although gradient descent produces Xu's LMSER algorithm, the steepest descent, conjugate direction, and Newton-Raphson methods produce new adaptive algorithms for PCA. We also provide a discussion on the landscape of the objective function, and present a global convergence proof of the adaptive gradient descent PCA algorithm using stochastic approximation theory. Extensive experiments with stationary and nonstationary multidimensional Gaussian sequences show faster convergence of the new algorithms over the traditional gradient descent methods.We also compare the steepest descent adaptive algorithm with state-of-the-art methods on stationary and nonstationary sequences.

  10. Perfume Fragrance Discrimination Using Resistance And Capacitance Responses Of Polymer Sensors

    NASA Astrophysics Data System (ADS)

    Lima, John Paul Hempel; Vandendriessche, Thomas; Fonseca, Fernando J.; Lammertyn, Jeroen; Nicolai, Bart M.; de Andrade, Adnei Melges

    2009-05-01

    This work shows a comparison between electrical resistance and capacitance responses of ethanol and five different fragrances using an electronic nose based on conducting polymers. Gas chromatography—mass spectrometry (GC-MS) measurements were performed to evaluate the main differences between the analytes. It is shown that although the fragrances are quite similar in their compositions the sensors are able to discriminate them through PCA (Principal Component Analysis) and ANNs (Artificial Neural Network) analysis.

  11. Multivariate frequency domain analysis of protein dynamics

    NASA Astrophysics Data System (ADS)

    Matsunaga, Yasuhiro; Fuchigami, Sotaro; Kidera, Akinori

    2009-03-01

    Multivariate frequency domain analysis (MFDA) is proposed to characterize collective vibrational dynamics of protein obtained by a molecular dynamics (MD) simulation. MFDA performs principal component analysis (PCA) for a bandpass filtered multivariate time series using the multitaper method of spectral estimation. By applying MFDA to MD trajectories of bovine pancreatic trypsin inhibitor, we determined the collective vibrational modes in the frequency domain, which were identified by their vibrational frequencies and eigenvectors. At near zero temperature, the vibrational modes determined by MFDA agreed well with those calculated by normal mode analysis. At 300 K, the vibrational modes exhibited characteristic features that were considerably different from the principal modes of the static distribution given by the standard PCA. The influences of aqueous environments were discussed based on two different sets of vibrational modes, one derived from a MD simulation in water and the other from a simulation in vacuum. Using the varimax rotation, an algorithm of the multivariate statistical analysis, the representative orthogonal set of eigenmodes was determined at each vibrational frequency.

  12. Principal component analysis of chemical shift perturbation data of a multiple-ligand-binding system for elucidation of respective binding mechanism.

    PubMed

    Konuma, Tsuyoshi; Lee, Young-Ho; Goto, Yuji; Sakurai, Kazumasa

    2013-01-01

    Chemical shift perturbations (CSPs) in NMR spectra provide useful information about the interaction of a protein with its ligands. However, in a multiple-ligand-binding system, determining quantitative parameters such as a dissociation constant (K(d) ) is difficult. Here, we used a method we named CS-PCA, a principal component analysis (PCA) of chemical shift (CS) data, to analyze the interaction between bovine β-lactoglobulin (βLG) and 1-anilinonaphthalene-8-sulfonate (ANS), which is a multiple-ligand-binding system. The CSP on the binding of ANS involved contributions from two distinct binding sites. PCA of the titration data successfully separated the CSP pattern into contributions from each site. Docking simulations based on the separated CSP patterns provided the structures of βLG-ANS complexes for each binding site. In addition, we determined the K(d) values as 3.42 × 10⁻⁴ M² and 2.51 × 10⁻³ M for Sites 1 and 2, respectively. In contrast, it was difficult to obtain reliable K(d) values for respective sites from the isothermal titration calorimetry experiments. Two ANS molecules were found to bind at Site 1 simultaneously, suggesting that the binding occurs cooperatively with a partial unfolding of the βLG structure. On the other hand, the binding of ANS to Site 2 was a simple attachment without a significant conformational change. From the present results, CS-PCA was confirmed to provide not only the positions and the K(d) values of binding sites but also information about the binding mechanism. Thus, it is anticipated to be a general method to investigate protein-ligand interactions. Copyright © 2012 Wiley Periodicals, Inc.

  13. Novel algorithm for simultaneous component detection and pseudo-molecular ion characterization in liquid chromatography-mass spectrometry.

    PubMed

    Zhang, Yufeng; Wang, Xiaoan; Wo, Siukwan; Ho, Hingman; Han, Quanbin; Fan, Xiaohui; Zuo, Zhong

    2015-01-01

    Resolving components and determining their pseudo-molecular ions (PMIs) are crucial steps in identifying complex herbal mixtures by liquid chromatography-mass spectrometry. To tackle such labor-intensive steps, we present here a novel algorithm for simultaneous detection of components and their PMIs. Our method consists of three steps: (1) obtaining a simplified dataset containing only mono-isotopic masses by removal of background noise and isotopic cluster ions based on the isotopic distribution model derived from all the reported natural compounds in dictionary of natural products; (2) stepwise resolving and removing all features of the highest abundant component from current simplified dataset and calculating PMI of each component according to an adduct-ion model, in which all non-fragment ions in a mass spectrum are considered as PMI plus one or several neutral species; (3) visual classification of detected components by principal component analysis (PCA) to exclude possible non-natural compounds (such as pharmaceutical excipients). This algorithm has been successfully applied to a standard mixture and three herbal extract/preparations. It indicated that our algorithm could detect components' features as a whole and report their PMI with an accuracy of more than 98%. Furthermore, components originated from excipients/contaminants could be easily separated from those natural components in the bi-plots of PCA. Copyright © 2014 Elsevier B.V. All rights reserved.

  14. Principal Components Analysis of a JWST NIRSpec Detector Subsystem

    NASA Technical Reports Server (NTRS)

    Arendt, Richard G.; Fixsen, D. J.; Greenhouse, Matthew A.; Lander, Matthew; Lindler, Don; Loose, Markus; Moseley, S. H.; Mott, D. Brent; Rauscher, Bernard J.; Wen, Yiting; hide

    2013-01-01

    We present principal component analysis (PCA) of a flight-representative James Webb Space Telescope NearInfrared Spectrograph (NIRSpec) Detector Subsystem. Although our results are specific to NIRSpec and its T - 40 K SIDECAR ASICs and 5 m cutoff H2RG detector arrays, the underlying technical approach is more general. We describe how we measured the systems response to small environmental perturbations by modulating a set of bias voltages and temperature. We used this information to compute the systems principal noise components. Together with information from the astronomical scene, we show how the zeroth principal component can be used to calibrate out the effects of small thermal and electrical instabilities to produce cosmetically cleaner images with significantly less correlated noise. Alternatively, if one were designing a new instrument, one could use a similar PCA approach to inform a set of environmental requirements (temperature stability, electrical stability, etc.) that enabled the planned instrument to meet performance requirements

  15. Common mode error in Antarctic GPS coordinate time series on its effect on bedrock-uplift estimates

    NASA Astrophysics Data System (ADS)

    Liu, Bin; King, Matt; Dai, Wujiao

    2018-05-01

    Spatially-correlated common mode error always exists in regional, or-larger, GPS networks. We applied independent component analysis (ICA) to GPS vertical coordinate time series in Antarctica from 2010 to 2014 and made a comparison with the principal component analysis (PCA). Using PCA/ICA, the time series can be decomposed into a set of temporal components and their spatial responses. We assume the components with common spatial responses are common mode error (CME). An average reduction of ˜40% about the RMS values was achieved in both PCA and ICA filtering. However, the common mode components obtained from the two approaches have different spatial and temporal features. ICA time series present interesting correlations with modeled atmospheric and non-tidal ocean loading displacements. A white noise (WN) plus power law noise (PL) model was adopted in the GPS velocity estimation using maximum likelihood estimation (MLE) analysis, with ˜55% reduction of the velocity uncertainties after filtering using ICA. Meanwhile, spatiotemporal filtering reduces the amplitude of PL and periodic terms in the GPS time series. Finally, we compare the GPS uplift velocities, after correction for elastic effects, with recent models of glacial isostatic adjustment (GIA). The agreements of the GPS observed velocities and four GIA models are generally improved after the spatiotemporal filtering, with a mean reduction of ˜0.9 mm/yr of the WRMS values, possibly allowing for more confident separation of various GIA model predictions.

  16. Study of ionospheric anomalies due to impact of typhoon using Principal Component Analysis and image processing

    NASA Astrophysics Data System (ADS)

    LIN, JYH-WOEI

    2012-08-01

    Principal Component Analysis (PCA) and image processing are used to determine Total Electron Content (TEC) anomalies in the F-layer of the ionosphere relating to Typhoon Nakri for 29 May, 2008 (UTC). PCA and image processing are applied to the global ionospheric map (GIM) with transforms conducted for the time period 12:00-14:00 UT on 29 May, 2008 when the wind was most intense. Results show that at a height of approximately 150-200 km the TEC anomaly is highly localized; however, it becomes more intense and widespread with height. Potential causes of these results are discussed with emphasis given to acoustic gravity waves caused by wind force.

  17. On the application of the Principal Component Analysis for an efficient climate downscaling of surface wind fields

    NASA Astrophysics Data System (ADS)

    Chavez, Roberto; Lozano, Sergio; Correia, Pedro; Sanz-Rodrigo, Javier; Probst, Oliver

    2013-04-01

    With the purpose of efficiently and reliably generating long-term wind resource maps for the wind energy industry, the application and verification of a statistical methodology for the climate downscaling of wind fields at surface level is presented in this work. This procedure is based on the combination of the Monte Carlo and the Principal Component Analysis (PCA) statistical methods. Firstly the Monte Carlo method is used to create a huge number of daily-based annual time series, so called climate representative years, by the stratified sampling of a 33-year-long time series corresponding to the available period of the NCAR/NCEP global reanalysis data set (R-2). Secondly the representative years are evaluated such that the best set is chosen according to its capability to recreate the Sea Level Pressure (SLP) temporal and spatial fields from the R-2 data set. The measure of this correspondence is based on the Euclidean distance between the Empirical Orthogonal Functions (EOF) spaces generated by the PCA (Principal Component Analysis) decomposition of the SLP fields from both the long-term and the representative year data sets. The methodology was verified by comparing the selected 365-days period against a 9-year period of wind fields generated by dynamical downscaling the Global Forecast System data with the mesoscale model SKIRON for the Iberian Peninsula. These results showed that, compared to the traditional method of dynamical downscaling any random 365-days period, the error in the average wind velocity by the PCA's representative year was reduced by almost 30%. Moreover the Mean Absolute Errors (MAE) in the monthly and daily wind profiles were also reduced by almost 25% along all SKIRON grid points. These results showed also that the methodology presented maximum error values in the wind speed mean of 0.8 m/s and maximum MAE in the monthly curves of 0.7 m/s. Besides the bulk numbers, this work shows the spatial distribution of the errors across the Iberian domain and additional wind statistics such as the velocity and directional frequency. Additional repetitions were performed to prove the reliability and robustness of this kind-of statistical-dynamical downscaling method.

  18. A probability index for surface zonda wind occurrence at Mendoza city through vertical sounding principal components analysis

    NASA Astrophysics Data System (ADS)

    Otero, Federico; Norte, Federico; Araneo, Diego

    2018-01-01

    The aim of this work is to obtain an index for predicting the probability of occurrence of zonda event at surface level from sounding data at Mendoza city, Argentine. To accomplish this goal, surface zonda wind events were previously found with an objective classification method (OCM) only considering the surface station values. Once obtained the dates and the onset time of each event, the prior closest sounding for each event was taken to realize a principal component analysis (PCA) that is used to identify the leading patterns of the vertical structure of the atmosphere previously to a zonda wind event. These components were used to construct the index model. For the PCA an entry matrix of temperature ( T) and dew point temperature (Td) anomalies for the standard levels between 850 and 300 hPa was build. The analysis yielded six significant components with a 94 % of the variance explained and the leading patterns of favorable weather conditions for the development of the phenomenon were obtained. A zonda/non-zonda indicator c can be estimated by a logistic multiple regressions depending on the PCA component loadings, determining a zonda probability index \\widehat{c} calculable from T and Td profiles and it depends on the climatological features of the region. The index showed 74.7 % efficiency. The same analysis was performed by adding surface values of T and Td from Mendoza Aero station increasing the index efficiency to 87.8 %. The results revealed four significantly correlated PCs with a major improvement in differentiating zonda cases and a reducing of the uncertainty interval.

  19. Permeability Estimation of Rock Reservoir Based on PCA and Elman Neural Networks

    NASA Astrophysics Data System (ADS)

    Shi, Ying; Jian, Shaoyong

    2018-03-01

    an intelligent method which based on fuzzy neural networks with PCA algorithm, is proposed to estimate the permeability of rock reservoir. First, the dimensionality reduction process is utilized for these parameters by principal component analysis method. Further, the mapping relationship between rock slice characteristic parameters and permeability had been found through fuzzy neural networks. The estimation validity and reliability for this method were tested with practical data from Yan’an region in Ordos Basin. The result showed that the average relative errors of permeability estimation for this method is 6.25%, and this method had the better convergence speed and more accuracy than other. Therefore, by using the cheap rock slice related information, the permeability of rock reservoir can be estimated efficiently and accurately, and it is of high reliability, practicability and application prospect.

  20. Diagnostics and Active Control of Aircraft Interior Noise

    NASA Technical Reports Server (NTRS)

    Fuller, C. R.

    1998-01-01

    This project deals with developing advanced methods for investigating and controlling interior noise in aircraft. The work concentrates on developing and applying the techniques of Near Field Acoustic Holography (NAH) and Principal Component Analysis (PCA) to the aircraft interior noise dynamic problem. This involves investigating the current state of the art, developing new techniques and then applying them to the particular problem being studied. The knowledge gained under the first part of the project was then used to develop and apply new, advanced noise control techniques for reducing interior noise. A new fully active control approach based on the PCA was developed and implemented on a test cylinder. Finally an active-passive approach based on tunable vibration absorbers was to be developed and analytically applied to a range of test structures from simple plates to aircraft fuselages.

  1. Degradation trend estimation of slewing bearing based on LSSVM model

    NASA Astrophysics Data System (ADS)

    Lu, Chao; Chen, Jie; Hong, Rongjing; Feng, Yang; Li, Yuanyuan

    2016-08-01

    A novel prediction method is proposed based on least squares support vector machine (LSSVM) to estimate the slewing bearing's degradation trend with small sample data. This method chooses the vibration signal which contains rich state information as the object of the study. Principal component analysis (PCA) was applied to fuse multi-feature vectors which could reflect the health state of slewing bearing, such as root mean square, kurtosis, wavelet energy entropy, and intrinsic mode function (IMF) energy. The degradation indicator fused by PCA can reflect the degradation more comprehensively and effectively. Then the degradation trend of slewing bearing was predicted by using the LSSVM model optimized by particle swarm optimization (PSO). The proposed method was demonstrated to be more accurate and effective by the whole life experiment of slewing bearing. Therefore, it can be applied in engineering practice.

  2. Neural-network-based state of health diagnostics for an automated radioxenon sampler/analyzer

    NASA Astrophysics Data System (ADS)

    Keller, Paul E.; Kangas, Lars J.; Hayes, James C.; Schrom, Brian T.; Suarez, Reynold; Hubbard, Charles W.; Heimbigner, Tom R.; McIntyre, Justin I.

    2009-05-01

    Artificial neural networks (ANNs) are used to determine the state-of-health (SOH) of the Automated Radioxenon Analyzer/Sampler (ARSA). ARSA is a gas collection and analysis system used for non-proliferation monitoring in detecting radioxenon released during nuclear tests. SOH diagnostics are important for automated, unmanned sensing systems so that remote detection and identification of problems can be made without onsite staff. Both recurrent and feed-forward ANNs are presented. The recurrent ANN is trained to predict sensor values based on current valve states, which control air flow, so that with only valve states the normal SOH sensor values can be predicted. Deviation between modeled value and actual is an indication of a potential problem. The feed-forward ANN acts as a nonlinear version of principal components analysis (PCA) and is trained to replicate the normal SOH sensor values. Because of ARSA's complexity, this nonlinear PCA is better able to capture the relationships among the sensors than standard linear PCA and is applicable to both sensor validation and recognizing off-normal operating conditions. Both models provide valuable information to detect impending malfunctions before they occur to avoid unscheduled shutdown. Finally, the ability of ANN methods to predict the system state is presented.

  3. Plaque echodensity and textural features are associated with histologic carotid plaque instability.

    PubMed

    Doonan, Robert J; Gorgui, Jessica; Veinot, Jean P; Lai, Chi; Kyriacou, Efthyvoulos; Corriveau, Marc M; Steinmetz, Oren K; Daskalopoulou, Stella S

    2016-09-01

    Carotid plaque echodensity and texture features predict cerebrovascular symptomatology. Our purpose was to determine the association of echodensity and textural features obtained from a digital image analysis (DIA) program with histologic features of plaque instability as well as to identify the specific morphologic characteristics of unstable plaques. Patients scheduled to undergo carotid endarterectomy were recruited and underwent carotid ultrasound imaging. DIA was performed to extract echodensity and textural features using Plaque Texture Analysis software (LifeQ Medical Ltd, Nicosia, Cyprus). Carotid plaque surgical specimens were obtained and analyzed histologically. Principal component analysis (PCA) was performed to reduce imaging variables. Logistic regression models were used to determine if PCA variables and individual imaging variables predicted histologic features of plaque instability. Image analysis data from 160 patients were analyzed. Individual imaging features of plaque echolucency and homogeneity were associated with a more unstable plaque phenotype on histology. These results were independent of age, sex, and degree of carotid stenosis. PCA reduced 39 individual imaging variables to five PCA variables. PCA1 and PCA2 were significantly associated with overall plaque instability on histology (both P = .02), whereas PCA3 did not achieve statistical significance (P = .07). DIA features of carotid plaques are associated with histologic plaque instability as assessed by multiple histologic features. Importantly, unstable plaques on histology appear more echolucent and homogeneous on ultrasound imaging. These results are independent of stenosis, suggesting that image analysis may have a role in refining the selection of patients who undergo carotid endarterectomy. Copyright © 2016 Society for Vascular Surgery. Published by Elsevier Inc. All rights reserved.

  4. Binding Isotherms and Time Courses Readily from Magnetic Resonance.

    PubMed

    Xu, Jia; Van Doren, Steven R

    2016-08-16

    Evidence is presented that binding isotherms, simple or biphasic, can be extracted directly from noninterpreted, complex 2D NMR spectra using principal component analysis (PCA) to reveal the largest trend(s) across the series. This approach renders peak picking unnecessary for tracking population changes. In 1:1 binding, the first principal component captures the binding isotherm from NMR-detected titrations in fast, slow, and even intermediate and mixed exchange regimes, as illustrated for phospholigand associations with proteins. Although the sigmoidal shifts and line broadening of intermediate exchange distorts binding isotherms constructed conventionally, applying PCA directly to these spectra along with Pareto scaling overcomes the distortion. Applying PCA to time-domain NMR data also yields binding isotherms from titrations in fast or slow exchange. The algorithm readily extracts from magnetic resonance imaging movie time courses such as breathing and heart rate in chest imaging. Similarly, two-step binding processes detected by NMR are easily captured by principal components 1 and 2. PCA obviates the customary focus on specific peaks or regions of images. Applying it directly to a series of complex data will easily delineate binding isotherms, equilibrium shifts, and time courses of reactions or fluctuations.

  5. Identification of More Feasible MicroRNA-mRNA Interactions within Multiple Cancers Using Principal Component Analysis Based Unsupervised Feature Extraction.

    PubMed

    Taguchi, Y-H

    2016-05-10

    MicroRNA(miRNA)-mRNA interactions are important for understanding many biological processes, including development, differentiation and disease progression, but their identification is highly context-dependent. When computationally derived from sequence information alone, the identification should be verified by integrated analyses of mRNA and miRNA expression. The drawback of this strategy is the vast number of identified interactions, which prevents an experimental or detailed investigation of each pair. In this paper, we overcome this difficulty by the recently proposed principal component analysis (PCA)-based unsupervised feature extraction (FE), which reduces the number of identified miRNA-mRNA interactions that properly discriminate between patients and healthy controls without losing biological feasibility. The approach is applied to six cancers: hepatocellular carcinoma, non-small cell lung cancer, esophageal squamous cell carcinoma, prostate cancer, colorectal/colon cancer and breast cancer. In PCA-based unsupervised FE, the significance does not depend on the number of samples (as in the standard case) but on the number of features, which approximates the number of miRNAs/mRNAs. To our knowledge, we have newly identified miRNA-mRNA interactions in multiple cancers based on a single common (universal) criterion. Moreover, the number of identified interactions was sufficiently small to be sequentially curated by literature searches.

  6. Intelligence, Surveillance, and Reconnaissance Fusion for Coalition Operations

    DTIC Science & Technology

    2008-07-01

    classification of the targets of interest. The MMI features extracted in this manner have two properties that provide a sound justification for...are generalizations of well- known feature extraction methods such as Principal Components Analysis (PCA) and Independent Component Analysis (ICA...augment (without degrading performance) a large class of generic fusion processes. Ontologies Classifications Feature extraction Feature analysis

  7. Analysis of environmental variation in a Great Plains reservoir using principal components analysis and geographic information systems

    USGS Publications Warehouse

    Long, J.M.; Fisher, W.L.

    2006-01-01

    We present a method for spatial interpretation of environmental variation in a reservoir that integrates principal components analysis (PCA) of environmental data with geographic information systems (GIS). To illustrate our method, we used data from a Great Plains reservoir (Skiatook Lake, Oklahoma) with longitudinal variation in physicochemical conditions. We measured 18 physicochemical features, mapped them using GIS, and then calculated and interpreted four principal components. Principal component 1 (PC1) was readily interpreted as longitudinal variation in water chemistry, but the other principal components (PC2-4) were difficult to interpret. Site scores for PC1-4 were calculated in GIS by summing weighted overlays of the 18 measured environmental variables, with the factor loadings from the PCA as the weights. PC1-4 were then ordered into a landscape hierarchy, an emergent property of this technique, which enabled their interpretation. PC1 was interpreted as a reservoir scale change in water chemistry, PC2 was a microhabitat variable of rip-rap substrate, PC3 identified coves/embayments and PC4 consisted of shoreline microhabitats related to slope. The use of GIS improved our ability to interpret the more obscure principal components (PC2-4), which made the spatial variability of the reservoir environment more apparent. This method is applicable to a variety of aquatic systems, can be accomplished using commercially available software programs, and allows for improved interpretation of the geographic environmental variability of a system compared to using typical PCA plots. ?? Copyright by the North American Lake Management Society 2006.

  8. Application of principal component analysis (PCA) and improved joint probability distributions to the inverse first-order reliability method (I-FORM) for predicting extreme sea states

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Eckert-Gallup, Aubrey C.; Sallaberry, Cédric J.; Dallman, Ann R.

    Environmental contours describing extreme sea states are generated as the input for numerical or physical model simulations as a part of the standard current practice for designing marine structures to survive extreme sea states. These environmental contours are characterized by combinations of significant wave height (H s) and either energy period (T e) or peak period (T p) values calculated for a given recurrence interval using a set of data based on hindcast simulations or buoy observations over a sufficient period of record. The use of the inverse first-order reliability method (I-FORM) is a standard design practice for generating environmentalmore » contours. This paper develops enhanced methodologies for data analysis prior to the application of the I-FORM, including the use of principal component analysis (PCA) to create an uncorrelated representation of the variables under consideration as well as new distribution and parameter fitting techniques. As a result, these modifications better represent the measured data and, therefore, should contribute to the development of more realistic representations of environmental contours of extreme sea states for determining design loads for marine structures.« less

  9. Application of principal component analysis (PCA) and improved joint probability distributions to the inverse first-order reliability method (I-FORM) for predicting extreme sea states

    DOE PAGES

    Eckert-Gallup, Aubrey C.; Sallaberry, Cédric J.; Dallman, Ann R.; ...

    2016-01-06

    Environmental contours describing extreme sea states are generated as the input for numerical or physical model simulations as a part of the standard current practice for designing marine structures to survive extreme sea states. These environmental contours are characterized by combinations of significant wave height (H s) and either energy period (T e) or peak period (T p) values calculated for a given recurrence interval using a set of data based on hindcast simulations or buoy observations over a sufficient period of record. The use of the inverse first-order reliability method (I-FORM) is a standard design practice for generating environmentalmore » contours. This paper develops enhanced methodologies for data analysis prior to the application of the I-FORM, including the use of principal component analysis (PCA) to create an uncorrelated representation of the variables under consideration as well as new distribution and parameter fitting techniques. As a result, these modifications better represent the measured data and, therefore, should contribute to the development of more realistic representations of environmental contours of extreme sea states for determining design loads for marine structures.« less

  10. Testing the Technology Acceptance Model: HIV case managers' intention to use a continuity of care record with context-specific links.

    PubMed

    Schnall, Rebecca; Bakken, Suzanne

    2011-09-01

    To assess the applicability of the Technology Acceptance Model (TAM) constructs in explaining HIV case managers' behavioural intention to use a continuity of care record (CCR) with context-specific links designed to meet their information needs. Data were collected from 94 case managers who provide care to persons living with HIV (PLWH) using an online survey comprising three components: (1) demographic information: age, gender, ethnicity, race, Internet usage and computer experience; (2) mock-up of CCR with context-specific links; and items related to TAM constructs. Data analysis included: principal components factor analysis (PCA), assessment of internal consistency reliability and univariate and multivariate analysis. PCA extracted three factors (Perceived Ease of Use, Perceived Usefulness and Perceived Barriers to Use), explained variance = 84.9%, Cronbach's ά = 0.69-0.91. In a linear regression model, Perceived Ease of Use, Perceived Usefulness and Perceived Barriers to Use explained 43.6% (p < 0.001) of the variance in Behavioural Intention to use a CCR with context-specific links. Our study contributes to the evidence base regarding TAM in health care through expanding the type of professional surveyed, study setting and Health Information Technology assessed.

  11. Fixed Eigenvector Analysis of Thermographic NDE Data

    NASA Technical Reports Server (NTRS)

    Cramer, K. Elliott; Winfree, William P.

    2011-01-01

    Principal Component Analysis (PCA) has been shown effective for reducing thermographic NDE data. This paper will discuss an alternative method of analysis that has been developed where a predetermined set of eigenvectors is used to process the thermal data from both reinforced carbon-carbon (RCC) and graphiteepoxy honeycomb materials. These eigenvectors can be generated either from an analytic model of the thermal response of the material system under examination, or from a large set of experimental data. This paper provides the details of the analytic model, an overview of the PCA process, as well as a quantitative signal-to-noise comparison of the results of performing both conventional PCA and fixed eigenvector analysis on thermographic data from two specimens, one Reinforced Carbon-Carbon with flat bottom holes and the second a sandwich construction with graphite-epoxy face sheets and aluminum honeycomb core.

  12. Application of chemometrics in quality control of Turmeric (Curcuma longa) based on Ultra-violet, Fourier transform-infrared and 1H NMR spectroscopy.

    PubMed

    Gad, Haidy A; Bouzabata, Amel

    2017-12-15

    Turmeric (Curcuma longa L.) belongs to the family Zingiberaceae that is widely used as a spice in food preparations in addition to its biological activities. UV, FT-IR, 1 H NMR in addition to HPLC were applied to construct a metabolic fingerprint for Turmeric in an attempt to assess its quality. 30 samples were analyzed, and then principal component analysis (PCA) and hierarchical clustering analysis (HCA) were utilized to assess the differences and similarities between collected samples. PCA score plot based on both HPLC and UV spectroscopy showed the same discriminatory pattern, where the samples were segregated into four main groups depending on their total curcuminoids content. The results revealed that UV could be utilized as a simple and rapid alternative for HPLC. However, FT-IR failed to discriminate between the same species. By applying 1 H NMR, the metabolic variability between samples was more evident in the essential oils/fatty acid region. Copyright © 2017 Elsevier Ltd. All rights reserved.

  13. Systematic study of anharmonic features in a principal component analysis of gramicidin A.

    PubMed

    Kurylowicz, Martin; Yu, Ching-Hsing; Pomès, Régis

    2010-02-03

    We use principal component analysis (PCA) to detect functionally interesting collective motions in molecular-dynamics simulations of membrane-bound gramicidin A. We examine the statistical and structural properties of all PCA eigenvectors and eigenvalues for the backbone and side-chain atoms. All eigenvalue spectra show two distinct power-law scaling regimes, quantitatively separating large from small covariance motions. Time trajectories of the largest PCs converge to Gaussian distributions at long timescales, but groups of small-covariance PCs, which are usually ignored as noise, have subdiffusive distributions. These non-Gaussian distributions imply anharmonic motions on the free-energy surface. We characterize the anharmonic components of motion by analyzing the mean-square displacement for all PCs. The subdiffusive components reveal picosecond-scale oscillations in the mean-square displacement at frequencies consistent with infrared measurements. In this regime, the slowest backbone mode exhibits tilting of the peptide planes, which allows carbonyl oxygen atoms to provide surrogate solvation for water and cation transport in the channel lumen. Higher-frequency modes are also apparent, and we describe their vibrational spectra. Our findings expand the utility of PCA for quantifying the essential features of motion on the anharmonic free-energy surface made accessible by atomistic molecular-dynamics simulations. Copyright (c) 2010 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  14. Data on Support Vector Machines (SVM) model to forecast photovoltaic power.

    PubMed

    Malvoni, M; De Giorgi, M G; Congedo, P M

    2016-12-01

    The data concern the photovoltaic (PV) power, forecasted by a hybrid model that considers weather variations and applies a technique to reduce the input data size, as presented in the paper entitled "Photovoltaic forecast based on hybrid pca-lssvm using dimensionality reducted data" (M. Malvoni, M.G. De Giorgi, P.M. Congedo, 2015) [1]. The quadratic Renyi entropy criteria together with the principal component analysis (PCA) are applied to the Least Squares Support Vector Machines (LS-SVM) to predict the PV power in the day-ahead time frame. The data here shared represent the proposed approach results. Hourly PV power predictions for 1,3,6,12, 24 ahead hours and for different data reduction sizes are provided in Supplementary material.

  15. ClustVis: a web tool for visualizing clustering of multivariate data using Principal Component Analysis and heatmap

    PubMed Central

    Metsalu, Tauno; Vilo, Jaak

    2015-01-01

    The Principal Component Analysis (PCA) is a widely used method of reducing the dimensionality of high-dimensional data, often followed by visualizing two of the components on the scatterplot. Although widely used, the method is lacking an easy-to-use web interface that scientists with little programming skills could use to make plots of their own data. The same applies to creating heatmaps: it is possible to add conditional formatting for Excel cells to show colored heatmaps, but for more advanced features such as clustering and experimental annotations, more sophisticated analysis tools have to be used. We present a web tool called ClustVis that aims to have an intuitive user interface. Users can upload data from a simple delimited text file that can be created in a spreadsheet program. It is possible to modify data processing methods and the final appearance of the PCA and heatmap plots by using drop-down menus, text boxes, sliders etc. Appropriate defaults are given to reduce the time needed by the user to specify input parameters. As an output, users can download PCA plot and heatmap in one of the preferred file formats. This web server is freely available at http://biit.cs.ut.ee/clustvis/. PMID:25969447

  16. A structural investigation into the compaction behavior of pharmaceutical composites using powder X-ray diffraction and total scattering analysis.

    PubMed

    Moore, Michael D; Steinbach, Alison M; Buckner, Ira S; Wildfong, Peter L D

    2009-11-01

    To use advanced powder X-ray diffraction (PXRD) to characterize the structure of anhydrous theophylline following compaction, alone, and as part of a binary mixture with either alpha-lactose monohydrate or microcrystalline cellulose. Compacts formed from (1) pure theophylline and (2) each type of binary mixture were analyzed intact using PXRD. A novel mathematical technique was used to accurately separate multi-component diffraction patterns. The pair distribution function (PDF) of isolated theophylline diffraction data was employed to assess structural differences induced by consolidation and evaluated by principal components analysis (PCA). Changes induced in PXRD patterns by increasing compaction pressure were amplified by the PDF. Simulated data suggest PDF dampening is attributable to molecular deviations from average crystalline position. Samples compacted at different pressures were identified and differentiated using PCA. Samples compacted at common pressures exhibited similar inter-atomic correlations, where excipient concentration factored in the analyses involving lactose. Practical real-space structural analysis of PXRD data by PDF was accomplished for intact, compacted crystalline drug with and without excipient. PCA was used to compare multiple PDFs and successfully differentiated pattern changes consistent with compaction-induced disordering of theophylline as a single component and in the presence of another material.

  17. Classification of time-of-flight secondary ion mass spectrometry spectra from complex Cu-Fe sulphides by principal component analysis and artificial neural networks.

    PubMed

    Kalegowda, Yogesh; Harmer, Sarah L

    2013-01-08

    Artificial neural network (ANN) and a hybrid principal component analysis-artificial neural network (PCA-ANN) classifiers have been successfully implemented for classification of static time-of-flight secondary ion mass spectrometry (ToF-SIMS) mass spectra collected from complex Cu-Fe sulphides (chalcopyrite, bornite, chalcocite and pyrite) at different flotation conditions. ANNs are very good pattern classifiers because of: their ability to learn and generalise patterns that are not linearly separable; their fault and noise tolerance capability; and high parallelism. In the first approach, fragments from the whole ToF-SIMS spectrum were used as input to the ANN, the model yielded high overall correct classification rates of 100% for feed samples, 88% for conditioned feed samples and 91% for Eh modified samples. In the second approach, the hybrid pattern classifier PCA-ANN was integrated. PCA is a very effective multivariate data analysis tool applied to enhance species features and reduce data dimensionality. Principal component (PC) scores which accounted for 95% of the raw spectral data variance, were used as input to the ANN, the model yielded high overall correct classification rates of 88% for conditioned feed samples and 95% for Eh modified samples. Copyright © 2012 Elsevier B.V. All rights reserved.

  18. Interpretation of data on the aggregate composition of typical chernozems under different land use by cluster and principal component analyses

    NASA Astrophysics Data System (ADS)

    Kholodov, V. A.; Yaroslavtseva, N. V.; Lazarev, V. I.; Frid, A. S.

    2016-09-01

    Cluster analysis and principal component analysis (PCA) have been used for the interpretation of dry sieving data. Chernozems from the treatments of long-term field experiments with different land-use patterns— annually mowed steppe, continuous potato culture, permanent black fallow, and untilled fallow since 1998 after permanent black fallow—have been used. Analysis of dry sieving data by PCA has shown that the treatments of untilled fallow after black fallow and annually mowed steppe differ most in the series considered; the content of dry aggregates of 10-7 mm makes the largest contribution to the distribution of objects along the first principal component. This fraction has been sieved in water and analyzed by PCA. In contrast to dry sieving data, the wet sieving data showed the closest mathematical distance between the treatment of untilled fallow after black fallow and the undisturbed treatment of annually mowed steppe, while the untilled fallow after black fallow and the permanent black fallow were the most distant treatments. Thus, it may be suggested that the water stability of structure is first restored after the removal of destructive anthropogenic load. However, the restoration of the distribution of structural separates to the parameters characteristic of native soils is a significantly longer process.

  19. Genetic variants in RNA-induced silencing complex genes and prostate cancer.

    PubMed

    Nikolić, Z; Savić Pavićević, D; Vučić, N; Cerović, S; Vukotić, V; Brajušković, G

    2017-04-01

    The purpose of this study is to evaluate the potential association between genetic variants in genes encoding the components of RNA-induced silencing complex and prostate cancer (PCa) risk. Genetic variants chosen for this study are rs3742330 in DICER1, rs4961280 in AGO2, rs784567 in TARBP2, rs7813 in GEMIN4 and rs197414 in GEMIN3. The study involved 355 PCa patients, 360 patients with benign prostatic hyperplasia and 318 healthy controls. For individuals diagnosed with PCa, clinicopathological characteristics including serum prostate-specific antigen level at diagnosis, Gleason score (GS) and clinical stage were determined. Genotyping was performed using high-resolution melting analysis, PCR-RFLP, TaqMan SNP Genotyping Assay and real-time PCR-based genotyping assay using specific probes. Allelic and genotypic associations were evaluated by unconditional linear and logistic regression methods. The study provided no evidence of association between the analyzed genetic variants and PCa risk. Nevertheless, allele A of rs784567 was found to confer the reduced risk of higher serum PSA level at diagnosis (P = 0.046; Difference = -66.64, 95 % CI -131.93 to 1.35, for log-additive model). Furthermore, rs4961280, as well as rs3742330, were shown to be associated with GS. These variants, together with rs7813, were found to be associated with the lower clinical stage of PCa. Also, rs3742330 minor allele G was found to be associated with lower PCa aggressiveness (P = 0.036; OR 0.14, 95 % CI 0.023-1.22, for recessive model). According to our data, rs3742330, rs4961280 and rs7813 qualify for potentially protective genetic variants against PCa progression. These variants were not shown to be associated with PCa risk.

  20. Preliminary PCA/TT Results on MRO CRISM Multispectral Images

    NASA Astrophysics Data System (ADS)

    Klassen, David R.; Smith, M. D.

    2010-10-01

    Mars Reconnaissance Orbiter arrived at Mars in March 2006 and by September had achieved its science-phase orbit with the Compact Reconnaissance Imaging Spectrometer for Mars (CRISM) beginning its visible to near-infrared (VIS/NIR) spectral imaging shortly thereafter. One goal of CRISM is to fill in the spatial gaps between the various targeted observations, eventually mapping the entire surface. Due to the large volume of data this would create, the instrument works in a reduced spectral sampling mode creating "multispectral” images. From these data we can create image cubes using 64 wavelengths from 0.410 to 3.923 µm. We present here our analysis of these multispectral mode data products using Principal Components Analysis (PCA) and Target Transformation (TT) [1]. Previous work with ground-based images [2-5] has shown that over an entire visible hemisphere, there are only three to four meaningful components using 32-105 wavelengths over 1.5-4.1 µm the first two are consistent over all temporal scales. The TT retrieved spectral endmembers show nearly the same level of consistency [5]. The preliminary work on the CRISM images cubes implies similar results; three to four significant principal components that are fairly consistent over time. These components are then used in TT to find spectral endmembers which can be used to characterize the surface reflectance for future use in radiative transfer cloud optical depth retrievals. We present here the PCA/TT results comparing the principal components and recovered endmembers from six reconstructed CRISM multi-spectral image cubes. References: [1] Bandfield, J. L., et al. (2000) JGR, 105, 9573. [2] Klassen, D. R. and Bell III, J. F. (2001) BAAS 33, 1069. [3] Klassen, D. R. and Bell III, J. F. (2003) BAAS, 35, 936. [4] Klassen, D. R., Wark, T. J., Cugliotta, C. G. (2005) BAAS, 37, 693. [5] Klassen, D. R. (2009) Icarus, 204, 32.

  1. Identifying overarching excipient properties towards an in-depth understanding of process and product performance for continuous twin-screw wet granulation.

    PubMed

    Willecke, N; Szepes, A; Wunderlich, M; Remon, J P; Vervaet, C; De Beer, T

    2017-04-30

    The overall objective of this work is to understand how excipient characteristics influence the process and product performance for a continuous twin-screw wet granulation process. The knowledge gained through this study is intended to be used for a Quality by Design (QbD)-based formulation design approach and formulation optimization. A total of 9 preferred fillers and 9 preferred binders were selected for this study. The selected fillers and binders were extensively characterized regarding their physico-chemical and solid state properties using 21 material characterization techniques. Subsequently, principal component analysis (PCA) was performed on the data sets of filler and binder characteristics in order to reduce the variety of single characteristics to a limited number of overarching properties. Four principal components (PC) explained 98.4% of the overall variability in the fillers data set, while three principal components explained 93.4% of the overall variability in the data set of binders. Both PCA models allowed in-depth evaluation of similarities and differences in the excipient properties. Copyright © 2017. Published by Elsevier B.V.

  2. Principal components analysis based control of a multi-DoF underactuated prosthetic hand.

    PubMed

    Matrone, Giulia C; Cipriani, Christian; Secco, Emanuele L; Magenes, Giovanni; Carrozza, Maria Chiara

    2010-04-23

    Functionality, controllability and cosmetics are the key issues to be addressed in order to accomplish a successful functional substitution of the human hand by means of a prosthesis. Not only the prosthesis should duplicate the human hand in shape, functionality, sensorization, perception and sense of body-belonging, but it should also be controlled as the natural one, in the most intuitive and undemanding way. At present, prosthetic hands are controlled by means of non-invasive interfaces based on electromyography (EMG). Driving a multi degrees of freedom (DoF) hand for achieving hand dexterity implies to selectively modulate many different EMG signals in order to make each joint move independently, and this could require significant cognitive effort to the user. A Principal Components Analysis (PCA) based algorithm is used to drive a 16 DoFs underactuated prosthetic hand prototype (called CyberHand) with a two dimensional control input, in order to perform the three prehensile forms mostly used in Activities of Daily Living (ADLs). Such Principal Components set has been derived directly from the artificial hand by collecting its sensory data while performing 50 different grasps, and subsequently used for control. Trials have shown that two independent input signals can be successfully used to control the posture of a real robotic hand and that correct grasps (in terms of involved fingers, stability and posture) may be achieved. This work demonstrates the effectiveness of a bio-inspired system successfully conjugating the advantages of an underactuated, anthropomorphic hand with a PCA-based control strategy, and opens up promising possibilities for the development of an intuitively controllable hand prosthesis.

  3. Model Reduction via Principe Component Analysis and Markov Chain Monte Carlo (MCMC) Methods

    NASA Astrophysics Data System (ADS)

    Gong, R.; Chen, J.; Hoversten, M. G.; Luo, J.

    2011-12-01

    Geophysical and hydrogeological inverse problems often include a large number of unknown parameters, ranging from hundreds to millions, depending on parameterization and problems undertaking. This makes inverse estimation and uncertainty quantification very challenging, especially for those problems in two- or three-dimensional spatial domains. Model reduction technique has the potential of mitigating the curse of dimensionality by reducing total numbers of unknowns while describing the complex subsurface systems adequately. In this study, we explore the use of principal component analysis (PCA) and Markov chain Monte Carlo (MCMC) sampling methods for model reduction through the use of synthetic datasets. We compare the performances of three different but closely related model reduction approaches: (1) PCA methods with geometric sampling (referred to as 'Method 1'), (2) PCA methods with MCMC sampling (referred to as 'Method 2'), and (3) PCA methods with MCMC sampling and inclusion of random effects (referred to as 'Method 3'). We consider a simple convolution model with five unknown parameters as our goal is to understand and visualize the advantages and disadvantages of each method by comparing their inversion results with the corresponding analytical solutions. We generated synthetic data with noise added and invert them under two different situations: (1) the noised data and the covariance matrix for PCA analysis are consistent (referred to as the unbiased case), and (2) the noise data and the covariance matrix are inconsistent (referred to as biased case). In the unbiased case, comparison between the analytical solutions and the inversion results show that all three methods provide good estimates of the true values and Method 1 is computationally more efficient. In terms of uncertainty quantification, Method 1 performs poorly because of relatively small number of samples obtained, Method 2 performs best, and Method 3 overestimates uncertainty due to inclusion of random effects. However, in the biased case, only Method 3 correctly estimates all the unknown parameters, and both Methods 1 and 2 provide wrong values for the biased parameters. The synthetic case study demonstrates that if the covariance matrix for PCA analysis is inconsistent with true models, the PCA methods with geometric or MCMC sampling will provide incorrect estimates.

  4. Multivariate Analysis of Fruit Antioxidant Activities of Blackberry Treated with 1-Methylcyclopropene or Vacuum Precooling

    PubMed Central

    Li, Jian; Ma, Guowei; Ma, Lin; Bao, Xiaolin; Li, Liping; Zhao, Qian

    2018-01-01

    Effects of 1-methylcyclopropene (1-MCP) and vacuum precooling on quality and antioxidant properties of blackberries (Rubus spp.) were evaluated using one-way analysis of variance, principal component analysis (PCA), partial least squares (PLS), and path analysis. Results showed that the activities of antioxidant enzymes were enhanced by both 1-MCP treatment and vacuum precooling. PCA could discriminate 1-MCP treated fruit and the vacuum precooled fruit and showed that the radical-scavenging activities in vacuum precooled fruit were higher than those in 1-MCP treated fruit. The scores of PCA showed that H2O2 content was the most important variables of blackberry fruit. PLSR results showed that peroxidase (POD) activity negatively correlated with H2O2 content. The results of path coefficient analysis indicated that glutathione (GSH) also had an indirect effect on H2O2 content. PMID:29487622

  5. Detecting phase separation of freeze-dried binary amorphous systems using pair-wise distribution function and multivariate data analysis.

    PubMed

    Chieng, Norman; Trnka, Hjalte; Boetker, Johan; Pikal, Michael; Rantanen, Jukka; Grohganz, Holger

    2013-09-15

    The purpose of this study is to investigate the use of multivariate data analysis for powder X-ray diffraction-pair-wise distribution function (PXRD-PDF) data to detect phase separation in freeze-dried binary amorphous systems. Polymer-polymer and polymer-sugar binary systems at various ratios were freeze-dried. All samples were analyzed by PXRD, transformed to PDF and analyzed by principal component analysis (PCA). These results were validated by differential scanning calorimetry (DSC) through characterization of glass transition of the maximally freeze-concentrate solute (Tg'). Analysis of PXRD-PDF data using PCA provides a more clear 'miscible' or 'phase separated' interpretation through the distribution pattern of samples on a score plot presentation compared to residual plot method. In a phase separated system, samples were found to be evenly distributed around the theoretical PDF profile. For systems that were miscible, a clear deviation of samples away from the theoretical PDF profile was observed. Moreover, PCA analysis allows simultaneous analysis of replicate samples. Comparatively, the phase behavior analysis from PXRD-PDF-PCA method was in agreement with the DSC results. Overall, the combined PXRD-PDF-PCA approach improves the clarity of the PXRD-PDF results and can be used as an alternative explorative data analytical tool in detecting phase separation in freeze-dried binary amorphous systems. Copyright © 2013 Elsevier B.V. All rights reserved.

  6. Differentiating Organic and Conventional Sage by Chromatographic and Mass Spectrometry Flow-Injection Fingerprints Combined with Principal Component Analysis

    PubMed Central

    Gao, Boyan; Lu, Yingjian; Sheng, Yi; Chen, Pei; Yu, Liangli (Lucy)

    2013-01-01

    High performance liquid chromatography (HPLC) and flow injection electrospray ionization with ion trap mass spectrometry (FIMS) fingerprints combined with the principal component analysis (PCA) were examined for their potential in differentiating commercial organic and conventional sage samples. The individual components in the sage samples were also characterized with an ultra-performance liquid chromatography with a quadrupole-time of flight mass spectrometer (UPLC Q-TOF MS). The results suggested that both HPLC and FIMS fingerprints combined with PCA could differentiate organic and conventional sage samples effectively. FIMS may serve as a quick test capable of distinguishing organic and conventional sages in 1 min, and could potentially be developed for high-throughput applications; whereas HPLC fingerprints could provide more chemical composition information with a longer analytical time. PMID:23464755

  7. Integrated Low-Rank-Based Discriminative Feature Learning for Recognition.

    PubMed

    Zhou, Pan; Lin, Zhouchen; Zhang, Chao

    2016-05-01

    Feature learning plays a central role in pattern recognition. In recent years, many representation-based feature learning methods have been proposed and have achieved great success in many applications. However, these methods perform feature learning and subsequent classification in two separate steps, which may not be optimal for recognition tasks. In this paper, we present a supervised low-rank-based approach for learning discriminative features. By integrating latent low-rank representation (LatLRR) with a ridge regression-based classifier, our approach combines feature learning with classification, so that the regulated classification error is minimized. In this way, the extracted features are more discriminative for the recognition tasks. Our approach benefits from a recent discovery on the closed-form solutions to noiseless LatLRR. When there is noise, a robust Principal Component Analysis (PCA)-based denoising step can be added as preprocessing. When the scale of a problem is large, we utilize a fast randomized algorithm to speed up the computation of robust PCA. Extensive experimental results demonstrate the effectiveness and robustness of our method.

  8. Discrimination of rectal cancer through human serum using surface-enhanced Raman spectroscopy

    NASA Astrophysics Data System (ADS)

    Li, Xiaozhou; Yang, Tianyue; Li, Siqi; Zhang, Su; Jin, Lili

    2015-05-01

    In this paper, surface-enhanced Raman spectroscopy (SERS) was used to detect the changes in blood serum components that accompany rectal cancer. The differences in serum SERS data between rectal cancer patients and healthy controls were examined. Postoperative rectal cancer patients also participated in the comparison to monitor the effects of cancer treatments. The results show that there are significant variations at certain wavenumbers which indicates alteration of corresponding biological substances. Principal component analysis (PCA) and parameters of intensity ratios were used on the original SERS spectra for the extraction of featured variables. These featured variables then underwent linear discriminant analysis (LDA) and classification and regression tree (CART) for the discrimination analysis. Accuracies of 93.5 and 92.4 % were obtained for PCA-LDA and parameter-CART, respectively.

  9. Ultra-sensitive high performance liquid chromatography-laser-induced fluorescence based proteomics for clinical applications.

    PubMed

    Patil, Ajeetkumar; Bhat, Sujatha; Pai, Keerthilatha M; Rai, Lavanya; Kartha, V B; Chidangil, Santhosh

    2015-09-08

    An ultra-sensitive high performance liquid chromatography-laser induced fluorescence (HPLC-LIF) based technique has been developed by our group at Manipal, for screening, early detection, and staging for various cancers, using protein profiling of clinical samples like, body fluids, cellular specimens, and biopsy-tissue. More than 300 protein profiles of different clinical samples (serum, saliva, cellular samples and tissue homogenates) from volunteers (normal, and different pre-malignant/malignant conditions) were recorded using this set-up. The protein profiles were analyzed using principal component analysis (PCA) to achieve objective detection and classification of malignant, premalignant and healthy conditions with high sensitivity and specificity. The HPLC-LIF protein profiling combined with PCA, as a routine method for screening, diagnosis, and staging of cervical cancer and oral cancer, is discussed in this paper. In recent years, proteomics techniques have advanced tremendously in life sciences and medical sciences for the detection and identification of proteins in body fluids, tissue homogenates and cellular samples to understand biochemical mechanisms leading to different diseases. Some of the methods include techniques like high performance liquid chromatography, 2D-gel electrophoresis, MALDI-TOF-MS, SELDI-TOF-MS, CE-MS and LC-MS techniques. We have developed an ultra-sensitive high performance liquid chromatography-laser induced fluorescence (HPLC-LIF) based technique, for screening, early detection, and staging for various cancers, using protein profiling of clinical samples like, body fluids, cellular specimens, and biopsy-tissue. More than 300 protein profiles of different clinical samples (serum, saliva, cellular samples and tissue homogenates) from healthy and volunteers with different malignant conditions were recorded by using this set-up. The protein profile data were analyzed using principal component analysis (PCA) for objective classification and detection of malignant, premalignant and healthy conditions. The method is extremely sensitive to detect proteins with limit of detection of the order of femto-moles. The HPLC-LIF combined with PCA as a potential proteomic method for the diagnosis of oral cancer and cervical cancer has been discussed in this paper. This article is part of a Special Issue entitled: Proteomics in India. Copyright © 2015 Elsevier B.V. All rights reserved.

  10. A Principal Components Analysis and Validation of the Coping with the College Environment Scale (CWCES)

    ERIC Educational Resources Information Center

    Ackermann, Margot Elise; Morrow, Jennifer Ann

    2008-01-01

    The present study describes the development and initial validation of the Coping with the College Environment Scale (CWCES). Participants included 433 college students who took an online survey. Principal Components Analysis (PCA) revealed six coping strategies: planning and self-management, seeking support from institutional resources, escaping…

  11. Principal Component Analysis: Resources for an Essential Application of Linear Algebra

    ERIC Educational Resources Information Center

    Pankavich, Stephen; Swanson, Rebecca

    2015-01-01

    Principal Component Analysis (PCA) is a highly useful topic within an introductory Linear Algebra course, especially since it can be used to incorporate a number of applied projects. This method represents an essential application and extension of the Spectral Theorem and is commonly used within a variety of fields, including statistics,…

  12. Learning Principal Component Analysis by Using Data from Air Quality Networks

    ERIC Educational Resources Information Center

    Perez-Arribas, Luis Vicente; Leon-González, María Eugenia; Rosales-Conrado, Noelia

    2017-01-01

    With the final objective of using computational and chemometrics tools in the chemistry studies, this paper shows the methodology and interpretation of the Principal Component Analysis (PCA) using pollution data from different cities. This paper describes how students can obtain data on air quality and process such data for additional information…

  13. Real time on-chip sequential adaptive principal component analysis for data feature extraction and image compression

    NASA Technical Reports Server (NTRS)

    Duong, T. A.

    2004-01-01

    In this paper, we present a new, simple, and optimized hardware architecture sequential learning technique for adaptive Principle Component Analysis (PCA) which will help optimize the hardware implementation in VLSI and to overcome the difficulties of the traditional gradient descent in learning convergence and hardware implementation.

  14. A novel principal component analysis for spatially misaligned multivariate air pollution data.

    PubMed

    Jandarov, Roman A; Sheppard, Lianne A; Sampson, Paul D; Szpiro, Adam A

    2017-01-01

    We propose novel methods for predictive (sparse) PCA with spatially misaligned data. These methods identify principal component loading vectors that explain as much variability in the observed data as possible, while also ensuring the corresponding principal component scores can be predicted accurately by means of spatial statistics at locations where air pollution measurements are not available. This will make it possible to identify important mixtures of air pollutants and to quantify their health effects in cohort studies, where currently available methods cannot be used. We demonstrate the utility of predictive (sparse) PCA in simulated data and apply the approach to annual averages of particulate matter speciation data from national Environmental Protection Agency (EPA) regulatory monitors.

  15. Chemical information obtained from Auger depth profiles by means of advanced factor analysis (MLCFA)

    NASA Astrophysics Data System (ADS)

    De Volder, P.; Hoogewijs, R.; De Gryse, R.; Fiermans, L.; Vennik, J.

    1993-01-01

    The advanced multivariate statistical technique "maximum likelihood common factor analysis (MLCFA)" is shown to be superior to "principal component analysis (PCA)" for decomposing overlapping peaks into their individual component spectra of which neither the number of components nor the peak shape of the component spectra is known. An examination of the maximum resolving power of both techniques, MLCFA and PCA, by means of artificially created series of multicomponent spectra confirms this finding unambiguously. Substantial progress in the use of AES as a chemical-analysis technique is accomplished through the implementation of MLCFA. Chemical information from Auger depth profiles is extracted by investigating the variation of the line shape of the Auger signal as a function of the changing chemical state of the element. In particular, MLCFA combined with Auger depth profiling has been applied to problems related to steelcord-rubber tyre adhesion. MLCFA allows one to elucidate the precise nature of the interfacial layer of reaction products between natural rubber vulcanized on a thin brass layer. This study reveals many interesting chemical aspects of the oxi-sulfidation of brass undetectable with classical AES.

  16. An Efficient Taguchi Approach for the Performance Optimization of Health, Safety, Environment and Ergonomics in Generation Companies

    PubMed Central

    Azadeh, Ali; Sheikhalishahi, Mohammad

    2014-01-01

    Background A unique framework for performance optimization of generation companies (GENCOs) based on health, safety, environment, and ergonomics (HSEE) indicators is presented. Methods To rank this sector of industry, the combination of data envelopment analysis (DEA), principal component analysis (PCA), and Taguchi are used for all branches of GENCOs. These methods are applied in an integrated manner to measure the performance of GENCO. The preferred model between DEA, PCA, and Taguchi is selected based on sensitivity analysis and maximum correlation between rankings. To achieve the stated objectives, noise is introduced into input data. Results The results show that Taguchi outperforms other methods. Moreover, a comprehensive experiment is carried out to identify the most influential factor for ranking GENCOs. Conclusion The approach developed in this study could be used for continuous assessment and improvement of GENCO's performance in supplying energy with respect to HSEE factors. The results of such studies would help managers to have better understanding of weak and strong points in terms of HSEE factors. PMID:26106505

  17. The classification of lung cancers and their degree of malignancy by FTIR, PCA-LDA analysis, and a physics-based computational model.

    PubMed

    Kaznowska, E; Depciuch, J; Łach, K; Kołodziej, M; Koziorowska, A; Vongsvivut, J; Zawlik, I; Cholewa, M; Cebulski, J

    2018-08-15

    Lung cancer has the highest mortality rate of all malignant tumours. The current effects of cancer treatment, as well as its diagnostics, are unsatisfactory. Therefore it is very important to introduce modern diagnostic tools, which will allow for rapid classification of lung cancers and their degree of malignancy. For this purpose, the authors propose the use of Fourier Transform InfraRed (FTIR) spectroscopy combined with Principal Component Analysis-Linear Discriminant Analysis (PCA-LDA) and a physics-based computational model. The results obtained for lung cancer tissues, adenocarcinoma and squamous cell carcinoma FTIR spectra, show a shift in wavenumbers compared to control tissue FTIR spectra. Furthermore, in the FTIR spectra of adenocarcinoma there are no peaks corresponding to glutamate or phospholipid functional groups. Moreover, in the case of G2 and G3 malignancy of adenocarcinoma lung cancer, the absence of an OH groups peak was noticed. Thus, it seems that FTIR spectroscopy is a valuable tool to classify lung cancer and to determine the degree of its malignancy. Copyright © 2018 Elsevier B.V. All rights reserved.

  18. Y-chromosome phylogeographic analysis of the Greek-Cypriot population reveals elements consistent with Neolithic and Bronze Age settlements.

    PubMed

    Voskarides, Konstantinos; Mazières, Stéphane; Hadjipanagi, Despina; Di Cristofaro, Julie; Ignatiou, Anastasia; Stefanou, Charalambos; King, Roy J; Underhill, Peter A; Chiaroni, Jacques; Deltas, Constantinos

    2016-01-01

    The archeological record indicates that the permanent settlement of Cyprus began with pioneering agriculturalists circa 11,000 years before present, (ca. 11,000 y BP). Subsequent colonization events followed, some recognized regionally. Here, we assess the Y-chromosome structure of Cyprus in context to regional populations and correlate it to phases of prehistoric colonization. Analysis of haplotypes from 574 samples showed that island-wide substructure was barely significant in a spatial analysis of molecular variance (SAMOVA). However, analyses of molecular variance (AMOVA) of haplogroups using 92 binary markers genotyped in 629 Cypriots revealed that the proportion of variance among the districts was irregularly distributed. Principal component analysis (PCA) revealed potential genetic associations of Greek-Cypriots with neighbor populations. Contrasting haplogroups in the PCA were used as surrogates of parental populations. Admixture analyses suggested that the majority of G2a-P15 and R1b-M269 components were contributed by Anatolia and Levant sources, respectively, while Greece Balkans supplied the majority of E-V13 and J2a-M67. Haplotype-based expansion times were at historical levels suggestive of recent demography. Analyses of Cypriot haplogroup data are consistent with two stages of prehistoric settlement. E-V13 and E-M34 are widespread, and PCA suggests sourcing them to the Balkans and Levant/Anatolia, respectively. The persistent pre-Greek component is represented by elements of G2-U5(xL30) haplogroups: U5*, PF3147, and L293. J2b-M205 may contribute also to the pre-Greek strata. The majority of R1b-Z2105 lineages occur in both the westernmost and easternmost districts. Distinctively, sub-haplogroup R1b- M589 occurs only in the east. The absence of R1b- M589 lineages in Crete and the Balkans and the presence in Asia Minor are compatible with Late Bronze Age influences from Anatolia rather than from Mycenaean Greeks.

  19. Prostate Cancer Associated Lipid Signatures in Serum Studied by ESI-Tandem Mass Spectrometryas Potential New Biomarkers.

    PubMed

    Duscharla, Divya; Bhumireddy, Sudarshana Reddy; Lakshetti, Sridhar; Pospisil, Heike; Murthy, P V L N; Walther, Reinhard; Sripadi, Prabhakar; Ummanni, Ramesh

    2016-01-01

    Prostate cancer (PCa) is one amongst the most common cancersin western men. Incidence rate ofPCa is on the rise worldwide. The present study deals with theserum lipidome profiling of patients diagnosed with PCa to identify potential new biomarkers. We employed ESI-MS/MS and GC-MS for identification of significantly altered lipids in cancer patient's serum compared to controls. Lipidomic data revealed 24 lipids are significantly altered in cancer patinet's serum (n = 18) compared to normal (n = 18) with no history of PCa. By using hierarchical clustering and principal component analysis (PCA) we could clearly separate cancer patients from control group. Correlation and partition analysis along with Formal Concept Analysis (FCA) have identified that PC (39:6) and FA (22:3) could classify samples with higher certainty. Both the lipids, PC (39:6) and FA (22:3) could influence the cataloging of patients with 100% sensitivity (all 18 control samples are classified correctly) and 77.7% specificity (of 18 tumor samples 4 samples are misclassified) with p-value of 1.612×10-6 in Fischer's exact test. Further, we performed GC-MS to denote fatty acids altered in PCa patients and found that alpha-linolenic acid (ALA) levels are altered in PCa. We also performed an in vitro proliferation assay to determine the effect of ALA in survival of classical human PCa cell lines LNCaP and PC3. We hereby report that the altered lipids PC (39:6) and FA (22:3) offer a new set of biomarkers in addition to the existing diagnostic tests that could significantly improve sensitivity and specificity in PCa diagnosis.

  20. Prostate Cancer Associated Lipid Signatures in Serum Studied by ESI-Tandem Mass Spectrometryas Potential New Biomarkers

    PubMed Central

    Duscharla, Divya; Bhumireddy, Sudarshana Reddy; Lakshetti, Sridhar; Pospisil, Heike; Murthy, P. V. L. N.; Walther, Reinhard; Sripadi, Prabhakar; Ummanni, Ramesh

    2016-01-01

    Prostate cancer (PCa) is one amongst the most common cancersin western men. Incidence rate ofPCa is on the rise worldwide. The present study deals with theserum lipidome profiling of patients diagnosed with PCa to identify potential new biomarkers. We employed ESI-MS/MS and GC-MS for identification of significantly altered lipids in cancer patient’s serum compared to controls. Lipidomic data revealed 24 lipids are significantly altered in cancer patinet’s serum (n = 18) compared to normal (n = 18) with no history of PCa. By using hierarchical clustering and principal component analysis (PCA) we could clearly separate cancer patients from control group. Correlation and partition analysis along with Formal Concept Analysis (FCA) have identified that PC (39:6) and FA (22:3) could classify samples with higher certainty. Both the lipids, PC (39:6) and FA (22:3) could influence the cataloging of patients with 100% sensitivity (all 18 control samples are classified correctly) and 77.7% specificity (of 18 tumor samples 4 samples are misclassified) with p-value of 1.612×10−6 in Fischer’s exact test. Further, we performed GC-MS to denote fatty acids altered in PCa patients and found that alpha-linolenic acid (ALA) levels are altered in PCa. We also performed an in vitro proliferation assay to determine the effect of ALA in survival of classical human PCa cell lines LNCaP and PC3. We hereby report that the altered lipids PC (39:6) and FA (22:3) offer a new set of biomarkers in addition to the existing diagnostic tests that could significantly improve sensitivity and specificity in PCa diagnosis. PMID:26958841

  1. Neuronal substrates of Corsi Block span: Lesion symptom mapping analyses in relation to attentional competition and spatial bias.

    PubMed

    Chechlacz, Magdalena; Rotshtein, Pia; Humphreys, Glyn W

    2014-11-01

    Spatial working memory problems are frequently reported following brain damage within both left and right hemispheres but with the severity often being grater in individuals with right hemisphere lesions. Clinically, deficits in spatial working memory have also been noted in patients with visuospatial disorders such as unilateral neglect. Here, we examined neural substrates of short-term memory for spatial locations based on the Corsi Block tapping task and the relationship with the visuospatial deficits of neglect and extinction in a group of chronic neuropsychological patients. Principal Component Analysis (PCA) was used to distinguish shared and dissociate functional components. The neural substrates of spatial short-term memory deficits and the components identified by PCA were examined using whole brain voxel-based morphometry and tract-wise lesion deficits analyses. We found that bilateral lesions within occipital cortex (middle occipital gyrus) and right posterior parietal cortex, along with disconnection of the right parieto-temporal segment of arcuate fasciculus, were associated with low spatial memory span. A single component revealed by PCA accounted for over half of the variance and was linked to damage to right posterior brain regions (temporo-parietal junction, the inferior parietal lobule and middle temporal gyrus extending into middle occipital gyrus). We also found link to disconnections within several association pathways including the superior longitudinal fasciculus, arcuate fasciculus, inferior fronto-occipital fasciculus and inferior longitudinal fasciculus. These results indicate that different visuospatial deficits converge into a single component mapped within posterior parietal areas and fronto-parietal white matter pathways. Furthermore, the data presented here fit with the role of posterior parietal cortex/temporo-parietal junction in maintaining a map of salient locations in space, with Corsi Block performance being impaired when the spatial map is damaged. Copyright © 2014 Elsevier Ltd. All rights reserved.

  2. Comparative Analysis of Hybrid Models for Prediction of BP Reactivity to Crossed Legs.

    PubMed

    Kaur, Gurmanik; Arora, Ajat Shatru; Jain, Vijender Kumar

    2017-01-01

    Crossing the legs at the knees, during BP measurement, is one of the several physiological stimuli that considerably influence the accuracy of BP measurements. Therefore, it is paramount to develop an appropriate prediction model for interpreting influence of crossed legs on BP. This research work described the use of principal component analysis- (PCA-) fused forward stepwise regression (FSWR), artificial neural network (ANN), adaptive neuro fuzzy inference system (ANFIS), and least squares support vector machine (LS-SVM) models for prediction of BP reactivity to crossed legs among the normotensive and hypertensive participants. The evaluation of the performance of the proposed prediction models using appropriate statistical indices showed that the PCA-based LS-SVM (PCA-LS-SVM) model has the highest prediction accuracy with coefficient of determination ( R 2 ) = 93.16%, root mean square error (RMSE) = 0.27, and mean absolute percentage error (MAPE) = 5.71 for SBP prediction in normotensive subjects. Furthermore, R 2  = 96.46%, RMSE = 0.19, and MAPE = 1.76 for SBP prediction and R 2  = 95.44%, RMSE = 0.21, and MAPE = 2.78 for DBP prediction in hypertensive subjects using the PCA-LSSVM model. This assessment presents the importance and advantages posed by hybrid computing models for the prediction of variables in biomedical research studies.

  3. Discerning some Tylenol brands using attenuated total reflection Fourier transform infrared data and multivariate analysis techniques.

    PubMed

    Msimanga, Huggins Z; Ollis, Robert J

    2010-06-01

    Principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) were used to classify acetaminophen-containing medicines using their attenuated total reflection Fourier transform infrared (ATR-FT-IR) spectra. Four formulations of Tylenol (Arthritis Pain Relief, Extra Strength Pain Relief, 8 Hour Pain Relief, and Extra Strength Pain Relief Rapid Release) along with 98% pure acetaminophen were selected for this study because of the similarity of their spectral features, with correlation coefficients ranging from 0.9857 to 0.9988. Before acquiring spectra for the predictor matrix, the effects on spectral precision with respect to sample particle size (determined by sieve size opening), force gauge of the ATR accessory, sample reloading, and between-tablet variation were examined. Spectra were baseline corrected and normalized to unity before multivariate analysis. Analysis of variance (ANOVA) was used to study spectral precision. The large particles (35 mesh) showed large variance between spectra, while fine particles (120 mesh) indicated good spectral precision based on the F-test. Force gauge setting did not significantly affect precision. Sample reloading using the fine particle size and a constant force gauge setting of 50 units also did not compromise precision. Based on these observations, data acquisition for the predictor matrix was carried out with the fine particles (sieve size opening of 120 mesh) at a constant force gauge setting of 50 units. After removing outliers, PCA successfully classified the five samples in the first and second components, accounting for 45.0% and 24.5% of the variances, respectively. The four-component PLS-DA model (R(2)=0.925 and Q(2)=0.906) gave good test spectra predictions with an overall average of 0.961 +/- 7.1% RSD versus the expected 1.0 prediction for the 20 test spectra used.

  4. A composite measure to explore visual disability in primary progressive multiple sclerosis.

    PubMed

    Poretto, Valentina; Petracca, Maria; Saiote, Catarina; Mormina, Enricomaria; Howard, Jonathan; Miller, Aaron; Lublin, Fred D; Inglese, Matilde

    2017-01-01

    Optical coherence tomography (OCT) and magnetic resonance imaging (MRI) can provide complementary information on visual system damage in multiple sclerosis (MS). The objective of this paper is to determine whether a composite OCT/MRI score, reflecting cumulative damage along the entire visual pathway, can predict visual deficits in primary progressive multiple sclerosis (PPMS). Twenty-five PPMS patients and 20 age-matched controls underwent neuro-ophthalmologic evaluation, spectral-domain OCT, and 3T brain MRI. Differences between groups were assessed by univariate general linear model and principal component analysis (PCA) grouped instrumental variables into main components. Linear regression analysis was used to assess the relationship between low-contrast visual acuity (LCVA), OCT/MRI-derived metrics and PCA-derived composite scores. PCA identified four main components explaining 80.69% of data variance. Considering each variable independently, LCVA 1.25% was significantly predicted by ganglion cell-inner plexiform layer (GCIPL) thickness, thalamic volume and optic radiation (OR) lesion volume (adjusted R 2 0.328, p  = 0.00004; adjusted R 2 0.187, p  = 0.002 and adjusted R 2 0.180, p  = 0.002). The PCA composite score of global visual pathway damage independently predicted both LCVA 1.25% (adjusted R 2 value 0.361, p  = 0.00001) and LCVA 2.50% (adjusted R 2 value 0.323, p  = 0.00003). A multiparametric score represents a more comprehensive and effective tool to explain visual disability than a single instrumental metric in PPMS.

  5. Estimation of human emotions using thermal facial information

    NASA Astrophysics Data System (ADS)

    Nguyen, Hung; Kotani, Kazunori; Chen, Fan; Le, Bac

    2014-01-01

    In recent years, research on human emotion estimation using thermal infrared (IR) imagery has appealed to many researchers due to its invariance to visible illumination changes. Although infrared imagery is superior to visible imagery in its invariance to illumination changes and appearance differences, it has difficulties in handling transparent glasses in the thermal infrared spectrum. As a result, when using infrared imagery for the analysis of human facial information, the regions of eyeglasses are dark and eyes' thermal information is not given. We propose a temperature space method to correct eyeglasses' effect using the thermal facial information in the neighboring facial regions, and then use Principal Component Analysis (PCA), Eigen-space Method based on class-features (EMC), and PCA-EMC method to classify human emotions from the corrected thermal images. We collected the Kotani Thermal Facial Emotion (KTFE) database and performed the experiments, which show the improved accuracy rate in estimating human emotions.

  6. Automated X-Ray Diffraction of Irradiated Materials

    DOE PAGES

    Rodman, John; Lin, Yuewei; Sprouster, David; ...

    2017-10-26

    Synchrotron-based X-ray diffraction (XRD) and small-angle Xray scattering (SAXS) characterization techniques used on unirradiated and irradiated reactor pressure vessel steels yield large amounts of data. Machine learning techniques, including PCA, offer a novel method of analyzing and visualizing these large data sets in order to determine the effects of chemistry and irradiation conditions on the formation of radiation induced precipitates. In order to run analysis on these data sets, preprocessing must be carried out to convert the data to a usable format and mask the 2-D detector images to account for experimental variations. Once the data has been preprocessed, itmore » can be organized and visualized using principal component analysis (PCA), multi-dimensional scaling, and k-means clustering. In conclusion, from these techniques, it is shown that sample chemistry has a notable effect on the formation of the radiation induced precipitates in reactor pressure vessel steels.« less

  7. Localized Principal Component Analysis based Curve Evolution: A Divide and Conquer Approach

    PubMed Central

    Appia, Vikram; Ganapathy, Balaji; Yezzi, Anthony; Faber, Tracy

    2014-01-01

    We propose a novel localized principal component analysis (PCA) based curve evolution approach which evolves the segmenting curve semi-locally within various target regions (divisions) in an image and then combines these locally accurate segmentation curves to obtain a global segmentation. The training data for our approach consists of training shapes and associated auxiliary (target) masks. The masks indicate the various regions of the shape exhibiting highly correlated variations locally which may be rather independent of the variations in the distant parts of the global shape. Thus, in a sense, we are clustering the variations exhibited in the training data set. We then use a parametric model to implicitly represent each localized segmentation curve as a combination of the local shape priors obtained by representing the training shapes and the masks as a collection of signed distance functions. We also propose a parametric model to combine the locally evolved segmentation curves into a single hybrid (global) segmentation. Finally, we combine the evolution of these semilocal and global parameters to minimize an objective energy function. The resulting algorithm thus provides a globally accurate solution, which retains the local variations in shape. We present some results to illustrate how our approach performs better than the traditional approach with fully global PCA. PMID:25520901

  8. Prediction of p38 map kinase inhibitory activity of 3, 4-dihydropyrido [3, 2-d] pyrimidone derivatives using an expert system based on principal component analysis and least square support vector machine

    PubMed Central

    Shahlaei, M.; Saghaie, L.

    2014-01-01

    A quantitative structure–activity relationship (QSAR) study is suggested for the prediction of biological activity (pIC50) of 3, 4-dihydropyrido [3,2-d] pyrimidone derivatives as p38 inhibitors. Modeling of the biological activities of compounds of interest as a function of molecular structures was established by means of principal component analysis (PCA) and least square support vector machine (LS-SVM) methods. The results showed that the pIC50 values calculated by LS-SVM are in good agreement with the experimental data, and the performance of the LS-SVM regression model is superior to the PCA-based model. The developed LS-SVM model was applied for the prediction of the biological activities of pyrimidone derivatives, which were not in the modeling procedure. The resulted model showed high prediction ability with root mean square error of prediction of 0.460 for LS-SVM. The study provided a novel and effective approach for predicting biological activities of 3, 4-dihydropyrido [3,2-d] pyrimidone derivatives as p38 inhibitors and disclosed that LS-SVM can be used as a powerful chemometrics tool for QSAR studies. PMID:26339262

  9. Comparison of the Trace Elements and Active Components of Lonicera japonica flos and Lonicera flos Using ICP-MS and HPLC-PDA.

    PubMed

    Zhao, Yueran; Dou, Deqiang; Guo, Yueqiu; Qi, Yue; Li, Jun; Jia, Dong

    2018-06-01

    Thirteen trace elements and active constituents of 40 batches of Lonicera japonica flos and Lonicera flos were comparatively studied using inductively coupled plasma mass-spectrometry (ICP-MS) and high-performance liquid chromatography-photodiode array (HPLC-PDA). The trace elements were 24 Mg, 52 Cr, 55 Mn, 57 Fe, 60 Ni, 63 Cu, 66 Zn, 75 As, 82 Se, 98 Mo, 114 Cd, 202 Hg, and 208 Pb, and the active compounds were chlorogenic acid, 3,5-O-dicaffeoylquinc acid, 4,5-O-dicaffeoylquinc acid, luteolin-7-O-glucoside, and 4-O-caffeoylquinic acid. The data of 18 variables were statistically processed using principal component analysis (PCA) and discriminate analysis (DA) to classify L. japonica flos and L. flos. The validated method was developed to divide the 40 samples into two groups based on the PCA in terms of 18 variables. Furthermore, the species of Lonicera was better discriminated by using DA with 12 variables. These results suggest that the method and statistical analysis of the contents of trace elements and chemical components can classify the L. japonica flos and L. flos using 12 variables, such as 3,5-O-dicaffeoylquincacid, luteolin-7-O-glucoside, Cd, Mn, Hg, Pb, Ni, 4-O-caffeoyl-quinic acid, 4,5-O-dicaffeoylquinc acid, Fe, Mg, and Cr.

  10. Feasibility Study on the Use of On-line Multivariate Statistical Process Control for Safeguards Applications in Natural Uranium Conversion Plants

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ladd-Lively, Jennifer L

    2014-01-01

    The objective of this work was to determine the feasibility of using on-line multivariate statistical process control (MSPC) for safeguards applications in natural uranium conversion plants. Multivariate statistical process control is commonly used throughout industry for the detection of faults. For safeguards applications in uranium conversion plants, faults could include the diversion of intermediate products such as uranium dioxide, uranium tetrafluoride, and uranium hexafluoride. This study was limited to a 100 metric ton of uranium (MTU) per year natural uranium conversion plant (NUCP) using the wet solvent extraction method for the purification of uranium ore concentrate. A key component inmore » the multivariate statistical methodology is the Principal Component Analysis (PCA) approach for the analysis of data, development of the base case model, and evaluation of future operations. The PCA approach was implemented through the use of singular value decomposition of the data matrix where the data matrix represents normal operation of the plant. Component mole balances were used to model each of the process units in the NUCP. However, this approach could be applied to any data set. The monitoring framework developed in this research could be used to determine whether or not a diversion of material has occurred at an NUCP as part of an International Atomic Energy Agency (IAEA) safeguards system. This approach can be used to identify the key monitoring locations, as well as locations where monitoring is unimportant. Detection limits at the key monitoring locations can also be established using this technique. Several faulty scenarios were developed to test the monitoring framework after the base case or normal operating conditions of the PCA model were established. In all of the scenarios, the monitoring framework was able to detect the fault. Overall this study was successful at meeting the stated objective.« less

  11. Activation of Beta-Catenin Signaling in Androgen Receptor–Negative Prostate Cancer Cells

    PubMed Central

    Wan, Xinhai; Liu, Jie; Lu, Jing-Fang; Tzelepi, Vassiliki; Yang, Jun; Starbuck, Michael W.; Diao, Lixia; Wang, Jing; Efstathiou, Eleni; Vazquez, Elba S.; Troncoso, Patricia; Maity, Sankar N.; Navone, Nora M.

    2012-01-01

    Purpose To study Wnt/beta-catenin in castrate-resistant prostate cancer (CRPC) and understand its function independently of the beta-catenin–androgen receptor (AR) interaction. Experimental Design We performed beta-catenin immunocytochemical analysis, evaluated TOP-flash reporter activity (a reporter of beta-catenin–mediated transcription), and sequenced the beta-catenin gene in MDA PCa 118a, MDA PCa 118b, MDA PCa 2b, and PC-3 prostate cancer (PCa) cells. We knocked down beta-catenin in AR-negative MDA PCa 118b cells and performed comparative gene-array analysis. We also immunohistochemically analyzed beta-catenin and AR in 27 bone metastases of human CRPCs. Results Beta-catenin nuclear accumulation and TOP-flash reporter activity were high in MDA PCa 118b but not in MDA PCa 2b or PC-3 cells. MDA PCa 118a and 118b cells carry a mutated beta-catenin at codon 32 (D32G). Ten genes were expressed differently (false discovery rate, 0.05) in MDA PCa 118b cells with downregulated beta-catenin. One such gene, hyaluronan synthase 2 (HAS2), synthesizes hyaluronan, a core component of the extracellular matrix. We confirmed HAS2 upregulation in PC-3 cells transfected with D32G-mutant beta-catenin. Finally, we found nuclear localization of beta-catenin in 10 of 27 human tissue specimens; this localization was inversely associated with AR expression (P = 0.056, Fisher’s exact test), suggesting that reduced AR expression enables Wnt/beta-catenin signaling. Conclusion We identified a previously unknown downstream target of beta-catenin, HAS2, in PCa, and found that high beta-catenin nuclear localization and low or no AR expression may define a subpopulation of men with bone-metastatic PCa. These findings may guide physicians in managing these patients. PMID:22298898

  12. Satellite image fusion based on principal component analysis and high-pass filtering.

    PubMed

    Metwalli, Mohamed R; Nasr, Ayman H; Allah, Osama S Farag; El-Rabaie, S; Abd El-Samie, Fathi E

    2010-06-01

    This paper presents an integrated method for the fusion of satellite images. Several commercial earth observation satellites carry dual-resolution sensors, which provide high spatial resolution or simply high-resolution (HR) panchromatic (pan) images and low-resolution (LR) multi-spectral (MS) images. Image fusion methods are therefore required to integrate a high-spectral-resolution MS image with a high-spatial-resolution pan image to produce a pan-sharpened image with high spectral and spatial resolutions. Some image fusion methods such as the intensity, hue, and saturation (IHS) method, the principal component analysis (PCA) method, and the Brovey transform (BT) method provide HR MS images, but with low spectral quality. Another family of image fusion methods, such as the high-pass-filtering (HPF) method, operates on the basis of the injection of high frequency components from the HR pan image into the MS image. This family of methods provides less spectral distortion. In this paper, we propose the integration of the PCA method and the HPF method to provide a pan-sharpened MS image with superior spatial resolution and less spectral distortion. The experimental results show that the proposed fusion method retains the spectral characteristics of the MS image and, at the same time, improves the spatial resolution of the pan-sharpened image.

  13. [Research on discrimination of cabbage and weeds based on visible and near-infrared spectrum analysis].

    PubMed

    Zu, Qin; Zhao, Chun-Jiang; Deng, Wei; Wang, Xiu

    2013-05-01

    The automatic identification of weeds forms the basis for precision spraying of crops infest. The canopy spectral reflectance within the 350-2 500 nm band of two strains of cabbages and five kinds of weeds such as barnyard grass, setaria, crabgrass, goosegrass and pigweed was acquired by ASD spectrometer. According to the spectral curve characteristics, the data in different bands were compressed with different levels to improve the operation efficiency. Firstly, the spectrum was denoised in accordance with the different order of multiple scattering correction (MSC) method and Savitzky-Golay (SG) convolution smoothing method set by different parameters, then the model was built by combining the principal component analysis (PCA) method to extract principal components, finally all kinds of plants were classified by using the soft independent modeling of class analogy (SIMCA) taxonomy and the classification results were compared. The tests results indicate that after the pretreatment of the spectral data with the method of the combination of MSC and SG set with 3rd order, 5th degree polynomial, 21 smoothing points, and the top 10 principal components extraction using PCA as a classification model input variable, 100% correct classification rate was achieved, and it is able to identify cabbage and several kinds of common weeds quickly and nondestructively.

  14. Origin of fecal contamination in waters from contrasted areas: stanols as Microbial Source Tracking markers.

    PubMed

    Derrien, M; Jardé, E; Gruau, G; Pourcher, A M; Gourmelon, M; Jadas-Hécart, A; Pierson Wickmann, A C

    2012-09-01

    Improving the microbiological quality of coastal and river waters relies on the development of reliable markers that are capable of determining sources of fecal pollution. Recently, a principal component analysis (PCA) method based on six stanol compounds (i.e. 5β-cholestan-3β-ol (coprostanol), 5β-cholestan-3α-ol (epicoprostanol), 24-methyl-5α-cholestan-3β-ol (campestanol), 24-ethyl-5α-cholestan-3β-ol (sitostanol), 24-ethyl-5β-cholestan-3β-ol (24-ethylcoprostanol) and 24-ethyl-5β-cholestan-3α-ol (24-ethylepicoprostanol)) was shown to be suitable for distinguishing between porcine and bovine feces. In this study, we tested if this PCA method, using the above six stanols, could be used as a tool in "Microbial Source Tracking (MST)" methods in water from areas of intensive agriculture where diffuse fecal contamination is often marked by the co-existence of human and animal sources. In particular, well-defined and stable clusters were found in PCA score plots clustering samples of "pure" human, bovine and porcine feces along with runoff and diluted waters in which the source of contamination is known. A good consistency was also observed between the source assignments made by the 6-stanol-based PCA method and the microbial markers for river waters contaminated by fecal matter of unknown origin. More generally, the tests conducted in this study argue for the addition of the PCA method based on six stanols in the MST toolbox to help identify fecal contamination sources. The data presented in this study show that this addition would improve the determination of fecal contamination sources when the contamination levels are low to moderate. Copyright © 2012 Elsevier Ltd. All rights reserved.

  15. Land cover in Upper Egypt assessed using regional and global land cover products derived from MODIS imagery.

    PubMed

    Fuller, Douglas O; Parenti, Michael S; Gad, Adel M; Beier, John C

    2012-01-01

    Irrigation along the Nile River has resulted in dramatic changes in the biophysical environment of Upper Egypt. In this study we used a combination of MODIS 250 m NDVI data and Landsat imagery to identify areas that changed from 2001-2008 as a result of irrigation and water-level fluctuations in the Nile River and nearby water bodies. We used two different methods of time series analysis -- principal components (PCA) and harmonic decomposition (HD), applied to the MODIS 250 m NDVI images to derive simple three-class land cover maps and then assessed their accuracy using a set of reference polygons derived from 30 m Landsat 5 and 7 imagery. We analyzed our MODIS 250 m maps against a new MODIS global land cover product (MOD12Q1 collection 5) to assess whether regionally specific mapping approaches are superior to a standard global product. Results showed that the accuracy of the PCA-based product was greater than the accuracy of either the HD or MOD12Q1 products for the years 2001, 2003, and 2008. However, the accuracy of the PCA product was only slightly better than the MOD12Q1 for 2001 and 2003. Overall, the results suggest that our PCA-based approach produces a high level of user and producer accuracies, although the MOD12Q1 product also showed consistently high accuracy. Overlay of 2001-2008 PCA-based maps showed a net increase of 12 129 ha of irrigated vegetation, with the largest increase found from 2006-2008 around the Districts of Edfu and Kom Ombo. This result was unexpected in light of ambitious government plans to develop 336 000 ha of irrigated agriculture around the Toshka Lakes.

  16. Improvements in Alzheimer's disease diagnosis using principle components analysis (PCA) in combination with Raman spectroscopy

    NASA Astrophysics Data System (ADS)

    Archer, John K. J.; Sudworth, Caroline D.; Williams, Rachel; How, Thien; Stone, Nicholas; Mann, David; Black, Richard A.

    2007-07-01

    The significant achievements of medical science over the last century are evident in the increasing age of the global population, however this now brings new problems, the most prominent being the growth in the number of people suffering from dementia. Over half the people with dementia in the UK are sufferers of Alzheimer's disease, a condition in which intraneuronal neurofibrillary tangles and extraneuronal senile tangles take over neurons prompting their death. A definitive diagnosis is still only currently available post-mortem, whilst current symptom based processes of elimination are far from perfect, especially when the only treatments available are symptom inhibiting drugs. Principal component analysis (PCA) of the Raman spectra taken from brain tissue has proved to be a potential tool in the diagnosis. However, this work now has to be refined in order to progress to tissue less associated with the symptoms of Alzheimer's disease. The first step of this has already been taken in progressing from frontal tissue to occipital tissue point spectra taken at random positions from bulk tissue. Now we present initial work into acquiring Raman spectral maps from across a tissue area, in pursuit of identifying unique plaque and tangle spectra. These spectra are presented alongside synthetic β-Amyloid spectra, in a study of the role that the peptide plays in the biomarker spectra, and how this information can aid the PCA of bulk tissue, and point towards a Raman spectroscopic test on less sensitive tissue, such as blood.

  17. Screening and Analysis of the Marker Components in Ganoderma lucidum by HPLC and HPLC-MSn with the Aid of Chemometrics.

    PubMed

    Wu, Lingfang; Liang, Wenyi; Chen, Wenjing; Li, Shi; Cui, Yaping; Qi, Qi; Zhang, Lanzhen

    2017-04-06

    Ganoderma triterpenes (GTs) are the major secondary metabolites of Ganoderma lucidum , which is a popularly used traditional Chinese medicine for complementary cancer therapy. The present study was to establish a fingerprint evaluation system based on Similarity Analysis (SA), Cluster Analysis (CA) and Principal Component Analysis (PCA) for the identification and quality control of G. lucidum . Fifteen samples from the Chinese provinces of Hainan, Neimeng, Shangdong, Jilin, Anhui, Henan, Yunnan, Guangxi and Fujian were analyzed by HPLC-PAD and HPLC-MS n . Forty-seven compounds were detected by HPLC, of which forty-two compounds were tentatively identified by comparing their retention times and mass spectrometry data with that of reference compounds and reviewing the literature. Ganoderic acid B, 3,7,15-trihydroxy-11,23-dioxolanost-8,16-dien-26-oic acid, lucidenic acid A, ganoderic acid G, and 3,7-oxo-12-acetylganoderic acid DM were deemed to be the marker compounds to distinguish the samples with different quality according to both CA and PCA. This study provides helpful chemical information for further research on the anti-tumor activity and mechanism of action of G. lucidum . The results proved that fingerprints combined with chemometrics are a simple, rapid and effective method for the quality control of G. lucidum .

  18. Direct analysis in real time mass spectrometry and multivariate data analysis: a novel approach to rapid identification of analytical markers for quality control of traditional Chinese medicine preparation.

    PubMed

    Zeng, Shanshan; Wang, Lu; Chen, Teng; Wang, Yuefei; Mo, Huanbiao; Qu, Haibin

    2012-07-06

    The paper presents a novel strategy to identify analytical markers of traditional Chinese medicine preparation (TCMP) rapidly via direct analysis in real time mass spectrometry (DART-MS). A commonly used TCMP, Danshen injection, was employed as a model. The optimal analysis conditions were achieved by measuring the contribution of various experimental parameters to the mass spectra. Salvianolic acids and saccharides were simultaneously determined within a single 1-min DART-MS run. Furthermore, spectra of Danshen injections supplied by five manufacturers were processed with principal component analysis (PCA). Obvious clustering was observed in the PCA score plot, and candidate markers were recognized from the contribution plots of PCA. The suitability of potential markers was then confirmed by contrasting with the results of traditional analysis methods. Using this strategy, fructose, glucose, sucrose, protocatechuic aldehyde and salvianolic acid A were rapidly identified as the markers of Danshen injections. The combination of DART-MS with PCA provides a reliable approach to the identification of analytical markers for quality control of TCMP. Copyright © 2012 Elsevier B.V. All rights reserved.

  19. Comprehensive analysis of commercial willow bark extracts by new technology platform: combined use of metabolomics, high-performance liquid chromatography-solid-phase extraction-nuclear magnetic resonance spectroscopy and high-resolution radical scavenging assay.

    PubMed

    Agnolet, Sara; Wiese, Stefanie; Verpoorte, Robert; Staerk, Dan

    2012-11-02

    Here, proof-of-concept of a new analytical platform used for the comprehensive analysis of a small set of commercial willow bark products is presented, and compared with a traditional standardization solely based on analysis of salicin and salicin derivatives. The platform combines principal component analysis (PCA) of two chemical fingerprints, i.e., HPLC and (1)H NMR data, and a pharmacological fingerprint, i.e., high-resolution 2,2'-azino-bis(3-ethylbenzothiazoline-6-sulfonate) radical cation (ABTS(+)) reduction profile, with targeted identification of constituents of interest by hyphenated HPLC-solid-phase extraction-tube transfer NMR, i.e., HPLC-SPE-ttNMR. Score plots from PCA of HPLC and (1)H NMR fingerprints showed the same distinct grouping of preparations formulated as capsules of Salix alba bark and separation of S. alba cortex. Loading plots revealed this to be due to high amount of salicin in capsules and ampelopsin, taxifolin, 7-O-methyltaxifolin-3'-O-glucoside, and 7-O-methyltaxifolin in S. alba cortex, respectively. PCA of high-resolution radical scavenging profiles revealed clear separation of preparations along principal component 1 due to the major radical scavengers (+)-catechin and ampelopsin. The new analytical platform allowed identification of 16 compounds in commercial willow bark extracts, and identification of ampelopsin, taxifolin, 7-O-methyltaxifolin-3'-O-glucoside, and 7-O-methyltaxifolin in S. alba bark extract is reported for the first time. The detection of the novel compound, ethyl 1-hydroxy-6-oxocyclohex-2-enecarboxylate, is also described. Copyright © 2012 Elsevier B.V. All rights reserved.

  20. Estimating the leaf nitrogen content of paddy rice by using the combined reflectance and laser-induced fluorescence spectra.

    PubMed

    Yang, Jian; Du, Lin; Sun, Jia; Zhang, Zhenbing; Chen, Biwu; Shi, Shuo; Gong, Wei; Song, Shalei

    2016-08-22

    Paddy rice is one of the most important crops in China, and leaf nitrogen content (LNC) serves as a significant indictor for monitoring crop status. A reliable method is needed for precise and fast quantification of LNC. Laser-induced fluorescence (LIF) technology and reflectance spectra of crops are widely used to monitor leaf biochemical content. However, comparison between the fluorescence and reflectance spectra has been rarely investigated in the monitoring of LNC. In this study, the performance of the fluorescence and reflectance spectra for LNC estimation was discussed based on principal component analysis (PCA) and back-propagation neural network (BPNN). The combination of fluorescence and reflectance spectra was also proposed to monitor paddy rice LNC. The fluorescence and reflectance spectra exhibited a high degree of multi-collinearity. About 95.38%, and 97.76% of the total variance included in the spectra were efficiently extracted by using the first three PCs in PCA. The BPNN was implemented for LNC prediction based on new variables calculated using PCA. The experimental results demonstrated that the fluorescence spectra (R2 = 0.810, 0.804 for 2014 and 2015, respectively) are superior to the reflectance spectra (R2 = 0.721, 0.671 for 2014 and 2015, respectively) for estimating LNC based on the PCA-BPNN model. The proposed combination of fluorescence and reflectance spectra can greatly improve the accuracy of LNC estimation (R2 = 0.912, 0.890 for 2014 and 2015, respectively).

  1. Energy resolution improvement of CdTe detectors by using the principal component analysis technique

    NASA Astrophysics Data System (ADS)

    Alharbi, T.

    2018-02-01

    In this paper, we report on the application of the Principal Component Analysis (PCA) technique for the improvement of the γ-ray energy resolution of CdTe detectors. The PCA technique is used to estimate the amount of charge-trapping effect which is reflected in the shape of each detector pulse, thereby correcting for the charge-trapping effect. The details of the method are described and the results obtained with a CdTe detector are shown. We have achieved an energy resolution of 1.8 % (FWHM) at 662 keV with full detection efficiency from a 1 mm thick CdTe detector which gives an energy resolution of 4.5 % (FWHM) by using the standard pulse processing method.

  2. Biological Assessment of Newfoundland Rivers Based on the Depauperate Macroinvertebrate Fauna With Emphasis on the Mayfly, Stonefly and Caddisfly Fauna: a Cautionary Tale.

    NASA Astrophysics Data System (ADS)

    Colbo, M. H.; Cote, D.; Kendall, V.

    2005-05-01

    The fauna of insular Newfoundland compared to the mainland is depauperate, e.g., 35 spp of mayflies versus about 160 spp in Maine. The question is can this depauperate fauna provide good biomonitoring information? A comparative study using present /absent data from 85 sites in eastern Newfoundland was analyzed with particular reference to the Ephemeroptera, Plecoptera and Trichoptera fauna. The results indicated these data did differentiate regions sampling type and land use when run through a simple Principle Component Analysis (PCA). The PCA 1 and 2 components varied from no to a strong correlation with both Ephemeroptera and Trichoptera richness and taxa richness of the major feeding groups. However, applying previously published biotic indices of species that occur in Newfoundland did not provide satisfactory results. It appears these indices may reflect oxygen gradients rather than pollutant concentrations and in our cold well oxygenated streams yield unsatisfactory discrimination among natural and perturb streams.

  3. Rolling Bearing Fault Diagnosis Based on an Improved HTT Transform

    PubMed Central

    Tang, Guiji; Tian, Tian; Zhou, Chong

    2018-01-01

    When rolling bearing failure occurs, vibration signals generally contain different signal components, such as impulsive fault feature signals, background noise and harmonic interference signals. One of the most challenging aspects of rolling bearing fault diagnosis is how to inhibit noise and harmonic interference signals, while enhancing impulsive fault feature signals. This paper presents a novel bearing fault diagnosis method, namely an improved Hilbert time–time (IHTT) transform, by combining a Hilbert time–time (HTT) transform with principal component analysis (PCA). Firstly, the HTT transform was performed on vibration signals to derive a HTT transform matrix. Then, PCA was employed to de-noise the HTT transform matrix in order to improve the robustness of the HTT transform. Finally, the diagonal time series of the de-noised HTT transform matrix was extracted as the enhanced impulsive fault feature signal and the contained fault characteristic information was identified through further analyses of amplitude and envelope spectrums. Both simulated and experimental analyses validated the superiority of the presented method for detecting bearing failures. PMID:29662013

  4. Comparison of wavelet based denoising schemes for gear condition monitoring: An Artificial Neural Network based Approach

    NASA Astrophysics Data System (ADS)

    Ahmed, Rounaq; Srinivasa Pai, P.; Sriram, N. S.; Bhat, Vasudeva

    2018-02-01

    Vibration Analysis has been extensively used in recent past for gear fault diagnosis. The vibration signals extracted is usually contaminated with noise and may lead to wrong interpretation of results. The denoising of extracted vibration signals helps the fault diagnosis by giving meaningful results. Wavelet Transform (WT) increases signal to noise ratio (SNR), reduces root mean square error (RMSE) and is effective to denoise the gear vibration signals. The extracted signals have to be denoised by selecting a proper denoising scheme in order to prevent the loss of signal information along with noise. An approach has been made in this work to show the effectiveness of Principal Component Analysis (PCA) to denoise gear vibration signal. In this regard three selected wavelet based denoising schemes namely PCA, Empirical Mode Decomposition (EMD), Neighcoeff Coefficient (NC), has been compared with Adaptive Threshold (AT) an extensively used wavelet based denoising scheme for gear vibration signal. The vibration signals acquired from a customized gear test rig were denoised by above mentioned four denoising schemes. The fault identification capability as well as SNR, Kurtosis and RMSE for the four denoising schemes have been compared. Features extracted from the denoised signals have been used to train and test artificial neural network (ANN) models. The performances of the four denoising schemes have been evaluated based on the performance of the ANN models. The best denoising scheme has been identified, based on the classification accuracy results. PCA is effective in all the regards as a best denoising scheme.

  5. A modified receptor model for source apportionment of heavy metal pollution in soil.

    PubMed

    Huang, Ying; Deng, Meihua; Wu, Shaofu; Japenga, Jan; Li, Tingqiang; Yang, Xiaoe; He, Zhenli

    2018-07-15

    Source apportionment is a crucial step toward reduction of heavy metal pollution in soil. Existing methods are generally based on receptor models. However, overestimation or underestimation occurs when they are applied to heavy metal source apportionment in soil. Therefore, a modified model (PCA-MLRD) was developed, which is based on principal component analysis (PCA) and multiple linear regression with distance (MLRD). This model was applied to a case study conducted in a peri-urban area in southeast China where soils were contaminated by arsenic (As), cadmium (Cd), mercury (Hg) and lead (Pb). Compared with existing models, PCA-MLRD is able to identify specific sources and quantify the extent of influence for each emission. The zinc (Zn)-Pb mine was identified as the most important anthropogenic emission, which affected approximately half area for Pb and As accumulation, and approximately one third for Cd. Overall, the influence extent of the anthropogenic emissions decreased in the order of mine (3 km) > dyeing mill (2 km) ≈ industrial hub (2 km) > fluorescent factory (1.5 km) > road (0.5 km). Although algorithm still needs to improved, the PCA-MLRD model has the potential to become a useful tool for heavy metal source apportionment in soil. Copyright © 2018 Elsevier B.V. All rights reserved.

  6. The utilization of Depth Invariant Index and Principle Component Analysis for mapping seagrass ecosystem of Kotok Island and Karang Bongkok, Indonesia

    NASA Astrophysics Data System (ADS)

    Manuputty, Agnestesya; Lumban Gaol, Jonson; Bahri Agus, Syamsul; Wayan Nurjaya, I.

    2017-01-01

    Seagrass perform a variety of functions within ecosystems, and have both economic and ecological values, therefore it has to be kept sustainable. One of the stages to preserve seagrass ecosystems is monitoring by utilizing thespatial data accurately. The purpose of the study was to assess and compare the accuracy of DII and PCA transformationsfor mapping of seagrass ecosystems. Fieldstudy was carried out in Karang Bongkok and Kotok Island waters, in Agustus 2014 and in March 2015. A WorldView-2 image acquisition date of 5 October 2013 was used in the study. The transformations for image processing data were Depth Invariant Index (DII) and Principle Component Analysis (PCA) using Support Vector Machine (SVM) classification. The result shows that benthic habitat mapping of Karang Bongkok using DII and PCA transformations were 72%and 81% overall’s accuracy respectively, whereas of Kotok Island were 83% and 84% overall’s accuracy respectively. There were seven benthic habitat types found in karang Bongkok waters and in Kotok Island namely seagrass, sand, rubble, coral, logoon, sand mix seagrass, and sand mix rubble. PCA transformation was effectively to improve mapping accuracy of sea grass mapping in Kotok Island and Karang Bongkok.

  7. Forecasting of UV-Vis absorbance time series using artificial neural networks combined with principal component analysis.

    PubMed

    Plazas-Nossa, Leonardo; Hofer, Thomas; Gruber, Günter; Torres, Andres

    2017-02-01

    This work proposes a methodology for the forecasting of online water quality data provided by UV-Vis spectrometry. Therefore, a combination of principal component analysis (PCA) to reduce the dimensionality of a data set and artificial neural networks (ANNs) for forecasting purposes was used. The results obtained were compared with those obtained by using discrete Fourier transform (DFT). The proposed methodology was applied to four absorbance time series data sets composed by a total number of 5705 UV-Vis spectra. Absolute percentage errors obtained by applying the proposed PCA/ANN methodology vary between 10% and 13% for all four study sites. In general terms, the results obtained were hardly generalizable, as they appeared to be highly dependent on specific dynamics of the water system; however, some trends can be outlined. PCA/ANN methodology gives better results than PCA/DFT forecasting procedure by using a specific spectra range for the following conditions: (i) for Salitre wastewater treatment plant (WWTP) (first hour) and Graz West R05 (first 18 min), from the last part of UV range to all visible range; (ii) for Gibraltar pumping station (first 6 min) for all UV-Vis absorbance spectra; and (iii) for San Fernando WWTP (first 24 min) for all of UV range to middle part of visible range.

  8. A novel method for qualitative analysis of edible oil oxidation using an electronic nose.

    PubMed

    Xu, Lirong; Yu, Xiuzhu; Liu, Lei; Zhang, Rui

    2016-07-01

    An electronic nose (E-nose) was used for rapid assessment of the degree of oxidation in edible oils. Peroxide and acid values of edible oil samples were analyzed using data obtained by the American Oil Chemists' Society (AOCS) Official Method for reference. Qualitative discrimination between non-oxidized and oxidized oils was conducted using the E-nose technique developed in combination with cluster analysis (CA), principal component analysis (PCA), and linear discriminant analysis (LDA). The results from CA, PCA and LDA indicated that the E-nose technique could be used for differentiation of non-oxidized and oxidized oils. LDA produced slightly better results than CA and PCA. The proposed approach can be used as an alternative to AOCS Official Method as an innovative tool for rapid detection of edible oil oxidation. Copyright © 2016 Elsevier Ltd. All rights reserved.

  9. A deviation based assessment methodology for multiple machine health patterns classification and fault detection

    NASA Astrophysics Data System (ADS)

    Jia, Xiaodong; Jin, Chao; Buzza, Matt; Di, Yuan; Siegel, David; Lee, Jay

    2018-01-01

    Successful applications of Diffusion Map (DM) in machine failure detection and diagnosis have been reported in several recent studies. DM provides an efficient way to visualize the high-dimensional, complex and nonlinear machine data, and thus suggests more knowledge about the machine under monitoring. In this paper, a DM based methodology named as DM-EVD is proposed for machine degradation assessment, abnormality detection and diagnosis in an online fashion. Several limitations and challenges of using DM for machine health monitoring have been analyzed and addressed. Based on the proposed DM-EVD, a deviation based methodology is then proposed to include more dimension reduction methods. In this work, the incorporation of Laplacian Eigen-map and Principal Component Analysis (PCA) are explored, and the latter algorithm is named as PCA-Dev and is validated in the case study. To show the successful application of the proposed methodology, case studies from diverse fields are presented and investigated in this work. Improved results are reported by benchmarking with other machine learning algorithms.

  10. A principal component analysis of the relationship between the external body shape and internal skeleton for the upper body.

    PubMed

    Nerot, A; Skalli, W; Wang, X

    2016-10-03

    Recent progress in 3D scanning technologies allows easy access to 3D human body envelope. To create personalized human models with an articulated linkage for realistic re-posturing and motion analyses, an accurate estimation of internal skeleton points, including joint centers, from the external envelope is required. For this research project, 3D reconstructions of both internal skeleton and external envelope from low dose biplanar X-rays of 40 male adults were obtained. Using principal component analysis technique (PCA), a low-dimensional dataset was used to predict internal points of the upper body from the trunk envelope. A least squares method was used to find PC scores that fit the PCA-based model to the envelope of a new subject. To validate the proposed approach, estimated internal points were evaluated using a leave-one-out (LOO) procedure, i.e. successively considering each individual from our dataset as an extra-subject. In addition, different methods were proposed to reduce the variability in data and improve the performance of the PCA-based prediction. The best method was considered as the one providing the smallest errors between estimated and reference internal points with an average error of 8.3mm anterior-posteriorly, 6.7mm laterally and 6.5mm vertically. As the proposed approach relies on few or no bony landmarks, it could be easily applicable and generalizable to surface scans from any devices. Combined with automatic body scanning techniques, this study could potentially constitute a new step towards automatic generation of external/internal subject-specific manikins. Copyright © 2016 Elsevier Ltd. All rights reserved.

  11. Investigation of inversion polymorphisms in the human genome using principal components analysis.

    PubMed

    Ma, Jianzhong; Amos, Christopher I

    2012-01-01

    Despite the significant advances made over the last few years in mapping inversions with the advent of paired-end sequencing approaches, our understanding of the prevalence and spectrum of inversions in the human genome has lagged behind other types of structural variants, mainly due to the lack of a cost-efficient method applicable to large-scale samples. We propose a novel method based on principal components analysis (PCA) to characterize inversion polymorphisms using high-density SNP genotype data. Our method applies to non-recurrent inversions for which recombination between the inverted and non-inverted segments in inversion heterozygotes is suppressed due to the loss of unbalanced gametes. Inside such an inversion region, an effect similar to population substructure is thus created: two distinct "populations" of inversion homozygotes of different orientations and their 1:1 admixture, namely the inversion heterozygotes. This kind of substructure can be readily detected by performing PCA locally in the inversion regions. Using simulations, we demonstrated that the proposed method can be used to detect and genotype inversion polymorphisms using unphased genotype data. We applied our method to the phase III HapMap data and inferred the inversion genotypes of known inversion polymorphisms at 8p23.1 and 17q21.31. These inversion genotypes were validated by comparing with literature results and by checking Mendelian consistency using the family data whenever available. Based on the PCA-approach, we also performed a preliminary genome-wide scan for inversions using the HapMap data, which resulted in 2040 candidate inversions, 169 of which overlapped with previously reported inversions. Our method can be readily applied to the abundant SNP data, and is expected to play an important role in developing human genome maps of inversions and exploring associations between inversions and susceptibility of diseases.

  12. Burst and Principal Components Analyses of MEA Data Separates Chemicals by Class

    EPA Science Inventory

    Microelectrode arrays (MEAs) detect drug and chemical induced changes in action potential "spikes" in neuronal networks and can be used to screen chemicals for neurotoxicity. Analytical "fingerprinting," using Principal Components Analysis (PCA) on spike trains recorded from prim...

  13. Multi-segmental movements as a function of experience in karate.

    PubMed

    Zago, Matteo; Codari, Marina; Iaia, F Marcello; Sforza, Chiarella

    2017-08-01

    Karate is a martial art that partly depends on subjective scoring of complex movements. Principal component analysis (PCA)-based methods can identify the fundamental synergies (principal movements) of motor system, providing a quantitative global analysis of technique. In this study, we aimed at describing the fundamental multi-joint synergies of a karate performance, under the hypothesis that the latter are skilldependent; estimate karateka's experience level, expressed as years of practice. A motion capture system recorded traditional karate techniques of 10 professional and amateur karateka. At any time point, the 3D-coordinates of body markers produced posture vectors that were normalised, concatenated from all karateka and submitted to a first PCA. Five principal movements described both gross movement synergies and individual differences. A second PCA followed by linear regression estimated the years of practice using principal movements (eigenpostures and weighting curves) and centre of mass kinematics (error: 3.71 years; R2 = 0.91, P ≪ 0.001). Principal movements and eigenpostures varied among different karateka and as functions of experience. This approach provides a framework to develop visual tools for the analysis of motor synergies in karate, allowing to detect the multi-joint motor patterns that should be restored after an injury, or to be specifically trained to increase performance.

  14. Using both principal component analysis and reduced rank regression to study dietary patterns and diabetes in Chinese adults.

    PubMed

    Batis, Carolina; Mendez, Michelle A; Gordon-Larsen, Penny; Sotres-Alvarez, Daniela; Adair, Linda; Popkin, Barry

    2016-02-01

    We examined the association between dietary patterns and diabetes using the strengths of two methods: principal component analysis (PCA) to identify the eating patterns of the population and reduced rank regression (RRR) to derive a pattern that explains the variation in glycated Hb (HbA1c), homeostasis model assessment of insulin resistance (HOMA-IR) and fasting glucose. We measured diet over a 3 d period with 24 h recalls and a household food inventory in 2006 and used it to derive PCA and RRR dietary patterns. The outcomes were measured in 2009. Adults (n 4316) from the China Health and Nutrition Survey. The adjusted odds ratio for diabetes prevalence (HbA1c≥6·5 %), comparing the highest dietary pattern score quartile with the lowest, was 1·26 (95 % CI 0·76, 2·08) for a modern high-wheat pattern (PCA; wheat products, fruits, eggs, milk, instant noodles and frozen dumplings), 0·76 (95 % CI 0·49, 1·17) for a traditional southern pattern (PCA; rice, meat, poultry and fish) and 2·37 (95 % CI 1·56, 3·60) for the pattern derived with RRR. By comparing the dietary pattern structures of RRR and PCA, we found that the RRR pattern was also behaviourally meaningful. It combined the deleterious effects of the modern high-wheat pattern (high intakes of wheat buns and breads, deep-fried wheat and soya milk) with the deleterious effects of consuming the opposite of the traditional southern pattern (low intakes of rice, poultry and game, fish and seafood). Our findings suggest that using both PCA and RRR provided useful insights when studying the association of dietary patterns with diabetes.

  15. Using both Principal Component Analysis and Reduced Rank Regression to Study Dietary Patterns and Diabetes in Chinese Adults

    PubMed Central

    Batis, Carolina; Mendez, Michelle A.; Gordon-Larsen, Penny; Sotres-Alvarez, Daniela; Adair, Linda; Popkin, Barry

    2014-01-01

    Objective We examined the association between dietary patterns and diabetes using the strengths of two methods: principal component analysis (PCA) to identify the eating patterns of the population and reduced rank regression (RRR) to derive a pattern that explains the variation in hemoglobin A1c (HbA1c), homeostasis model of insulin resistance (HOMA-IR), and fasting glucose. Design We measured diet over a 3-day period with 24-hour recalls and a household food inventory in 2006 and used it to derive PCA and RRR dietary patterns. The outcomes were measured in 2009. Setting Adults (n = 4,316) from the China Health and Nutrition Survey. Results The adjusted odds ratio for diabetes prevalence (HbA1c ≥ 6.5%), comparing the highest dietary pattern score quartile to the lowest, was 1.26 (0.76, 2.08) for a modern high-wheat pattern (PCA; wheat products, fruits, eggs, milk, instant noodles and frozen dumplings), 0.76 (0.49, 1.17) for a traditional southern pattern (PCA; rice, meat, poultry, and fish), and 2.37 (1.56, 3.60) for the pattern derived with RRR. By comparing the dietary pattern structures of RRR and PCA, we found that the RRR pattern was also behaviorally meaningful. It combined the deleterious effects of the modern high-wheat (high intake of wheat buns and breads, deep-fried wheat, and soy milk) with the deleterious effects of consuming the opposite of the traditional southern (low intake of rice, poultry and game, fish and seafood). Conclusions Our findings suggest that using both PCA and RRR provided useful insights when studying the association of dietary patterns with diabetes. PMID:26784586

  16. Finding imaging patterns of structural covariance via Non-Negative Matrix Factorization.

    PubMed

    Sotiras, Aristeidis; Resnick, Susan M; Davatzikos, Christos

    2015-03-01

    In this paper, we investigate the use of Non-Negative Matrix Factorization (NNMF) for the analysis of structural neuroimaging data. The goal is to identify the brain regions that co-vary across individuals in a consistent way, hence potentially being part of underlying brain networks or otherwise influenced by underlying common mechanisms such as genetics and pathologies. NNMF offers a directly data-driven way of extracting relatively localized co-varying structural regions, thereby transcending limitations of Principal Component Analysis (PCA), Independent Component Analysis (ICA) and other related methods that tend to produce dispersed components of positive and negative loadings. In particular, leveraging upon the well known ability of NNMF to produce parts-based representations of image data, we derive decompositions that partition the brain into regions that vary in consistent ways across individuals. Importantly, these decompositions achieve dimensionality reduction via highly interpretable ways and generalize well to new data as shown via split-sample experiments. We empirically validate NNMF in two data sets: i) a Diffusion Tensor (DT) mouse brain development study, and ii) a structural Magnetic Resonance (sMR) study of human brain aging. We demonstrate the ability of NNMF to produce sparse parts-based representations of the data at various resolutions. These representations seem to follow what we know about the underlying functional organization of the brain and also capture some pathological processes. Moreover, we show that these low dimensional representations favorably compare to descriptions obtained with more commonly used matrix factorization methods like PCA and ICA. Copyright © 2014 Elsevier Inc. All rights reserved.

  17. Inhibitory effect of eupatilin and jaceosidin isolated from Artemisia princeps in IgE-induced hypersensitivity.

    PubMed

    Lee, Seung Hoon; Bae, Eun-Ah; Park, Eun-Kyung; Shin, Yong-Wook; Baek, Nam-In; Han, Eun-Joo; Chung, Hae-Gon; Kim, Dong-Hyun

    2007-12-15

    To understand the antiallergic effect of Artemisia princeps (AP), which has been found to show inhibitory activity against degranulation and a passive cutaneous anaphylaxis (PCA) reaction, eupatilin and jaceosidin, as the active components, were isolated by degranulation-inhibitory activity-guided fractionation, with their antiallergic activity investigated. These isolated components potently inhibited the release of beta-hexosaminidase from RBL-2H3 cells induced by the IgE-antigen complex, with IC(50) values of 3.4 and 4.5muM, respectively. Eupatilin and jaceosidin potently inhibited the PCA reaction and scratching behaviors induced by IgE- antigen complex and compound 48/80, respectively. Orally administered jaceosidin more potently inhibited the PCA reaction than that of eupatilin, although the PCA reaction-inhibitory activity of intraperitoneally administered jaceosidin was nearly the same as that of eupatilin. Eupatilin and jaceosidin inhibited the gene expressions of TNF-alpha and IL-4 in RBL-2H3 cells stimulated by IgE-antigen complex. Eupatilin and jaceosidin inhibited the activation of NF-kB. Based on these findings, eupatilin and jaceosidin may be useful for protection from the PCA and itching reactions, which are IgE-mediated representative skin allergic diseases.

  18. Characterization of Leaf Extracts of Schinus terebinthifolius Raddi by GC-MS and Chemometric Analysis

    PubMed Central

    Carneiro, Fabíola B.; Lopes, Pablo Q.; Ramalho, Ricardo C.; Scotti, Marcus T.; Santos, Sócrates G.; Soares, Luiz A. L.

    2017-01-01

    Background: Schinus terebinthifolius Raddi belongs to Anacardiacea family and is widely known as “aroeira.” This species originates from South America, and its extracts are used in folk medicine due to its therapeutic properties, which include antimicrobial, anti-inflammatory, and antipyretic effects. The complexity and variability of the chemical constitution of the herbal raw material establishes the quality of the respective herbal medicine products. Objective: Thus, the purpose of this study was to investigate the variability of the volatile compounds from leaves of S. terebinthifolius. Materials and Methods: The samples were collected from different states of the Northeast region of Brazil and analyzed with a gas chromatograph coupled to a mass spectrometer (GC-MS). The collected data were analyzed using multivariate data analysis. Results: The samples’ chromatograms, obtained by GC-MS, showed similar chemical profiles in a number of peaks, but some differences were observed in the intensity of these analytical markers. The chromatographic fingerprints obtained by GC-MS were suitable for discrimination of the samples; these results along with a statistical treatment (principal component analysis [PCA]) were used as a tool for comparative analysis between the different samples of S. terebinthifolius. Conclusion: The experimental data show that the PCA used in this study clustered the samples into groups with similar chemical profiles, which builds an appropriate approach to evaluate the similarity in the phytochemical pattern found in the different leaf samples. SUMMARY The leave extracts of Schinus terebinthifolius were obtained by turbo-extractionThe extracts were partitioned with hexane and analyzed by GC-MSThe chromatographic data were analyzed using the principal component analysis (PCA)The PCA plots showed the main compounds (phellandrene, limonene, and carene), which were used to group the samples from a different geographical location in accordance to their chemical similarity. Abbreviations used: AL: Alagoas, BA: Bahia, CE: Ceará, CPETEC: Center for Weather Forecasting and Climate Studies, GC-MS: Gas chromatograph coupled to a mass spectrometer, MA: Maranhão, MVA: Multivariate data analysis, PB: Paraíba, PC1: Direction that describes the maximum variance of the original data, PC2: Maximum direction variance of the data in the subspace orthogonal to PC1, PCA: Principal component analysis, PE: Pernambuco, PI: Piauí, RN: Rio Grande do Norte, SE: Sergipe. PMID:29142431

  19. Principal component analysis-based anatomical motion models for use in adaptive radiation therapy of head and neck cancer patients

    NASA Astrophysics Data System (ADS)

    Chetvertkov, Mikhail A.

    Purpose: To develop standard and regularized principal component analysis (PCA) models of anatomical changes from daily cone beam CTs (CBCTs) of head and neck (H&N) patients, assess their potential use in adaptive radiation therapy (ART), and to extract quantitative information for treatment response assessment. Methods: Planning CT (pCT) images of H&N patients were artificially deformed to create "digital phantom" images, which modeled systematic anatomical changes during Radiation Therapy (RT). Artificial deformations closely mirrored patients' actual deformations, and were interpolated to generate 35 synthetic CBCTs, representing evolving anatomy over 35 fractions. Deformation vector fields (DVFs) were acquired between pCT and synthetic CBCTs (i.e., digital phantoms), and between pCT and clinical CBCTs. Patient-specific standard PCA (SPCA) and regularized PCA (RPCA) models were built from these synthetic and clinical DVF sets. Eigenvectors, or eigenDVFs (EDVFs), having the largest eigenvalues were hypothesized to capture the major anatomical deformations during treatment. Modeled anatomies were used to assess the dose deviations with respect to the planned dose distribution. Results: PCA models achieve variable results, depending on the size and location of anatomical change. Random changes prevent or degrade SPCA's ability to detect underlying systematic change. RPCA is able to detect smaller systematic changes against the background of random fraction-to-fraction changes, and is therefore more successful than SPCA at capturing systematic changes early in treatment. SPCA models were less successful at modeling systematic changes in clinical patient images, which contain a wider range of random motion than synthetic CBCTs, while the regularized approach was able to extract major modes of motion. For dose assessment it has been shown that the modeled dose distribution was different from the planned dose for the parotid glands due to their shrinkage and shift into the higher dose volumes during the radiotherapy course. Modeled DVHs still underestimated the effect of parotid shrinkage due to the large compression factor (CF) used to acquire DVFs. Conclusion: Leading EDVFs from both PCA approaches have the potential to capture systematic anatomical changes during H&N radiotherapy when systematic changes are large enough with respect to random fraction-to-fraction changes. In all cases the RPCA approach appears to be more reliable than SPCA at capturing systematic changes, enabling dosimetric consequences to be projected to the future treatment fractions based on trends established early in a treatment course, or, potentially, based on population models. This work showed that PCA has a potential in identifying the major mode of anatomical changes during the radiotherapy course and subsequent use of this information in future dose predictions is feasible. Use of smaller CF values for DVFs is preferred, otherwise anatomical motion will be underestimated.

  20. The biometric-based module of smart grid system

    NASA Astrophysics Data System (ADS)

    Engel, E.; Kovalev, I. V.; Ermoshkina, A.

    2015-10-01

    Within Smart Grid concept the flexible biometric-based module base on Principal Component Analysis (PCA) and selective Neural Network is developed. The formation of the selective Neural Network the biometric-based module uses the method which includes three main stages: preliminary processing of the image, face localization and face recognition. Experiments on the Yale face database show that (i) selective Neural Network exhibits promising classification capability for face detection, recognition problems; and (ii) the proposed biometric-based module achieves near real-time face detection, recognition speed and the competitive performance, as compared to some existing subspaces-based methods.

Top