Sample records for component analysis pca

  1. Principal component and spatial correlation analysis of spectroscopic-imaging data in scanning probe microscopy.

    PubMed

    Jesse, Stephen; Kalinin, Sergei V

    2009-02-25

    An approach for the analysis of multi-dimensional, spectroscopic-imaging data based on principal component analysis (PCA) is explored. PCA selects and ranks relevant response components based on variance within the data. It is shown that for examples with small relative variations between spectra, the first few PCA components closely coincide with results obtained using model fitting, and this is achieved at rates approximately four orders of magnitude faster. For cases with strong response variations, PCA allows an effective approach to rapidly process, de-noise, and compress data. The prospects for PCA combined with correlation function analysis of component maps as a universal tool for data analysis and representation in microscopy are discussed.

  2. Priority of VHS Development Based in Potential Area using Principal Component Analysis

    NASA Astrophysics Data System (ADS)

    Meirawan, D.; Ana, A.; Saripudin, S.

    2018-02-01

    The current condition of VHS is still inadequate in quality, quantity and relevance. The purpose of this research is to analyse the development of VHS based on the development of regional potential by using principal component analysis (PCA) in Bandung, Indonesia. This study used descriptive qualitative data analysis using the principle of secondary data reduction component. The method used is Principal Component Analysis (PCA) analysis with Minitab Statistics Software tool. The results of this study indicate the value of the lowest requirement is a priority of the construction of development VHS with a program of majors in accordance with the development of regional potential. Based on the PCA score found that the main priority in the development of VHS in Bandung is in Saguling, which has the lowest PCA value of 416.92 in area 1, Cihampelas with the lowest PCA value in region 2 and Padalarang with the lowest PCA value.

  3. Nonlinear Principal Components Analysis: Introduction and Application

    ERIC Educational Resources Information Center

    Linting, Marielle; Meulman, Jacqueline J.; Groenen, Patrick J. F.; van der Koojj, Anita J.

    2007-01-01

    The authors provide a didactic treatment of nonlinear (categorical) principal components analysis (PCA). This method is the nonlinear equivalent of standard PCA and reduces the observed variables to a number of uncorrelated principal components. The most important advantages of nonlinear over linear PCA are that it incorporates nominal and ordinal…

  4. Exploring patterns enriched in a dataset with contrastive principal component analysis.

    PubMed

    Abid, Abubakar; Zhang, Martin J; Bagaria, Vivek K; Zou, James

    2018-05-30

    Visualization and exploration of high-dimensional data is a ubiquitous challenge across disciplines. Widely used techniques such as principal component analysis (PCA) aim to identify dominant trends in one dataset. However, in many settings we have datasets collected under different conditions, e.g., a treatment and a control experiment, and we are interested in visualizing and exploring patterns that are specific to one dataset. This paper proposes a method, contrastive principal component analysis (cPCA), which identifies low-dimensional structures that are enriched in a dataset relative to comparison data. In a wide variety of experiments, we demonstrate that cPCA with a background dataset enables us to visualize dataset-specific patterns missed by PCA and other standard methods. We further provide a geometric interpretation of cPCA and strong mathematical guarantees. An implementation of cPCA is publicly available, and can be used for exploratory data analysis in many applications where PCA is currently used.

  5. Analysis of the principal component algorithm in phase-shifting interferometry.

    PubMed

    Vargas, J; Quiroga, J Antonio; Belenguer, T

    2011-06-15

    We recently presented a new asynchronous demodulation method for phase-sampling interferometry. The method is based in the principal component analysis (PCA) technique. In the former work, the PCA method was derived heuristically. In this work, we present an in-depth analysis of the PCA demodulation method.

  6. Stability of Nonlinear Principal Components Analysis: An Empirical Study Using the Balanced Bootstrap

    ERIC Educational Resources Information Center

    Linting, Marielle; Meulman, Jacqueline J.; Groenen, Patrick J. F.; van der Kooij, Anita J.

    2007-01-01

    Principal components analysis (PCA) is used to explore the structure of data sets containing linearly related numeric variables. Alternatively, nonlinear PCA can handle possibly nonlinearly related numeric as well as nonnumeric variables. For linear PCA, the stability of its solution can be established under the assumption of multivariate…

  7. Experimental Researches on the Durability Indicators and the Physiological Comfort of Fabrics using the Principal Component Analysis (PCA) Method

    NASA Astrophysics Data System (ADS)

    Hristian, L.; Ostafe, M. M.; Manea, L. R.; Apostol, L. L.

    2017-06-01

    The work pursued the distribution of combed wool fabrics destined to manufacturing of external articles of clothing in terms of the values of durability and physiological comfort indices, using the mathematical model of Principal Component Analysis (PCA). Principal Components Analysis (PCA) applied in this study is a descriptive method of the multivariate analysis/multi-dimensional data, and aims to reduce, under control, the number of variables (columns) of the matrix data as much as possible to two or three. Therefore, based on the information about each group/assortment of fabrics, it is desired that, instead of nine inter-correlated variables, to have only two or three new variables called components. The PCA target is to extract the smallest number of components which recover the most of the total information contained in the initial data.

  8. Free energy landscape of a biomolecule in dihedral principal component space: sampling convergence and correspondence between structures and minima.

    PubMed

    Maisuradze, Gia G; Leitner, David M

    2007-05-15

    Dihedral principal component analysis (dPCA) has recently been developed and shown to display complex features of the free energy landscape of a biomolecule that may be absent in the free energy landscape plotted in principal component space due to mixing of internal and overall rotational motion that can occur in principal component analysis (PCA) [Mu et al., Proteins: Struct Funct Bioinfo 2005;58:45-52]. Another difficulty in the implementation of PCA is sampling convergence, which we address here for both dPCA and PCA using a tetrapeptide as an example. We find that for both methods the sampling convergence can be reached over a similar time. Minima in the free energy landscape in the space of the two largest dihedral principal components often correspond to unique structures, though we also find some distinct minima to correspond to the same structure. 2007 Wiley-Liss, Inc.

  9. A stock market forecasting model combining two-directional two-dimensional principal component analysis and radial basis function neural network.

    PubMed

    Guo, Zhiqiang; Wang, Huaiqing; Yang, Jie; Miller, David J

    2015-01-01

    In this paper, we propose and implement a hybrid model combining two-directional two-dimensional principal component analysis ((2D)2PCA) and a Radial Basis Function Neural Network (RBFNN) to forecast stock market behavior. First, 36 stock market technical variables are selected as the input features, and a sliding window is used to obtain the input data of the model. Next, (2D)2PCA is utilized to reduce the dimension of the data and extract its intrinsic features. Finally, an RBFNN accepts the data processed by (2D)2PCA to forecast the next day's stock price or movement. The proposed model is used on the Shanghai stock market index, and the experiments show that the model achieves a good level of fitness. The proposed model is then compared with one that uses the traditional dimension reduction method principal component analysis (PCA) and independent component analysis (ICA). The empirical results show that the proposed model outperforms the PCA-based model, as well as alternative models based on ICA and on the multilayer perceptron.

  10. A Stock Market Forecasting Model Combining Two-Directional Two-Dimensional Principal Component Analysis and Radial Basis Function Neural Network

    PubMed Central

    Guo, Zhiqiang; Wang, Huaiqing; Yang, Jie; Miller, David J.

    2015-01-01

    In this paper, we propose and implement a hybrid model combining two-directional two-dimensional principal component analysis ((2D)2PCA) and a Radial Basis Function Neural Network (RBFNN) to forecast stock market behavior. First, 36 stock market technical variables are selected as the input features, and a sliding window is used to obtain the input data of the model. Next, (2D)2PCA is utilized to reduce the dimension of the data and extract its intrinsic features. Finally, an RBFNN accepts the data processed by (2D)2PCA to forecast the next day's stock price or movement. The proposed model is used on the Shanghai stock market index, and the experiments show that the model achieves a good level of fitness. The proposed model is then compared with one that uses the traditional dimension reduction method principal component analysis (PCA) and independent component analysis (ICA). The empirical results show that the proposed model outperforms the PCA-based model, as well as alternative models based on ICA and on the multilayer perceptron. PMID:25849483

  11. Temporal Processing of Dynamic Positron Emission Tomography via Principal Component Analysis in the Sinogram Domain

    NASA Astrophysics Data System (ADS)

    Chen, Zhe; Parker, B. J.; Feng, D. D.; Fulton, R.

    2004-10-01

    In this paper, we compare various temporal analysis schemes applied to dynamic PET for improved quantification, image quality and temporal compression purposes. We compare an optimal sampling schedule (OSS) design, principal component analysis (PCA) applied in the image domain, and principal component analysis applied in the sinogram domain; for region-of-interest quantification, sinogram-domain PCA is combined with the Huesman algorithm to quantify from the sinograms directly without requiring reconstruction of all PCA channels. Using a simulated phantom FDG brain study and three clinical studies, we evaluate the fidelity of the compressed data for estimation of local cerebral metabolic rate of glucose by a four-compartment model. Our results show that using a noise-normalized PCA in the sinogram domain gives similar compression ratio and quantitative accuracy to OSS, but with substantially better precision. These results indicate that sinogram-domain PCA for dynamic PET can be a useful preprocessing stage for PET compression and quantification applications.

  12. Incorporating biological information in sparse principal component analysis with application to genomic data.

    PubMed

    Li, Ziyi; Safo, Sandra E; Long, Qi

    2017-07-11

    Sparse principal component analysis (PCA) is a popular tool for dimensionality reduction, pattern recognition, and visualization of high dimensional data. It has been recognized that complex biological mechanisms occur through concerted relationships of multiple genes working in networks that are often represented by graphs. Recent work has shown that incorporating such biological information improves feature selection and prediction performance in regression analysis, but there has been limited work on extending this approach to PCA. In this article, we propose two new sparse PCA methods called Fused and Grouped sparse PCA that enable incorporation of prior biological information in variable selection. Our simulation studies suggest that, compared to existing sparse PCA methods, the proposed methods achieve higher sensitivity and specificity when the graph structure is correctly specified, and are fairly robust to misspecified graph structures. Application to a glioblastoma gene expression dataset identified pathways that are suggested in the literature to be related with glioblastoma. The proposed sparse PCA methods Fused and Grouped sparse PCA can effectively incorporate prior biological information in variable selection, leading to improved feature selection and more interpretable principal component loadings and potentially providing insights on molecular underpinnings of complex diseases.

  13. Descriptive Characteristics of Surface Water Quality in Hong Kong by a Self-Organising Map

    PubMed Central

    An, Yan; Zou, Zhihong; Li, Ranran

    2016-01-01

    In this study, principal component analysis (PCA) and a self-organising map (SOM) were used to analyse a complex dataset obtained from the river water monitoring stations in the Tolo Harbor and Channel Water Control Zone (Hong Kong), covering the period of 2009–2011. PCA was initially applied to identify the principal components (PCs) among the nonlinear and complex surface water quality parameters. SOM followed PCA, and was implemented to analyze the complex relationships and behaviors of the parameters. The results reveal that PCA reduced the multidimensional parameters to four significant PCs which are combinations of the original ones. The positive and inverse relationships of the parameters were shown explicitly by pattern analysis in the component planes. It was found that PCA and SOM are efficient tools to capture and analyze the behavior of multivariable, complex, and nonlinear related surface water quality data. PMID:26761018

  14. Descriptive Characteristics of Surface Water Quality in Hong Kong by a Self-Organising Map.

    PubMed

    An, Yan; Zou, Zhihong; Li, Ranran

    2016-01-08

    In this study, principal component analysis (PCA) and a self-organising map (SOM) were used to analyse a complex dataset obtained from the river water monitoring stations in the Tolo Harbor and Channel Water Control Zone (Hong Kong), covering the period of 2009-2011. PCA was initially applied to identify the principal components (PCs) among the nonlinear and complex surface water quality parameters. SOM followed PCA, and was implemented to analyze the complex relationships and behaviors of the parameters. The results reveal that PCA reduced the multidimensional parameters to four significant PCs which are combinations of the original ones. The positive and inverse relationships of the parameters were shown explicitly by pattern analysis in the component planes. It was found that PCA and SOM are efficient tools to capture and analyze the behavior of multivariable, complex, and nonlinear related surface water quality data.

  15. Principal components analysis in clinical studies.

    PubMed

    Zhang, Zhongheng; Castelló, Adela

    2017-09-01

    In multivariate analysis, independent variables are usually correlated to each other which can introduce multicollinearity in the regression models. One approach to solve this problem is to apply principal components analysis (PCA) over these variables. This method uses orthogonal transformation to represent sets of potentially correlated variables with principal components (PC) that are linearly uncorrelated. PCs are ordered so that the first PC has the largest possible variance and only some components are selected to represent the correlated variables. As a result, the dimension of the variable space is reduced. This tutorial illustrates how to perform PCA in R environment, the example is a simulated dataset in which two PCs are responsible for the majority of the variance in the data. Furthermore, the visualization of PCA is highlighted.

  16. Investigation of domain walls in PPLN by confocal raman microscopy and PCA analysis

    NASA Astrophysics Data System (ADS)

    Shur, Vladimir Ya.; Zelenovskiy, Pavel; Bourson, Patrice

    2017-07-01

    Confocal Raman microscopy (CRM) is a powerful tool for investigation of ferroelectric domains. Mechanical stresses and electric fields existed in the vicinity of neutral and charged domain walls modify frequency, intensity and width of spectral lines [1], thus allowing to visualize micro- and nanodomain structures both at the surface and in the bulk of the crystal [2,3]. Stresses and fields are naturally coupled in ferroelectrics due to inverse piezoelectric effect and hardly can be separated in Raman spectra. PCA is a powerful statistical method for analysis of large data matrix providing a set of orthogonal variables, called principal components (PCs). PCA is widely used for classification of experimental data, for example, in crystallization experiments, for detection of small amounts of components in solid mixtures etc. [4,5]. In Raman spectroscopy PCA was applied for analysis of phase transitions and provided critical pressure with good accuracy [6]. In the present work we for the first time applied Principal Component Analysis (PCA) method for analysis of Raman spectra measured in periodically poled lithium niobate (PPLN). We found that principal components demonstrate different sensitivity to mechanical stresses and electric fields in the vicinity of the domain walls. This allowed us to separately visualize spatial distribution of fields and electric fields at the surface and in the bulk of PPLN.

  17. A feasibility study on age-related factors of wrist pulse using principal component analysis.

    PubMed

    Jang-Han Bae; Young Ju Jeon; Sanghun Lee; Jaeuk U Kim

    2016-08-01

    Various analysis methods for examining wrist pulse characteristics are needed for accurate pulse diagnosis. In this feasibility study, principal component analysis (PCA) was performed to observe age-related factors of wrist pulse from various analysis parameters. Forty subjects in the age group of 20s and 40s were participated, and their wrist pulse signal and respiration signal were acquired with the pulse tonometric device. After pre-processing of the signals, twenty analysis parameters which have been regarded as values reflecting pulse characteristics were calculated and PCA was performed. As a results, we could reduce complex parameters to lower dimension and age-related factors of wrist pulse were observed by combining-new analysis parameter derived from PCA. These results demonstrate that PCA can be useful tool for analyzing wrist pulse signal.

  18. Rotation of EOFs by the Independent Component Analysis: Towards A Solution of the Mixing Problem in the Decomposition of Geophysical Time Series

    NASA Technical Reports Server (NTRS)

    Aires, Filipe; Rossow, William B.; Chedin, Alain; Hansen, James E. (Technical Monitor)

    2001-01-01

    The Independent Component Analysis is a recently developed technique for component extraction. This new method requires the statistical independence of the extracted components, a stronger constraint that uses higher-order statistics, instead of the classical decorrelation, a weaker constraint that uses only second-order statistics. This technique has been used recently for the analysis of geophysical time series with the goal of investigating the causes of variability in observed data (i.e. exploratory approach). We demonstrate with a data simulation experiment that, if initialized with a Principal Component Analysis, the Independent Component Analysis performs a rotation of the classical PCA (or EOF) solution. This rotation uses no localization criterion like other Rotation Techniques (RT), only the global generalization of decorrelation by statistical independence is used. This rotation of the PCA solution seems to be able to solve the tendency of PCA to mix several physical phenomena, even when the signal is just their linear sum.

  19. Identification and classification of upper limb motions using PCA.

    PubMed

    Veer, Karan; Vig, Renu

    2018-03-28

    This paper describes the utility of principal component analysis (PCA) in classifying upper limb signals. PCA is a powerful tool for analyzing data of high dimension. Here, two different input strategies were explored. The first method uses upper arm dual-position-based myoelectric signal acquisition and the other solely uses PCA for classifying surface electromyogram (SEMG) signals. SEMG data from the biceps and the triceps brachii muscles and four independent muscle activities of the upper arm were measured in seven subjects (total dataset=56). The datasets used for the analysis are rotated by class-specific principal component matrices to decorrelate the measured data prior to feature extraction.

  20. A stable systemic risk ranking in China's banking sector: Based on principal component analysis

    NASA Astrophysics Data System (ADS)

    Fang, Libing; Xiao, Binqing; Yu, Honghai; You, Qixing

    2018-02-01

    In this paper, we compare five popular systemic risk rankings, and apply principal component analysis (PCA) model to provide a stable systemic risk ranking for the Chinese banking sector. Our empirical results indicate that five methods suggest vastly different systemic risk rankings for the same bank, while the combined systemic risk measure based on PCA provides a reliable ranking. Furthermore, according to factor loadings of the first component, PCA combined ranking is mainly based on fundamentals instead of market price data. We clearly find that price-based rankings are not as practical a method as fundamentals-based ones. This PCA combined ranking directly shows systemic risk contributions of each bank for banking supervision purpose and reminds banks to prevent and cope with the financial crisis in advance.

  1. Revealing the ultrafast outflow in IRAS 13224-3809 through spectral variability

    NASA Astrophysics Data System (ADS)

    Parker, M. L.; Alston, W. N.; Buisson, D. J. K.; Fabian, A. C.; Jiang, J.; Kara, E.; Lohfink, A.; Pinto, C.; Reynolds, C. S.

    2017-08-01

    We present an analysis of the long-term X-ray variability of the extreme narrow-line Seyfert 1 galaxy IRAS 13224-3809 using principal component analysis (PCA) and fractional excess variability (Fvar) spectra to identify model-independent spectral components. We identify a series of variability peaks in both the first PCA component and Fvar spectrum which correspond to the strongest predicted absorption lines from the ultrafast outflow (UFO) discovered by Parker et al. (2017). We also find higher order PCA components, which correspond to variability of the soft excess and reflection features. The subtle differences between RMS and PCA results argue that the observed flux-dependence of the absorption is due to increased ionization of the gas, rather than changes in column density or covering fraction. This result demonstrates that we can detect outflows from variability alone and that variability studies of UFOs are an extremely promising avenue for future research.

  2. The Influence Function of Principal Component Analysis by Self-Organizing Rule.

    PubMed

    Higuchi; Eguchi

    1998-07-28

    This article is concerned with a neural network approach to principal component analysis (PCA). An algorithm for PCA by the self-organizing rule has been proposed and its robustness observed through the simulation study by Xu and Yuille (1995). In this article, the robustness of the algorithm against outliers is investigated by using the theory of influence function. The influence function of the principal component vector is given in an explicit form. Through this expression, the method is shown to be robust against any directions orthogonal to the principal component vector. In addition, a statistic generated by the self-organizing rule is proposed to assess the influence of data in PCA.

  3. Common factor analysis versus principal component analysis: choice for symptom cluster research.

    PubMed

    Kim, Hee-Ju

    2008-03-01

    The purpose of this paper is to examine differences between two factor analytical methods and their relevance for symptom cluster research: common factor analysis (CFA) versus principal component analysis (PCA). Literature was critically reviewed to elucidate the differences between CFA and PCA. A secondary analysis (N = 84) was utilized to show the actual result differences from the two methods. CFA analyzes only the reliable common variance of data, while PCA analyzes all the variance of data. An underlying hypothetical process or construct is involved in CFA but not in PCA. PCA tends to increase factor loadings especially in a study with a small number of variables and/or low estimated communality. Thus, PCA is not appropriate for examining the structure of data. If the study purpose is to explain correlations among variables and to examine the structure of the data (this is usual for most cases in symptom cluster research), CFA provides a more accurate result. If the purpose of a study is to summarize data with a smaller number of variables, PCA is the choice. PCA can also be used as an initial step in CFA because it provides information regarding the maximum number and nature of factors. In using factor analysis for symptom cluster research, several issues need to be considered, including subjectivity of solution, sample size, symptom selection, and level of measure.

  4. Evaluation of skin melanoma in spectral range 450-950 nm using principal component analysis

    NASA Astrophysics Data System (ADS)

    Jakovels, D.; Lihacova, I.; Kuzmina, I.; Spigulis, J.

    2013-06-01

    Diagnostic potential of principal component analysis (PCA) of multi-spectral imaging data in the wavelength range 450- 950 nm for distant skin melanoma recognition is discussed. Processing of the measured clinical data by means of PCA resulted in clear separation between malignant melanomas and pigmented nevi.

  5. Principle Component Analysis with Incomplete Data: A simulation of R pcaMethods package in Constructing an Environmental Quality Index with Missing Data

    EPA Science Inventory

    Missing data is a common problem in the application of statistical techniques. In principal component analysis (PCA), a technique for dimensionality reduction, incomplete data points are either discarded or imputed using interpolation methods. Such approaches are less valid when ...

  6. Application of principal component analysis (PCA) as a sensory assessment tool for fermented food products.

    PubMed

    Ghosh, Debasree; Chattopadhyay, Parimal

    2012-06-01

    The objective of the work was to use the method of quantitative descriptive analysis (QDA) to describe the sensory attributes of the fermented food products prepared with the incorporation of lactic cultures. Panellists were selected and trained to evaluate various attributes specially color and appearance, body texture, flavor, overall acceptability and acidity of the fermented food products like cow milk curd and soymilk curd, idli, sauerkraut and probiotic ice cream. Principal component analysis (PCA) identified the six significant principal components that accounted for more than 90% of the variance in the sensory attribute data. Overall product quality was modelled as a function of principal components using multiple least squares regression (R (2) = 0.8). The result from PCA was statistically analyzed by analysis of variance (ANOVA). These findings demonstrate the utility of quantitative descriptive analysis for identifying and measuring the fermented food product attributes that are important for consumer acceptability.

  7. 2L-PCA: a two-level principal component analyzer for quantitative drug design and its applications.

    PubMed

    Du, Qi-Shi; Wang, Shu-Qing; Xie, Neng-Zhong; Wang, Qing-Yan; Huang, Ri-Bo; Chou, Kuo-Chen

    2017-09-19

    A two-level principal component predictor (2L-PCA) was proposed based on the principal component analysis (PCA) approach. It can be used to quantitatively analyze various compounds and peptides about their functions or potentials to become useful drugs. One level is for dealing with the physicochemical properties of drug molecules, while the other level is for dealing with their structural fragments. The predictor has the self-learning and feedback features to automatically improve its accuracy. It is anticipated that 2L-PCA will become a very useful tool for timely providing various useful clues during the process of drug development.

  8. Application of principal component analysis to distinguish patients with schizophrenia from healthy controls based on fractional anisotropy measurements.

    PubMed

    Caprihan, A; Pearlson, G D; Calhoun, V D

    2008-08-15

    Principal component analysis (PCA) is often used to reduce the dimension of data before applying more sophisticated data analysis methods such as non-linear classification algorithms or independent component analysis. This practice is based on selecting components corresponding to the largest eigenvalues. If the ultimate goal is separation of data in two groups, then these set of components need not have the most discriminatory power. We measured the distance between two such populations using Mahalanobis distance and chose the eigenvectors to maximize it, a modified PCA method, which we call the discriminant PCA (DPCA). DPCA was applied to diffusion tensor-based fractional anisotropy images to distinguish age-matched schizophrenia subjects from healthy controls. The performance of the proposed method was evaluated by the one-leave-out method. We show that for this fractional anisotropy data set, the classification error with 60 components was close to the minimum error and that the Mahalanobis distance was twice as large with DPCA, than with PCA. Finally, by masking the discriminant function with the white matter tracts of the Johns Hopkins University atlas, we identified left superior longitudinal fasciculus as the tract which gave the least classification error. In addition, with six optimally chosen tracts the classification error was zero.

  9. Identification of the isomers using principal component analysis (PCA) method

    NASA Astrophysics Data System (ADS)

    Kepceoǧlu, Abdullah; Gündoǧdu, Yasemin; Ledingham, Kenneth William David; Kilic, Hamdi Sukur

    2016-03-01

    In this work, we have carried out a detailed statistical analysis for experimental data of mass spectra from xylene isomers. Principle Component Analysis (PCA) was used to identify the isomers which cannot be distinguished using conventional statistical methods for interpretation of their mass spectra. Experiments have been carried out using a linear TOF-MS coupled to a femtosecond laser system as an energy source for the ionisation processes. We have performed experiments and collected data which has been analysed and interpreted using PCA as a multivariate analysis of these spectra. This demonstrates the strength of the method to get an insight for distinguishing the isomers which cannot be identified using conventional mass analysis obtained through dissociative ionisation processes on these molecules. The PCA results dependending on the laser pulse energy and the background pressure in the spectrometers have been presented in this work.

  10. PCA based clustering for brain tumor segmentation of T1w MRI images.

    PubMed

    Kaya, Irem Ersöz; Pehlivanlı, Ayça Çakmak; Sekizkardeş, Emine Gezmez; Ibrikci, Turgay

    2017-03-01

    Medical images are huge collections of information that are difficult to store and process consuming extensive computing time. Therefore, the reduction techniques are commonly used as a data pre-processing step to make the image data less complex so that a high-dimensional data can be identified by an appropriate low-dimensional representation. PCA is one of the most popular multivariate methods for data reduction. This paper is focused on T1-weighted MRI images clustering for brain tumor segmentation with dimension reduction by different common Principle Component Analysis (PCA) algorithms. Our primary aim is to present a comparison between different variations of PCA algorithms on MRIs for two cluster methods. Five most common PCA algorithms; namely the conventional PCA, Probabilistic Principal Component Analysis (PPCA), Expectation Maximization Based Principal Component Analysis (EM-PCA), Generalize Hebbian Algorithm (GHA), and Adaptive Principal Component Extraction (APEX) were applied to reduce dimensionality in advance of two clustering algorithms, K-Means and Fuzzy C-Means. In the study, the T1-weighted MRI images of the human brain with brain tumor were used for clustering. In addition to the original size of 512 lines and 512 pixels per line, three more different sizes, 256 × 256, 128 × 128 and 64 × 64, were included in the study to examine their effect on the methods. The obtained results were compared in terms of both the reconstruction errors and the Euclidean distance errors among the clustered images containing the same number of principle components. According to the findings, the PPCA obtained the best results among all others. Furthermore, the EM-PCA and the PPCA assisted K-Means algorithm to accomplish the best clustering performance in the majority as well as achieving significant results with both clustering algorithms for all size of T1w MRI images. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  11. Comparison of common components analysis with principal components analysis and independent components analysis: Application to SPME-GC-MS volatolomic signatures.

    PubMed

    Bouhlel, Jihéne; Jouan-Rimbaud Bouveresse, Delphine; Abouelkaram, Said; Baéza, Elisabeth; Jondreville, Catherine; Travel, Angélique; Ratel, Jérémy; Engel, Erwan; Rutledge, Douglas N

    2018-02-01

    The aim of this work is to compare a novel exploratory chemometrics method, Common Components Analysis (CCA), with Principal Components Analysis (PCA) and Independent Components Analysis (ICA). CCA consists in adapting the multi-block statistical method known as Common Components and Specific Weights Analysis (CCSWA or ComDim) by applying it to a single data matrix, with one variable per block. As an application, the three methods were applied to SPME-GC-MS volatolomic signatures of livers in an attempt to reveal volatile organic compounds (VOCs) markers of chicken exposure to different types of micropollutants. An application of CCA to the initial SPME-GC-MS data revealed a drift in the sample Scores along CC2, as a function of injection order, probably resulting from time-related evolution in the instrument. This drift was eliminated by orthogonalization of the data set with respect to CC2, and the resulting data are used as the orthogonalized data input into each of the three methods. Since the first step in CCA is to norm-scale all the variables, preliminary data scaling has no effect on the results, so that CCA was applied only to orthogonalized SPME-GC-MS data, while, PCA and ICA were applied to the "orthogonalized", "orthogonalized and Pareto-scaled", and "orthogonalized and autoscaled" data. The comparison showed that PCA results were highly dependent on the scaling of variables, contrary to ICA where the data scaling did not have a strong influence. Nevertheless, for both PCA and ICA the clearest separations of exposed groups were obtained after autoscaling of variables. The main part of this work was to compare the CCA results using the orthogonalized data with those obtained with PCA and ICA applied to orthogonalized and autoscaled variables. The clearest separations of exposed chicken groups were obtained by CCA. CCA Loadings also clearly identified the variables contributing most to the Common Components giving separations. The PCA Loadings did not highlight the most influencing variables for each separation, whereas the ICA Loadings highlighted the same variables as did CCA. This study shows the potential of CCA for the extraction of pertinent information from a data matrix, using a procedure based on an original optimisation criterion, to produce results that are complementary, and in some cases may be superior, to those of PCA and ICA. Copyright © 2017 Elsevier B.V. All rights reserved.

  12. PCA-LBG-based algorithms for VQ codebook generation

    NASA Astrophysics Data System (ADS)

    Tsai, Jinn-Tsong; Yang, Po-Yuan

    2015-04-01

    Vector quantisation (VQ) codebooks are generated by combining principal component analysis (PCA) algorithms with Linde-Buzo-Gray (LBG) algorithms. All training vectors are grouped according to the projected values of the principal components. The PCA-LBG-based algorithms include (1) PCA-LBG-Median, which selects the median vector of each group, (2) PCA-LBG-Centroid, which adopts the centroid vector of each group, and (3) PCA-LBG-Random, which randomly selects a vector of each group. The LBG algorithm finds a codebook based on the better vectors sent to an initial codebook by the PCA. The PCA performs an orthogonal transformation to convert a set of potentially correlated variables into a set of variables that are not linearly correlated. Because the orthogonal transformation efficiently distinguishes test image vectors, the proposed PCA-LBG-based algorithm is expected to outperform conventional algorithms in designing VQ codebooks. The experimental results confirm that the proposed PCA-LBG-based algorithms indeed obtain better results compared to existing methods reported in the literature.

  13. Use of Geochemistry Data Collected by the Mars Exploration Rover Spirit in Gusev Crater to Teach Geomorphic Zonation through Principal Components Analysis

    ERIC Educational Resources Information Center

    Rodrigue, Christine M.

    2011-01-01

    This paper presents a laboratory exercise used to teach principal components analysis (PCA) as a means of surface zonation. The lab was built around abundance data for 16 oxides and elements collected by the Mars Exploration Rover Spirit in Gusev Crater between Sol 14 and Sol 470. Students used PCA to reduce 15 of these into 3 components, which,…

  14. Complexity of free energy landscapes of peptides revealed by nonlinear principal component analysis.

    PubMed

    Nguyen, Phuong H

    2006-12-01

    Employing the recently developed hierarchical nonlinear principal component analysis (NLPCA) method of Saegusa et al. (Neurocomputing 2004;61:57-70 and IEICE Trans Inf Syst 2005;E88-D:2242-2248), the complexities of the free energy landscapes of several peptides, including triglycine, hexaalanine, and the C-terminal beta-hairpin of protein G, were studied. First, the performance of this NLPCA method was compared with the standard linear principal component analysis (PCA). In particular, we compared two methods according to (1) the ability of the dimensionality reduction and (2) the efficient representation of peptide conformations in low-dimensional spaces spanned by the first few principal components. The study revealed that NLPCA reduces the dimensionality of the considered systems much better, than did PCA. For example, in order to get the similar error, which is due to representation of the original data of beta-hairpin in low dimensional space, one needs 4 and 21 principal components of NLPCA and PCA, respectively. Second, by representing the free energy landscapes of the considered systems as a function of the first two principal components obtained from PCA, we obtained the relatively well-structured free energy landscapes. In contrast, the free energy landscapes of NLPCA are much more complicated, exhibiting many states which are hidden in the PCA maps, especially in the unfolded regions. Furthermore, the study also showed that many states in the PCA maps are mixed up by several peptide conformations, while those of the NLPCA maps are more pure. This finding suggests that the NLPCA should be used to capture the essential features of the systems. (c) 2006 Wiley-Liss, Inc.

  15. Q-mode versus R-mode principal component analysis for linear discriminant analysis (LDA)

    NASA Astrophysics Data System (ADS)

    Lee, Loong Chuen; Liong, Choong-Yeun; Jemain, Abdul Aziz

    2017-05-01

    Many literature apply Principal Component Analysis (PCA) as either preliminary visualization or variable con-struction methods or both. Focus of PCA can be on the samples (R-mode PCA) or variables (Q-mode PCA). Traditionally, R-mode PCA has been the usual approach to reduce high-dimensionality data before the application of Linear Discriminant Analysis (LDA), to solve classification problems. Output from PCA composed of two new matrices known as loadings and scores matrices. Each matrix can then be used to produce a plot, i.e. loadings plot aids identification of important variables whereas scores plot presents spatial distribution of samples on new axes that are also known as Principal Components (PCs). Fundamentally, the scores matrix always be the input variables for building classification model. A recent paper uses Q-mode PCA but the focus of analysis was not on the variables but instead on the samples. As a result, the authors have exchanged the use of both loadings and scores plots in which clustering of samples was studied using loadings plot whereas scores plot has been used to identify important manifest variables. Therefore, the aim of this study is to statistically validate the proposed practice. Evaluation is based on performance of external error obtained from LDA models according to number of PCs. On top of that, bootstrapping was also conducted to evaluate the external error of each of the LDA models. Results show that LDA models produced by PCs from R-mode PCA give logical performance and the matched external error are also unbiased whereas the ones produced with Q-mode PCA show the opposites. With that, we concluded that PCs produced from Q-mode is not statistically stable and thus should not be applied to problems of classifying samples, but variables. We hope this paper will provide some insights on the disputable issues.

  16. EMPCA and Cluster Analysis of Quasar Spectra: Construction and Application to Simulated Spectra

    NASA Astrophysics Data System (ADS)

    Marrs, Adam; Leighly, Karen; Wagner, Cassidy; Macinnis, Francis

    2017-01-01

    Quasars have complex spectra with emission lines influenced by many factors. Therefore, to fully describe the spectrum requires specification of a large number of parameters, such as line equivalent width, blueshift, and ratios. Principal Component Analysis (PCA) aims to construct eigenvectors-or principal components-from the data with the goal of finding a few key parameters that can be used to predict the rest of the spectrum fairly well. Analysis of simulated quasar spectra was used to verify and justify our modified application of PCA.We used a variant of PCA called Weighted Expectation Maximization PCA (EMPCA; Bailey 2012) along with k-means cluster analysis to analyze simulated quasar spectra. Our approach combines both analytical methods to address two known problems with classical PCA. EMPCA uses weights to account for uncertainty and missing points in the spectra. K-means groups similar spectra together to address the nonlinearity of quasar spectra, specifically variance in blueshifts and widths of the emission lines.In producing and analyzing simulations, we first tested the effects of varying equivalent widths and blueshifts on the derived principal components, and explored the differences between standard PCA and EMPCA. We also tested the effects of varying signal-to-noise ratio. Next we used the results of fits to composite quasar spectra (see accompanying poster by Wagner et al.) to construct a set of realistic simulated spectra, and subjected those spectra to the EMPCA /k-means analysis. We concluded that our approach was validated when we found that the mean spectra from our k-means clusters derived from PCA projection coefficients reproduced the trends observed in the composite spectra.Furthermore, our method needed only two eigenvectors to identify both sets of correlations used to construct the simulations, as well as indicating the linear and nonlinear segments. Comparing this to regular PCA, which can require a dozen or more components, or to direct spectral analysis that may need measurement of 20 fit parameters, shows why the dual application of these two techniques is such a powerful tool.

  17. Strain Transient Detection Techniques: A Comparison of Source Parameter Inversions of Signals Isolated through Principal Component Analysis (PCA), Non-Linear PCA, and Rotated PCA

    NASA Astrophysics Data System (ADS)

    Lipovsky, B.; Funning, G. J.

    2009-12-01

    We compare several techniques for the analysis of geodetic time series with the ultimate aim to characterize the physical processes which are represented therein. We compare three methods for the analysis of these data: Principal Component Analysis (PCA), Non-Linear PCA (NLPCA), and Rotated PCA (RPCA). We evaluate each method by its ability to isolate signals which may be any combination of low amplitude (near noise level), temporally transient, unaccompanied by seismic emissions, and small scale with respect to the spatial domain. PCA is a powerful tool for extracting structure from large datasets which is traditionally realized through either the solution of an eigenvalue problem or through iterative methods. PCA is an transformation of the coordinate system of our data such that the new "principal" data axes retain maximal variance and minimal reconstruction error (Pearson, 1901; Hotelling, 1933). RPCA is achieved by an orthogonal transformation of the principal axes determined in PCA. In the analysis of meteorological data sets, RPCA has been seen to overcome domain shape dependencies, correct for sampling errors, and to determine principal axes which more closely represent physical processes (e.g., Richman, 1986). NLPCA generalizes PCA such that principal axes are replaced by principal curves (e.g., Hsieh 2004). We achieve NLPCA through an auto-associative feed-forward neural network (Scholz, 2005). We show the geophysical relevance of these techniques by application of each to a synthetic data set. Results are compared by inverting principal axes to determine deformation source parameters. Temporal variability in source parameters, estimated by each method, are also compared.

  18. Structured Sparse Principal Components Analysis With the TV-Elastic Net Penalty.

    PubMed

    de Pierrefeu, Amicie; Lofstedt, Tommy; Hadj-Selem, Fouad; Dubois, Mathieu; Jardri, Renaud; Fovet, Thomas; Ciuciu, Philippe; Frouin, Vincent; Duchesnay, Edouard

    2018-02-01

    Principal component analysis (PCA) is an exploratory tool widely used in data analysis to uncover the dominant patterns of variability within a population. Despite its ability to represent a data set in a low-dimensional space, PCA's interpretability remains limited. Indeed, the components produced by PCA are often noisy or exhibit no visually meaningful patterns. Furthermore, the fact that the components are usually non-sparse may also impede interpretation, unless arbitrary thresholding is applied. However, in neuroimaging, it is essential to uncover clinically interpretable phenotypic markers that would account for the main variability in the brain images of a population. Recently, some alternatives to the standard PCA approach, such as sparse PCA (SPCA), have been proposed, their aim being to limit the density of the components. Nonetheless, sparsity alone does not entirely solve the interpretability problem in neuroimaging, since it may yield scattered and unstable components. We hypothesized that the incorporation of prior information regarding the structure of the data may lead to improved relevance and interpretability of brain patterns. We therefore present a simple extension of the popular PCA framework that adds structured sparsity penalties on the loading vectors in order to identify the few stable regions in the brain images that capture most of the variability. Such structured sparsity can be obtained by combining, e.g., and total variation (TV) penalties, where the TV regularization encodes information on the underlying structure of the data. This paper presents the structured SPCA (denoted SPCA-TV) optimization framework and its resolution. We demonstrate SPCA-TV's effectiveness and versatility on three different data sets. It can be applied to any kind of structured data, such as, e.g., -dimensional array images or meshes of cortical surfaces. The gains of SPCA-TV over unstructured approaches (such as SPCA and ElasticNet PCA) or structured approach (such as GraphNet PCA) are significant, since SPCA-TV reveals the variability within a data set in the form of intelligible brain patterns that are easier to interpret and more stable across different samples.

  19. Discrimination of Geographical Origin of Asian Garlic Using Isotopic and Chemical Datasets under Stepwise Principal Component Analysis.

    PubMed

    Liu, Tsang-Sen; Lin, Jhen-Nan; Peng, Tsung-Ren

    2018-01-16

    Isotopic compositions of δ 2 H, δ 18 O, δ 13 C, and δ 15 N and concentrations of 22 trace elements from garlic samples were analyzed and processed with stepwise principal component analysis (PCA) to discriminate garlic's country of origin among Asian regions including South Korea, Vietnam, Taiwan, and China. Results indicate that there is no single trace-element concentration or isotopic composition that can accomplish the study's purpose and the stepwise PCA approach proposed does allow for discrimination between countries on a regional basis. Sequentially, Step-1 PCA distinguishes garlic's country of origin among Taiwanese, South Korean, and Vietnamese samples; Step-2 PCA discriminates Chinese garlic from South Korean garlic; and Step-3 and Step-4 PCA, Chinese garlic from Vietnamese garlic. In model tests, countries of origin of all audit samples were correctly discriminated by stepwise PCA. Consequently, this study demonstrates that stepwise PCA as applied is a simple and effective approach to discriminating country of origin among Asian garlics. © 2018 American Academy of Forensic Sciences.

  20. Visible micro-Raman spectroscopy of single human mammary epithelial cells exposed to x-ray radiation.

    PubMed

    Delfino, Ines; Perna, Giuseppe; Lasalvia, Maria; Capozzi, Vito; Manti, Lorenzo; Camerlingo, Carlo; Lepore, Maria

    2015-03-01

    A micro-Raman spectroscopy investigation has been performed in vitro on single human mammary epithelial cells after irradiation by graded x-ray doses. The analysis by principal component analysis (PCA) and interval-PCA (i-PCA) methods has allowed us to point out the small differences in the Raman spectra induced by irradiation. This experimental approach has enabled us to delineate radiation-induced changes in protein, nucleic acid, lipid, and carbohydrate content. In particular, the dose dependence of PCA and i-PCA components has been analyzed. Our results have confirmed that micro-Raman spectroscopy coupled to properly chosen data analysis methods is a very sensitive technique to detect early molecular changes at the single-cell level following exposure to ionizing radiation. This would help in developing innovative approaches to monitor radiation cancer radiotherapy outcome so as to reduce the overall radiation dose and minimize damage to the surrounding healthy cells, both aspects being of great importance in the field of radiation therapy.

  1. Application of principal component analysis for improvement of X-ray fluorescence images obtained by polycapillary-based micro-XRF technique

    NASA Astrophysics Data System (ADS)

    Aida, S.; Matsuno, T.; Hasegawa, T.; Tsuji, K.

    2017-07-01

    Micro X-ray fluorescence (micro-XRF) analysis is repeated as a means of producing elemental maps. In some cases, however, the XRF images of trace elements that are obtained are not clear due to high background intensity. To solve this problem, we applied principal component analysis (PCA) to XRF spectra. We focused on improving the quality of XRF images by applying PCA. XRF images of the dried residue of standard solution on the glass substrate were taken. The XRF intensities for the dried residue were analyzed before and after PCA. Standard deviations of XRF intensities in the PCA-filtered images were improved, leading to clear contrast of the images. This improvement of the XRF images was effective in cases where the XRF intensity was weak.

  2. Applying robust variant of Principal Component Analysis as a damage detector in the presence of outliers

    NASA Astrophysics Data System (ADS)

    Gharibnezhad, Fahit; Mujica, Luis E.; Rodellar, José

    2015-01-01

    Using Principal Component Analysis (PCA) for Structural Health Monitoring (SHM) has received considerable attention over the past few years. PCA has been used not only as a direct method to identify, classify and localize damages but also as a significant primary step for other methods. Despite several positive specifications that PCA conveys, it is very sensitive to outliers. Outliers are anomalous observations that can affect the variance and the covariance as vital parts of PCA method. Therefore, the results based on PCA in the presence of outliers are not fully satisfactory. As a main contribution, this work suggests the use of robust variant of PCA not sensitive to outliers, as an effective way to deal with this problem in SHM field. In addition, the robust PCA is compared with the classical PCA in the sense of detecting probable damages. The comparison between the results shows that robust PCA can distinguish the damages much better than using classical one, and even in many cases allows the detection where classic PCA is not able to discern between damaged and non-damaged structures. Moreover, different types of robust PCA are compared with each other as well as with classical counterpart in the term of damage detection. All the results are obtained through experiments with an aircraft turbine blade using piezoelectric transducers as sensors and actuators and adding simulated damages.

  3. Principal component analysis as a tool for library design: a case study investigating natural products, brand-name drugs, natural product-like libraries, and drug-like libraries.

    PubMed

    Wenderski, Todd A; Stratton, Christopher F; Bauer, Renato A; Kopp, Felix; Tan, Derek S

    2015-01-01

    Principal component analysis (PCA) is a useful tool in the design and planning of chemical libraries. PCA can be used to reveal differences in structural and physicochemical parameters between various classes of compounds by displaying them in a convenient graphical format. Herein, we demonstrate the use of PCA to gain insight into structural features that differentiate natural products, synthetic drugs, natural product-like libraries, and drug-like libraries, and show how the results can be used to guide library design.

  4. Principal Component Analysis as a Tool for Library Design: A Case Study Investigating Natural Products, Brand-Name Drugs, Natural Product-Like Libraries, and Drug-Like Libraries

    PubMed Central

    Wenderski, Todd A.; Stratton, Christopher F.; Bauer, Renato A.; Kopp, Felix; Tan, Derek S.

    2015-01-01

    Principal component analysis (PCA) is a useful tool in the design and planning of chemical libraries. PCA can be used to reveal differences in structural and physicochemical parameters between various classes of compounds by displaying them in a convenient graphical format. Herein, we demonstrate the use of PCA to gain insight into structural features that differentiate natural products, synthetic drugs, natural product-like libraries, and drug-like libraries, and show how the results can be used to guide library design. PMID:25618349

  5. Sparse principal component analysis in medical shape modeling

    NASA Astrophysics Data System (ADS)

    Sjöstrand, Karl; Stegmann, Mikkel B.; Larsen, Rasmus

    2006-03-01

    Principal component analysis (PCA) is a widely used tool in medical image analysis for data reduction, model building, and data understanding and exploration. While PCA is a holistic approach where each new variable is a linear combination of all original variables, sparse PCA (SPCA) aims at producing easily interpreted models through sparse loadings, i.e. each new variable is a linear combination of a subset of the original variables. One of the aims of using SPCA is the possible separation of the results into isolated and easily identifiable effects. This article introduces SPCA for shape analysis in medicine. Results for three different data sets are given in relation to standard PCA and sparse PCA by simple thresholding of small loadings. Focus is on a recent algorithm for computing sparse principal components, but a review of other approaches is supplied as well. The SPCA algorithm has been implemented using Matlab and is available for download. The general behavior of the algorithm is investigated, and strengths and weaknesses are discussed. The original report on the SPCA algorithm argues that the ordering of modes is not an issue. We disagree on this point and propose several approaches to establish sensible orderings. A method that orders modes by decreasing variance and maximizes the sum of variances for all modes is presented and investigated in detail.

  6. A new statistical PCA-ICA algorithm for location of R-peaks in ECG.

    PubMed

    Chawla, M P S; Verma, H K; Kumar, Vinod

    2008-09-16

    The success of ICA to separate the independent components from the mixture depends on the properties of the electrocardiogram (ECG) recordings. This paper discusses some of the conditions of independent component analysis (ICA) that could affect the reliability of the separation and evaluation of issues related to the properties of the signals and number of sources. Principal component analysis (PCA) scatter plots are plotted to indicate the diagnostic features in the presence and absence of base-line wander in interpreting the ECG signals. In this analysis, a newly developed statistical algorithm by authors, based on the use of combined PCA-ICA for two correlated channels of 12-channel ECG data is proposed. ICA technique has been successfully implemented in identifying and removal of noise and artifacts from ECG signals. Cleaned ECG signals are obtained using statistical measures like kurtosis and variance of variance after ICA processing. This analysis also paper deals with the detection of QRS complexes in electrocardiograms using combined PCA-ICA algorithm. The efficacy of the combined PCA-ICA algorithm lies in the fact that the location of the R-peaks is bounded from above and below by the location of the cross-over points, hence none of the peaks are ignored or missed.

  7. Optimized principal component analysis on coronagraphic images of the fomalhaut system

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Meshkat, Tiffany; Kenworthy, Matthew A.; Quanz, Sascha P.

    We present the results of a study to optimize the principal component analysis (PCA) algorithm for planet detection, a new algorithm complementing angular differential imaging and locally optimized combination of images (LOCI) for increasing the contrast achievable next to a bright star. The stellar point spread function (PSF) is constructed by removing linear combinations of principal components, allowing the flux from an extrasolar planet to shine through. The number of principal components used determines how well the stellar PSF is globally modeled. Using more principal components may decrease the number of speckles in the final image, but also increases themore » background noise. We apply PCA to Fomalhaut Very Large Telescope NaCo images acquired at 4.05 μm with an apodized phase plate. We do not detect any companions, with a model dependent upper mass limit of 13-18 M {sub Jup} from 4-10 AU. PCA achieves greater sensitivity than the LOCI algorithm for the Fomalhaut coronagraphic data by up to 1 mag. We make several adaptations to the PCA code and determine which of these prove the most effective at maximizing the signal-to-noise from a planet very close to its parent star. We demonstrate that optimizing the number of principal components used in PCA proves most effective for pulling out a planet signal.« less

  8. A diffusion-matched principal component analysis (DM-PCA) based two-channel denoising procedure for high-resolution diffusion-weighted MRI

    PubMed Central

    Chang, Hing-Chiu; Bilgin, Ali; Bernstein, Adam; Trouard, Theodore P.

    2018-01-01

    Over the past several years, significant efforts have been made to improve the spatial resolution of diffusion-weighted imaging (DWI), aiming at better detecting subtle lesions and more reliably resolving white-matter fiber tracts. A major concern with high-resolution DWI is the limited signal-to-noise ratio (SNR), which may significantly offset the advantages of high spatial resolution. Although the SNR of DWI data can be improved by denoising in post-processing, existing denoising procedures may potentially reduce the anatomic resolvability of high-resolution imaging data. Additionally, non-Gaussian noise induced signal bias in low-SNR DWI data may not always be corrected with existing denoising approaches. Here we report an improved denoising procedure, termed diffusion-matched principal component analysis (DM-PCA), which comprises 1) identifying a group of (not necessarily neighboring) voxels that demonstrate very similar magnitude signal variation patterns along the diffusion dimension, 2) correcting low-frequency phase variations in complex-valued DWI data, 3) performing PCA along the diffusion dimension for real- and imaginary-components (in two separate channels) of phase-corrected DWI voxels with matched diffusion properties, 4) suppressing the noisy PCA components in real- and imaginary-components, separately, of phase-corrected DWI data, and 5) combining real- and imaginary-components of denoised DWI data. Our data show that the new two-channel (i.e., for real- and imaginary-components) DM-PCA denoising procedure performs reliably without noticeably compromising anatomic resolvability. Non-Gaussian noise induced signal bias could also be reduced with the new denoising method. The DM-PCA based denoising procedure should prove highly valuable for high-resolution DWI studies in research and clinical uses. PMID:29694400

  9. Fast principal component analysis for stacking seismic data

    NASA Astrophysics Data System (ADS)

    Wu, Juan; Bai, Min

    2018-04-01

    Stacking seismic data plays an indispensable role in many steps of the seismic data processing and imaging workflow. Optimal stacking of seismic data can help mitigate seismic noise and enhance the principal components to a great extent. Traditional average-based seismic stacking methods cannot obtain optimal performance when the ambient noise is extremely strong. We propose a principal component analysis (PCA) algorithm for stacking seismic data without being sensitive to noise level. Considering the computational bottleneck of the classic PCA algorithm in processing massive seismic data, we propose an efficient PCA algorithm to make the proposed method readily applicable for industrial applications. Two numerically designed examples and one real seismic data are used to demonstrate the performance of the presented method.

  10. Breast Shape Analysis With Curvature Estimates and Principal Component Analysis for Cosmetic and Reconstructive Breast Surgery.

    PubMed

    Catanuto, Giuseppe; Taher, Wafa; Rocco, Nicola; Catalano, Francesca; Allegra, Dario; Milotta, Filippo Luigi Maria; Stanco, Filippo; Gallo, Giovanni; Nava, Maurizio Bruno

    2018-03-20

    Breast shape is defined utilizing mainly qualitative assessment (full, flat, ptotic) or estimates, such as volume or distances between reference points, that cannot describe it reliably. We will quantitatively describe breast shape with two parameters derived from a statistical methodology denominated principal component analysis (PCA). We created a heterogeneous dataset of breast shapes acquired with a commercial infrared 3-dimensional scanner on which PCA was performed. We plotted on a Cartesian plane the two highest values of PCA for each breast (principal components 1 and 2). Testing of the methodology on a preoperative and postoperative surgical case and test-retest was performed by two operators. The first two principal components derived from PCA are able to characterize the shape of the breast included in the dataset. The test-retest demonstrated that different operators are able to obtain very similar values of PCA. The system is also able to identify major changes in the preoperative and postoperative stages of a two-stage reconstruction. Even minor changes were correctly detected by the system. This methodology can reliably describe the shape of a breast. An expert operator and a newly trained operator can reach similar results in a test/re-testing validation. Once developed and after further validation, this methodology could be employed as a good tool for outcome evaluation, auditing, and benchmarking.

  11. The fractal characteristic of facial anthropometric data for developing PCA fit test panels for youth born in central China.

    PubMed

    Yang, Lei; Wei, Ran; Shen, Henggen

    2017-01-01

    New principal component analysis (PCA) respirator fit test panels had been developed for current American and Chinese civilian workers based on anthropometric surveys. The PCA panels used the first two principal components (PCs) obtained from a set of 10 facial dimensions. Although the PCA panels for American and Chinese subjects adopted the bivairate framework with two PCs, the number of the PCs retained in the PCA analysis was different between Chinese subjects and Americans. For the Chinese youth group, the third PC should be retained in the PCA analysis for developing new fit test panels. In this article, an additional number label (ANL) is used to explain the third PC in PCA analysis when the first two PCs are used to construct the PCA half-facepiece respirator fit test panel for Chinese group. The three-dimensional box-counting method is proposed to estimate the ANLs by calculating fractal dimensions of the facial anthropometric data of the Chinese youth. The linear regression coefficients of scale-free range R 2 are all over 0.960, which demonstrates that the facial anthropometric data of the Chinese youth has fractal characteristic. The youth subjects born in Henan province has an ANL of 2.002, which is lower than the composite facial anthropometric data of Chinese subjects born in many provinces. Hence, Henan youth subjects have the self-similar facial anthropometric characteristic and should use the particular ANL (2.002) as the important tool along with using the PCA panel. The ANL method proposed in this article not only provides a new methodology in quantifying the characteristics of facial anthropometric dimensions for any ethnic/racial group, but also extends the scope of PCA panel studies to higher dimensions.

  12. Cluster and principal component analysis based on SSR markers of Amomum tsao-ko in Jinping County of Yunnan Province

    NASA Astrophysics Data System (ADS)

    Ma, Mengli; Lei, En; Meng, Hengling; Wang, Tiantao; Xie, Linyan; Shen, Dong; Xianwang, Zhou; Lu, Bingyue

    2017-08-01

    Amomum tsao-ko is a commercial plant that used for various purposes in medicinal and food industries. For the present investigation, 44 germplasm samples were collected from Jinping County of Yunnan Province. Clusters analysis and 2-dimensional principal component analysis (PCA) was used to represent the genetic relations among Amomum tsao-ko by using simple sequence repeat (SSR) markers. Clustering analysis clearly distinguished the samples groups. Two major clusters were formed; first (Cluster I) consisted of 34 individuals, the second (Cluster II) consisted of 10 individuals, Cluster I as the main group contained multiple sub-clusters. PCA also showed 2 groups: PCA Group 1 included 29 individuals, PCA Group 2 included 12 individuals, consistent with the results of cluster analysis. The purpose of the present investigation was to provide information on genetic relationship of Amomum tsao-ko germplasm resources in main producing areas, also provide a theoretical basis for the protection and utilization of Amomum tsao-ko resources.

  13. Stationary Wavelet-based Two-directional Two-dimensional Principal Component Analysis for EMG Signal Classification

    NASA Astrophysics Data System (ADS)

    Ji, Yi; Sun, Shanlin; Xie, Hong-Bo

    2017-06-01

    Discrete wavelet transform (WT) followed by principal component analysis (PCA) has been a powerful approach for the analysis of biomedical signals. Wavelet coefficients at various scales and channels were usually transformed into a one-dimensional array, causing issues such as the curse of dimensionality dilemma and small sample size problem. In addition, lack of time-shift invariance of WT coefficients can be modeled as noise and degrades the classifier performance. In this study, we present a stationary wavelet-based two-directional two-dimensional principal component analysis (SW2D2PCA) method for the efficient and effective extraction of essential feature information from signals. Time-invariant multi-scale matrices are constructed in the first step. The two-directional two-dimensional principal component analysis then operates on the multi-scale matrices to reduce the dimension, rather than vectors in conventional PCA. Results are presented from an experiment to classify eight hand motions using 4-channel electromyographic (EMG) signals recorded in healthy subjects and amputees, which illustrates the efficiency and effectiveness of the proposed method for biomedical signal analysis.

  14. From measurements to metrics: PCA-based indicators of cyber anomaly

    NASA Astrophysics Data System (ADS)

    Ahmed, Farid; Johnson, Tommy; Tsui, Sonia

    2012-06-01

    We present a framework of the application of Principal Component Analysis (PCA) to automatically obtain meaningful metrics from intrusion detection measurements. In particular, we report the progress made in applying PCA to analyze the behavioral measurements of malware and provide some preliminary results in selecting dominant attributes from an arbitrary number of malware attributes. The results will be useful in formulating an optimal detection threshold in the principal component space, which can both validate and augment existing malware classifiers.

  15. RECENT APPLICATIONS OF SOURCE APPORTIONMENT METHODS AND RELATED NEEDS

    EPA Science Inventory

    Traditional receptor modeling studies have utilized factor analysis (like principal component analysis, PCA) and/or Chemical Mass Balance (CMB) to assess source influences. The limitations with these approaches is that PCA is qualitative and CMB requires the input of source pr...

  16. A measure for objects clustering in principal component analysis biplot: A case study in inter-city buses maintenance cost data

    NASA Astrophysics Data System (ADS)

    Ginanjar, Irlandia; Pasaribu, Udjianna S.; Indratno, Sapto W.

    2017-03-01

    This article presents the application of the principal component analysis (PCA) biplot for the needs of data mining. This article aims to simplify and objectify the methods for objects clustering in PCA biplot. The novelty of this paper is to get a measure that can be used to objectify the objects clustering in PCA biplot. Orthonormal eigenvectors, which are the coefficients of a principal component model representing an association between principal components and initial variables. The existence of the association is a valid ground to objects clustering based on principal axes value, thus if m principal axes used in the PCA, then the objects can be classified into 2m clusters. The inter-city buses are clustered based on maintenance costs data by using two principal axes PCA biplot. The buses are clustered into four groups. The first group is the buses with high maintenance costs, especially for lube, and brake canvass. The second group is the buses with high maintenance costs, especially for tire, and filter. The third group is the buses with low maintenance costs, especially for lube, and brake canvass. The fourth group is buses with low maintenance costs, especially for tire, and filter.

  17. Perturbational formulation of principal component analysis in molecular dynamics simulation.

    PubMed

    Koyama, Yohei M; Kobayashi, Tetsuya J; Tomoda, Shuji; Ueda, Hiroki R

    2008-10-01

    Conformational fluctuations of a molecule are important to its function since such intrinsic fluctuations enable the molecule to respond to the external environmental perturbations. For extracting large conformational fluctuations, which predict the primary conformational change by the perturbation, principal component analysis (PCA) has been used in molecular dynamics simulations. However, several versions of PCA, such as Cartesian coordinate PCA and dihedral angle PCA (dPCA), are limited to use with molecules with a single dominant state or proteins where the dihedral angle represents an important internal coordinate. Other PCAs with general applicability, such as the PCA using pairwise atomic distances, do not represent the physical meaning clearly. Therefore, a formulation that provides general applicability and clearly represents the physical meaning is yet to be developed. For developing such a formulation, we consider the conformational distribution change by the perturbation with arbitrary linearly independent perturbation functions. Within the second order approximation of the Kullback-Leibler divergence by the perturbation, the PCA can be naturally interpreted as a method for (1) decomposing a given perturbation into perturbations that independently contribute to the conformational distribution change or (2) successively finding the perturbation that induces the largest conformational distribution change. In this perturbational formulation of PCA, (i) the eigenvalue measures the Kullback-Leibler divergence from the unperturbed to perturbed distributions, (ii) the eigenvector identifies the combination of the perturbation functions, and (iii) the principal component determines the probability change induced by the perturbation. Based on this formulation, we propose a PCA using potential energy terms, and we designate it as potential energy PCA (PEPCA). The PEPCA provides both general applicability and clear physical meaning. For demonstrating its power, we apply the PEPCA to an alanine dipeptide molecule in vacuum as a minimal model of a nonsingle dominant conformational biomolecule. The first and second principal components clearly characterize two stable states and the transition state between them. Positive and negative components with larger absolute values of the first and second eigenvectors identify the electrostatic interactions, which stabilize or destabilize each stable state and the transition state. Our result therefore indicates that PCA can be applied, by carefully selecting the perturbation functions, not only to identify the molecular conformational fluctuation but also to predict the conformational distribution change by the perturbation beyond the limitation of the previous methods.

  18. Perturbational formulation of principal component analysis in molecular dynamics simulation

    NASA Astrophysics Data System (ADS)

    Koyama, Yohei M.; Kobayashi, Tetsuya J.; Tomoda, Shuji; Ueda, Hiroki R.

    2008-10-01

    Conformational fluctuations of a molecule are important to its function since such intrinsic fluctuations enable the molecule to respond to the external environmental perturbations. For extracting large conformational fluctuations, which predict the primary conformational change by the perturbation, principal component analysis (PCA) has been used in molecular dynamics simulations. However, several versions of PCA, such as Cartesian coordinate PCA and dihedral angle PCA (dPCA), are limited to use with molecules with a single dominant state or proteins where the dihedral angle represents an important internal coordinate. Other PCAs with general applicability, such as the PCA using pairwise atomic distances, do not represent the physical meaning clearly. Therefore, a formulation that provides general applicability and clearly represents the physical meaning is yet to be developed. For developing such a formulation, we consider the conformational distribution change by the perturbation with arbitrary linearly independent perturbation functions. Within the second order approximation of the Kullback-Leibler divergence by the perturbation, the PCA can be naturally interpreted as a method for (1) decomposing a given perturbation into perturbations that independently contribute to the conformational distribution change or (2) successively finding the perturbation that induces the largest conformational distribution change. In this perturbational formulation of PCA, (i) the eigenvalue measures the Kullback-Leibler divergence from the unperturbed to perturbed distributions, (ii) the eigenvector identifies the combination of the perturbation functions, and (iii) the principal component determines the probability change induced by the perturbation. Based on this formulation, we propose a PCA using potential energy terms, and we designate it as potential energy PCA (PEPCA). The PEPCA provides both general applicability and clear physical meaning. For demonstrating its power, we apply the PEPCA to an alanine dipeptide molecule in vacuum as a minimal model of a nonsingle dominant conformational biomolecule. The first and second principal components clearly characterize two stable states and the transition state between them. Positive and negative components with larger absolute values of the first and second eigenvectors identify the electrostatic interactions, which stabilize or destabilize each stable state and the transition state. Our result therefore indicates that PCA can be applied, by carefully selecting the perturbation functions, not only to identify the molecular conformational fluctuation but also to predict the conformational distribution change by the perturbation beyond the limitation of the previous methods.

  19. Principal component analysis of indocyanine green fluorescence dynamics for diagnosis of vascular diseases

    NASA Astrophysics Data System (ADS)

    Seo, Jihye; An, Yuri; Lee, Jungsul; Choi, Chulhee

    2015-03-01

    Indocyanine green (ICG), a near-infrared fluorophore, has been used in visualization of vascular structure and non-invasive diagnosis of vascular disease. Although many imaging techniques have been developed, there are still limitations in diagnosis of vascular diseases. We have recently developed a minimally invasive diagnostics system based on ICG fluorescence imaging for sensitive detection of vascular insufficiency. In this study, we used principal component analysis (PCA) to examine ICG spatiotemporal profile and to obtain pathophysiological information from ICG dynamics. Here we demonstrated that principal components of ICG dynamics in both feet showed significant differences between normal control and diabetic patients with vascula complications. We extracted the PCA time courses of the first three components and found distinct pattern in diabetic patient. We propose that PCA of ICG dynamics reveal better classification performance compared to fluorescence intensity analysis. We anticipate that specific feature of spatiotemporal ICG dynamics can be useful in diagnosis of various vascular diseases.

  20. Multivariate analysis for scanning tunneling spectroscopy data

    NASA Astrophysics Data System (ADS)

    Yamanishi, Junsuke; Iwase, Shigeru; Ishida, Nobuyuki; Fujita, Daisuke

    2018-01-01

    We applied principal component analysis (PCA) to two-dimensional tunneling spectroscopy (2DTS) data obtained on a Si(111)-(7 × 7) surface to explore the effectiveness of multivariate analysis for interpreting 2DTS data. We demonstrated that several components that originated mainly from specific atoms at the Si(111)-(7 × 7) surface can be extracted by PCA. Furthermore, we showed that hidden components in the tunneling spectra can be decomposed (peak separation), which is difficult to achieve with normal 2DTS analysis without the support of theoretical calculations. Our analysis showed that multivariate analysis can be an additional powerful way to analyze 2DTS data and extract hidden information from a large amount of spectroscopic data.

  1. Principal Component Analysis of Thermographic Data

    NASA Technical Reports Server (NTRS)

    Winfree, William P.; Cramer, K. Elliott; Zalameda, Joseph N.; Howell, Patricia A.; Burke, Eric R.

    2015-01-01

    Principal Component Analysis (PCA) has been shown effective for reducing thermographic NDE data. While a reliable technique for enhancing the visibility of defects in thermal data, PCA can be computationally intense and time consuming when applied to the large data sets typical in thermography. Additionally, PCA can experience problems when very large defects are present (defects that dominate the field-of-view), since the calculation of the eigenvectors is now governed by the presence of the defect, not the "good" material. To increase the processing speed and to minimize the negative effects of large defects, an alternative method of PCA is being pursued where a fixed set of eigenvectors, generated from an analytic model of the thermal response of the material under examination, is used to process the thermal data from composite materials. This method has been applied for characterization of flaws.

  2. An application of principal component analysis to the clavicle and clavicle fixation devices.

    PubMed

    Daruwalla, Zubin J; Courtis, Patrick; Fitzpatrick, Clare; Fitzpatrick, David; Mullett, Hannan

    2010-03-26

    Principal component analysis (PCA) enables the building of statistical shape models of bones and joints. This has been used in conjunction with computer assisted surgery in the past. However, PCA of the clavicle has not been performed. Using PCA, we present a novel method that examines the major modes of size and three-dimensional shape variation in male and female clavicles and suggests a method of grouping the clavicle into size and shape categories. Twenty-one high-resolution computerized tomography scans of the clavicle were reconstructed and analyzed using a specifically developed statistical software package. After performing statistical shape analysis, PCA was applied to study the factors that account for anatomical variation. The first principal component representing size accounted for 70.5 percent of anatomical variation. The addition of a further three principal components accounted for almost 87 percent. Using statistical shape analysis, clavicles in males have a greater lateral depth and are longer, wider and thicker than in females. However, the sternal angle in females is larger than in males. PCA confirmed these differences between genders but also noted that men exhibit greater variance and classified clavicles into five morphological groups. This unique approach is the first that standardizes a clavicular orientation. It provides information that is useful to both, the biomedical engineer and clinician. Other applications include implant design with regard to modifying current or designing future clavicle fixation devices. Our findings support the need for further development of clavicle fixation devices and the questioning of whether gender-specific devices are necessary.

  3. Multilevel principal component analysis (mPCA) in shape analysis: A feasibility study in medical and dental imaging.

    PubMed

    Farnell, D J J; Popat, H; Richmond, S

    2016-06-01

    Methods used in image processing should reflect any multilevel structures inherent in the image dataset or they run the risk of functioning inadequately. We wish to test the feasibility of multilevel principal components analysis (PCA) to build active shape models (ASMs) for cases relevant to medical and dental imaging. Multilevel PCA was used to carry out model fitting to sets of landmark points and it was compared to the results of "standard" (single-level) PCA. Proof of principle was tested by applying mPCA to model basic peri-oral expressions (happy, neutral, sad) approximated to the junction between the mouth/lips. Monte Carlo simulations were used to create this data which allowed exploration of practical implementation issues such as the number of landmark points, number of images, and number of groups (i.e., "expressions" for this example). To further test the robustness of the method, mPCA was subsequently applied to a dental imaging dataset utilising landmark points (placed by different clinicians) along the boundary of mandibular cortical bone in panoramic radiographs of the face. Changes of expression that varied between groups were modelled correctly at one level of the model and changes in lip width that varied within groups at another for the Monte Carlo dataset. Extreme cases in the test dataset were modelled adequately by mPCA but not by standard PCA. Similarly, variations in the shape of the cortical bone were modelled by one level of mPCA and variations between the experts at another for the panoramic radiographs dataset. Results for mPCA were found to be comparable to those of standard PCA for point-to-point errors via miss-one-out testing for this dataset. These errors reduce with increasing number of eigenvectors/values retained, as expected. We have shown that mPCA can be used in shape models for dental and medical image processing. mPCA was found to provide more control and flexibility when compared to standard "single-level" PCA. Specifically, mPCA is preferable to "standard" PCA when multiple levels occur naturally in the dataset. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  4. Comparison of multi-subject ICA methods for analysis of fMRI data

    PubMed Central

    Erhardt, Erik Barry; Rachakonda, Srinivas; Bedrick, Edward; Allen, Elena; Adali, Tülay; Calhoun, Vince D.

    2010-01-01

    Spatial independent component analysis (ICA) applied to functional magnetic resonance imaging (fMRI) data identifies functionally connected networks by estimating spatially independent patterns from their linearly mixed fMRI signals. Several multi-subject ICA approaches estimating subject-specific time courses (TCs) and spatial maps (SMs) have been developed, however there has not yet been a full comparison of the implications of their use. Here, we provide extensive comparisons of four multi-subject ICA approaches in combination with data reduction methods for simulated and fMRI task data. For multi-subject ICA, the data first undergo reduction at the subject and group levels using principal component analysis (PCA). Comparisons of subject-specific, spatial concatenation, and group data mean subject-level reduction strategies using PCA and probabilistic PCA (PPCA) show that computationally intensive PPCA is equivalent to PCA, and that subject-specific and group data mean subject-level PCA are preferred because of well-estimated TCs and SMs. Second, aggregate independent components are estimated using either noise free ICA or probabilistic ICA (PICA). Third, subject-specific SMs and TCs are estimated using back-reconstruction. We compare several direct group ICA (GICA) back-reconstruction approaches (GICA1-GICA3) and an indirect back-reconstruction approach, spatio-temporal regression (STR, or dual regression). Results show the earlier group ICA (GICA1) approximates STR, however STR has contradictory assumptions and may show mixed-component artifacts in estimated SMs. Our evidence-based recommendation is to use GICA3, introduced here, with subject-specific PCA and noise-free ICA, providing the most robust and accurate estimated SMs and TCs in addition to offering an intuitive interpretation. PMID:21162045

  5. Principal component analysis of the CT density histogram to generate parametric response maps of COPD

    NASA Astrophysics Data System (ADS)

    Zha, N.; Capaldi, D. P. I.; Pike, D.; McCormack, D. G.; Cunningham, I. A.; Parraga, G.

    2015-03-01

    Pulmonary x-ray computed tomography (CT) may be used to characterize emphysema and airways disease in patients with chronic obstructive pulmonary disease (COPD). One analysis approach - parametric response mapping (PMR) utilizes registered inspiratory and expiratory CT image volumes and CT-density-histogram thresholds, but there is no consensus regarding the threshold values used, or their clinical meaning. Principal-component-analysis (PCA) of the CT density histogram can be exploited to quantify emphysema using data-driven CT-density-histogram thresholds. Thus, the objective of this proof-of-concept demonstration was to develop a PRM approach using PCA-derived thresholds in COPD patients and ex-smokers without airflow limitation. Methods: Fifteen COPD ex-smokers and 5 normal ex-smokers were evaluated. Thoracic CT images were also acquired at full inspiration and full expiration and these images were non-rigidly co-registered. PCA was performed for the CT density histograms, from which the components with the highest eigenvalues greater than one were summed. Since the values of the principal component curve correlate directly with the variability in the sample, the maximum and minimum points on the curve were used as threshold values for the PCA-adjusted PRM technique. Results: A significant correlation was determined between conventional and PCA-adjusted PRM with 3He MRI apparent diffusion coefficient (p<0.001), with CT RA950 (p<0.0001), as well as with 3He MRI ventilation defect percent, a measurement of both small airways disease (p=0.049 and p=0.06, respectively) and emphysema (p=0.02). Conclusions: PRM generated using PCA thresholds of the CT density histogram showed significant correlations with CT and 3He MRI measurements of emphysema, but not airways disease.

  6. IMPROVED SEARCH OF PRINCIPAL COMPONENT ANALYSIS DATABASES FOR SPECTRO-POLARIMETRIC INVERSION

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Casini, R.; Lites, B. W.; Ramos, A. Asensio

    2013-08-20

    We describe a simple technique for the acceleration of spectro-polarimetric inversions based on principal component analysis (PCA) of Stokes profiles. This technique involves the indexing of the database models based on the sign of the projections (PCA coefficients) of the first few relevant orders of principal components of the four Stokes parameters. In this way, each model in the database can be attributed a distinctive binary number of 2{sup 4n} bits, where n is the number of PCA orders used for the indexing. Each of these binary numbers (indices) identifies a group of ''compatible'' models for the inversion of amore » given set of observed Stokes profiles sharing the same index. The complete set of the binary numbers so constructed evidently determines a partition of the database. The search of the database for the PCA inversion of spectro-polarimetric data can profit greatly from this indexing. In practical cases it becomes possible to approach the ideal acceleration factor of 2{sup 4n} as compared to the systematic search of a non-indexed database for a traditional PCA inversion. This indexing method relies on the existence of a physical meaning in the sign of the PCA coefficients of a model. For this reason, the presence of model ambiguities and of spectro-polarimetric noise in the observations limits in practice the number n of relevant PCA orders that can be used for the indexing.« less

  7. Developing and Evaluating Creativity Gamification Rehabilitation System: The Application of PCA-ANFIS Based Emotions Model

    ERIC Educational Resources Information Center

    Su, Chung-Ho; Cheng, Ching-Hsue

    2016-01-01

    This study aims to explore the factors in a patient's rehabilitation achievement after a total knee replacement (TKR) patient exercises, using a PCA-ANFIS emotion model-based game rehabilitation system, which combines virtual reality (VR) and motion capture technology. The researchers combine a principal component analysis (PCA) and an adaptive…

  8. Contact- and distance-based principal component analysis of protein dynamics.

    PubMed

    Ernst, Matthias; Sittel, Florian; Stock, Gerhard

    2015-12-28

    To interpret molecular dynamics simulations of complex systems, systematic dimensionality reduction methods such as principal component analysis (PCA) represent a well-established and popular approach. Apart from Cartesian coordinates, internal coordinates, e.g., backbone dihedral angles or various kinds of distances, may be used as input data in a PCA. Adopting two well-known model problems, folding of villin headpiece and the functional dynamics of BPTI, a systematic study of PCA using distance-based measures is presented which employs distances between Cα-atoms as well as distances between inter-residue contacts including side chains. While this approach seems prohibitive for larger systems due to the quadratic scaling of the number of distances with the size of the molecule, it is shown that it is sufficient (and sometimes even better) to include only relatively few selected distances in the analysis. The quality of the PCA is assessed by considering the resolution of the resulting free energy landscape (to identify metastable conformational states and barriers) and the decay behavior of the corresponding autocorrelation functions (to test the time scale separation of the PCA). By comparing results obtained with distance-based, dihedral angle, and Cartesian coordinates, the study shows that the choice of input variables may drastically influence the outcome of a PCA.

  9. Contact- and distance-based principal component analysis of protein dynamics

    NASA Astrophysics Data System (ADS)

    Ernst, Matthias; Sittel, Florian; Stock, Gerhard

    2015-12-01

    To interpret molecular dynamics simulations of complex systems, systematic dimensionality reduction methods such as principal component analysis (PCA) represent a well-established and popular approach. Apart from Cartesian coordinates, internal coordinates, e.g., backbone dihedral angles or various kinds of distances, may be used as input data in a PCA. Adopting two well-known model problems, folding of villin headpiece and the functional dynamics of BPTI, a systematic study of PCA using distance-based measures is presented which employs distances between Cα-atoms as well as distances between inter-residue contacts including side chains. While this approach seems prohibitive for larger systems due to the quadratic scaling of the number of distances with the size of the molecule, it is shown that it is sufficient (and sometimes even better) to include only relatively few selected distances in the analysis. The quality of the PCA is assessed by considering the resolution of the resulting free energy landscape (to identify metastable conformational states and barriers) and the decay behavior of the corresponding autocorrelation functions (to test the time scale separation of the PCA). By comparing results obtained with distance-based, dihedral angle, and Cartesian coordinates, the study shows that the choice of input variables may drastically influence the outcome of a PCA.

  10. Subject order-independent group ICA (SOI-GICA) for functional MRI data analysis.

    PubMed

    Zhang, Han; Zuo, Xi-Nian; Ma, Shuang-Ye; Zang, Yu-Feng; Milham, Michael P; Zhu, Chao-Zhe

    2010-07-15

    Independent component analysis (ICA) is a data-driven approach to study functional magnetic resonance imaging (fMRI) data. Particularly, for group analysis on multiple subjects, temporally concatenation group ICA (TC-GICA) is intensively used. However, due to the usually limited computational capability, data reduction with principal component analysis (PCA: a standard preprocessing step of ICA decomposition) is difficult to achieve for a large dataset. To overcome this, TC-GICA employs multiple-stage PCA data reduction. Such multiple-stage PCA data reduction, however, leads to variable outputs due to different subject concatenation orders. Consequently, the ICA algorithm uses the variable multiple-stage PCA outputs and generates variable decompositions. In this study, a rigorous theoretical analysis was conducted to prove the existence of such variability. Simulated and real fMRI experiments were used to demonstrate the subject-order-induced variability of TC-GICA results using multiple PCA data reductions. To solve this problem, we propose a new subject order-independent group ICA (SOI-GICA). Both simulated and real fMRI data experiments demonstrated the high robustness and accuracy of the SOI-GICA results compared to those of traditional TC-GICA. Accordingly, we recommend SOI-GICA for group ICA-based fMRI studies, especially those with large data sets. Copyright 2010 Elsevier Inc. All rights reserved.

  11. Differential principal component analysis of ChIP-seq.

    PubMed

    Ji, Hongkai; Li, Xia; Wang, Qian-fei; Ning, Yang

    2013-04-23

    We propose differential principal component analysis (dPCA) for analyzing multiple ChIP-sequencing datasets to identify differential protein-DNA interactions between two biological conditions. dPCA integrates unsupervised pattern discovery, dimension reduction, and statistical inference into a single framework. It uses a small number of principal components to summarize concisely the major multiprotein synergistic differential patterns between the two conditions. For each pattern, it detects and prioritizes differential genomic loci by comparing the between-condition differences with the within-condition variation among replicate samples. dPCA provides a unique tool for efficiently analyzing large amounts of ChIP-sequencing data to study dynamic changes of gene regulation across different biological conditions. We demonstrate this approach through analyses of differential chromatin patterns at transcription factor binding sites and promoters as well as allele-specific protein-DNA interactions.

  12. Principal Component Analysis: A Method for Determining the Essential Dynamics of Proteins

    PubMed Central

    David, Charles C.; Jacobs, Donald J.

    2015-01-01

    It has become commonplace to employ principal component analysis to reveal the most important motions in proteins. This method is more commonly known by its acronym, PCA. While most popular molecular dynamics packages inevitably provide PCA tools to analyze protein trajectories, researchers often make inferences of their results without having insight into how to make interpretations, and they are often unaware of limitations and generalizations of such analysis. Here we review best practices for applying standard PCA, describe useful variants, discuss why one may wish to make comparison studies, and describe a set of metrics that make comparisons possible. In practice, one will be forced to make inferences about the essential dynamics of a protein without having the desired amount of samples. Therefore, considerable time is spent on describing how to judge the significance of results, highlighting pitfalls. The topic of PCA is reviewed from the perspective of many practical considerations, and useful recipes are provided. PMID:24061923

  13. Principal component analysis: a method for determining the essential dynamics of proteins.

    PubMed

    David, Charles C; Jacobs, Donald J

    2014-01-01

    It has become commonplace to employ principal component analysis to reveal the most important motions in proteins. This method is more commonly known by its acronym, PCA. While most popular molecular dynamics packages inevitably provide PCA tools to analyze protein trajectories, researchers often make inferences of their results without having insight into how to make interpretations, and they are often unaware of limitations and generalizations of such analysis. Here we review best practices for applying standard PCA, describe useful variants, discuss why one may wish to make comparison studies, and describe a set of metrics that make comparisons possible. In practice, one will be forced to make inferences about the essential dynamics of a protein without having the desired amount of samples. Therefore, considerable time is spent on describing how to judge the significance of results, highlighting pitfalls. The topic of PCA is reviewed from the perspective of many practical considerations, and useful recipes are provided.

  14. Statistical techniques applied to aerial radiometric surveys (STAARS): principal components analysis user's manual. [NURE program

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Koch, C.D.; Pirkle, F.L.; Schmidt, J.S.

    1981-01-01

    A Principal Components Analysis (PCA) has been written to aid in the interpretation of multivariate aerial radiometric data collected by the US Department of Energy (DOE) under the National Uranium Resource Evaluation (NURE) program. The variations exhibited by these data have been reduced and classified into a number of linear combinations by using the PCA program. The PCA program then generates histograms and outlier maps of the individual variates. Black and white plots can be made on a Calcomp plotter by the application of follow-up programs. All programs referred to in this guide were written for a DEC-10. From thismore » analysis a geologist may begin to interpret the data structure. Insight into geological processes underlying the data may be obtained.« less

  15. Adaptive online monitoring for ICU patients by combining just-in-time learning and principal component analysis.

    PubMed

    Li, Xuejian; Wang, Youqing

    2016-12-01

    Offline general-type models are widely used for patients' monitoring in intensive care units (ICUs), which are developed by using past collected datasets consisting of thousands of patients. However, these models may fail to adapt to the changing states of ICU patients. Thus, to be more robust and effective, the monitoring models should be adaptable to individual patients. A novel combination of just-in-time learning (JITL) and principal component analysis (PCA), referred to learning-type PCA (L-PCA), was proposed for adaptive online monitoring of patients in ICUs. JITL was used to gather the most relevant data samples for adaptive modeling of complex physiological processes. PCA was used to build an online individual-type model and calculate monitoring statistics, and then to judge whether the patient's status is normal or not. The adaptability of L-PCA lies in the usage of individual data and the continuous updating of the training dataset. Twelve subjects were selected from the Physiobank's Multi-parameter Intelligent Monitoring for Intensive Care II (MIMIC II) database, and five vital signs of each subject were chosen. The proposed method was compared with the traditional PCA and fast moving-window PCA (Fast MWPCA). The experimental results demonstrated that the fault detection rates respectively increased by 20 % and 47 % compared with PCA and Fast MWPCA. L-PCA is first introduced into ICU patients monitoring and achieves the best monitoring performance in terms of adaptability to changes in patient status and sensitivity for abnormality detection.

  16. Classification of alloys using laser induced breakdown spectroscopy with principle component analysis

    NASA Astrophysics Data System (ADS)

    Syuhada Mangsor, Aneez; Haider Rizvi, Zuhaib; Chaudhary, Kashif; Safwan Aziz, Muhammad

    2018-05-01

    The study of atomic spectroscopy has contributed to a wide range of scientific applications. In principle, laser induced breakdown spectroscopy (LIBS) method has been used to analyse various types of matter regardless of its physical state, either it is solid, liquid or gas because all elements emit light of characteristic frequencies when it is excited to sufficiently high energy. The aim of this work was to analyse the signature spectrums of each element contained in three different types of samples. Metal alloys of Aluminium, Titanium and Brass with the purities of 75%, 80%, 85%, 90% and 95% were used as the manipulated variable and their LIBS spectra were recorded. The characteristic emission lines of main elements were identified from the spectra as well as its corresponding contents. Principal component analysis (PCA) was carried out using the data from LIBS spectra. Three obvious clusters were observed in 3-dimensional PCA plot which corresponding to the different group of alloys. Findings from this study showed that LIBS technology with the help of principle component analysis could conduct the variety discrimination of alloys demonstrating the capability of LIBS-PCA method in field of spectro-analysis. Thus, LIBS-PCA method is believed to be an effective method for classifying alloys with different percentage of purifications, which was high-cost and time-consuming before.

  17. Principal Component Analysis in the Spectral Analysis of the Dynamic Laser Speckle Patterns

    NASA Astrophysics Data System (ADS)

    Ribeiro, K. M.; Braga, R. A., Jr.; Horgan, G. W.; Ferreira, D. D.; Safadi, T.

    2014-02-01

    Dynamic laser speckle is a phenomenon that interprets an optical patterns formed by illuminating a surface under changes with coherent light. Therefore, the dynamic change of the speckle patterns caused by biological material is known as biospeckle. Usually, these patterns of optical interference evolving in time are analyzed by graphical or numerical methods, and the analysis in frequency domain has also been an option, however involving large computational requirements which demands new approaches to filter the images in time. Principal component analysis (PCA) works with the statistical decorrelation of data and it can be used as a data filtering. In this context, the present work evaluated the PCA technique to filter in time the data from the biospeckle images aiming the reduction of time computer consuming and improving the robustness of the filtering. It was used 64 images of biospeckle in time observed in a maize seed. The images were arranged in a data matrix and statistically uncorrelated by PCA technique, and the reconstructed signals were analyzed using the routine graphical and numerical methods to analyze the biospeckle. Results showed the potential of the PCA tool in filtering the dynamic laser speckle data, with the definition of markers of principal components related to the biological phenomena and with the advantage of fast computational processing.

  18. Guided filter and principal component analysis hybrid method for hyperspectral pansharpening

    NASA Astrophysics Data System (ADS)

    Qu, Jiahui; Li, Yunsong; Dong, Wenqian

    2018-01-01

    Hyperspectral (HS) pansharpening aims to generate a fused HS image with high spectral and spatial resolution through integrating an HS image with a panchromatic (PAN) image. A guided filter (GF) and principal component analysis (PCA) hybrid HS pansharpening method is proposed. First, the HS image is interpolated and the PCA transformation is performed on the interpolated HS image. The first principal component (PC1) channel concentrates on the spatial information of the HS image. Different from the traditional PCA method, the proposed method sharpens the PAN image and utilizes the GF to obtain the spatial information difference between the HS image and the enhanced PAN image. Then, in order to reduce spectral and spatial distortion, an appropriate tradeoff parameter is defined and the spatial information difference is injected into the PC1 channel through multiplying by this tradeoff parameter. Once the new PC1 channel is obtained, the fused image is finally generated by the inverse PCA transformation. Experiments performed on both synthetic and real datasets show that the proposed method outperforms other several state-of-the-art HS pansharpening methods in both subjective and objective evaluations.

  19. Application of principal component analysis to multispectral imaging data for evaluation of pigmented skin lesions

    NASA Astrophysics Data System (ADS)

    Jakovels, Dainis; Lihacova, Ilze; Kuzmina, Ilona; Spigulis, Janis

    2013-11-01

    Non-invasive and fast primary diagnostics of pigmented skin lesions is required due to frequent incidence of skin cancer - melanoma. Diagnostic potential of principal component analysis (PCA) for distant skin melanoma recognition is discussed. Processing of the measured clinical multi-spectral images (31 melanomas and 94 nonmalignant pigmented lesions) in the wavelength range of 450-950 nm by means of PCA resulted in 87 % sensitivity and 78 % specificity for separation between malignant melanomas and pigmented nevi.

  20. AlleleCoder: a PERL script for coding codominant polymorphism data for PCA analysis

    USDA-ARS?s Scientific Manuscript database

    A useful biological interpretation of diploid heterozygotes is in terms of the dose of the common allele (0, 1 or 2 copies). We have developed a PERL script that converts FASTA files into coded spreadsheets suitable for Principal Component Analysis (PCA). In combination with R and R Commander, two- ...

  1. Combination of PCA and LORETA for sources analysis of ERP data: an emotional processing study

    NASA Astrophysics Data System (ADS)

    Hu, Jin; Tian, Jie; Yang, Lei; Pan, Xiaohong; Liu, Jiangang

    2006-03-01

    The purpose of this paper is to study spatiotemporal patterns of neuronal activity in emotional processing by analysis of ERP data. 108 pictures (categorized as positive, negative and neutral) were presented to 24 healthy, right-handed subjects while 128-channel EEG data were recorded. An analysis of two steps was applied to the ERP data. First, principal component analysis was performed to obtain significant ERP components. Then LORETA was applied to each component to localize their brain sources. The first six principal components were extracted, each of which showed different spatiotemporal patterns of neuronal activity. The results agree with other emotional study by fMRI or PET. The combination of PCA and LORETA can be used to analyze spatiotemporal patterns of ERP data in emotional processing.

  2. Exploring high-affinity binding properties of octamer peptides by principal component analysis of tetramer peptides.

    PubMed

    Kume, Akiko; Kawai, Shun; Kato, Ryuji; Iwata, Shinmei; Shimizu, Kazunori; Honda, Hiroyuki

    2017-02-01

    To investigate the binding properties of a peptide sequence, we conducted principal component analysis (PCA) of the physicochemical features of a tetramer peptide library comprised of 512 peptides, and the variables were reduced to two principal components. We selected IL-2 and IgG as model proteins and the binding affinity to these proteins was assayed using the 512 peptides mentioned above. PCA of binding affinity data showed that 16 and 18 variables were suitable for localizing IL-2 and IgG high-affinity binding peptides, respectively, into a restricted region of the PCA plot. We then investigated whether the binding affinity of octamer peptide libraries could be predicted using the identified region in the tetramer PCA. The results show that octamer high-affinity binding peptides were also concentrated in the tetramer high-affinity binding region of both IL-2 and IgG. The average fluorescence intensity of high-affinity binding peptides was 3.3- and 2.1-fold higher than that of low-affinity binding peptides for IL-2 and IgG, respectively. We conclude that PCA may be used to identify octamer peptides with high- or low-affinity binding properties from data from a tetramer peptide library. Copyright © 2016 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.

  3. Scalable Robust Principal Component Analysis Using Grassmann Averages.

    PubMed

    Hauberg, Sren; Feragen, Aasa; Enficiaud, Raffi; Black, Michael J

    2016-11-01

    In large datasets, manual data verification is impossible, and we must expect the number of outliers to increase with data size. While principal component analysis (PCA) can reduce data size, and scalable solutions exist, it is well-known that outliers can arbitrarily corrupt the results. Unfortunately, state-of-the-art approaches for robust PCA are not scalable. We note that in a zero-mean dataset, each observation spans a one-dimensional subspace, giving a point on the Grassmann manifold. We show that the average subspace corresponds to the leading principal component for Gaussian data. We provide a simple algorithm for computing this Grassmann Average ( GA), and show that the subspace estimate is less sensitive to outliers than PCA for general distributions. Because averages can be efficiently computed, we immediately gain scalability. We exploit robust averaging to formulate the Robust Grassmann Average (RGA) as a form of robust PCA. The resulting Trimmed Grassmann Average ( TGA) is appropriate for computer vision because it is robust to pixel outliers. The algorithm has linear computational complexity and minimal memory requirements. We demonstrate TGA for background modeling, video restoration, and shadow removal. We show scalability by performing robust PCA on the entire Star Wars IV movie; a task beyond any current method. Source code is available online.

  4. An Intelligent Architecture Based on Field Programmable Gate Arrays Designed to Detect Moving Objects by Using Principal Component Analysis

    PubMed Central

    Bravo, Ignacio; Mazo, Manuel; Lázaro, José L.; Gardel, Alfredo; Jiménez, Pedro; Pizarro, Daniel

    2010-01-01

    This paper presents a complete implementation of the Principal Component Analysis (PCA) algorithm in Field Programmable Gate Array (FPGA) devices applied to high rate background segmentation of images. The classical sequential execution of different parts of the PCA algorithm has been parallelized. This parallelization has led to the specific development and implementation in hardware of the different stages of PCA, such as computation of the correlation matrix, matrix diagonalization using the Jacobi method and subspace projections of images. On the application side, the paper presents a motion detection algorithm, also entirely implemented on the FPGA, and based on the developed PCA core. This consists of dynamically thresholding the differences between the input image and the one obtained by expressing the input image using the PCA linear subspace previously obtained as a background model. The proposal achieves a high ratio of processed images (up to 120 frames per second) and high quality segmentation results, with a completely embedded and reliable hardware architecture based on commercial CMOS sensors and FPGA devices. PMID:22163406

  5. An intelligent architecture based on Field Programmable Gate Arrays designed to detect moving objects by using Principal Component Analysis.

    PubMed

    Bravo, Ignacio; Mazo, Manuel; Lázaro, José L; Gardel, Alfredo; Jiménez, Pedro; Pizarro, Daniel

    2010-01-01

    This paper presents a complete implementation of the Principal Component Analysis (PCA) algorithm in Field Programmable Gate Array (FPGA) devices applied to high rate background segmentation of images. The classical sequential execution of different parts of the PCA algorithm has been parallelized. This parallelization has led to the specific development and implementation in hardware of the different stages of PCA, such as computation of the correlation matrix, matrix diagonalization using the Jacobi method and subspace projections of images. On the application side, the paper presents a motion detection algorithm, also entirely implemented on the FPGA, and based on the developed PCA core. This consists of dynamically thresholding the differences between the input image and the one obtained by expressing the input image using the PCA linear subspace previously obtained as a background model. The proposal achieves a high ratio of processed images (up to 120 frames per second) and high quality segmentation results, with a completely embedded and reliable hardware architecture based on commercial CMOS sensors and FPGA devices.

  6. Using principal component analysis and annual seasonal trend analysis to assess karst rocky desertification in southwestern China.

    PubMed

    Zhang, Zhiming; Ouyang, Zhiyun; Xiao, Yi; Xiao, Yang; Xu, Weihua

    2017-06-01

    Increasing exploitation of karst resources is causing severe environmental degradation because of the fragility and vulnerability of karst areas. By integrating principal component analysis (PCA) with annual seasonal trend analysis (ASTA), this study assessed karst rocky desertification (KRD) within a spatial context. We first produced fractional vegetation cover (FVC) data from a moderate-resolution imaging spectroradiometer normalized difference vegetation index using a dimidiate pixel model. Then, we generated three main components of the annual FVC data using PCA. Subsequently, we generated the slope image of the annual seasonal trends of FVC using median trend analysis. Finally, we combined the three PCA components and annual seasonal trends of FVC with the incidence of KRD for each type of carbonate rock to classify KRD into one of four categories based on K-means cluster analysis: high, moderate, low, and none. The results of accuracy assessments indicated that this combination approach produced greater accuracy and more reasonable KRD mapping than the average FVC based on the vegetation coverage standard. The KRD map for 2010 indicated that the total area of KRD was 78.76 × 10 3  km 2 , which constitutes about 4.06% of the eight southwest provinces of China. The largest KRD areas were found in Yunnan province. The combined PCA and ASTA approach was demonstrated to be an easily implemented, robust, and flexible method for the mapping and assessment of KRD, which can be used to enhance regional KRD management schemes or to address assessment of other environmental issues.

  7. Variability search in M 31 using principal component analysis and the Hubble Source Catalogue

    NASA Astrophysics Data System (ADS)

    Moretti, M. I.; Hatzidimitriou, D.; Karampelas, A.; Sokolovsky, K. V.; Bonanos, A. Z.; Gavras, P.; Yang, M.

    2018-06-01

    Principal component analysis (PCA) is being extensively used in Astronomy but not yet exhaustively exploited for variability search. The aim of this work is to investigate the effectiveness of using the PCA as a method to search for variable stars in large photometric data sets. We apply PCA to variability indices computed for light curves of 18 152 stars in three fields in M 31 extracted from the Hubble Source Catalogue. The projection of the data into the principal components is used as a stellar variability detection and classification tool, capable of distinguishing between RR Lyrae stars, long-period variables (LPVs) and non-variables. This projection recovered more than 90 per cent of the known variables and revealed 38 previously unknown variable stars (about 30 per cent more), all LPVs except for one object of uncertain variability type. We conclude that this methodology can indeed successfully identify candidate variable stars.

  8. Demixed principal component analysis of neural population data.

    PubMed

    Kobak, Dmitry; Brendel, Wieland; Constantinidis, Christos; Feierstein, Claudia E; Kepecs, Adam; Mainen, Zachary F; Qi, Xue-Lian; Romo, Ranulfo; Uchida, Naoshige; Machens, Christian K

    2016-04-12

    Neurons in higher cortical areas, such as the prefrontal cortex, are often tuned to a variety of sensory and motor variables, and are therefore said to display mixed selectivity. This complexity of single neuron responses can obscure what information these areas represent and how it is represented. Here we demonstrate the advantages of a new dimensionality reduction technique, demixed principal component analysis (dPCA), that decomposes population activity into a few components. In addition to systematically capturing the majority of the variance of the data, dPCA also exposes the dependence of the neural representation on task parameters such as stimuli, decisions, or rewards. To illustrate our method we reanalyze population data from four datasets comprising different species, different cortical areas and different experimental tasks. In each case, dPCA provides a concise way of visualizing the data that summarizes the task-dependent features of the population response in a single figure.

  9. A Genealogical Interpretation of Principal Components Analysis

    PubMed Central

    McVean, Gil

    2009-01-01

    Principal components analysis, PCA, is a statistical method commonly used in population genetics to identify structure in the distribution of genetic variation across geographical location and ethnic background. However, while the method is often used to inform about historical demographic processes, little is known about the relationship between fundamental demographic parameters and the projection of samples onto the primary axes. Here I show that for SNP data the projection of samples onto the principal components can be obtained directly from considering the average coalescent times between pairs of haploid genomes. The result provides a framework for interpreting PCA projections in terms of underlying processes, including migration, geographical isolation, and admixture. I also demonstrate a link between PCA and Wright's fst and show that SNP ascertainment has a largely simple and predictable effect on the projection of samples. Using examples from human genetics, I discuss the application of these results to empirical data and the implications for inference. PMID:19834557

  10. Ionospheric total electron content anomalies due to Typhoon Nakri on 29 May 2008: A nonlinear principal component analysis

    NASA Astrophysics Data System (ADS)

    Lin, Jyh-Woei

    2012-09-01

    This paper uses Nonlinear Principal Component Analysis (NLPCA) and Principal Component Analysis (PCA) to determine Total Electron Content (TEC) anomalies in the ionosphere for the Nakri Typhoon on 29 May, 2008 (UTC). NLPCA, PCA and image processing are applied to the global ionospheric map (GIM) with transforms conducted for the time period 12:00-14:00 UT on 29 May 2008 when the wind was most intense. Results show that at a height of approximately 150-200 km the TEC anomaly using NLPCA is more localized; however its intensity increases with height and becomes more widespread. The TEC anomalies are not found by PCA. Potential causes of the results are discussed with emphasis given to vertical acoustic gravity waves. The approximate position of the typhoon's eye can be detected if the GIM is divided into fine enough maps with adequate spatial-resolution at GPS-TEC receivers. This implies that the trace of the typhoon in the regional GIM is caught using NLPCA.

  11. Fast Steerable Principal Component Analysis

    PubMed Central

    Zhao, Zhizhen; Shkolnisky, Yoel; Singer, Amit

    2016-01-01

    Cryo-electron microscopy nowadays often requires the analysis of hundreds of thousands of 2-D images as large as a few hundred pixels in each direction. Here, we introduce an algorithm that efficiently and accurately performs principal component analysis (PCA) for a large set of 2-D images, and, for each image, the set of its uniform rotations in the plane and their reflections. For a dataset consisting of n images of size L × L pixels, the computational complexity of our algorithm is O(nL3 + L4), while existing algorithms take O(nL4). The new algorithm computes the expansion coefficients of the images in a Fourier–Bessel basis efficiently using the nonuniform fast Fourier transform. We compare the accuracy and efficiency of the new algorithm with traditional PCA and existing algorithms for steerable PCA. PMID:27570801

  12. Raman signatures of ferroic domain walls captured by principal component analysis.

    PubMed

    Nataf, G F; Barrett, N; Kreisel, J; Guennou, M

    2018-01-24

    Ferroic domain walls are currently investigated by several state-of-the art techniques in order to get a better understanding of their distinct, functional properties. Here, principal component analysis (PCA) of Raman maps is used to study ferroelectric domain walls (DWs) in LiNbO 3 and ferroelastic DWs in NdGaO 3 . It is shown that PCA allows us to quickly and reliably identify small Raman peak variations at ferroelectric DWs and that the value of a peak shift can be deduced-accurately and without a priori-from a first order Taylor expansion of the spectra. The ability of PCA to separate the contribution of ferroelastic domains and DWs to Raman spectra is emphasized. More generally, our results provide a novel route for the statistical analysis of any property mapped across a DW.

  13. A comparison of the usefulness of canonical analysis, principal components analysis, and band selection for extraction of features from TMS data for landcover analysis

    NASA Technical Reports Server (NTRS)

    Boyd, R. K.; Brumfield, J. O.; Campbell, W. J.

    1984-01-01

    Three feature extraction methods, canonical analysis (CA), principal component analysis (PCA), and band selection, have been applied to Thematic Mapper Simulator (TMS) data in order to evaluate the relative performance of the methods. The results obtained show that CA is capable of providing a transformation of TMS data which leads to better classification results than provided by all seven bands, by PCA, or by band selection. A second conclusion drawn from the study is that TMS bands 2, 3, 4, and 7 (thermal) are most important for landcover classification.

  14. Fourier Transform Infrared Spectroscopy (FTIR) and Multivariate Analysis for Identification of Different Vegetable Oils Used in Biodiesel Production

    PubMed Central

    Mueller, Daniela; Ferrão, Marco Flôres; Marder, Luciano; da Costa, Adilson Ben; de Cássia de Souza Schneider, Rosana

    2013-01-01

    The main objective of this study was to use infrared spectroscopy to identify vegetable oils used as raw material for biodiesel production and apply multivariate analysis to the data. Six different vegetable oil sources—canola, cotton, corn, palm, sunflower and soybeans—were used to produce biodiesel batches. The spectra were acquired by Fourier transform infrared spectroscopy using a universal attenuated total reflectance sensor (FTIR-UATR). For the multivariate analysis principal component analysis (PCA), hierarchical cluster analysis (HCA), interval principal component analysis (iPCA) and soft independent modeling of class analogy (SIMCA) were used. The results indicate that is possible to develop a methodology to identify vegetable oils used as raw material in the production of biodiesel by FTIR-UATR applying multivariate analysis. It was also observed that the iPCA found the best spectral range for separation of biodiesel batches using FTIR-UATR data, and with this result, the SIMCA method classified 100% of the soybean biodiesel samples. PMID:23539030

  15. Research on distributed heterogeneous data PCA algorithm based on cloud platform

    NASA Astrophysics Data System (ADS)

    Zhang, Jin; Huang, Gang

    2018-05-01

    Principal component analysis (PCA) of heterogeneous data sets can solve the problem that centralized data scalability is limited. In order to reduce the generation of intermediate data and error components of distributed heterogeneous data sets, a principal component analysis algorithm based on heterogeneous data sets under cloud platform is proposed. The algorithm performs eigenvalue processing by using Householder tridiagonalization and QR factorization to calculate the error component of the heterogeneous database associated with the public key to obtain the intermediate data set and the lost information. Experiments on distributed DBM heterogeneous datasets show that the model method has the feasibility and reliability in terms of execution time and accuracy.

  16. Free-energy landscape of RNA hairpins constructed via dihedral angle principal component analysis.

    PubMed

    Riccardi, Laura; Nguyen, Phuong H; Stock, Gerhard

    2009-12-31

    To systematically construct a low-dimensional free-energy landscape of RNA systems from a classical molecular dynamics simulation, various versions of the principal component analysis (PCA) are compared: the cPCA using the Cartesian coordinates of all atoms, the dPCA using the sine/cosine-transformed six backbone dihedral angles as well as the glycosidic torsional angle chi and the pseudorotational angle P, the aPCA which ignores the circularity of the 6 + 2 dihedral angles of the RNA, and the dPCA(etatheta), which approximates the 6 backbone dihedral angles by 2 pseudotorsional angles eta and theta. As representative examples, a 10-nucleotide UUCG hairpin and the 36-nucleotide segment SL1 of the Psi site of HIV-1 are studied by classical molecular dynamics simulation, using the Amber all-atom force field and explicit solvent. It is shown that the conformational heterogeneity of the RNA hairpins can only be resolved by an angular PCA such as the dPCA but not by the cPCA using Cartesian coordinates. Apart from possible artifacts due to the coupling of overall and internal motion, this is because the details of hydrogen bonding and stacking interactions but also of global structural rearrangements of the RNA are better discriminated by dihedral angles. In line with recent experiments, it is found that the free energy landscape of RNA hairpins is quite rugged and contains various metastable conformational states which may serve as an intermediate for unfolding.

  17. Spectral discrimination of serum from liver cancer and liver cirrhosis using Raman spectroscopy

    NASA Astrophysics Data System (ADS)

    Yang, Tianyue; Li, Xiaozhou; Yu, Ting; Sun, Ruomin; Li, Siqi

    2011-07-01

    In this paper, Raman spectra of human serum were measured using Raman spectroscopy, then the spectra was analyzed by multivariate statistical methods of principal component analysis (PCA). Then linear discriminant analysis (LDA) was utilized to differentiate the loading score of different diseases as the diagnosing algorithm. Artificial neural network (ANN) was used for cross-validation. The diagnosis sensitivity and specificity by PCA-LDA are 88% and 79%, while that of the PCA-ANN are 89% and 95%. It can be seen that modern analyzing method is a useful tool for the analysis of serum spectra for diagnosing diseases.

  18. Selection of solubility parameters for characterization of pharmaceutical excipients.

    PubMed

    Adamska, Katarzyna; Voelkel, Adam; Héberger, Károly

    2007-11-09

    The solubility parameter (delta(2)), corrected solubility parameter (delta(T)) and its components (delta(d), delta(p), delta(h)) were determined for series of pharmaceutical excipients by using inverse gas chromatography (IGC). Principal component analysis (PCA) was applied for the selection of the solubility parameters which assure the complete characterization of examined materials. Application of PCA suggests that complete description of examined materials is achieved with four solubility parameters, i.e. delta(2) and Hansen solubility parameters (delta(d), delta(p), delta(h)). Selection of the excipients through PCA of their solubility parameters data can be used for prediction of their behavior in a multi-component system, e.g. for selection of the best materials to form stable pharmaceutical liquid mixtures or stable coating formulation.

  19. A new statistic for identifying batch effects in high-throughput genomic data that uses guided principal component analysis.

    PubMed

    Reese, Sarah E; Archer, Kellie J; Therneau, Terry M; Atkinson, Elizabeth J; Vachon, Celine M; de Andrade, Mariza; Kocher, Jean-Pierre A; Eckel-Passow, Jeanette E

    2013-11-15

    Batch effects are due to probe-specific systematic variation between groups of samples (batches) resulting from experimental features that are not of biological interest. Principal component analysis (PCA) is commonly used as a visual tool to determine whether batch effects exist after applying a global normalization method. However, PCA yields linear combinations of the variables that contribute maximum variance and thus will not necessarily detect batch effects if they are not the largest source of variability in the data. We present an extension of PCA to quantify the existence of batch effects, called guided PCA (gPCA). We describe a test statistic that uses gPCA to test whether a batch effect exists. We apply our proposed test statistic derived using gPCA to simulated data and to two copy number variation case studies: the first study consisted of 614 samples from a breast cancer family study using Illumina Human 660 bead-chip arrays, whereas the second case study consisted of 703 samples from a family blood pressure study that used Affymetrix SNP Array 6.0. We demonstrate that our statistic has good statistical properties and is able to identify significant batch effects in two copy number variation case studies. We developed a new statistic that uses gPCA to identify whether batch effects exist in high-throughput genomic data. Although our examples pertain to copy number data, gPCA is general and can be used on other data types as well. The gPCA R package (Available via CRAN) provides functionality and data to perform the methods in this article. reesese@vcu.edu

  20. [Identification of varieties of textile fibers by using Vis/NIR infrared spectroscopy technique].

    PubMed

    Wu, Gui-Fang; He, Yong

    2010-02-01

    The aim of the present paper was to provide new insight into Vis/NIR spectroscopic analysis of textile fibers. In order to achieve rapid identification of the varieties of fibers, the authors selected 5 kinds of fibers of cotton, flax, wool, silk and tencel to do a study with Vis/NIR spectroscopy. Firstly, the spectra of each kind of fiber were scanned by spectrometer, and principal component analysis (PCA) method was used to analyze the characteristics of the pattern of Vis/NIR spectra. Principal component scores scatter plot (PC1 x PC2 x PC3) of fiber indicated the classification effect of five varieties of fibers. The former 6 principal components (PCs) were selected according to the quantity and size of PCs. The PCA classification model was optimized by using the least-squares support vector machines (LS-SVM) method. The authors used the 6 PCs extracted by PCA as the inputs of LS-SVM, and PCA-LS-SVM model was built to achieve varieties validation as well as mathematical model building and optimization analysis. Two hundred samples (40 samples for each variety of fibers) of five varieties of fibers were used for calibration of PCA-LS-SVM model, and the other 50 samples (10 samples for each variety of fibers) were used for validation. The result of validation showed that Vis/NIR spectroscopy technique based on PCA-LS-SVM had a powerful classification capability. It provides a new method for identifying varieties of fibers rapidly and real time, so it has important significance for protecting the rights of consumers, ensuring the quality of textiles, and implementing rationalization production and transaction of textile materials and its production.

  1. TARGETED PRINCIPLE COMPONENT ANALYSIS: A NEW MOTION ARTIFACT CORRECTION APPROACH FOR NEAR-INFRARED SPECTROSCOPY

    PubMed Central

    YÜCEL, MERYEM A.; SELB, JULIETTE; COOPER, ROBERT J.; BOAS, DAVID A.

    2014-01-01

    As near-infrared spectroscopy (NIRS) broadens its application area to different age and disease groups, motion artifacts in the NIRS signal due to subject movement is becoming an important challenge. Motion artifacts generally produce signal fluctuations that are larger than physiological NIRS signals, thus it is crucial to correct for them before obtaining an estimate of stimulus evoked hemodynamic responses. There are various methods for correction such as principle component analysis (PCA), wavelet-based filtering and spline interpolation. Here, we introduce a new approach to motion artifact correction, targeted principle component analysis (tPCA), which incorporates a PCA filter only on the segments of data identified as motion artifacts. It is expected that this will overcome the issues of filtering desired signals that plagues standard PCA filtering of entire data sets. We compared the new approach with the most effective motion artifact correction algorithms on a set of data acquired simultaneously with a collodion-fixed probe (low motion artifact content) and a standard Velcro probe (high motion artifact content). Our results show that tPCA gives statistically better results in recovering hemodynamic response function (HRF) as compared to wavelet-based filtering and spline interpolation for the Velcro probe. It results in a significant reduction in mean-squared error (MSE) and significant enhancement in Pearson’s correlation coefficient to the true HRF. The collodion-fixed fiber probe with no motion correction performed better than the Velcro probe corrected for motion artifacts in terms of MSE and Pearson’s correlation coefficient. Thus, if the experimental study permits, the use of a collodion-fixed fiber probe may be desirable. If the use of a collodion-fixed probe is not feasible, then we suggest the use of tPCA in the processing of motion artifact contaminated data. PMID:25360181

  2. Detecting most influencing courses on students grades using block PCA

    NASA Astrophysics Data System (ADS)

    Othman, Osama H.; Gebril, Rami Salah

    2014-12-01

    One of the modern solutions adopted in dealing with the problem of large number of variables in statistical analyses is the Block Principal Component Analysis (Block PCA). This modified technique can be used to reduce the vertical dimension (variables) of the data matrix Xn×p by selecting a smaller number of variables, (say m) containing most of the statistical information. These selected variables can then be employed in further investigations and analyses. Block PCA is an adapted multistage technique of the original PCA. It involves the application of Cluster Analysis (CA) and variable selection throughout sub principal components scores (PC's). The application of Block PCA in this paper is a modified version of the original work of Liu et al (2002). The main objective was to apply PCA on each group of variables, (established using cluster analysis), instead of involving the whole large pack of variables which was proved to be unreliable. In this work, the Block PCA is used to reduce the size of a huge data matrix ((n = 41) × (p = 251)) consisting of Grade Point Average (GPA) of the students in 251 courses (variables) in the faculty of science in Benghazi University. In other words, we are constructing a smaller analytical data matrix of the GPA's of the students with less variables containing most variation (statistical information) in the original database. By applying the Block PCA, (12) courses were found to `absorb' most of the variation or influence from the original data matrix, and hence worth to be keep for future statistical exploring and analytical studies. In addition, the course Independent Study (Math.) was found to be the most influencing course on students GPA among the 12 selected courses.

  3. Understanding deformation mechanisms during powder compaction using principal component analysis of compression data.

    PubMed

    Roopwani, Rahul; Buckner, Ira S

    2011-10-14

    Principal component analysis (PCA) was applied to pharmaceutical powder compaction. A solid fraction parameter (SF(c/d)) and a mechanical work parameter (W(c/d)) representing irreversible compression behavior were determined as functions of applied load. Multivariate analysis of the compression data was carried out using PCA. The first principal component (PC1) showed loadings for the solid fraction and work values that agreed with changes in the relative significance of plastic deformation to consolidation at different pressures. The PC1 scores showed the same rank order as the relative plasticity ranking derived from the literature for common pharmaceutical materials. The utility of PC1 in understanding deformation was extended to binary mixtures using a subset of the original materials. Combinations of brittle and plastic materials were characterized using the PCA method. The relationships between PC1 scores and the weight fractions of the mixtures were typically linear showing ideal mixing in their deformation behaviors. The mixture consisting of two plastic materials was the only combination to show a consistent positive deviation from ideality. The application of PCA to solid fraction and mechanical work data appears to be an effective means of predicting deformation behavior during compaction of simple powder mixtures. Copyright © 2011 Elsevier B.V. All rights reserved.

  4. How Many Separable Sources? Model Selection In Independent Components Analysis

    PubMed Central

    Woods, Roger P.; Hansen, Lars Kai; Strother, Stephen

    2015-01-01

    Unlike mixtures consisting solely of non-Gaussian sources, mixtures including two or more Gaussian components cannot be separated using standard independent components analysis methods that are based on higher order statistics and independent observations. The mixed Independent Components Analysis/Principal Components Analysis (mixed ICA/PCA) model described here accommodates one or more Gaussian components in the independent components analysis model and uses principal components analysis to characterize contributions from this inseparable Gaussian subspace. Information theory can then be used to select from among potential model categories with differing numbers of Gaussian components. Based on simulation studies, the assumptions and approximations underlying the Akaike Information Criterion do not hold in this setting, even with a very large number of observations. Cross-validation is a suitable, though computationally intensive alternative for model selection. Application of the algorithm is illustrated using Fisher's iris data set and Howells' craniometric data set. Mixed ICA/PCA is of potential interest in any field of scientific investigation where the authenticity of blindly separated non-Gaussian sources might otherwise be questionable. Failure of the Akaike Information Criterion in model selection also has relevance in traditional independent components analysis where all sources are assumed non-Gaussian. PMID:25811988

  5. Short-term PV/T module temperature prediction based on PCA-RBF neural network

    NASA Astrophysics Data System (ADS)

    Li, Jiyong; Zhao, Zhendong; Li, Yisheng; Xiao, Jing; Tang, Yunfeng

    2018-02-01

    Aiming at the non-linearity and large inertia of temperature control in PV/T system, short-term temperature prediction of PV/T module is proposed, to make the PV/T system controller run forward according to the short-term forecasting situation to optimize control effect. Based on the analysis of the correlation between PV/T module temperature and meteorological factors, and the temperature of adjacent time series, the principal component analysis (PCA) method is used to pre-process the original input sample data. Combined with the RBF neural network theory, the simulation results show that the PCA method makes the prediction accuracy of the network model higher and the generalization performance stronger than that of the RBF neural network without the main component extraction.

  6. Two worlds collide: Image analysis methods for quantifying structural variation in cluster molecular dynamics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Steenbergen, K. G., E-mail: kgsteen@gmail.com; Gaston, N.

    2014-02-14

    Inspired by methods of remote sensing image analysis, we analyze structural variation in cluster molecular dynamics (MD) simulations through a unique application of the principal component analysis (PCA) and Pearson Correlation Coefficient (PCC). The PCA analysis characterizes the geometric shape of the cluster structure at each time step, yielding a detailed and quantitative measure of structural stability and variation at finite temperature. Our PCC analysis captures bond structure variation in MD, which can be used to both supplement the PCA analysis as well as compare bond patterns between different cluster sizes. Relying only on atomic position data, without requirement formore » a priori structural input, PCA and PCC can be used to analyze both classical and ab initio MD simulations for any cluster composition or electronic configuration. Taken together, these statistical tools represent powerful new techniques for quantitative structural characterization and isomer identification in cluster MD.« less

  7. Two worlds collide: image analysis methods for quantifying structural variation in cluster molecular dynamics.

    PubMed

    Steenbergen, K G; Gaston, N

    2014-02-14

    Inspired by methods of remote sensing image analysis, we analyze structural variation in cluster molecular dynamics (MD) simulations through a unique application of the principal component analysis (PCA) and Pearson Correlation Coefficient (PCC). The PCA analysis characterizes the geometric shape of the cluster structure at each time step, yielding a detailed and quantitative measure of structural stability and variation at finite temperature. Our PCC analysis captures bond structure variation in MD, which can be used to both supplement the PCA analysis as well as compare bond patterns between different cluster sizes. Relying only on atomic position data, without requirement for a priori structural input, PCA and PCC can be used to analyze both classical and ab initio MD simulations for any cluster composition or electronic configuration. Taken together, these statistical tools represent powerful new techniques for quantitative structural characterization and isomer identification in cluster MD.

  8. SESNPCA: Principal Component Analysis Applied to Stripped-Envelope Core-Collapse Supernovae

    NASA Astrophysics Data System (ADS)

    Williamson, Marc; Bianco, Federica; Modjaz, Maryam

    2018-01-01

    In the new era of time-domain astronomy, it will become increasingly important to have rigorous, data driven models for classifying transients, including supernovae (SNe). We present the first application of principal component analysis (PCA) to stripped-envelope core-collapse supernovae (SESNe). Previous studies of SNe types Ib, IIb, Ic, and broad-line Ic (Ic-BL) focus only on specific spectral features, while our PCA algorithm uses all of the information contained in each spectrum. We use one of the largest compiled datasets of SESNe, containing over 150 SNe, each with spectra taken at multiple phases. Our work focuses on 49 SNe with spectra taken 15 ± 5 days after maximum V-band light where better distinctions can be made between SNe type Ib and Ic spectra. We find that spectra of SNe type IIb and Ic-BL are separable from the other types in PCA space, indicating that PCA is a promising option for developing a purely data driven model for SESNe classification.

  9. Identification and apportionment of hazardous elements in the sediments in the Yangtze River estuary.

    PubMed

    Wang, Jiawei; Liu, Ruimin; Wang, Haotian; Yu, Wenwen; Xu, Fei; Shen, Zhenyao

    2015-12-01

    In this study, positive matrix factorization (PMF) and principal components analysis (PCA) were combined to identify and apportion pollution-based sources of hazardous elements in the surface sediments in the Yangtze River estuary (YRE). Source identification analysis indicated that PC1, including Al, Fe, Mn, Cr, Ni, As, Cu, and Zn, can be defined as a sewage component; PC2, including Pb and Sb, can be considered as an atmospheric deposition component; and PC3, containing Cd and Hg, can be considered as an agricultural nonpoint component. To better identify the sources and quantitatively apportion the concentrations to their sources, eight sources were identified with PMF: agricultural/industrial sewage mixed (18.6 %), mining wastewater (15.9 %), agricultural fertilizer (14.5 %), atmospheric deposition (12.8 %), agricultural nonpoint (10.6 %), industrial wastewater (9.8 %), marine activity (9.0 %), and nickel plating industry (8.8 %). Overall, the hazardous element content seems to be more connected to anthropogenic activity instead of natural sources. The PCA results laid the foundation for the PMF analysis by providing a general classification of sources. PMF resolves more factors with a higher explained variance than PCA; PMF provided both the internal analysis and the quantitative analysis. The combination of the two methods can provide more reasonable and reliable results.

  10. Visualizing Hyolaryngeal Mechanics in Swallowing Using Dynamic MRI

    PubMed Central

    Pearson, William G.; Zumwalt, Ann C.

    2013-01-01

    Introduction Coordinates of anatomical landmarks are captured using dynamic MRI to explore whether a proposed two-sling mechanism underlies hyolaryngeal elevation in pharyngeal swallowing. A principal components analysis (PCA) is applied to coordinates to determine the covariant function of the proposed mechanism. Methods Dynamic MRI (dMRI) data were acquired from eleven healthy subjects during a repeated swallows task. Coordinates mapping the proposed mechanism are collected from each dynamic (frame) of a dynamic MRI swallowing series of a randomly selected subject in order to demonstrate shape changes in a single subject. Coordinates representing minimum and maximum hyolaryngeal elevation of all 11 subjects were also mapped to demonstrate shape changes of the system among all subjects. MophoJ software was used to perform PCA and determine vectors of shape change (eigenvectors) for elements of the two-sling mechanism of hyolaryngeal elevation. Results For both single subject and group PCAs, hyolaryngeal elevation accounted for the first principal component of variation. For the single subject PCA, the first principal component accounted for 81.5% of the variance. For the between subjects PCA, the first principal component accounted for 58.5% of the variance. Eigenvectors and shape changes associated with this first principal component are reported. Discussion Eigenvectors indicate that two-muscle slings and associated skeletal elements function as components of a covariant mechanism to elevate the hyolaryngeal complex. Morphological analysis is useful to model shape changes in the two-sling mechanism of hyolaryngeal elevation. PMID:25090608

  11. Protein-RNA specificity by high-throughput principal component analysis of NMR spectra.

    PubMed

    Collins, Katherine M; Oregioni, Alain; Robertson, Laura E; Kelly, Geoff; Ramos, Andres

    2015-03-31

    Defining the RNA target selectivity of the proteins regulating mRNA metabolism is a key issue in RNA biology. Here we present a novel use of principal component analysis (PCA) to extract the RNA sequence preference of RNA binding proteins. We show that PCA can be used to compare the changes in the nuclear magnetic resonance (NMR) spectrum of a protein upon binding a set of quasi-degenerate RNAs and define the nucleobase specificity. We couple this application of PCA to an automated NMR spectra recording and processing protocol and obtain an unbiased and high-throughput NMR method for the analysis of nucleobase preference in protein-RNA interactions. We test the method on the RNA binding domains of three important regulators of RNA metabolism. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. Derivation of Boundary Manikins: A Principal Component Analysis

    NASA Technical Reports Server (NTRS)

    Young, Karen; Margerum, Sarah; Barr, Abbe; Ferrer, Mike A.; Rajulu, Sudhakar

    2008-01-01

    When designing any human-system interface, it is critical to provide realistic anthropometry to properly represent how a person fits within a given space. This study aimed to identify a minimum number of boundary manikins or representative models of subjects anthropometry from a target population, which would realistically represent the population. The boundary manikin anthropometry was derived using, Principal Component Analysis (PCA). PCA is a statistical approach to reduce a multi-dimensional dataset using eigenvectors and eigenvalues. The measurements used in the PCA were identified as those measurements critical for suit and cockpit design. The PCA yielded a total of 26 manikins per gender, as well as their anthropometry from the target population. Reduction techniques were implemented to reduce this number further with a final result of 20 female and 22 male subjects. The anthropometry of the boundary manikins was then be used to create 3D digital models (to be discussed in subsequent papers) intended for use by designers to test components of their space suit design, to verify that the requirements specified in the Human Systems Integration Requirements (HSIR) document are met. The end-goal is to allow for designers to generate suits which accommodate the diverse anthropometry of the user population.

  13. An algorithm for separation of mixed sparse and Gaussian sources

    PubMed Central

    Akkalkotkar, Ameya

    2017-01-01

    Independent component analysis (ICA) is a ubiquitous method for decomposing complex signal mixtures into a small set of statistically independent source signals. However, in cases in which the signal mixture consists of both nongaussian and Gaussian sources, the Gaussian sources will not be recoverable by ICA and will pollute estimates of the nongaussian sources. Therefore, it is desirable to have methods for mixed ICA/PCA which can separate mixtures of Gaussian and nongaussian sources. For mixtures of purely Gaussian sources, principal component analysis (PCA) can provide a basis for the Gaussian subspace. We introduce a new method for mixed ICA/PCA which we call Mixed ICA/PCA via Reproducibility Stability (MIPReSt). Our method uses a repeated estimations technique to rank sources by reproducibility, combined with decomposition of multiple subsamplings of the original data matrix. These multiple decompositions allow us to assess component stability as the size of the data matrix changes, which can be used to determinine the dimension of the nongaussian subspace in a mixture. We demonstrate the utility of MIPReSt for signal mixtures consisting of simulated sources and real-word (speech) sources, as well as mixture of unknown composition. PMID:28414814

  14. An algorithm for separation of mixed sparse and Gaussian sources.

    PubMed

    Akkalkotkar, Ameya; Brown, Kevin Scott

    2017-01-01

    Independent component analysis (ICA) is a ubiquitous method for decomposing complex signal mixtures into a small set of statistically independent source signals. However, in cases in which the signal mixture consists of both nongaussian and Gaussian sources, the Gaussian sources will not be recoverable by ICA and will pollute estimates of the nongaussian sources. Therefore, it is desirable to have methods for mixed ICA/PCA which can separate mixtures of Gaussian and nongaussian sources. For mixtures of purely Gaussian sources, principal component analysis (PCA) can provide a basis for the Gaussian subspace. We introduce a new method for mixed ICA/PCA which we call Mixed ICA/PCA via Reproducibility Stability (MIPReSt). Our method uses a repeated estimations technique to rank sources by reproducibility, combined with decomposition of multiple subsamplings of the original data matrix. These multiple decompositions allow us to assess component stability as the size of the data matrix changes, which can be used to determinine the dimension of the nongaussian subspace in a mixture. We demonstrate the utility of MIPReSt for signal mixtures consisting of simulated sources and real-word (speech) sources, as well as mixture of unknown composition.

  15. Exploring functional data analysis and wavelet principal component analysis on ecstasy (MDMA) wastewater data.

    PubMed

    Salvatore, Stefania; Bramness, Jørgen G; Røislien, Jo

    2016-07-12

    Wastewater-based epidemiology (WBE) is a novel approach in drug use epidemiology which aims to monitor the extent of use of various drugs in a community. In this study, we investigate functional principal component analysis (FPCA) as a tool for analysing WBE data and compare it to traditional principal component analysis (PCA) and to wavelet principal component analysis (WPCA) which is more flexible temporally. We analysed temporal wastewater data from 42 European cities collected daily over one week in March 2013. The main temporal features of ecstasy (MDMA) were extracted using FPCA using both Fourier and B-spline basis functions with three different smoothing parameters, along with PCA and WPCA with different mother wavelets and shrinkage rules. The stability of FPCA was explored through bootstrapping and analysis of sensitivity to missing data. The first three principal components (PCs), functional principal components (FPCs) and wavelet principal components (WPCs) explained 87.5-99.6 % of the temporal variation between cities, depending on the choice of basis and smoothing. The extracted temporal features from PCA, FPCA and WPCA were consistent. FPCA using Fourier basis and common-optimal smoothing was the most stable and least sensitive to missing data. FPCA is a flexible and analytically tractable method for analysing temporal changes in wastewater data, and is robust to missing data. WPCA did not reveal any rapid temporal changes in the data not captured by FPCA. Overall the results suggest FPCA with Fourier basis functions and common-optimal smoothing parameter as the most accurate approach when analysing WBE data.

  16. Independent components analysis to increase efficiency of discriminant analysis methods (FDA and LDA): Application to NMR fingerprinting of wine.

    PubMed

    Monakhova, Yulia B; Godelmann, Rolf; Kuballa, Thomas; Mushtakova, Svetlana P; Rutledge, Douglas N

    2015-08-15

    Discriminant analysis (DA) methods, such as linear discriminant analysis (LDA) or factorial discriminant analysis (FDA), are well-known chemometric approaches for solving classification problems in chemistry. In most applications, principle components analysis (PCA) is used as the first step to generate orthogonal eigenvectors and the corresponding sample scores are utilized to generate discriminant features for the discrimination. Independent components analysis (ICA) based on the minimization of mutual information can be used as an alternative to PCA as a preprocessing tool for LDA and FDA classification. To illustrate the performance of this ICA/DA methodology, four representative nuclear magnetic resonance (NMR) data sets of wine samples were used. The classification was performed regarding grape variety, year of vintage and geographical origin. The average increase for ICA/DA in comparison with PCA/DA in the percentage of correct classification varied between 6±1% and 8±2%. The maximum increase in classification efficiency of 11±2% was observed for discrimination of the year of vintage (ICA/FDA) and geographical origin (ICA/LDA). The procedure to determine the number of extracted features (PCs, ICs) for the optimum DA models was discussed. The use of independent components (ICs) instead of principle components (PCs) resulted in improved classification performance of DA methods. The ICA/LDA method is preferable to ICA/FDA for recognition tasks based on NMR spectroscopic measurements. Copyright © 2015 Elsevier B.V. All rights reserved.

  17. Principal component analysis of TOF-SIMS spectra, images and depth profiles: an industrial perspective

    NASA Astrophysics Data System (ADS)

    Pacholski, Michaeleen L.

    2004-06-01

    Principal component analysis (PCA) has been successfully applied to time-of-flight secondary ion mass spectrometry (TOF-SIMS) spectra, images and depth profiles. Although SIMS spectral data sets can be small (in comparison to datasets typically discussed in literature from other analytical techniques such as gas or liquid chromatography), each spectrum has thousands of ions resulting in what can be a difficult comparison of samples. Analysis of industrially-derived samples means the identity of most surface species are unknown a priori and samples must be analyzed rapidly to satisfy customer demands. PCA enables rapid assessment of spectral differences (or lack there of) between samples and identification of chemically different areas on sample surfaces for images. Depth profile analysis helps define interfaces and identify low-level components in the system.

  18. [Determination of the Plant Origin of Licorice Oil Extract, a Natural Food Additive, by Principal Component Analysis Based on Chemical Components].

    PubMed

    Tada, Atsuko; Ishizuki, Kyoko; Sugimoto, Naoki; Yoshimatsu, Kayo; Kawahara, Nobuo; Suematsu, Takako; Arifuku, Kazunori; Fukai, Toshio; Tamura, Yukiyoshi; Ohtsuki, Takashi; Tahara, Maiko; Yamazaki, Takeshi; Akiyama, Hiroshi

    2015-01-01

    "Licorice oil extract" (LOE) (antioxidant agent) is described in the notice of Japanese food additive regulations as a material obtained from the roots and/or rhizomes of Glycyrrhiza uralensis, G. inflata or G. glabra. In this study, we aimed to identify the original Glycyrrhiza species of eight food additive products using LC/MS. Glabridin, a characteristic compound in G. glabra, was specifically detected in seven products, and licochalcone A, a characteristic compound in G. inflata, was detected in one product. In addition, Principal Component Analysis (PCA) (a kind of multivariate analysis) using the data of LC/MS or (1)H-NMR analysis was performed. The data of thirty-one samples, including LOE products used as food additives, ethanol extracts of various Glycyrrhiza species and commercially available Glycyrrhiza species-derived products were assessed. Based on the PCA results, the majority of LOE products was confirmed to be derived from G. glabra. This study suggests that PCA using (1)H-NMR analysis data is a simple and useful method to identify the plant species of origin of natural food additive products.

  19. Analysis and Evaluation of the Characteristic Taste Components in Portobello Mushroom.

    PubMed

    Wang, Jinbin; Li, Wen; Li, Zhengpeng; Wu, Wenhui; Tang, Xueming

    2018-05-10

    To identify the characteristic taste components of the common cultivated mushroom (brown; Portobello), Agaricus bisporus, taste components in the stipe and pileus of Portobello mushroom harvested at different growth stages were extracted and identified, and principal component analysis (PCA) and taste active value (TAV) were used to reveal the characteristic taste components during the each of the growth stages of Portobello mushroom. In the stipe and pileus, 20 and 14 different principal taste components were identified, respectively, and they were considered as the principal taste components of Portobello mushroom fruit bodies, which included most amino acids and 5'-nucleotides. Some taste components that were found at high levels, such as lactic acid and citric acid, were not detected as Portobello mushroom principal taste components through PCA. However, due to their high content, Portobello mushroom could be used as a source of organic acids. The PCA and TAV results revealed that 5'-GMP, glutamic acid, malic acid, alanine, proline, leucine, and aspartic acid were the characteristic taste components of Portobello mushroom fruit bodies. Portobello mushroom was also found to be rich in protein and amino acids, so it might also be useful in the formulation of nutraceuticals and functional food. The results in this article could provide a theoretical basis for understanding and regulating the characteristic flavor components synthesis process of Portobello mushroom. © 2018 Institute of Food Technologists®.

  20. Principal component analysis for protein folding dynamics.

    PubMed

    Maisuradze, Gia G; Liwo, Adam; Scheraga, Harold A

    2009-01-09

    Protein folding is considered here by studying the dynamics of the folding of the triple beta-strand WW domain from the Formin-binding protein 28. Starting from the unfolded state and ending either in the native or nonnative conformational states, trajectories are generated with the coarse-grained united residue (UNRES) force field. The effectiveness of principal components analysis (PCA), an already established mathematical technique for finding global, correlated motions in atomic simulations of proteins, is evaluated here for coarse-grained trajectories. The problems related to PCA and their solutions are discussed. The folding and nonfolding of proteins are examined with free-energy landscapes. Detailed analyses of many folding and nonfolding trajectories at different temperatures show that PCA is very efficient for characterizing the general folding and nonfolding features of proteins. It is shown that the first principal component captures and describes in detail the dynamics of a system. Anomalous diffusion in the folding/nonfolding dynamics is examined by the mean-square displacement (MSD) and the fractional diffusion and fractional kinetic equations. The collisionless (or ballistic) behavior of a polypeptide undergoing Brownian motion along the first few principal components is accounted for.

  1. Discrimination of healthy and osteoarthritic articular cartilage by Fourier transform infrared imaging and Fisher’s discriminant analysis

    PubMed Central

    Mao, Zhi-Hua; Yin, Jian-Hua; Zhang, Xue-Xi; Wang, Xiao; Xia, Yang

    2016-01-01

    Fourier transform infrared spectroscopic imaging (FTIRI) technique can be used to obtain the quantitative information of content and spatial distribution of principal components in cartilage by combining with chemometrics methods. In this study, FTIRI combining with principal component analysis (PCA) and Fisher’s discriminant analysis (FDA) was applied to identify the healthy and osteoarthritic (OA) articular cartilage samples. Ten 10-μm thick sections of canine cartilages were imaged at 6.25μm/pixel in FTIRI. The infrared spectra extracted from the FTIR images were imported into SPSS software for PCA and FDA. Based on the PCA result of 2 principal components, the healthy and OA cartilage samples were effectively discriminated by the FDA with high accuracy of 94% for the initial samples (training set) and cross validation, as well as 86.67% for the prediction group. The study showed that cartilage degeneration became gradually weak with the increase of the depth. FTIRI combined with chemometrics may become an effective method for distinguishing healthy and OA cartilages in future. PMID:26977354

  2. Extracting spectral contrast in Landsat Thematic Mapper image data using selective principal component analysis

    USGS Publications Warehouse

    Chavez, P.S.; Kwarteng, A.Y.

    1989-01-01

    A challenge encountered with Landsat Thematic Mapper (TM) data, which includes data from size reflective spectral bands, is displaying as much information as possible in a three-image set for color compositing or digital analysis. Principal component analysis (PCA) applied to the six TM bands simultaneously is often used to address this problem. However, two problems that can be encountered using the PCA method are that information of interest might be mathematically mapped to one of the unused components and that a color composite can be difficult to interpret. "Selective' PCA can be used to minimize both of these problems. The spectral contrast among several spectral regions was mapped for a northern Arizona site using Landsat TM data. Field investigations determined that most of the spectral contrast seen in this area was due to one of the following: the amount of iron and hematite in the soils and rocks, vegetation differences, standing and running water, or the presence of gypsum, which has a higher moisture retention capability than do the surrounding soils and rocks. -from Authors

  3. Kernel Principal Component Analysis for dimensionality reduction in fMRI-based diagnosis of ADHD.

    PubMed

    Sidhu, Gagan S; Asgarian, Nasimeh; Greiner, Russell; Brown, Matthew R G

    2012-01-01

    This study explored various feature extraction methods for use in automated diagnosis of Attention-Deficit Hyperactivity Disorder (ADHD) from functional Magnetic Resonance Image (fMRI) data. Each participant's data consisted of a resting state fMRI scan as well as phenotypic data (age, gender, handedness, IQ, and site of scanning) from the ADHD-200 dataset. We used machine learning techniques to produce support vector machine (SVM) classifiers that attempted to differentiate between (1) all ADHD patients vs. healthy controls and (2) ADHD combined (ADHD-c) type vs. ADHD inattentive (ADHD-i) type vs. controls. In different tests, we used only the phenotypic data, only the imaging data, or else both the phenotypic and imaging data. For feature extraction on fMRI data, we tested the Fast Fourier Transform (FFT), different variants of Principal Component Analysis (PCA), and combinations of FFT and PCA. PCA variants included PCA over time (PCA-t), PCA over space and time (PCA-st), and kernelized PCA (kPCA-st). Baseline chance accuracy was 64.2% produced by guessing healthy control (the majority class) for all participants. Using only phenotypic data produced 72.9% accuracy on two class diagnosis and 66.8% on three class diagnosis. Diagnosis using only imaging data did not perform as well as phenotypic-only approaches. Using both phenotypic and imaging data with combined FFT and kPCA-st feature extraction yielded accuracies of 76.0% on two class diagnosis and 68.6% on three class diagnosis-better than phenotypic-only approaches. Our results demonstrate the potential of using FFT and kPCA-st with resting-state fMRI data as well as phenotypic data for automated diagnosis of ADHD. These results are encouraging given known challenges of learning ADHD diagnostic classifiers using the ADHD-200 dataset (see Brown et al., 2012).

  4. Non-linear principal component analysis applied to Lorenz models and to North Atlantic SLP

    NASA Astrophysics Data System (ADS)

    Russo, A.; Trigo, R. M.

    2003-04-01

    A non-linear generalisation of Principal Component Analysis (PCA), denoted Non-Linear Principal Component Analysis (NLPCA), is introduced and applied to the analysis of three data sets. Non-Linear Principal Component Analysis allows for the detection and characterisation of low-dimensional non-linear structure in multivariate data sets. This method is implemented using a 5-layer feed-forward neural network introduced originally in the chemical engineering literature (Kramer, 1991). The method is described and details of its implementation are addressed. Non-Linear Principal Component Analysis is first applied to a data set sampled from the Lorenz attractor (1963). It is found that the NLPCA approximations are more representative of the data than are the corresponding PCA approximations. The same methodology was applied to the less known Lorenz attractor (1984). However, the results obtained weren't as good as those attained with the famous 'Butterfly' attractor. Further work with this model is underway in order to assess if NLPCA techniques can be more representative of the data characteristics than are the corresponding PCA approximations. The application of NLPCA to relatively 'simple' dynamical systems, such as those proposed by Lorenz, is well understood. However, the application of NLPCA to a large climatic data set is much more challenging. Here, we have applied NLPCA to the sea level pressure (SLP) field for the entire North Atlantic area and the results show a slight imcrement of explained variance associated. Finally, directions for future work are presented.%}

  5. Demixed principal component analysis of neural population data

    PubMed Central

    Kobak, Dmitry; Brendel, Wieland; Constantinidis, Christos; Feierstein, Claudia E; Kepecs, Adam; Mainen, Zachary F; Qi, Xue-Lian; Romo, Ranulfo; Uchida, Naoshige; Machens, Christian K

    2016-01-01

    Neurons in higher cortical areas, such as the prefrontal cortex, are often tuned to a variety of sensory and motor variables, and are therefore said to display mixed selectivity. This complexity of single neuron responses can obscure what information these areas represent and how it is represented. Here we demonstrate the advantages of a new dimensionality reduction technique, demixed principal component analysis (dPCA), that decomposes population activity into a few components. In addition to systematically capturing the majority of the variance of the data, dPCA also exposes the dependence of the neural representation on task parameters such as stimuli, decisions, or rewards. To illustrate our method we reanalyze population data from four datasets comprising different species, different cortical areas and different experimental tasks. In each case, dPCA provides a concise way of visualizing the data that summarizes the task-dependent features of the population response in a single figure. DOI: http://dx.doi.org/10.7554/eLife.10989.001 PMID:27067378

  6. Classification of fMRI resting-state maps using machine learning techniques: A comparative study

    NASA Astrophysics Data System (ADS)

    Gallos, Ioannis; Siettos, Constantinos

    2017-11-01

    We compare the efficiency of Principal Component Analysis (PCA) and nonlinear learning manifold algorithms (ISOMAP and Diffusion maps) for classifying brain maps between groups of schizophrenia patients and healthy from fMRI scans during a resting-state experiment. After a standard pre-processing pipeline, we applied spatial Independent component analysis (ICA) to reduce (a) noise and (b) spatial-temporal dimensionality of fMRI maps. On the cross-correlation matrix of the ICA components, we applied PCA, ISOMAP and Diffusion Maps to find an embedded low-dimensional space. Finally, support-vector-machines (SVM) and k-NN algorithms were used to evaluate the performance of the algorithms in classifying between the two groups.

  7. InterFace: A software package for face image warping, averaging, and principal components analysis.

    PubMed

    Kramer, Robin S S; Jenkins, Rob; Burton, A Mike

    2017-12-01

    We describe InterFace, a software package for research in face recognition. The package supports image warping, reshaping, averaging of multiple face images, and morphing between faces. It also supports principal components analysis (PCA) of face images, along with tools for exploring the "face space" produced by PCA. The package uses a simple graphical user interface, allowing users to perform these sophisticated image manipulations without any need for programming knowledge. The program is available for download in the form of an app, which requires that users also have access to the (freely available) MATLAB Runtime environment.

  8. Identification of regional activation by factorization of high-density surface EMG signals: A comparison of Principal Component Analysis and Non-negative Matrix factorization.

    PubMed

    Gallina, Alessio; Garland, S Jayne; Wakeling, James M

    2018-05-22

    In this study, we investigated whether principal component analysis (PCA) and non-negative matrix factorization (NMF) perform similarly for the identification of regional activation within the human vastus medialis. EMG signals from 64 locations over the VM were collected from twelve participants while performing a low-force isometric knee extension. The envelope of the EMG signal of each channel was calculated by low-pass filtering (8 Hz) the monopolar EMG signal after rectification. The data matrix was factorized using PCA and NMF, and up to 5 factors were considered for each algorithm. Association between explained variance, spatial weights and temporal scores between the two algorithms were compared using Pearson correlation. For both PCA and NMF, a single factor explained approximately 70% of the variance of the signal, while two and three factors explained just over 85% or 90%. The variance explained by PCA and NMF was highly comparable (R > 0.99). Spatial weights and temporal scores extracted with non-negative reconstruction of PCA and NMF were highly associated (all p < 0.001, mean R > 0.97). Regional VM activation can be identified using high-density surface EMG and factorization algorithms. Regional activation explains up to 30% of the variance of the signal, as identified through both PCA and NMF. Copyright © 2018 Elsevier Ltd. All rights reserved.

  9. An Exploratory Study on Using Principal-Component Analysis and Confirmatory Factor Analysis to Identify Bolt-On Dimensions: The EQ-5D Case Study.

    PubMed

    Finch, Aureliano Paolo; Brazier, John Edward; Mukuria, Clara; Bjorner, Jakob Bue

    2017-12-01

    Generic preference-based measures such as the EuroQol five-dimensional questionnaire (EQ-5D) are used in economic evaluation, but may not be appropriate for all conditions. When this happens, a possible solution is adding bolt-ons to expand their descriptive systems. Using review-based methods, studies published to date claimed the relevance of bolt-ons in the presence of poor psychometric results. This approach does not identify the specific dimensions missing from the Generic preference-based measure core descriptive system, and is inappropriate for identifying dimensions that might improve the measure generically. This study explores the use of principal-component analysis (PCA) and confirmatory factor analysis (CFA) for bolt-on identification in the EQ-5D. Data were drawn from the international Multi-Instrument Comparison study, which is an online survey on health and well-being measures in five countries. Analysis was based on a pool of 92 items from nine instruments. Initial content analysis provided a theoretical framework for PCA results interpretation and CFA model development. PCA was used to investigate the underlining dimensional structure and whether EQ-5D items were represented in the identified constructs. CFA was used to confirm the structure. CFA was cross-validated in random halves of the sample. PCA suggested a nine-component solution, which was confirmed by CFA. This included psychological symptoms, physical functioning, and pain, which were covered by the EQ-5D, and satisfaction, speech/cognition,relationships, hearing, vision, and energy/sleep which were not. These latter factors may represent relevant candidate bolt-ons. PCA and CFA appear useful methods for identifying potential bolt-ons dimensions for an instrument such as the EQ-5D. Copyright © 2017 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.

  10. ToF-SIMS PCA analysis of Myrtus communis L.

    NASA Astrophysics Data System (ADS)

    Piras, F. M.; Dettori, M. F.; Magnani, A.

    2009-06-01

    Nowadays there is a growing interest of researchers for the application of sophisticated analytical techniques in conjunction with statistical data analysis methods to the characterization of natural products to assure their authenticity and quality, and for the possibility of direct analysis of food to obtain maximum information. In this work, time-of-flight secondary ion mass spectrometry (ToF-SIMS) in conjunction with principal components analysis (PCA) are applied to study the chemical composition and variability of Sardinian myrtle ( Myrtus communis L.) through the analysis of both berries alcoholic extracts and berries epicarp. ToF-SIMS spectra of berries epicarp show that the epicuticular waxes consist mainly of carboxylic acids with chain length ranging from C20 to C30, or identical species formed from fragmentation of long-chain esters. PCA of ToF-SIMS data from myrtle berries epicarp distinguishes two groups characterized by a different surface concentration of triacontanoic acid. Variability in antocyanins, flavonols, α-tocopherol, and myrtucommulone contents is showed by ToF-SIMS PCA analysis of myrtle berries alcoholic extracts.

  11. Principal component analysis of phenolic acid spectra

    USDA-ARS?s Scientific Manuscript database

    Phenolic acids are common plant metabolites that exhibit bioactive properties and have applications in functional food and animal feed formulations. The ultraviolet (UV) and infrared (IR) spectra of four closely related phenolic acid structures were evaluated by principal component analysis (PCA) to...

  12. Image restoration for three-dimensional fluorescence microscopy using an orthonormal basis for efficient representation of depth-variant point-spread functions

    PubMed Central

    Patwary, Nurmohammed; Preza, Chrysanthe

    2015-01-01

    A depth-variant (DV) image restoration algorithm for wide field fluorescence microscopy, using an orthonormal basis decomposition of DV point-spread functions (PSFs), is investigated in this study. The efficient PSF representation is based on a previously developed principal component analysis (PCA), which is computationally intensive. We present an approach developed to reduce the number of DV PSFs required for the PCA computation, thereby making the PCA-based approach computationally tractable for thick samples. Restoration results from both synthetic and experimental images show consistency and that the proposed algorithm addresses efficiently depth-induced aberration using a small number of principal components. Comparison of the PCA-based algorithm with a previously-developed strata-based DV restoration algorithm demonstrates that the proposed method improves performance by 50% in terms of accuracy and simultaneously reduces the processing time by 64% using comparable computational resources. PMID:26504634

  13. Principal component analysis vs. self-organizing maps combined with hierarchical clustering for pattern recognition in volcano seismic spectra

    NASA Astrophysics Data System (ADS)

    Unglert, K.; Radić, V.; Jellinek, A. M.

    2016-06-01

    Variations in the spectral content of volcano seismicity related to changes in volcanic activity are commonly identified manually in spectrograms. However, long time series of monitoring data at volcano observatories require tools to facilitate automated and rapid processing. Techniques such as self-organizing maps (SOM) and principal component analysis (PCA) can help to quickly and automatically identify important patterns related to impending eruptions. For the first time, we evaluate the performance of SOM and PCA on synthetic volcano seismic spectra constructed from observations during two well-studied eruptions at Klauea Volcano, Hawai'i, that include features observed in many volcanic settings. In particular, our objective is to test which of the techniques can best retrieve a set of three spectral patterns that we used to compose a synthetic spectrogram. We find that, without a priori knowledge of the given set of patterns, neither SOM nor PCA can directly recover the spectra. We thus test hierarchical clustering, a commonly used method, to investigate whether clustering in the space of the principal components and on the SOM, respectively, can retrieve the known patterns. Our clustering method applied to the SOM fails to detect the correct number and shape of the known input spectra. In contrast, clustering of the data reconstructed by the first three PCA modes reproduces these patterns and their occurrence in time more consistently. This result suggests that PCA in combination with hierarchical clustering is a powerful practical tool for automated identification of characteristic patterns in volcano seismic spectra. Our results indicate that, in contrast to PCA, common clustering algorithms may not be ideal to group patterns on the SOM and that it is crucial to evaluate the performance of these tools on a control dataset prior to their application to real data.

  14. Distribution of a low dose compound within pharmaceutical tablet by using multivariate curve resolution on Raman hyperspectral images.

    PubMed

    Boiret, Mathieu; de Juan, Anna; Gorretta, Nathalie; Ginot, Yves-Michel; Roger, Jean-Michel

    2015-01-25

    In this work, Raman hyperspectral images and multivariate curve resolution-alternating least squares (MCR-ALS) are used to study the distribution of actives and excipients within a pharmaceutical drug product. This article is mainly focused on the distribution of a low dose constituent. Different approaches are compared, using initially filtered or non-filtered data, or using a column-wise augmented dataset before starting the MCR-ALS iterative process including appended information on the low dose component. In the studied formulation, magnesium stearate is used as a lubricant to improve powder flowability. With a theoretical concentration of 0.5% (w/w) in the drug product, the spectral variance contained in the data is weak. By using a principal component analysis (PCA) filtered dataset as a first step of the MCR-ALS approach, the lubricant information is lost in the non-explained variance and its associated distribution in the tablet cannot be highlighted. A sufficient number of components to generate the PCA noise-filtered matrix has to be used in order to keep the lubricant variability within the data set analyzed or, otherwise, work with the raw non-filtered data. Different models are built using an increasing number of components to perform the PCA reduction. It is shown that the magnesium stearate information can be extracted from a PCA model using a minimum of 20 components. In the last part, a column-wise augmented matrix, including a reference spectrum of the lubricant, is used before starting MCR-ALS process. PCA reduction is performed on the augmented matrix, so the magnesium stearate contribution is included within the MCR-ALS calculations. By using an appropriate PCA reduction, with a sufficient number of components, or by using an augmented dataset including appended information on the low dose component, the distribution of the two actives, the two main excipients and the low dose lubricant are correctly recovered. Copyright © 2014 Elsevier B.V. All rights reserved.

  15. PEM-PCA: a parallel expectation-maximization PCA face recognition architecture.

    PubMed

    Rujirakul, Kanokmon; So-In, Chakchai; Arnonkijpanich, Banchar

    2014-01-01

    Principal component analysis or PCA has been traditionally used as one of the feature extraction techniques in face recognition systems yielding high accuracy when requiring a small number of features. However, the covariance matrix and eigenvalue decomposition stages cause high computational complexity, especially for a large database. Thus, this research presents an alternative approach utilizing an Expectation-Maximization algorithm to reduce the determinant matrix manipulation resulting in the reduction of the stages' complexity. To improve the computational time, a novel parallel architecture was employed to utilize the benefits of parallelization of matrix computation during feature extraction and classification stages including parallel preprocessing, and their combinations, so-called a Parallel Expectation-Maximization PCA architecture. Comparing to a traditional PCA and its derivatives, the results indicate lower complexity with an insignificant difference in recognition precision leading to high speed face recognition systems, that is, the speed-up over nine and three times over PCA and Parallel PCA.

  16. Strategies for reducing large fMRI data sets for independent component analysis.

    PubMed

    Wang, Ze; Wang, Jiongjiong; Calhoun, Vince; Rao, Hengyi; Detre, John A; Childress, Anna R

    2006-06-01

    In independent component analysis (ICA), principal component analysis (PCA) is generally used to reduce the raw data to a few principal components (PCs) through eigenvector decomposition (EVD) on the data covariance matrix. Although this works for spatial ICA (sICA) on moderately sized fMRI data, it is intractable for temporal ICA (tICA), since typical fMRI data have a high spatial dimension, resulting in an unmanageable data covariance matrix. To solve this problem, two practical data reduction methods are presented in this paper. The first solution is to calculate the PCs of tICA from the PCs of sICA. This approach works well for moderately sized fMRI data; however, it is highly computationally intensive, even intractable, when the number of scans increases. The second solution proposed is to perform PCA decomposition via a cascade recursive least squared (CRLS) network, which provides a uniform data reduction solution for both sICA and tICA. Without the need to calculate the covariance matrix, CRLS extracts PCs directly from the raw data, and the PC extraction can be terminated after computing an arbitrary number of PCs without the need to estimate the whole set of PCs. Moreover, when the whole data set becomes too large to be loaded into the machine memory, CRLS-PCA can save data retrieval time by reading the data once, while the conventional PCA requires numerous data retrieval steps for both covariance matrix calculation and PC extractions. Real fMRI data were used to evaluate the PC extraction precision, computational expense, and memory usage of the presented methods.

  17. Liquid chromatography tandem mass spectrometry determination of chemical markers and principal component analysis of Vitex agnus-castus L. fruits (Verbenaceae) and derived food supplements.

    PubMed

    Mari, Angela; Montoro, Paola; Pizza, Cosimo; Piacente, Sonia

    2012-11-01

    A validated analytical method for the quantitative determination of seven chemical markers occurring in a hydroalcoholic extract of Vitex agnus-castus fruits by liquid chromatography electrospray triple quadrupole tandem mass spectrometry (LC/ESI/(QqQ)MSMS) is reported. To carry out a comparative study, five commercial food supplements corresponding to hydroalcoholic extracts of V. agnus-castus fruits were analysed under the same chromatographic conditions of the crude extract. Principal component analysis (PCA), based only on the variation of the amount of the seven chemical markers, was applied in order to find similarities between the hydroalcoholic extract and the food supplements. A second PCA analysis was carried out considering the whole spectroscopic data deriving from liquid chromatography electrospray linear ion trap mass spectrometry (LC/ESI/(LIT)MS) analysis. High similarity between the two PCA was observed, showing the possibility to select one of these two approaches for future applications in the field of comparative analysis of food supplements and quality control procedures. Copyright © 2012 Elsevier B.V. All rights reserved.

  18. Investigation of cell wall composition related to stem lodging resistance in wheat (Triticum aestivum L.) by FTIR spectroscopy.

    PubMed

    Wang, Jian; Zhu, Jinmao; Huang, RuZhu; Yang, YuSheng

    2012-07-01

    We explored the rapid qualitative analysis of wheat cultivars with good lodging resistances by Fourier transform infrared resonance (FTIR) spectroscopy and multivariate statistical analysis. FTIR imaging showing that wheat stem cell walls were mainly composed of cellulose, pectin, protein, and lignin. Principal components analysis (PCA) was used to eliminate multicollinearity among multiple peak absorptions. PCA revealed the developmental internodes of wheat stems could be distributed from low to high along the load of the second principal component, which was consistent with the corresponding bands of cellulose in the FTIR spectra of the cell walls. Furthermore, four distinct stem populations could also be identified by spectral features related to their corresponding mechanical properties via PCA and cluster analysis. Histochemical staining of four types of wheat stems with various abilities to resist lodging revealed that cellulose contributed more than lignin to the ability to resist lodging. These results strongly suggested that the main cell wall component responsible for these differences was cellulose. Therefore, the combination of multivariate analysis and FTIR could rapidly screen wheat cultivars with good lodging resistance. Furthermore, the application of these methods to a much wider range of cultivars of unknown mechanical properties promises to be of interest.

  19. Ripening-dependent metabolic changes in the volatiles of pineapple (Ananas comosus (L.) Merr.) fruit: II. Multivariate statistical profiling of pineapple aroma compounds based on comprehensive two-dimensional gas chromatography-mass spectrometry.

    PubMed

    Steingass, Christof Björn; Jutzi, Manfred; Müller, Jenny; Carle, Reinhold; Schmarr, Hans-Georg

    2015-03-01

    Ripening-dependent changes of pineapple volatiles were studied in a nontargeted profiling analysis. Volatiles were isolated via headspace solid phase microextraction and analyzed by comprehensive 2D gas chromatography and mass spectrometry (HS-SPME-GC×GC-qMS). Profile patterns presented in the contour plots were evaluated applying image processing techniques and subsequent multivariate statistical data analysis. Statistical methods comprised unsupervised hierarchical cluster analysis (HCA) and principal component analysis (PCA) to classify the samples. Supervised partial least squares discriminant analysis (PLS-DA) and partial least squares (PLS) regression were applied to discriminate different ripening stages and describe the development of volatiles during postharvest storage, respectively. Hereby, substantial chemical markers allowing for class separation were revealed. The workflow permitted the rapid distinction between premature green-ripe pineapples and postharvest-ripened sea-freighted fruits. Volatile profiles of fully ripe air-freighted pineapples were similar to those of green-ripe fruits postharvest ripened for 6 days after simulated sea freight export, after PCA with only two principal components. However, PCA considering also the third principal component allowed differentiation between air-freighted fruits and the four progressing postharvest maturity stages of sea-freighted pineapples.

  20. Evaluation of Staining-Dependent Colour Changes in Resin Composites Using Principal Component Analysis

    PubMed Central

    Manojlovic, D.; Lenhardt, L.; Milićević, B.; Antonov, M.; Miletic, V.; Dramićanin, M. D.

    2015-01-01

    Colour changes in Gradia Direct™ composite after immersion in tea, coffee, red wine, Coca-Cola, Colgate mouthwash, and distilled water were evaluated using principal component analysis (PCA) and the CIELAB colour coordinates. The reflection spectra of the composites were used as input data for the PCA. The output data (scores and loadings) provided information about the magnitude and origin of the surface reflection changes after exposure to the staining solutions. The reflection spectra of the stained samples generally exhibited lower reflection in the blue spectral range, which was manifested in the lower content of the blue shade for the samples. Both analyses demonstrated the high staining abilities of tea, coffee, and red wine, which produced total colour changes of 4.31, 6.61, and 6.22, respectively, according to the CIELAB analysis. PCA revealed subtle changes in the reflection spectra of composites immersed in Coca-Cola, demonstrating Coca-Cola’s ability to stain the composite to a small degree. PMID:26450008

  1. Evaluation of Staining-Dependent Colour Changes in Resin Composites Using Principal Component Analysis.

    PubMed

    Manojlovic, D; Lenhardt, L; Milićević, B; Antonov, M; Miletic, V; Dramićanin, M D

    2015-10-09

    Colour changes in Gradia Direct™ composite after immersion in tea, coffee, red wine, Coca-Cola, Colgate mouthwash, and distilled water were evaluated using principal component analysis (PCA) and the CIELAB colour coordinates. The reflection spectra of the composites were used as input data for the PCA. The output data (scores and loadings) provided information about the magnitude and origin of the surface reflection changes after exposure to the staining solutions. The reflection spectra of the stained samples generally exhibited lower reflection in the blue spectral range, which was manifested in the lower content of the blue shade for the samples. Both analyses demonstrated the high staining abilities of tea, coffee, and red wine, which produced total colour changes of 4.31, 6.61, and 6.22, respectively, according to the CIELAB analysis. PCA revealed subtle changes in the reflection spectra of composites immersed in Coca-Cola, demonstrating Coca-Cola's ability to stain the composite to a small degree.

  2. Principal component and normal mode analysis of proteins; a quantitative comparison using the GroEL subunit.

    PubMed

    Skjaerven, Lars; Martinez, Aurora; Reuter, Nathalie

    2011-01-01

    Principal component analysis (PCA) and normal mode analysis (NMA) have emerged as two invaluable tools for studying conformational changes in proteins. To compare these approaches for studying protein dynamics, we have used a subunit of the GroEL chaperone, whose dynamics is well characterized. We first show that both PCA on trajectories from molecular dynamics (MD) simulations and NMA reveal a general dynamical behavior in agreement with what has previously been described for GroEL. We thus compare the reproducibility of PCA on independent MD runs and subsequently investigate the influence of the length of the MD simulations. We show that there is a relatively poor one-to-one correspondence between eigenvectors obtained from two independent runs and conclude that caution should be taken when analyzing principal components individually. We also observe that increasing the simulation length does not improve the agreement with the experimental structural difference. In fact, relatively short MD simulations are sufficient for this purpose. We observe a rapid convergence of the eigenvectors (after ca. 6 ns). Although there is not always a clear one-to-one correspondence, there is a qualitatively good agreement between the movements described by the first five modes obtained with the three different approaches; PCA, all-atoms NMA, and coarse-grained NMA. It is particularly interesting to relate this to the computational cost of the three methods. The results we obtain on the GroEL subunit contribute to the generalization of robust and reproducible strategies for the study of protein dynamics, using either NMA or PCA of trajectories from MD simulations. © 2010 Wiley-Liss, Inc.

  3. Dihedral angle principal component analysis of molecular dynamics simulations.

    PubMed

    Altis, Alexandros; Nguyen, Phuong H; Hegger, Rainer; Stock, Gerhard

    2007-06-28

    It has recently been suggested by Mu et al. [Proteins 58, 45 (2005)] to use backbone dihedral angles instead of Cartesian coordinates in a principal component analysis of molecular dynamics simulations. Dihedral angles may be advantageous because internal coordinates naturally provide a correct separation of internal and overall motion, which was found to be essential for the construction and interpretation of the free energy landscape of a biomolecule undergoing large structural rearrangements. To account for the circular statistics of angular variables, a transformation from the space of dihedral angles {phi(n)} to the metric coordinate space {x(n)=cos phi(n),y(n)=sin phi(n)} was employed. To study the validity and the applicability of the approach, in this work the theoretical foundations underlying the dihedral angle principal component analysis (dPCA) are discussed. It is shown that the dPCA amounts to a one-to-one representation of the original angle distribution and that its principal components can readily be characterized by the corresponding conformational changes of the peptide. Furthermore, a complex version of the dPCA is introduced, in which N angular variables naturally lead to N eigenvalues and eigenvectors. Applying the methodology to the construction of the free energy landscape of decaalanine from a 300 ns molecular dynamics simulation, a critical comparison of the various methods is given.

  4. Dihedral angle principal component analysis of molecular dynamics simulations

    NASA Astrophysics Data System (ADS)

    Altis, Alexandros; Nguyen, Phuong H.; Hegger, Rainer; Stock, Gerhard

    2007-06-01

    It has recently been suggested by Mu et al. [Proteins 58, 45 (2005)] to use backbone dihedral angles instead of Cartesian coordinates in a principal component analysis of molecular dynamics simulations. Dihedral angles may be advantageous because internal coordinates naturally provide a correct separation of internal and overall motion, which was found to be essential for the construction and interpretation of the free energy landscape of a biomolecule undergoing large structural rearrangements. To account for the circular statistics of angular variables, a transformation from the space of dihedral angles {φn} to the metric coordinate space {xn=cosφn,yn=sinφn} was employed. To study the validity and the applicability of the approach, in this work the theoretical foundations underlying the dihedral angle principal component analysis (dPCA) are discussed. It is shown that the dPCA amounts to a one-to-one representation of the original angle distribution and that its principal components can readily be characterized by the corresponding conformational changes of the peptide. Furthermore, a complex version of the dPCA is introduced, in which N angular variables naturally lead to N eigenvalues and eigenvectors. Applying the methodology to the construction of the free energy landscape of decaalanine from a 300ns molecular dynamics simulation, a critical comparison of the various methods is given.

  5. Application of Hyperspectral Imaging and Chemometric Calibrations for Variety Discrimination of Maize Seeds

    PubMed Central

    Zhang, Xiaolei; Liu, Fei; He, Yong; Li, Xiaoli

    2012-01-01

    Hyperspectral imaging in the visible and near infrared (VIS-NIR) region was used to develop a novel method for discriminating different varieties of commodity maize seeds. Firstly, hyperspectral images of 330 samples of six varieties of maize seeds were acquired using a hyperspectral imaging system in the 380–1,030 nm wavelength range. Secondly, principal component analysis (PCA) and kernel principal component analysis (KPCA) were used to explore the internal structure of the spectral data. Thirdly, three optimal wavelengths (523, 579 and 863 nm) were selected by implementing PCA directly on each image. Then four textural variables including contrast, homogeneity, energy and correlation were extracted from gray level co-occurrence matrix (GLCM) of each monochromatic image based on the optimal wavelengths. Finally, several models for maize seeds identification were established by least squares-support vector machine (LS-SVM) and back propagation neural network (BPNN) using four different combinations of principal components (PCs), kernel principal components (KPCs) and textural features as input variables, respectively. The recognition accuracy achieved in the PCA-GLCM-LS-SVM model (98.89%) was the most satisfactory one. We conclude that hyperspectral imaging combined with texture analysis can be implemented for fast classification of different varieties of maize seeds. PMID:23235456

  6. Principal Components Analysis Studies of Martian Clouds

    NASA Astrophysics Data System (ADS)

    Klassen, D. R.; Bell, J. F., III

    2001-11-01

    We present the principal components analysis (PCA) of absolutely calibrated multi-spectral images of Mars as a function of Martian season. The PCA technique is a mathematical rotation and translation of the data from a brightness/wavelength space to a vector space of principal ``traits'' that lie along the directions of maximal variance. The first of these traits, accounting for over 90% of the data variance, is overall brightness and represented by an average Mars spectrum. Interpretation of the remaining traits, which account for the remaining ~10% of the variance, is not always the same and depends upon what other components are in the scene and thus, varies with Martian season. For example, during seasons with large amounts of water ice in the scene, the second trait correlates with the ice and anti-corrlates with temperature. We will investigate the interpretation of the second, and successive important PCA traits. Although these PCA traits are orthogonal in their own vector space, it is unlikely that any one trait represents a singular, mineralogic, spectral end-member. It is more likely that there are many spectral endmembers that vary identically to within the noise level, that the PCA technique will not be able to distinguish them. Another possibility is that similar absorption features among spectral endmembers may be tied to one PCA trait, for example ''amount of 2 \\micron\\ absorption''. We thus attempt to extract spectral endmembers by matching linear combinations of the PCA traits to USGS, JHU, and JPL spectral libraries as aquired through the JPL Aster project. The recovered spectral endmembers are then linearly combined to model the multi-spectral image set. We present here the spectral abundance maps of the water ice/frost endmember which allow us to track Martian clouds and ground frosts. This work supported in part through NASA Planetary Astronomy Grant NAG5-6776. All data gathered at the NASA Infrared Telescope Facility in collaboration with the telescope operators and with thanks to the support staff and day crew.

  7. [Vis-NIR spectroscopic pattern recognition combined with SG smoothing applied to breed screening of transgenic sugarcane].

    PubMed

    Liu, Gui-Song; Guo, Hao-Song; Pan, Tao; Wang, Ji-Hua; Cao, Gan

    2014-10-01

    Based on Savitzky-Golay (SG) smoothing screening, principal component analysis (PCA) combined with separately supervised linear discriminant analysis (LDA) and unsupervised hierarchical clustering analysis (HCA) were used for non-destructive visible and near-infrared (Vis-NIR) detection for breed screening of transgenic sugarcane. A random and stability-dependent framework of calibration, prediction, and validation was proposed. A total of 456 samples of sugarcane leaves planting in the elongating stage were collected from the field, which was composed of 306 transgenic (positive) samples containing Bt and Bar gene and 150 non-transgenic (negative) samples. A total of 156 samples (negative 50 and positive 106) were randomly selected as the validation set; the remaining samples (negative 100 and positive 200, a total of 300 samples) were used as the modeling set, and then the modeling set was subdivided into calibration (negative 50 and positive 100, a total of 150 samples) and prediction sets (negative 50 and positive 100, a total of 150 samples) for 50 times. The number of SG smoothing points was ex- panded, while some modes of higher derivative were removed because of small absolute value, and a total of 264 smoothing modes were used for screening. The pairwise combinations of first three principal components were used, and then the optimal combination of principal components was selected according to the model effect. Based on all divisions of calibration and prediction sets and all SG smoothing modes, the SG-PCA-LDA and SG-PCA-HCA models were established, the model parameters were optimized based on the average prediction effect for all divisions to produce modeling stability. Finally, the model validation was performed by validation set. With SG smoothing, the modeling accuracy and stability of PCA-LDA, PCA-HCA were signif- icantly improved. For the optimal SG-PCA-LDA model, the recognition rate of positive and negative validation samples were 94.3%, 96.0%; and were 92.5%, 98.0% for the optimal SG-PCA-LDA model, respectively. Vis-NIR spectro- scopic pattern recognition combined with SG smoothing could be used for accurate recognition of transgenic sugarcane leaves, and provided a convenient screening method for transgenic sugarcane breeding.

  8. Tracing and separating plasma components causing matrix effects in hydrophilic interaction chromatography-electrospray ionization mass spectrometry.

    PubMed

    Ekdahl, Anja; Johansson, Maria C; Ahnoff, Martin

    2013-04-01

    Matrix effects on electrospray ionization were investigated for plasma samples analysed by hydrophilic interaction chromatography (HILIC) in gradient elution mode, and HILIC columns of different chemistries were tested for separation of plasma components and model analytes. By combining mass spectral data with post-column infusion traces, the following components of protein-precipitated plasma were identified and found to have significant effect on ionization: urea, creatinine, phosphocholine, lysophosphocholine, sphingomyelin, sodium ion, chloride ion, choline and proline betaine. The observed effect on ionization was both matrix-component and analyte dependent. The separation of identified plasma components and model analytes on eight columns was compared, using pair-wise linear correlation analysis and principal component analysis (PCA). Large changes in selectivity could be obtained by change of column, while smaller changes were seen when the mobile phase buffer was changed from ammonium formate pH 3.0 to ammonium acetate pH 4.5. While results from PCA and linear correlation analysis were largely in accord, linear correlation analysis was judged to be more straight-forward in terms of conduction and interpretation.

  9. Age-related differences in early novelty processing: Using PCA to parse the overlapping anterior P2 and N2 components

    PubMed Central

    Daffner, Kirk R.; Alperin, Brittany R.; Mott, Katherine K.; Tusch, Erich; Holcomb, Phillip J.

    2015-01-01

    Previous work demonstrated age-associated increases in the anterior P2 and age-related decreases in the anterior N2 in response to novel stimuli. Principal component analysis (PCA) was used to determine if the inverse relationship between these components was due to their temporal and spatial overlap. PCA revealed an early anterior P2, sensitive to task relevance, and a late anterior P2, responsive to novelty, both exhibiting age-related amplitude increases. A PCA factor representing the anterior N2, sensitive to novelty, exhibited age-related amplitude decreases. The late P2 and N2 to novels inversely correlated. Larger late P2 amplitude to novels was associated with better behavioral performance. Age-related differences in the anterior P2 and N2 to novel stimuli likely represent age-associated changes in independent cognitive operations. Enhanced anterior P2 activity (indexing augmentation in motivational salience) may be a compensatory mechanism for diminished anterior N2 activity (indexing reduced ability of older adults to process ambiguous representations). PMID:25596483

  10. Detection of compatibility between baclofen and excipients with aid of infrared spectroscopy and chemometry

    NASA Astrophysics Data System (ADS)

    Rojek, Barbara; Wesolowski, Marek; Suchacz, Bogdan

    2013-12-01

    In the paper infrared (IR) spectroscopy and multivariate exploration techniques: principal component analysis (PCA) and cluster analysis (CA) were applied as supportive methods for the detection of physicochemical incompatibilities between baclofen and excipients. In the course of research, the most useful rotational strategy in PCA proved to be varimax normalized, while in CA Ward's hierarchical agglomeration with Euclidean distance measure enabled to yield the most interpretable results. Chemometrical calculations confirmed the suitability of PCA and CA as the auxiliary methods for interpretation of infrared spectra in order to recognize whether compatibilities or incompatibilities between active substance and excipients occur. On the basis of IR spectra and the results of PCA and CA it was possible to demonstrate that the presence of lactose, β-cyclodextrin and meglumine in binary mixtures produce interactions with baclofen. The results were verified using differential scanning calorimetry, differential thermal analysis, thermogravimetry/differential thermogravimetry and X-ray powder diffraction analyses.

  11. Principal component analysis to assess the efficiency and mechanism for enhanced coagulation of natural algae-laden water using a novel dual coagulant system.

    PubMed

    Ou, Hua-Se; Wei, Chao-Hai; Deng, Yang; Gao, Nai-Yun; Ren, Yuan; Hu, Yun

    2014-02-01

    A novel dual coagulant system of polyaluminum chloride sulfate (PACS) and polydiallyldimethylammonium chloride (PDADMAC) was used to treat natural algae-laden water from Meiliang Gulf, Lake Taihu. PACS (Aln(OH)mCl3n-m-2k(SO4)k) has a mass ratio of 10 %, a SO4 (2-)/Al3 (+) mole ratio of 0.0664, and an OH/Al mole ratio of 2. The PDADMAC ([C8H16NCl]m) has a MW which ranges from 5 × 10(5) to 20 × 10(5) Da. The variations of contaminants in water samples during treatments were estimated in the form of principal component analysis (PCA) factor scores and conventional variables (turbidity, DOC, etc.). Parallel factor analysis determined four chromophoric dissolved organic matters (CDOM) components, and PCA identified four integrated principle factors. PCA factor 1 had significant correlations with chlorophyll-a (r=0.718), protein-like CDOM C1 (0.689), and C2 (0.756). Factor 2 correlated with UV254 (0.672), humic-like CDOM component C3 (0.716), and C4 (0.758). Factors 3 and 4 had correlations with NH3-N (0.748) and T-P (0.769), respectively. The variations of PCA factors scores revealed that PACS contributed less aluminum dissolution than PAC to obtain equivalent removal efficiency of contaminants. This might be due to the high cationic charge and pre-hydrolyzation of PACS. Compared with PACS coagulation (20 mg L(-1)), the removal of PCA factors 1, 2, and 4 increased 45, 33, and 12 %, respectively, in combined PACS-PDADMAC treatment (0.8 mg L(-1) +20 mg L(-1)). Since PAC contained more Al (0.053 g/1 g) than PACS (0.028 g/1 g), the results indicated that PACS contributed less Al dissolution into the water to obtain equivalent removal efficiency.

  12. Multiscale 3D Shape Analysis using Spherical Wavelets

    PubMed Central

    Nain, Delphine; Haker, Steven; Bobick, Aaron; Tannenbaum, Allen

    2013-01-01

    Shape priors attempt to represent biological variations within a population. When variations are global, Principal Component Analysis (PCA) can be used to learn major modes of variation, even from a limited training set. However, when significant local variations exist, PCA typically cannot represent such variations from a small training set. To address this issue, we present a novel algorithm that learns shape variations from data at multiple scales and locations using spherical wavelets and spectral graph partitioning. Our results show that when the training set is small, our algorithm significantly improves the approximation of shapes in a testing set over PCA, which tends to oversmooth data. PMID:16685992

  13. Multiscale 3D shape analysis using spherical wavelets.

    PubMed

    Nain, Delphine; Haker, Steven; Bobick, Aaron; Tannenbaum, Allen R

    2005-01-01

    Shape priors attempt to represent biological variations within a population. When variations are global, Principal Component Analysis (PCA) can be used to learn major modes of variation, even from a limited training set. However, when significant local variations exist, PCA typically cannot represent such variations from a small training set. To address this issue, we present a novel algorithm that learns shape variations from data at multiple scales and locations using spherical wavelets and spectral graph partitioning. Our results show that when the training set is small, our algorithm significantly improves the approximation of shapes in a testing set over PCA, which tends to oversmooth data.

  14. GO-PCA: An Unsupervised Method to Explore Gene Expression Data Using Prior Knowledge

    PubMed Central

    Wagner, Florian

    2015-01-01

    Method Genome-wide expression profiling is a widely used approach for characterizing heterogeneous populations of cells, tissues, biopsies, or other biological specimen. The exploratory analysis of such data typically relies on generic unsupervised methods, e.g. principal component analysis (PCA) or hierarchical clustering. However, generic methods fail to exploit prior knowledge about the molecular functions of genes. Here, I introduce GO-PCA, an unsupervised method that combines PCA with nonparametric GO enrichment analysis, in order to systematically search for sets of genes that are both strongly correlated and closely functionally related. These gene sets are then used to automatically generate expression signatures with functional labels, which collectively aim to provide a readily interpretable representation of biologically relevant similarities and differences. The robustness of the results obtained can be assessed by bootstrapping. Results I first applied GO-PCA to datasets containing diverse hematopoietic cell types from human and mouse, respectively. In both cases, GO-PCA generated a small number of signatures that represented the majority of lineages present, and whose labels reflected their respective biological characteristics. I then applied GO-PCA to human glioblastoma (GBM) data, and recovered signatures associated with four out of five previously defined GBM subtypes. My results demonstrate that GO-PCA is a powerful and versatile exploratory method that reduces an expression matrix containing thousands of genes to a much smaller set of interpretable signatures. In this way, GO-PCA aims to facilitate hypothesis generation, design of further analyses, and functional comparisons across datasets. PMID:26575370

  15. GO-PCA: An Unsupervised Method to Explore Gene Expression Data Using Prior Knowledge.

    PubMed

    Wagner, Florian

    2015-01-01

    Genome-wide expression profiling is a widely used approach for characterizing heterogeneous populations of cells, tissues, biopsies, or other biological specimen. The exploratory analysis of such data typically relies on generic unsupervised methods, e.g. principal component analysis (PCA) or hierarchical clustering. However, generic methods fail to exploit prior knowledge about the molecular functions of genes. Here, I introduce GO-PCA, an unsupervised method that combines PCA with nonparametric GO enrichment analysis, in order to systematically search for sets of genes that are both strongly correlated and closely functionally related. These gene sets are then used to automatically generate expression signatures with functional labels, which collectively aim to provide a readily interpretable representation of biologically relevant similarities and differences. The robustness of the results obtained can be assessed by bootstrapping. I first applied GO-PCA to datasets containing diverse hematopoietic cell types from human and mouse, respectively. In both cases, GO-PCA generated a small number of signatures that represented the majority of lineages present, and whose labels reflected their respective biological characteristics. I then applied GO-PCA to human glioblastoma (GBM) data, and recovered signatures associated with four out of five previously defined GBM subtypes. My results demonstrate that GO-PCA is a powerful and versatile exploratory method that reduces an expression matrix containing thousands of genes to a much smaller set of interpretable signatures. In this way, GO-PCA aims to facilitate hypothesis generation, design of further analyses, and functional comparisons across datasets.

  16. Decomposing the Apoptosis Pathway Into Biologically Interpretable Principal Components

    PubMed Central

    Wang, Min; Kornblau, Steven M; Coombes, Kevin R

    2018-01-01

    Principal component analysis (PCA) is one of the most common techniques in the analysis of biological data sets, but applying PCA raises 2 challenges. First, one must determine the number of significant principal components (PCs). Second, because each PC is a linear combination of genes, it rarely has a biological interpretation. Existing methods to determine the number of PCs are either subjective or computationally extensive. We review several methods and describe a new R package, PCDimension, that implements additional methods, the most important being an algorithm that extends and automates a graphical Bayesian method. Using simulations, we compared the methods. Our newly automated procedure is competitive with the best methods when considering both accuracy and speed and is the most accurate when the number of objects is small compared with the number of attributes. We applied the method to a proteomics data set from patients with acute myeloid leukemia. Proteins in the apoptosis pathway could be explained using 6 PCs. By clustering the proteins in PC space, we were able to replace the PCs by 6 “biological components,” 3 of which could be immediately interpreted from the current literature. We expect this approach combining PCA with clustering to be widely applicable. PMID:29881252

  17. In Situ Aerosol Profile Measurements and Comparisons with SAGE 3 Aerosol Extinction and Surface Area Profiles at 68 deg North

    NASA Technical Reports Server (NTRS)

    2005-01-01

    Under funding from this proposal three in situ profile measurements of stratospheric sulfate aerosol and ozone were completed from balloon-borne platforms. The measured quantities are aerosol size resolved number concentration and ozone. The one derived product is aerosol size distribution, from which aerosol moments, such as surface area, volume, and extinction can be calculated for comparison with SAGE III measurements and SAGE III derived products, such as surface area. The analysis of these profiles and comparison with SAGE III extinction measurements and SAGE III derived surface areas are provided in Yongxiao (2005), which comprised the research thesis component of Mr. Jian Yongxiao's M.S. degree in Atmospheric Science at the University of Wyoming. In addition analysis continues on using principal component analysis (PCA) to derive aerosol surface area from the 9 wavelength extinction measurements available from SAGE III. Ths paper will present PCA components to calculate surface area from SAGE III measurements and compare these derived surface areas with those available directly from in situ size distribution measurements, as well as surface areas which would be derived from PCA and Thomason's algorithm applied to the four wavelength SAGE II extinction measurements.

  18. Application of principal component analysis in the pollution assessment with heavy metals of vegetable food chain in the old mining areas

    PubMed Central

    2012-01-01

    Background The aim of the paper is to assess by the principal components analysis (PCA) the heavy metal contamination of soil and vegetables widely used as food for people who live in areas contaminated by heavy metals (HMs) due to long-lasting mining activities. This chemometric technique allowed us to select the best model for determining the risk of HMs on the food chain as well as on people's health. Results Many PCA models were computed with different variables: heavy metals contents and some agro-chemical parameters which characterize the soil samples from contaminated and uncontaminated areas, HMs contents of different types of vegetables grown and consumed in these areas, and the complex parameter target hazard quotients (THQ). Results were discussed in terms of principal component analysis. Conclusion There were two major benefits in processing the data PCA: firstly, it helped in optimizing the number and type of data that are best in rendering the HMs contamination of the soil and vegetables. Secondly, it was valuable for selecting the vegetable species which present the highest/minimum risk of a negative impact on the food chain and human health. PMID:23234365

  19. Spatial assessment of water quality using chemometrics in the Pearl River Estuary, China

    NASA Astrophysics Data System (ADS)

    Wu, Meilin; Wang, Youshao; Dong, Junde; Sun, Fulin; Wang, Yutu; Hong, Yiguo

    2017-03-01

    A cruise was commissioned in the summer of 2009 to evaluate water quality in the Pearl River Estuary (PRE). Chemometrics such as Principal Component Analysis (PCA), Cluster analysis (CA) and Self-Organizing Map (SOM) were employed to identify anthropogenic and natural influences on estuary water quality. The scores of stations in the surface layer in the first principal component (PC1) were related to NH4-N, PO4-P, NO2-N, NO3-N, TP, and Chlorophyll a while salinity, turbidity, and SiO3-Si in the second principal component (PC2). Similarly, the scores of stations in the bottom layers in PC1 were related to PO4-P, NO2-N, NO3-N, and TP, while salinity, Chlorophyll a, NH4-N, and SiO3-Si in PC2. Results of the PCA identified the spatial distribution of the surface and bottom water quality, namely the Guangzhou urban reach, Middle reach, and Lower reach of the estuary. Both cluster analysis and PCA produced the similar results. Self-organizing map delineated the Guangzhou urban reach of the Pearl River that was mainly influenced by human activities. The middle and lower reaches of the PRE were mainly influenced by the waters in the South China Sea. The information extracted by PCA, CA, and SOM would be very useful to regional agencies in developing a strategy to carry out scientific plans for resource use based on marine system functions.

  20. Hyperspectral Image Denoising Using a Nonlocal Spectral Spatial Principal Component Analysis

    NASA Astrophysics Data System (ADS)

    Li, D.; Xu, L.; Peng, J.; Ma, J.

    2018-04-01

    Hyperspectral images (HSIs) denoising is a critical research area in image processing duo to its importance in improving the quality of HSIs, which has a negative impact on object detection and classification and so on. In this paper, we develop a noise reduction method based on principal component analysis (PCA) for hyperspectral imagery, which is dependent on the assumption that the noise can be removed by selecting the leading principal components. The main contribution of paper is to introduce the spectral spatial structure and nonlocal similarity of the HSIs into the PCA denoising model. PCA with spectral spatial structure can exploit spectral correlation and spatial correlation of HSI by using 3D blocks instead of 2D patches. Nonlocal similarity means the similarity between the referenced pixel and other pixels in nonlocal area, where Mahalanobis distance algorithm is used to estimate the spatial spectral similarity by calculating the distance in 3D blocks. The proposed method is tested on both simulated and real hyperspectral images, the results demonstrate that the proposed method is superior to several other popular methods in HSI denoising.

  1. Facilitating in vivo tumor localization by principal component analysis based on dynamic fluorescence molecular imaging

    NASA Astrophysics Data System (ADS)

    Gao, Yang; Chen, Maomao; Wu, Junyu; Zhou, Yuan; Cai, Chuangjian; Wang, Daliang; Luo, Jianwen

    2017-09-01

    Fluorescence molecular imaging has been used to target tumors in mice with xenograft tumors. However, tumor imaging is largely distorted by the aggregation of fluorescent probes in the liver. A principal component analysis (PCA)-based strategy was applied on the in vivo dynamic fluorescence imaging results of three mice with xenograft tumors to facilitate tumor imaging, with the help of a tumor-specific fluorescent probe. Tumor-relevant features were extracted from the original images by PCA and represented by the principal component (PC) maps. The second principal component (PC2) map represented the tumor-related features, and the first principal component (PC1) map retained the original pharmacokinetic profiles, especially of the liver. The distribution patterns of the PC2 map of the tumor-bearing mice were in good agreement with the actual tumor location. The tumor-to-liver ratio and contrast-to-noise ratio were significantly higher on the PC2 map than on the original images, thus distinguishing the tumor from its nearby fluorescence noise of liver. The results suggest that the PC2 map could serve as a bioimaging marker to facilitate in vivo tumor localization, and dynamic fluorescence molecular imaging with PCA could be a valuable tool for future studies of in vivo tumor metabolism and progression.

  2. Performance comparisons between PCA-EA-LBG and PCA-LBG-EA approaches in VQ codebook generation for image compression

    NASA Astrophysics Data System (ADS)

    Tsai, Jinn-Tsong; Chou, Ping-Yi; Chou, Jyh-Horng

    2015-11-01

    The aim of this study is to generate vector quantisation (VQ) codebooks by integrating principle component analysis (PCA) algorithm, Linde-Buzo-Gray (LBG) algorithm, and evolutionary algorithms (EAs). The EAs include genetic algorithm (GA), particle swarm optimisation (PSO), honey bee mating optimisation (HBMO), and firefly algorithm (FF). The study is to provide performance comparisons between PCA-EA-LBG and PCA-LBG-EA approaches. The PCA-EA-LBG approaches contain PCA-GA-LBG, PCA-PSO-LBG, PCA-HBMO-LBG, and PCA-FF-LBG, while the PCA-LBG-EA approaches contain PCA-LBG, PCA-LBG-GA, PCA-LBG-PSO, PCA-LBG-HBMO, and PCA-LBG-FF. All training vectors of test images are grouped according to PCA. The PCA-EA-LBG used the vectors grouped by PCA as initial individuals, and the best solution gained by the EAs was given for LBG to discover a codebook. The PCA-LBG approach is to use the PCA to select vectors as initial individuals for LBG to find a codebook. The PCA-LBG-EA used the final result of PCA-LBG as an initial individual for EAs to find a codebook. The search schemes in PCA-EA-LBG first used global search and then applied local search skill, while in PCA-LBG-EA first used local search and then employed global search skill. The results verify that the PCA-EA-LBG indeed gain superior results compared to the PCA-LBG-EA, because the PCA-EA-LBG explores a global area to find a solution, and then exploits a better one from the local area of the solution. Furthermore the proposed PCA-EA-LBG approaches in designing VQ codebooks outperform existing approaches shown in the literature.

  3. Time-oriented hierarchical method for computation of principal components using subspace learning algorithm.

    PubMed

    Jankovic, Marko; Ogawa, Hidemitsu

    2004-10-01

    Principal Component Analysis (PCA) and Principal Subspace Analysis (PSA) are classic techniques in statistical data analysis, feature extraction and data compression. Given a set of multivariate measurements, PCA and PSA provide a smaller set of "basis vectors" with less redundancy, and a subspace spanned by them, respectively. Artificial neurons and neural networks have been shown to perform PSA and PCA when gradient ascent (descent) learning rules are used, which is related to the constrained maximization (minimization) of statistical objective functions. Due to their low complexity, such algorithms and their implementation in neural networks are potentially useful in cases of tracking slow changes of correlations in the input data or in updating eigenvectors with new samples. In this paper we propose PCA learning algorithm that is fully homogeneous with respect to neurons. The algorithm is obtained by modification of one of the most famous PSA learning algorithms--Subspace Learning Algorithm (SLA). Modification of the algorithm is based on Time-Oriented Hierarchical Method (TOHM). The method uses two distinct time scales. On a faster time scale PSA algorithm is responsible for the "behavior" of all output neurons. On a slower scale, output neurons will compete for fulfillment of their "own interests". On this scale, basis vectors in the principal subspace are rotated toward the principal eigenvectors. At the end of the paper it will be briefly analyzed how (or why) time-oriented hierarchical method can be used for transformation of any of the existing neural network PSA method, into PCA method.

  4. [Identification of varieties of cashmere by Vis/NIR spectroscopy technology based on PCA-SVM].

    PubMed

    Wu, Gui-Fang; He, Yong

    2009-06-01

    One mixed algorithm was presented to discriminate cashmere varieties with principal component analysis (PCA) and support vector machine (SVM). Cashmere fiber has such characteristics as threadlike, softness, glossiness and high tensile strength. The quality characters and economic value of each breed of cashmere are very different. In order to safeguard the consumer's rights and guarantee the quality of cashmere product, quickly, efficiently and correctly identifying cashmere has significant meaning to the production and transaction of cashmere material. The present research adopts Vis/NIRS spectroscopy diffuse techniques to collect the spectral data of cashmere. The near infrared fingerprint of cashmere was acquired by principal component analysis (PCA), and support vector machine (SVM) methods were used to further identify the cashmere material. The result of PCA indicated that the score map made by the scores of PC1, PC2 and PC3 was used, and 10 principal components (PCs) were selected as the input of support vector machine (SVM) based on the reliabilities of PCs of 99.99%. One hundred cashmere samples were used for calibration and the remaining 75 cashmere samples were used for validation. A one-against-all multi-class SVM model was built, the capabilities of SVM with different kernel function were comparatively analyzed, and the result showed that SVM possessing with the Gaussian kernel function has the best identification capabilities with the accuracy of 100%. This research indicated that the data mining method of PCA-SVM has a good identification effect, and can work as a new method for rapid identification of cashmere material varieties.

  5. Enlightening discriminative network functional modules behind Principal Component Analysis separation in differential-omic science studies

    PubMed Central

    Ciucci, Sara; Ge, Yan; Durán, Claudio; Palladini, Alessandra; Jiménez-Jiménez, Víctor; Martínez-Sánchez, Luisa María; Wang, Yuting; Sales, Susanne; Shevchenko, Andrej; Poser, Steven W.; Herbig, Maik; Otto, Oliver; Androutsellis-Theotokis, Andreas; Guck, Jochen; Gerl, Mathias J.; Cannistraci, Carlo Vittorio

    2017-01-01

    Omic science is rapidly growing and one of the most employed techniques to explore differential patterns in omic datasets is principal component analysis (PCA). However, a method to enlighten the network of omic features that mostly contribute to the sample separation obtained by PCA is missing. An alternative is to build correlation networks between univariately-selected significant omic features, but this neglects the multivariate unsupervised feature compression responsible for the PCA sample segregation. Biologists and medical researchers often prefer effective methods that offer an immediate interpretation to complicated algorithms that in principle promise an improvement but in practice are difficult to be applied and interpreted. Here we present PC-corr: a simple algorithm that associates to any PCA segregation a discriminative network of features. Such network can be inspected in search of functional modules useful in the definition of combinatorial and multiscale biomarkers from multifaceted omic data in systems and precision biomedicine. We offer proofs of PC-corr efficacy on lipidomic, metagenomic, developmental genomic, population genetic, cancer promoteromic and cancer stem-cell mechanomic data. Finally, PC-corr is a general functional network inference approach that can be easily adopted for big data exploration in computer science and analysis of complex systems in physics. PMID:28287094

  6. Extended principle component analysis - a useful tool to understand processes governing water quality at catchment scales

    NASA Astrophysics Data System (ADS)

    Selle, B.; Schwientek, M.

    2012-04-01

    Water quality of ground and surface waters in catchments is typically driven by many complex and interacting processes. While small scale processes are often studied in great detail, their relevance and interplay at catchment scales remain often poorly understood. For many catchments, extensive monitoring data on water quality have been collected for different purposes. These heterogeneous data sets contain valuable information on catchment scale processes but are rarely analysed using integrated methods. Principle component analysis (PCA) has previously been applied to this kind of data sets. However, a detailed analysis of scores, which are an important result of a PCA, is often missing. Mathematically, PCA expresses measured variables on water quality, e.g. nitrate concentrations, as linear combination of independent, not directly observable key processes. These computed key processes are represented by principle components. Their scores are interpretable as process intensities which vary in space and time. Subsequently, scores can be correlated with other key variables and catchment characteristics, such as water travel times and land use that were not considered in PCA. This detailed analysis of scores represents an extension of the commonly applied PCA which could considerably improve the understanding of processes governing water quality at catchment scales. In this study, we investigated the 170 km2 Ammer catchment in SW Germany which is characterised by an above average proportion of agricultural (71%) and urban (17%) areas. The Ammer River is mainly fed by karstic springs. For PCA, we separately analysed concentrations from (a) surface waters of the Ammer River and its tributaries, (b) spring waters from the main aquifers and (c) deep groundwater from production wells. This analysis was extended by a detailed analysis of scores. We analysed measured concentrations on major ions and selected organic micropollutants. Additionally, redox-sensitive variables and environmental tracers indicating groundwater age were analysed for deep groundwater from production wells. For deep groundwater, we found that microbial turnover was stronger influenced by local availability of energy sources than by travel times of groundwater to the wells. Groundwater quality primarily reflected the input of pollutants determined by landuse, e.g. agrochemicals. We concluded that for water quality in the Ammer catchment, conservative mixing of waters with different origin is more important than reactive transport processes along the flow path.

  7. Principal component analysis-based pattern analysis of dose-volume histograms and influence on rectal toxicity.

    PubMed

    Söhn, Matthias; Alber, Markus; Yan, Di

    2007-09-01

    The variability of dose-volume histogram (DVH) shapes in a patient population can be quantified using principal component analysis (PCA). We applied this to rectal DVHs of prostate cancer patients and investigated the correlation of the PCA parameters with late bleeding. PCA was applied to the rectal wall DVHs of 262 patients, who had been treated with a four-field box, conformal adaptive radiotherapy technique. The correlated changes in the DVH pattern were revealed as "eigenmodes," which were ordered by their importance to represent data set variability. Each DVH is uniquely characterized by its principal components (PCs). The correlation of the first three PCs and chronic rectal bleeding of Grade 2 or greater was investigated with uni- and multivariate logistic regression analyses. Rectal wall DVHs in four-field conformal RT can primarily be represented by the first two or three PCs, which describe approximately 94% or 96% of the DVH shape variability, respectively. The first eigenmode models the total irradiated rectal volume; thus, PC1 correlates to the mean dose. Mode 2 describes the interpatient differences of the relative rectal volume in the two- or four-field overlap region. Mode 3 reveals correlations of volumes with intermediate doses ( approximately 40-45 Gy) and volumes with doses >70 Gy; thus, PC3 is associated with the maximal dose. According to univariate logistic regression analysis, only PC2 correlated significantly with toxicity. However, multivariate logistic regression analysis with the first two or three PCs revealed an increased probability of bleeding for DVHs with more than one large PC. PCA can reveal the correlation structure of DVHs for a patient population as imposed by the treatment technique and provide information about its relationship to toxicity. It proves useful for augmenting normal tissue complication probability modeling approaches.

  8. Dynamic Responses in Brain Networks to Social Feedback: A Dual EEG Acquisition Study in Adolescent Couples

    PubMed Central

    Kuo, Ching-Chang; Ha, Thao; Ebbert, Ashley M.; Tucker, Don M.; Dishion, Thomas J.

    2017-01-01

    Adolescence is a sensitive period for the development of romantic relationships. During this period the maturation of frontolimbic networks is particularly important for the capacity to regulate emotional experiences. In previous research, both functional magnetic resonance imaging (fMRI) and dense array electroencephalography (dEEG) measures have suggested that responses in limbic regions are enhanced in adolescents experiencing social rejection. In the present research, we examined social acceptance and rejection from romantic partners as they engaged in a Chatroom Interact Task. Dual 128-channel dEEG systems were used to record neural responses to acceptance and rejection from both adolescent romantic partners and unfamiliar peers (N = 75). We employed a two-step temporal principal component analysis (PCA) and spatial independent component analysis (ICA) approach to statistically identify the neural components related to social feedback. Results revealed that the early (288 ms) discrimination between acceptance and rejection reflected by the P3a component was significant for the romantic partner but not the unfamiliar peer. In contrast, the later (364 ms) P3b component discriminated between acceptance and rejection for both partners and peers. The two-step approach (PCA then ICA) was better able than either PCA or ICA alone in separating these components of the brain's electrical activity that reflected both temporal and spatial phases of the brain's processing of social feedback. PMID:28620292

  9. Principal Component Analysis for pulse-shape discrimination of scintillation radiation detectors

    NASA Astrophysics Data System (ADS)

    Alharbi, T.

    2016-01-01

    In this paper, we report on the application of Principal Component analysis (PCA) for pulse-shape discrimination (PSD) of scintillation radiation detectors. The details of the method are described and the performance of the method is experimentally examined by discriminating between neutrons and gamma-rays with a liquid scintillation detector in a mixed radiation field. The performance of the method is also compared against that of the conventional charge-comparison method, demonstrating the superior performance of the method particularly at low light output range. PCA analysis has the important advantage of automatic extraction of the pulse-shape characteristics which makes the PSD method directly applicable to various scintillation detectors without the need for the adjustment of a PSD parameter.

  10. Decision tree and PCA-based fault diagnosis of rotating machinery

    NASA Astrophysics Data System (ADS)

    Sun, Weixiang; Chen, Jin; Li, Jiaqing

    2007-04-01

    After analysing the flaws of conventional fault diagnosis methods, data mining technology is introduced to fault diagnosis field, and a new method based on C4.5 decision tree and principal component analysis (PCA) is proposed. In this method, PCA is used to reduce features after data collection, preprocessing and feature extraction. Then, C4.5 is trained by using the samples to generate a decision tree model with diagnosis knowledge. At last the tree model is used to make diagnosis analysis. To validate the method proposed, six kinds of running states (normal or without any defect, unbalance, rotor radial rub, oil whirl, shaft crack and a simultaneous state of unbalance and radial rub), are simulated on Bently Rotor Kit RK4 to test C4.5 and PCA-based method and back-propagation neural network (BPNN). The result shows that C4.5 and PCA-based diagnosis method has higher accuracy and needs less training time than BPNN.

  11. Quantitative structure-activity relationship study of P2X7 receptor inhibitors using combination of principal component analysis and artificial intelligence methods.

    PubMed

    Ahmadi, Mehdi; Shahlaei, Mohsen

    2015-01-01

    P2X7 antagonist activity for a set of 49 molecules of the P2X7 receptor antagonists, derivatives of purine, was modeled with the aid of chemometric and artificial intelligence techniques. The activity of these compounds was estimated by means of combination of principal component analysis (PCA), as a well-known data reduction method, genetic algorithm (GA), as a variable selection technique, and artificial neural network (ANN), as a non-linear modeling method. First, a linear regression, combined with PCA, (principal component regression) was operated to model the structure-activity relationships, and afterwards a combination of PCA and ANN algorithm was employed to accurately predict the biological activity of the P2X7 antagonist. PCA preserves as much of the information as possible contained in the original data set. Seven most important PC's to the studied activity were selected as the inputs of ANN box by an efficient variable selection method, GA. The best computational neural network model was a fully-connected, feed-forward model with 7-7-1 architecture. The developed ANN model was fully evaluated by different validation techniques, including internal and external validation, and chemical applicability domain. All validations showed that the constructed quantitative structure-activity relationship model suggested is robust and satisfactory.

  12. Quantitative structure–activity relationship study of P2X7 receptor inhibitors using combination of principal component analysis and artificial intelligence methods

    PubMed Central

    Ahmadi, Mehdi; Shahlaei, Mohsen

    2015-01-01

    P2X7 antagonist activity for a set of 49 molecules of the P2X7 receptor antagonists, derivatives of purine, was modeled with the aid of chemometric and artificial intelligence techniques. The activity of these compounds was estimated by means of combination of principal component analysis (PCA), as a well-known data reduction method, genetic algorithm (GA), as a variable selection technique, and artificial neural network (ANN), as a non-linear modeling method. First, a linear regression, combined with PCA, (principal component regression) was operated to model the structure–activity relationships, and afterwards a combination of PCA and ANN algorithm was employed to accurately predict the biological activity of the P2X7 antagonist. PCA preserves as much of the information as possible contained in the original data set. Seven most important PC's to the studied activity were selected as the inputs of ANN box by an efficient variable selection method, GA. The best computational neural network model was a fully-connected, feed-forward model with 7−7−1 architecture. The developed ANN model was fully evaluated by different validation techniques, including internal and external validation, and chemical applicability domain. All validations showed that the constructed quantitative structure–activity relationship model suggested is robust and satisfactory. PMID:26600858

  13. PCA leverage: outlier detection for high-dimensional functional magnetic resonance imaging data.

    PubMed

    Mejia, Amanda F; Nebel, Mary Beth; Eloyan, Ani; Caffo, Brian; Lindquist, Martin A

    2017-07-01

    Outlier detection for high-dimensional (HD) data is a popular topic in modern statistical research. However, one source of HD data that has received relatively little attention is functional magnetic resonance images (fMRI), which consists of hundreds of thousands of measurements sampled at hundreds of time points. At a time when the availability of fMRI data is rapidly growing-primarily through large, publicly available grassroots datasets-automated quality control and outlier detection methods are greatly needed. We propose principal components analysis (PCA) leverage and demonstrate how it can be used to identify outlying time points in an fMRI run. Furthermore, PCA leverage is a measure of the influence of each observation on the estimation of principal components, which are often of interest in fMRI data. We also propose an alternative measure, PCA robust distance, which is less sensitive to outliers and has controllable statistical properties. The proposed methods are validated through simulation studies and are shown to be highly accurate. We also conduct a reliability study using resting-state fMRI data from the Autism Brain Imaging Data Exchange and find that removal of outliers using the proposed methods results in more reliable estimation of subject-level resting-state networks using independent components analysis. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  14. [Research on spectra recognition method for cabbages and weeds based on PCA and SIMCA].

    PubMed

    Zu, Qin; Deng, Wei; Wang, Xiu; Zhao, Chun-Jiang

    2013-10-01

    In order to improve the accuracy and efficiency of weed identification, the difference of spectral reflectance was employed to distinguish between crops and weeds. Firstly, the different combinations of Savitzky-Golay (SG) convolutional derivation and multiplicative scattering correction (MSC) method were applied to preprocess the raw spectral data. Then the clustering analysis of various types of plants was completed by using principal component analysis (PCA) method, and the feature wavelengths which were sensitive for classifying various types of plants were extracted according to the corresponding loading plots of the optimal principal components in PCA results. Finally, setting the feature wavelengths as the input variables, the soft independent modeling of class analogy (SIMCA) classification method was used to identify the various types of plants. The experimental results of classifying cabbages and weeds showed that on the basis of the optimal pretreatment by a synthetic application of MSC and SG convolutional derivation with SG's parameters set as 1rd order derivation, 3th degree polynomial and 51 smoothing points, 23 feature wavelengths were extracted in accordance with the top three principal components in PCA results. When SIMCA method was used for classification while the previously selected 23 feature wavelengths were set as the input variables, the classification rates of the modeling set and the prediction set were respectively up to 98.6% and 100%.

  15. Self-aggregation in scaled principal component space

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ding, Chris H.Q.; He, Xiaofeng; Zha, Hongyuan

    2001-10-05

    Automatic grouping of voluminous data into meaningful structures is a challenging task frequently encountered in broad areas of science, engineering and information processing. These data clustering tasks are frequently performed in Euclidean space or a subspace chosen from principal component analysis (PCA). Here we describe a space obtained by a nonlinear scaling of PCA in which data objects self-aggregate automatically into clusters. Projection into this space gives sharp distinctions among clusters. Gene expression profiles of cancer tissue subtypes, Web hyperlink structure and Internet newsgroups are analyzed to illustrate interesting properties of the space.

  16. Principal Component Analysis for Enhancement of Infrared Spectra Monitoring

    NASA Astrophysics Data System (ADS)

    Haney, Ricky Lance

    The issue of air quality within the aircraft cabin is receiving increasing attention from both pilot and flight attendant unions. This is due to exposure events caused by poor air quality that in some cases may have contained toxic oil components due to bleed air that flows from outside the aircraft and then through the engines into the aircraft cabin. Significant short and long-term medical issues for aircraft crew have been attributed to exposure. The need for air quality monitoring is especially evident in the fact that currently within an aircraft there are no sensors to monitor the air quality and potentially harmful gas levels (detect-to-warn sensors), much less systems to monitor and purify the air (detect-to-treat sensors) within the aircraft cabin. The specific purpose of this research is to utilize a mathematical technique called principal component analysis (PCA) in conjunction with principal component regression (PCR) and proportionality constant calculations (PCC) to simplify complex, multi-component infrared (IR) spectra data sets into a reduced data set used for determination of the concentrations of the individual components. Use of PCA can significantly simplify data analysis as well as improve the ability to determine concentrations of individual target species in gas mixtures where significant band overlap occurs in the IR spectrum region. Application of this analytical numerical technique to IR spectrum analysis is important in improving performance of commercial sensors that airlines and aircraft manufacturers could potentially use in an aircraft cabin environment for multi-gas component monitoring. The approach of this research is two-fold, consisting of a PCA application to compare simulation and experimental results with the corresponding PCR and PCC to determine quantitatively the component concentrations within a mixture. The experimental data sets consist of both two and three component systems that could potentially be present as air contaminants in an aircraft cabin. In addition, experimental data sets are analyzed for a hydrogen peroxide (H2O2) aqueous solution mixture to determine H2O2 concentrations at various levels that could be produced during use of a vapor phase hydrogen peroxide (VPHP) decontamination system. After the PCA application to two and three component systems, the analysis technique is further expanded to include the monitoring of potential bleed air contaminants from engine oil combustion. Simulation data sets created from database spectra were utilized to predict gas components and concentrations in unknown engine oil samples at high temperatures as well as time-evolved gases from the heating of engine oils.

  17. Population Analysis of Disabled Children by Departments in France

    NASA Astrophysics Data System (ADS)

    Meidatuzzahra, Diah; Kuswanto, Heri; Pech, Nicolas; Etchegaray, Amélie

    2017-06-01

    In this study, a statistical analysis is performed by model the variations of the disabled about 0-19 years old population among French departments. The aim is to classify the departments according to their profile determinants (socioeconomic and behavioural profiles). The analysis is focused on two types of methods: principal component analysis (PCA) and multiple correspondences factorial analysis (MCA) to review which one is the best methods for interpretation of the correlation between the determinants of disability (independent variable). The PCA is the best method for interpretation of the correlation between the determinants of disability (independent variable). The PCA reduces 14 determinants of disability to 4 axes, keeps 80% of total information, and classifies them into 7 classes. The MCA reduces the determinants to 3 axes, retains only 30% of information, and classifies them into 4 classes.

  18. Item response theory and factor analysis as a mean to characterize occurrence of response shift in a longitudinal quality of life study in breast cancer patients

    PubMed Central

    2014-01-01

    Background The occurrence of response shift (RS) in longitudinal health-related quality of life (HRQoL) studies, reflecting patient adaptation to disease, has already been demonstrated. Several methods have been developed to detect the three different types of response shift (RS), i.e. recalibration RS, 2) reprioritization RS, and 3) reconceptualization RS. We investigated two complementary methods that characterize the occurrence of RS: factor analysis, comprising Principal Component Analysis (PCA) and Multiple Correspondence Analysis (MCA), and a method of Item Response Theory (IRT). Methods Breast cancer patients (n = 381) completed the EORTC QLQ-C30 and EORTC QLQ-BR23 questionnaires at baseline, immediately following surgery, and three and six months after surgery, according to the “then-test/post-test” design. Recalibration was explored using MCA and a model of IRT, called the Linear Logistic Model with Relaxed Assumptions (LLRA) using the then-test method. Principal Component Analysis (PCA) was used to explore reconceptualization and reprioritization. Results MCA highlighted the main profiles of recalibration: patients with high HRQoL level report a slightly worse HRQoL level retrospectively and vice versa. The LLRA model indicated a downward or upward recalibration for each dimension. At six months, the recalibration effect was statistically significant for 11/22 dimensions of the QLQ-C30 and BR23 according to the LLRA model (p ≤ 0.001). Regarding the QLQ-C30, PCA indicated a reprioritization of symptom scales and reconceptualization via an increased correlation between functional scales. Conclusions Our findings demonstrate the usefulness of these analyses in characterizing the occurrence of RS. MCA and IRT model had convergent results with then-test method to characterize recalibration component of RS. PCA is an indirect method in investigating the reprioritization and reconceptualization components of RS. PMID:24606836

  19. Application of Principal Component Analysis to NIR Spectra of Phyllosilicates: A Technique for Identifying Phyllosilicates on Mars

    NASA Technical Reports Server (NTRS)

    Rampe, E. B.; Lanza, N. L.

    2012-01-01

    Orbital near-infrared (NIR) reflectance spectra of the martian surface from the OMEGA and CRISM instruments have identified a variety of phyllosilicates in Noachian terrains. The types of phyllosilicates present on Mars have important implications for the aqueous environments in which they formed, and, thus, for recognizing locales that may have been habitable. Current identifications of phyllosilicates from martian NIR data are based on the positions of spectral absorptions relative to laboratory data of well-characterized samples and from spectral ratios; however, some phyllosilicates can be difficult to distinguish from one another with these methods (i.e. illite vs. muscovite). Here we employ a multivariate statistical technique, principal component analysis (PCA), to differentiate between spectrally similar phyllosilicate minerals. PCA is commonly used in a variety of industries (pharmaceutical, agricultural, viticultural) to discriminate between samples. Previous work using PCA to analyze raw NIR reflectance data from mineral mixtures has shown that this is a viable technique for identifying mineral types, abundances, and particle sizes. Here, we evaluate PCA of second-derivative NIR reflectance data as a method for classifying phyllosilicates and test whether this method can be used to identify phyllosilicates on Mars.

  20. Characterizing Variability of Modular Brain Connectivity with Constrained Principal Component Analysis

    PubMed Central

    Hirayama, Jun-ichiro; Hyvärinen, Aapo; Kiviniemi, Vesa; Kawanabe, Motoaki; Yamashita, Okito

    2016-01-01

    Characterizing the variability of resting-state functional brain connectivity across subjects and/or over time has recently attracted much attention. Principal component analysis (PCA) serves as a fundamental statistical technique for such analyses. However, performing PCA on high-dimensional connectivity matrices yields complicated “eigenconnectivity” patterns, for which systematic interpretation is a challenging issue. Here, we overcome this issue with a novel constrained PCA method for connectivity matrices by extending the idea of the previously proposed orthogonal connectivity factorization method. Our new method, modular connectivity factorization (MCF), explicitly introduces the modularity of brain networks as a parametric constraint on eigenconnectivity matrices. In particular, MCF analyzes the variability in both intra- and inter-module connectivities, simultaneously finding network modules in a principled, data-driven manner. The parametric constraint provides a compact module-based visualization scheme with which the result can be intuitively interpreted. We develop an optimization algorithm to solve the constrained PCA problem and validate our method in simulation studies and with a resting-state functional connectivity MRI dataset of 986 subjects. The results show that the proposed MCF method successfully reveals the underlying modular eigenconnectivity patterns in more general situations and is a promising alternative to existing methods. PMID:28002474

  1. Comparison of discrete Fourier transform (DFT) and principal component analysis/DFT as forecasting tools for absorbance time series received by UV-visible probes installed in urban sewer systems.

    PubMed

    Plazas-Nossa, Leonardo; Torres, Andrés

    2014-01-01

    The objective of this work is to introduce a forecasting method for UV-Vis spectrometry time series that combines principal component analysis (PCA) and discrete Fourier transform (DFT), and to compare the results obtained with those obtained by using DFT. Three time series for three different study sites were used: (i) Salitre wastewater treatment plant (WWTP) in Bogotá; (ii) Gibraltar pumping station in Bogotá; and (iii) San Fernando WWTP in Itagüí (in the south part of Medellín). Each of these time series had an equal number of samples (1051). In general terms, the results obtained are hardly generalizable, as they seem to be highly dependent on specific water system dynamics; however, some trends can be outlined: (i) for UV range, DFT and PCA/DFT forecasting accuracy were almost the same; (ii) for visible range, the PCA/DFT forecasting procedure proposed gives systematically lower forecasting errors and variability than those obtained with the DFT procedure; and (iii) for short forecasting times the PCA/DFT procedure proposed is more suitable than the DFT procedure, according to processing times obtained.

  2. Comparison of water extraction methods in Tibet based on GF-1 data

    NASA Astrophysics Data System (ADS)

    Jia, Lingjun; Shang, Kun; Liu, Jing; Sun, Zhongqing

    2018-03-01

    In this study, we compared four different water extraction methods with GF-1 data according to different water types in Tibet, including Support Vector Machine (SVM), Principal Component Analysis (PCA), Decision Tree Classifier based on False Normalized Difference Water Index (FNDWI-DTC), and PCA-SVM. The results show that all of the four methods can extract large area water body, but only SVM and PCA-SVM can obtain satisfying extraction results for small size water body. The methods were evaluated by both overall accuracy (OAA) and Kappa coefficient (KC). The OAA of PCA-SVM, SVM, FNDWI-DTC, PCA are 96.68%, 94.23%, 93.99%, 93.01%, and the KCs are 0.9308, 0.8995, 0.8962, 0.8842, respectively, in consistent with visual inspection. In summary, SVM is better for narrow rivers extraction and PCA-SVM is suitable for water extraction of various types. As for dark blue lakes, the methods using PCA can extract more quickly and accurately.

  3. ERP Go/NoGo condition effects are better detected with separate PCAs.

    PubMed

    Barry, Robert J; De Blasio, Frances M; Fogarty, Jack S; Karamacoska, Diana

    2016-08-01

    We explored the separation of Go and NoGo effects in the ERP components elicited in an equiprobable Go/NoGo task, using different forms of temporal Principal Components Analysis (PCA). Following exploratory simulation studies assessing the PCA impact of latency jitter and between-condition latency differences in the P3 latency range, an empirical study compared results of a Combined PCA carried out using both Go and NoGo ERPs together as input, with those from two Separate PCAs carried out on the Go and NoGo ERPs separately. The simulation studies indicated that Separate PCAs provide adequate component recovery in the presence of P3 latency jitter, and that Combined PCAs provide good separation of components only when systematic condition-related latency differences are sufficiently large (here ~110ms). In the empirical data, broadly-similar components were obtained from the Combined and Separate PCAs, supporting previous findings from Combined PCA investigations, and the consequent interpretations of the sequential processing involved. However, the Separate PCAs generated latency differences for components in the Go and NoGo processing chains that better matched the late Go/NoGo ERP peaks, and produced better-defined and larger components that fitted the stages in a hypothetical processing schema developed for this paradigm. Overall, the Separate PCAs yielded a better partitioning of the ERP variance associated with the Go and NoGo conditions, and should be considered as the first choice in future investigations if systematic component or subcomponent latency differences are present or suspected. Copyright © 2016 Elsevier B.V. All rights reserved.

  4. Algorithms for accelerated convergence of adaptive PCA.

    PubMed

    Chatterjee, C; Kang, Z; Roychowdhury, V P

    2000-01-01

    We derive and discuss new adaptive algorithms for principal component analysis (PCA) that are shown to converge faster than the traditional PCA algorithms due to Oja, Sanger, and Xu. It is well known that traditional PCA algorithms that are derived by using gradient descent on an objective function are slow to converge. Furthermore, the convergence of these algorithms depends on appropriate choices of the gain sequences. Since online applications demand faster convergence and an automatic selection of gains, we present new adaptive algorithms to solve these problems. We first present an unconstrained objective function, which can be minimized to obtain the principal components. We derive adaptive algorithms from this objective function by using: 1) gradient descent; 2) steepest descent; 3) conjugate direction; and 4) Newton-Raphson methods. Although gradient descent produces Xu's LMSER algorithm, the steepest descent, conjugate direction, and Newton-Raphson methods produce new adaptive algorithms for PCA. We also provide a discussion on the landscape of the objective function, and present a global convergence proof of the adaptive gradient descent PCA algorithm using stochastic approximation theory. Extensive experiments with stationary and nonstationary multidimensional Gaussian sequences show faster convergence of the new algorithms over the traditional gradient descent methods.We also compare the steepest descent adaptive algorithm with state-of-the-art methods on stationary and nonstationary sequences.

  5. Total Electron Content forecast model over Australia

    NASA Astrophysics Data System (ADS)

    Bouya, Zahra; Terkildsen, Michael; Francis, Matthew

    Ionospheric perturbations can cause serious propagation errors in modern radio systems such as Global Navigation Satellite Systems (GNSS). Forecasting ionospheric parameters is helpful to estimate potential degradation of the performance of these systems. Our purpose is to establish an Australian Regional Total Electron Content (TEC) forecast model at IPS. In this work we present an approach based on the combined use of the Principal Component Analysis (PCA) and Artificial Neural Network (ANN) to predict future TEC values. PCA is used to reduce the dimensionality of the original TEC data by mapping it into its eigen-space. In this process the top- 5 eigenvectors are chosen to reflect the directions of the maximum variability. An ANN approach was then used for the multicomponent prediction. We outline the design of the ANN model with its parameters. A number of activation functions along with different spectral ranges and different numbers of Principal Components (PCs) were tested to find the PCA-ANN models reaching the best results. Keywords: GNSS, Space Weather, Regional, Forecast, PCA, ANN.

  6. Principal Components Analysis of a JWST NIRSpec Detector Subsystem

    NASA Technical Reports Server (NTRS)

    Arendt, Richard G.; Fixsen, D. J.; Greenhouse, Matthew A.; Lander, Matthew; Lindler, Don; Loose, Markus; Moseley, S. H.; Mott, D. Brent; Rauscher, Bernard J.; Wen, Yiting; hide

    2013-01-01

    We present principal component analysis (PCA) of a flight-representative James Webb Space Telescope NearInfrared Spectrograph (NIRSpec) Detector Subsystem. Although our results are specific to NIRSpec and its T - 40 K SIDECAR ASICs and 5 m cutoff H2RG detector arrays, the underlying technical approach is more general. We describe how we measured the systems response to small environmental perturbations by modulating a set of bias voltages and temperature. We used this information to compute the systems principal noise components. Together with information from the astronomical scene, we show how the zeroth principal component can be used to calibrate out the effects of small thermal and electrical instabilities to produce cosmetically cleaner images with significantly less correlated noise. Alternatively, if one were designing a new instrument, one could use a similar PCA approach to inform a set of environmental requirements (temperature stability, electrical stability, etc.) that enabled the planned instrument to meet performance requirements

  7. Common mode error in Antarctic GPS coordinate time series on its effect on bedrock-uplift estimates

    NASA Astrophysics Data System (ADS)

    Liu, Bin; King, Matt; Dai, Wujiao

    2018-05-01

    Spatially-correlated common mode error always exists in regional, or-larger, GPS networks. We applied independent component analysis (ICA) to GPS vertical coordinate time series in Antarctica from 2010 to 2014 and made a comparison with the principal component analysis (PCA). Using PCA/ICA, the time series can be decomposed into a set of temporal components and their spatial responses. We assume the components with common spatial responses are common mode error (CME). An average reduction of ˜40% about the RMS values was achieved in both PCA and ICA filtering. However, the common mode components obtained from the two approaches have different spatial and temporal features. ICA time series present interesting correlations with modeled atmospheric and non-tidal ocean loading displacements. A white noise (WN) plus power law noise (PL) model was adopted in the GPS velocity estimation using maximum likelihood estimation (MLE) analysis, with ˜55% reduction of the velocity uncertainties after filtering using ICA. Meanwhile, spatiotemporal filtering reduces the amplitude of PL and periodic terms in the GPS time series. Finally, we compare the GPS uplift velocities, after correction for elastic effects, with recent models of glacial isostatic adjustment (GIA). The agreements of the GPS observed velocities and four GIA models are generally improved after the spatiotemporal filtering, with a mean reduction of ˜0.9 mm/yr of the WRMS values, possibly allowing for more confident separation of various GIA model predictions.

  8. Esophageal cancer detection based on tissue surface-enhanced Raman spectroscopy and multivariate analysis

    NASA Astrophysics Data System (ADS)

    Feng, Shangyuan; Lin, Juqiang; Huang, Zufang; Chen, Guannan; Chen, Weisheng; Wang, Yue; Chen, Rong; Zeng, Haishan

    2013-01-01

    The capability of using silver nanoparticle based near-infrared surface enhanced Raman scattering (SERS) spectroscopy combined with principal component analysis (PCA) and linear discriminate analysis (LDA) to differentiate esophageal cancer tissue from normal tissue was presented. Significant differences in Raman intensities of prominent SERS bands were observed between normal and cancer tissues. PCA-LDA multivariate analysis of the measured tissue SERS spectra achieved diagnostic sensitivity of 90.9% and specificity of 97.8%. This exploratory study demonstrated great potential for developing label-free tissue SERS analysis into a clinical tool for esophageal cancer detection.

  9. Method of Real-Time Principal-Component Analysis

    NASA Technical Reports Server (NTRS)

    Duong, Tuan; Duong, Vu

    2005-01-01

    Dominant-element-based gradient descent and dynamic initial learning rate (DOGEDYN) is a method of sequential principal-component analysis (PCA) that is well suited for such applications as data compression and extraction of features from sets of data. In comparison with a prior method of gradient-descent-based sequential PCA, this method offers a greater rate of learning convergence. Like the prior method, DOGEDYN can be implemented in software. However, the main advantage of DOGEDYN over the prior method lies in the facts that it requires less computation and can be implemented in simpler hardware. It should be possible to implement DOGEDYN in compact, low-power, very-large-scale integrated (VLSI) circuitry that could process data in real time.

  10. Study of ionospheric anomalies due to impact of typhoon using Principal Component Analysis and image processing

    NASA Astrophysics Data System (ADS)

    LIN, JYH-WOEI

    2012-08-01

    Principal Component Analysis (PCA) and image processing are used to determine Total Electron Content (TEC) anomalies in the F-layer of the ionosphere relating to Typhoon Nakri for 29 May, 2008 (UTC). PCA and image processing are applied to the global ionospheric map (GIM) with transforms conducted for the time period 12:00-14:00 UT on 29 May, 2008 when the wind was most intense. Results show that at a height of approximately 150-200 km the TEC anomaly is highly localized; however, it becomes more intense and widespread with height. Potential causes of these results are discussed with emphasis given to acoustic gravity waves caused by wind force.

  11. A probability index for surface zonda wind occurrence at Mendoza city through vertical sounding principal components analysis

    NASA Astrophysics Data System (ADS)

    Otero, Federico; Norte, Federico; Araneo, Diego

    2018-01-01

    The aim of this work is to obtain an index for predicting the probability of occurrence of zonda event at surface level from sounding data at Mendoza city, Argentine. To accomplish this goal, surface zonda wind events were previously found with an objective classification method (OCM) only considering the surface station values. Once obtained the dates and the onset time of each event, the prior closest sounding for each event was taken to realize a principal component analysis (PCA) that is used to identify the leading patterns of the vertical structure of the atmosphere previously to a zonda wind event. These components were used to construct the index model. For the PCA an entry matrix of temperature ( T) and dew point temperature (Td) anomalies for the standard levels between 850 and 300 hPa was build. The analysis yielded six significant components with a 94 % of the variance explained and the leading patterns of favorable weather conditions for the development of the phenomenon were obtained. A zonda/non-zonda indicator c can be estimated by a logistic multiple regressions depending on the PCA component loadings, determining a zonda probability index \\widehat{c} calculable from T and Td profiles and it depends on the climatological features of the region. The index showed 74.7 % efficiency. The same analysis was performed by adding surface values of T and Td from Mendoza Aero station increasing the index efficiency to 87.8 %. The results revealed four significantly correlated PCs with a major improvement in differentiating zonda cases and a reducing of the uncertainty interval.

  12. Plaque echodensity and textural features are associated with histologic carotid plaque instability.

    PubMed

    Doonan, Robert J; Gorgui, Jessica; Veinot, Jean P; Lai, Chi; Kyriacou, Efthyvoulos; Corriveau, Marc M; Steinmetz, Oren K; Daskalopoulou, Stella S

    2016-09-01

    Carotid plaque echodensity and texture features predict cerebrovascular symptomatology. Our purpose was to determine the association of echodensity and textural features obtained from a digital image analysis (DIA) program with histologic features of plaque instability as well as to identify the specific morphologic characteristics of unstable plaques. Patients scheduled to undergo carotid endarterectomy were recruited and underwent carotid ultrasound imaging. DIA was performed to extract echodensity and textural features using Plaque Texture Analysis software (LifeQ Medical Ltd, Nicosia, Cyprus). Carotid plaque surgical specimens were obtained and analyzed histologically. Principal component analysis (PCA) was performed to reduce imaging variables. Logistic regression models were used to determine if PCA variables and individual imaging variables predicted histologic features of plaque instability. Image analysis data from 160 patients were analyzed. Individual imaging features of plaque echolucency and homogeneity were associated with a more unstable plaque phenotype on histology. These results were independent of age, sex, and degree of carotid stenosis. PCA reduced 39 individual imaging variables to five PCA variables. PCA1 and PCA2 were significantly associated with overall plaque instability on histology (both P = .02), whereas PCA3 did not achieve statistical significance (P = .07). DIA features of carotid plaques are associated with histologic plaque instability as assessed by multiple histologic features. Importantly, unstable plaques on histology appear more echolucent and homogeneous on ultrasound imaging. These results are independent of stenosis, suggesting that image analysis may have a role in refining the selection of patients who undergo carotid endarterectomy. Copyright © 2016 Society for Vascular Surgery. Published by Elsevier Inc. All rights reserved.

  13. Binding Isotherms and Time Courses Readily from Magnetic Resonance.

    PubMed

    Xu, Jia; Van Doren, Steven R

    2016-08-16

    Evidence is presented that binding isotherms, simple or biphasic, can be extracted directly from noninterpreted, complex 2D NMR spectra using principal component analysis (PCA) to reveal the largest trend(s) across the series. This approach renders peak picking unnecessary for tracking population changes. In 1:1 binding, the first principal component captures the binding isotherm from NMR-detected titrations in fast, slow, and even intermediate and mixed exchange regimes, as illustrated for phospholigand associations with proteins. Although the sigmoidal shifts and line broadening of intermediate exchange distorts binding isotherms constructed conventionally, applying PCA directly to these spectra along with Pareto scaling overcomes the distortion. Applying PCA to time-domain NMR data also yields binding isotherms from titrations in fast or slow exchange. The algorithm readily extracts from magnetic resonance imaging movie time courses such as breathing and heart rate in chest imaging. Similarly, two-step binding processes detected by NMR are easily captured by principal components 1 and 2. PCA obviates the customary focus on specific peaks or regions of images. Applying it directly to a series of complex data will easily delineate binding isotherms, equilibrium shifts, and time courses of reactions or fluctuations.

  14. Intelligence, Surveillance, and Reconnaissance Fusion for Coalition Operations

    DTIC Science & Technology

    2008-07-01

    classification of the targets of interest. The MMI features extracted in this manner have two properties that provide a sound justification for...are generalizations of well- known feature extraction methods such as Principal Components Analysis (PCA) and Independent Component Analysis (ICA...augment (without degrading performance) a large class of generic fusion processes. Ontologies Classifications Feature extraction Feature analysis

  15. Analysis of environmental variation in a Great Plains reservoir using principal components analysis and geographic information systems

    USGS Publications Warehouse

    Long, J.M.; Fisher, W.L.

    2006-01-01

    We present a method for spatial interpretation of environmental variation in a reservoir that integrates principal components analysis (PCA) of environmental data with geographic information systems (GIS). To illustrate our method, we used data from a Great Plains reservoir (Skiatook Lake, Oklahoma) with longitudinal variation in physicochemical conditions. We measured 18 physicochemical features, mapped them using GIS, and then calculated and interpreted four principal components. Principal component 1 (PC1) was readily interpreted as longitudinal variation in water chemistry, but the other principal components (PC2-4) were difficult to interpret. Site scores for PC1-4 were calculated in GIS by summing weighted overlays of the 18 measured environmental variables, with the factor loadings from the PCA as the weights. PC1-4 were then ordered into a landscape hierarchy, an emergent property of this technique, which enabled their interpretation. PC1 was interpreted as a reservoir scale change in water chemistry, PC2 was a microhabitat variable of rip-rap substrate, PC3 identified coves/embayments and PC4 consisted of shoreline microhabitats related to slope. The use of GIS improved our ability to interpret the more obscure principal components (PC2-4), which made the spatial variability of the reservoir environment more apparent. This method is applicable to a variety of aquatic systems, can be accomplished using commercially available software programs, and allows for improved interpretation of the geographic environmental variability of a system compared to using typical PCA plots. ?? Copyright by the North American Lake Management Society 2006.

  16. Fixed Eigenvector Analysis of Thermographic NDE Data

    NASA Technical Reports Server (NTRS)

    Cramer, K. Elliott; Winfree, William P.

    2011-01-01

    Principal Component Analysis (PCA) has been shown effective for reducing thermographic NDE data. This paper will discuss an alternative method of analysis that has been developed where a predetermined set of eigenvectors is used to process the thermal data from both reinforced carbon-carbon (RCC) and graphiteepoxy honeycomb materials. These eigenvectors can be generated either from an analytic model of the thermal response of the material system under examination, or from a large set of experimental data. This paper provides the details of the analytic model, an overview of the PCA process, as well as a quantitative signal-to-noise comparison of the results of performing both conventional PCA and fixed eigenvector analysis on thermographic data from two specimens, one Reinforced Carbon-Carbon with flat bottom holes and the second a sandwich construction with graphite-epoxy face sheets and aluminum honeycomb core.

  17. Multi-Centrality Graph Spectral Decompositions and Their Application to Cyber Intrusion Detection

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Pin-Yu; Choudhury, Sutanay; Hero, Alfred

    Many modern datasets can be represented as graphs and hence spectral decompositions such as graph principal component analysis (PCA) can be useful. Distinct from previous graph decomposition approaches based on subspace projection of a single topological feature, e.g., the centered graph adjacency matrix (graph Laplacian), we propose spectral decomposition approaches to graph PCA and graph dictionary learning that integrate multiple features, including graph walk statistics, centrality measures and graph distances to reference nodes. In this paper we propose a new PCA method for single graph analysis, called multi-centrality graph PCA (MC-GPCA), and a new dictionary learning method for ensembles ofmore » graphs, called multi-centrality graph dictionary learning (MC-GDL), both based on spectral decomposition of multi-centrality matrices. As an application to cyber intrusion detection, MC-GPCA can be an effective indicator of anomalous connectivity pattern and MC-GDL can provide discriminative basis for attack classification.« less

  18. Mapping brain activity in gradient-echo functional MRI using principal component analysis

    NASA Astrophysics Data System (ADS)

    Khosla, Deepak; Singh, Manbir; Don, Manuel

    1997-05-01

    The detection of sites of brain activation in functional MRI has been a topic of immense research interest and many technique shave been proposed to this end. Recently, principal component analysis (PCA) has been applied to extract the activated regions and their time course of activation. This method is based on the assumption that the activation is orthogonal to other signal variations such as brain motion, physiological oscillations and other uncorrelated noises. A distinct advantage of this method is that it does not require any knowledge of the time course of the true stimulus paradigm. This technique is well suited to EPI image sequences where the sampling rate is high enough to capture the effects of physiological oscillations. In this work, we propose and apply tow methods that are based on PCA to conventional gradient-echo images and investigate their usefulness as tools to extract reliable information on brain activation. The first method is a conventional technique where a single image sequence with alternating on and off stages is subject to a principal component analysis. The second method is a PCA-based approach called the common spatial factor analysis technique (CSF). As the name suggests, this method relies on common spatial factors between the above fMRI image sequence and a background fMRI. We have applied these methods to identify active brain ares during visual stimulation and motor tasks. The results from these methods are compared to those obtained by using the standard cross-correlation technique. We found good agreement in the areas identified as active across all three techniques. The results suggest that PCA and CSF methods have good potential in detecting the true stimulus correlated changes in the presence of other interfering signals.

  19. Systematic study of anharmonic features in a principal component analysis of gramicidin A.

    PubMed

    Kurylowicz, Martin; Yu, Ching-Hsing; Pomès, Régis

    2010-02-03

    We use principal component analysis (PCA) to detect functionally interesting collective motions in molecular-dynamics simulations of membrane-bound gramicidin A. We examine the statistical and structural properties of all PCA eigenvectors and eigenvalues for the backbone and side-chain atoms. All eigenvalue spectra show two distinct power-law scaling regimes, quantitatively separating large from small covariance motions. Time trajectories of the largest PCs converge to Gaussian distributions at long timescales, but groups of small-covariance PCs, which are usually ignored as noise, have subdiffusive distributions. These non-Gaussian distributions imply anharmonic motions on the free-energy surface. We characterize the anharmonic components of motion by analyzing the mean-square displacement for all PCs. The subdiffusive components reveal picosecond-scale oscillations in the mean-square displacement at frequencies consistent with infrared measurements. In this regime, the slowest backbone mode exhibits tilting of the peptide planes, which allows carbonyl oxygen atoms to provide surrogate solvation for water and cation transport in the channel lumen. Higher-frequency modes are also apparent, and we describe their vibrational spectra. Our findings expand the utility of PCA for quantifying the essential features of motion on the anharmonic free-energy surface made accessible by atomistic molecular-dynamics simulations. Copyright (c) 2010 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  20. ClustVis: a web tool for visualizing clustering of multivariate data using Principal Component Analysis and heatmap

    PubMed Central

    Metsalu, Tauno; Vilo, Jaak

    2015-01-01

    The Principal Component Analysis (PCA) is a widely used method of reducing the dimensionality of high-dimensional data, often followed by visualizing two of the components on the scatterplot. Although widely used, the method is lacking an easy-to-use web interface that scientists with little programming skills could use to make plots of their own data. The same applies to creating heatmaps: it is possible to add conditional formatting for Excel cells to show colored heatmaps, but for more advanced features such as clustering and experimental annotations, more sophisticated analysis tools have to be used. We present a web tool called ClustVis that aims to have an intuitive user interface. Users can upload data from a simple delimited text file that can be created in a spreadsheet program. It is possible to modify data processing methods and the final appearance of the PCA and heatmap plots by using drop-down menus, text boxes, sliders etc. Appropriate defaults are given to reduce the time needed by the user to specify input parameters. As an output, users can download PCA plot and heatmap in one of the preferred file formats. This web server is freely available at http://biit.cs.ut.ee/clustvis/. PMID:25969447

  1. A structural investigation into the compaction behavior of pharmaceutical composites using powder X-ray diffraction and total scattering analysis.

    PubMed

    Moore, Michael D; Steinbach, Alison M; Buckner, Ira S; Wildfong, Peter L D

    2009-11-01

    To use advanced powder X-ray diffraction (PXRD) to characterize the structure of anhydrous theophylline following compaction, alone, and as part of a binary mixture with either alpha-lactose monohydrate or microcrystalline cellulose. Compacts formed from (1) pure theophylline and (2) each type of binary mixture were analyzed intact using PXRD. A novel mathematical technique was used to accurately separate multi-component diffraction patterns. The pair distribution function (PDF) of isolated theophylline diffraction data was employed to assess structural differences induced by consolidation and evaluated by principal components analysis (PCA). Changes induced in PXRD patterns by increasing compaction pressure were amplified by the PDF. Simulated data suggest PDF dampening is attributable to molecular deviations from average crystalline position. Samples compacted at different pressures were identified and differentiated using PCA. Samples compacted at common pressures exhibited similar inter-atomic correlations, where excipient concentration factored in the analyses involving lactose. Practical real-space structural analysis of PXRD data by PDF was accomplished for intact, compacted crystalline drug with and without excipient. PCA was used to compare multiple PDFs and successfully differentiated pattern changes consistent with compaction-induced disordering of theophylline as a single component and in the presence of another material.

  2. Classification of time-of-flight secondary ion mass spectrometry spectra from complex Cu-Fe sulphides by principal component analysis and artificial neural networks.

    PubMed

    Kalegowda, Yogesh; Harmer, Sarah L

    2013-01-08

    Artificial neural network (ANN) and a hybrid principal component analysis-artificial neural network (PCA-ANN) classifiers have been successfully implemented for classification of static time-of-flight secondary ion mass spectrometry (ToF-SIMS) mass spectra collected from complex Cu-Fe sulphides (chalcopyrite, bornite, chalcocite and pyrite) at different flotation conditions. ANNs are very good pattern classifiers because of: their ability to learn and generalise patterns that are not linearly separable; their fault and noise tolerance capability; and high parallelism. In the first approach, fragments from the whole ToF-SIMS spectrum were used as input to the ANN, the model yielded high overall correct classification rates of 100% for feed samples, 88% for conditioned feed samples and 91% for Eh modified samples. In the second approach, the hybrid pattern classifier PCA-ANN was integrated. PCA is a very effective multivariate data analysis tool applied to enhance species features and reduce data dimensionality. Principal component (PC) scores which accounted for 95% of the raw spectral data variance, were used as input to the ANN, the model yielded high overall correct classification rates of 88% for conditioned feed samples and 95% for Eh modified samples. Copyright © 2012 Elsevier B.V. All rights reserved.

  3. Interpretation of data on the aggregate composition of typical chernozems under different land use by cluster and principal component analyses

    NASA Astrophysics Data System (ADS)

    Kholodov, V. A.; Yaroslavtseva, N. V.; Lazarev, V. I.; Frid, A. S.

    2016-09-01

    Cluster analysis and principal component analysis (PCA) have been used for the interpretation of dry sieving data. Chernozems from the treatments of long-term field experiments with different land-use patterns— annually mowed steppe, continuous potato culture, permanent black fallow, and untilled fallow since 1998 after permanent black fallow—have been used. Analysis of dry sieving data by PCA has shown that the treatments of untilled fallow after black fallow and annually mowed steppe differ most in the series considered; the content of dry aggregates of 10-7 mm makes the largest contribution to the distribution of objects along the first principal component. This fraction has been sieved in water and analyzed by PCA. In contrast to dry sieving data, the wet sieving data showed the closest mathematical distance between the treatment of untilled fallow after black fallow and the undisturbed treatment of annually mowed steppe, while the untilled fallow after black fallow and the permanent black fallow were the most distant treatments. Thus, it may be suggested that the water stability of structure is first restored after the removal of destructive anthropogenic load. However, the restoration of the distribution of structural separates to the parameters characteristic of native soils is a significantly longer process.

  4. Model Reduction via Principe Component Analysis and Markov Chain Monte Carlo (MCMC) Methods

    NASA Astrophysics Data System (ADS)

    Gong, R.; Chen, J.; Hoversten, M. G.; Luo, J.

    2011-12-01

    Geophysical and hydrogeological inverse problems often include a large number of unknown parameters, ranging from hundreds to millions, depending on parameterization and problems undertaking. This makes inverse estimation and uncertainty quantification very challenging, especially for those problems in two- or three-dimensional spatial domains. Model reduction technique has the potential of mitigating the curse of dimensionality by reducing total numbers of unknowns while describing the complex subsurface systems adequately. In this study, we explore the use of principal component analysis (PCA) and Markov chain Monte Carlo (MCMC) sampling methods for model reduction through the use of synthetic datasets. We compare the performances of three different but closely related model reduction approaches: (1) PCA methods with geometric sampling (referred to as 'Method 1'), (2) PCA methods with MCMC sampling (referred to as 'Method 2'), and (3) PCA methods with MCMC sampling and inclusion of random effects (referred to as 'Method 3'). We consider a simple convolution model with five unknown parameters as our goal is to understand and visualize the advantages and disadvantages of each method by comparing their inversion results with the corresponding analytical solutions. We generated synthetic data with noise added and invert them under two different situations: (1) the noised data and the covariance matrix for PCA analysis are consistent (referred to as the unbiased case), and (2) the noise data and the covariance matrix are inconsistent (referred to as biased case). In the unbiased case, comparison between the analytical solutions and the inversion results show that all three methods provide good estimates of the true values and Method 1 is computationally more efficient. In terms of uncertainty quantification, Method 1 performs poorly because of relatively small number of samples obtained, Method 2 performs best, and Method 3 overestimates uncertainty due to inclusion of random effects. However, in the biased case, only Method 3 correctly estimates all the unknown parameters, and both Methods 1 and 2 provide wrong values for the biased parameters. The synthetic case study demonstrates that if the covariance matrix for PCA analysis is inconsistent with true models, the PCA methods with geometric or MCMC sampling will provide incorrect estimates.

  5. Multivariate Analysis of Fruit Antioxidant Activities of Blackberry Treated with 1-Methylcyclopropene or Vacuum Precooling

    PubMed Central

    Li, Jian; Ma, Guowei; Ma, Lin; Bao, Xiaolin; Li, Liping; Zhao, Qian

    2018-01-01

    Effects of 1-methylcyclopropene (1-MCP) and vacuum precooling on quality and antioxidant properties of blackberries (Rubus spp.) were evaluated using one-way analysis of variance, principal component analysis (PCA), partial least squares (PLS), and path analysis. Results showed that the activities of antioxidant enzymes were enhanced by both 1-MCP treatment and vacuum precooling. PCA could discriminate 1-MCP treated fruit and the vacuum precooled fruit and showed that the radical-scavenging activities in vacuum precooled fruit were higher than those in 1-MCP treated fruit. The scores of PCA showed that H2O2 content was the most important variables of blackberry fruit. PLSR results showed that peroxidase (POD) activity negatively correlated with H2O2 content. The results of path coefficient analysis indicated that glutathione (GSH) also had an indirect effect on H2O2 content. PMID:29487622

  6. Detecting phase separation of freeze-dried binary amorphous systems using pair-wise distribution function and multivariate data analysis.

    PubMed

    Chieng, Norman; Trnka, Hjalte; Boetker, Johan; Pikal, Michael; Rantanen, Jukka; Grohganz, Holger

    2013-09-15

    The purpose of this study is to investigate the use of multivariate data analysis for powder X-ray diffraction-pair-wise distribution function (PXRD-PDF) data to detect phase separation in freeze-dried binary amorphous systems. Polymer-polymer and polymer-sugar binary systems at various ratios were freeze-dried. All samples were analyzed by PXRD, transformed to PDF and analyzed by principal component analysis (PCA). These results were validated by differential scanning calorimetry (DSC) through characterization of glass transition of the maximally freeze-concentrate solute (Tg'). Analysis of PXRD-PDF data using PCA provides a more clear 'miscible' or 'phase separated' interpretation through the distribution pattern of samples on a score plot presentation compared to residual plot method. In a phase separated system, samples were found to be evenly distributed around the theoretical PDF profile. For systems that were miscible, a clear deviation of samples away from the theoretical PDF profile was observed. Moreover, PCA analysis allows simultaneous analysis of replicate samples. Comparatively, the phase behavior analysis from PXRD-PDF-PCA method was in agreement with the DSC results. Overall, the combined PXRD-PDF-PCA approach improves the clarity of the PXRD-PDF results and can be used as an alternative explorative data analytical tool in detecting phase separation in freeze-dried binary amorphous systems. Copyright © 2013 Elsevier B.V. All rights reserved.

  7. Principal component analysis of dynamic fluorescence images for diagnosis of diabetic vasculopathy

    NASA Astrophysics Data System (ADS)

    Seo, Jihye; An, Yuri; Lee, Jungsul; Ku, Taeyun; Kang, Yujung; Ahn, Chulwoo; Choi, Chulhee

    2016-04-01

    Indocyanine green (ICG) fluorescence imaging has been clinically used for noninvasive visualizations of vascular structures. We have previously developed a diagnostic system based on dynamic ICG fluorescence imaging for sensitive detection of vascular disorders. However, because high-dimensional raw data were used, the analysis of the ICG dynamics proved difficult. We used principal component analysis (PCA) in this study to extract important elements without significant loss of information. We examined ICG spatiotemporal profiles and identified critical features related to vascular disorders. PCA time courses of the first three components showed a distinct pattern in diabetic patients. Among the major components, the second principal component (PC2) represented arterial-like features. The explained variance of PC2 in diabetic patients was significantly lower than in normal controls. To visualize the spatial pattern of PCs, pixels were mapped with red, green, and blue channels. The PC2 score showed an inverse pattern between normal controls and diabetic patients. We propose that PC2 can be used as a representative bioimaging marker for the screening of vascular diseases. It may also be useful in simple extractions of arterial-like features.

  8. Differentiating Organic and Conventional Sage by Chromatographic and Mass Spectrometry Flow-Injection Fingerprints Combined with Principal Component Analysis

    PubMed Central

    Gao, Boyan; Lu, Yingjian; Sheng, Yi; Chen, Pei; Yu, Liangli (Lucy)

    2013-01-01

    High performance liquid chromatography (HPLC) and flow injection electrospray ionization with ion trap mass spectrometry (FIMS) fingerprints combined with the principal component analysis (PCA) were examined for their potential in differentiating commercial organic and conventional sage samples. The individual components in the sage samples were also characterized with an ultra-performance liquid chromatography with a quadrupole-time of flight mass spectrometer (UPLC Q-TOF MS). The results suggested that both HPLC and FIMS fingerprints combined with PCA could differentiate organic and conventional sage samples effectively. FIMS may serve as a quick test capable of distinguishing organic and conventional sages in 1 min, and could potentially be developed for high-throughput applications; whereas HPLC fingerprints could provide more chemical composition information with a longer analytical time. PMID:23464755

  9. Discrimination of rectal cancer through human serum using surface-enhanced Raman spectroscopy

    NASA Astrophysics Data System (ADS)

    Li, Xiaozhou; Yang, Tianyue; Li, Siqi; Zhang, Su; Jin, Lili

    2015-05-01

    In this paper, surface-enhanced Raman spectroscopy (SERS) was used to detect the changes in blood serum components that accompany rectal cancer. The differences in serum SERS data between rectal cancer patients and healthy controls were examined. Postoperative rectal cancer patients also participated in the comparison to monitor the effects of cancer treatments. The results show that there are significant variations at certain wavenumbers which indicates alteration of corresponding biological substances. Principal component analysis (PCA) and parameters of intensity ratios were used on the original SERS spectra for the extraction of featured variables. These featured variables then underwent linear discriminant analysis (LDA) and classification and regression tree (CART) for the discrimination analysis. Accuracies of 93.5 and 92.4 % were obtained for PCA-LDA and parameter-CART, respectively.

  10. A Principal Components Analysis and Validation of the Coping with the College Environment Scale (CWCES)

    ERIC Educational Resources Information Center

    Ackermann, Margot Elise; Morrow, Jennifer Ann

    2008-01-01

    The present study describes the development and initial validation of the Coping with the College Environment Scale (CWCES). Participants included 433 college students who took an online survey. Principal Components Analysis (PCA) revealed six coping strategies: planning and self-management, seeking support from institutional resources, escaping…

  11. Principal Component Analysis: Resources for an Essential Application of Linear Algebra

    ERIC Educational Resources Information Center

    Pankavich, Stephen; Swanson, Rebecca

    2015-01-01

    Principal Component Analysis (PCA) is a highly useful topic within an introductory Linear Algebra course, especially since it can be used to incorporate a number of applied projects. This method represents an essential application and extension of the Spectral Theorem and is commonly used within a variety of fields, including statistics,…

  12. Learning Principal Component Analysis by Using Data from Air Quality Networks

    ERIC Educational Resources Information Center

    Perez-Arribas, Luis Vicente; Leon-González, María Eugenia; Rosales-Conrado, Noelia

    2017-01-01

    With the final objective of using computational and chemometrics tools in the chemistry studies, this paper shows the methodology and interpretation of the Principal Component Analysis (PCA) using pollution data from different cities. This paper describes how students can obtain data on air quality and process such data for additional information…

  13. Real time on-chip sequential adaptive principal component analysis for data feature extraction and image compression

    NASA Technical Reports Server (NTRS)

    Duong, T. A.

    2004-01-01

    In this paper, we present a new, simple, and optimized hardware architecture sequential learning technique for adaptive Principle Component Analysis (PCA) which will help optimize the hardware implementation in VLSI and to overcome the difficulties of the traditional gradient descent in learning convergence and hardware implementation.

  14. Principal component analysis on a torus: Theory and application to protein dynamics.

    PubMed

    Sittel, Florian; Filk, Thomas; Stock, Gerhard

    2017-12-28

    A dimensionality reduction method for high-dimensional circular data is developed, which is based on a principal component analysis (PCA) of data points on a torus. Adopting a geometrical view of PCA, various distance measures on a torus are introduced and the associated problem of projecting data onto the principal subspaces is discussed. The main idea is that the (periodicity-induced) projection error can be minimized by transforming the data such that the maximal gap of the sampling is shifted to the periodic boundary. In a second step, the covariance matrix and its eigendecomposition can be computed in a standard manner. Adopting molecular dynamics simulations of two well-established biomolecular systems (Aib 9 and villin headpiece), the potential of the method to analyze the dynamics of backbone dihedral angles is demonstrated. The new approach allows for a robust and well-defined construction of metastable states and provides low-dimensional reaction coordinates that accurately describe the free energy landscape. Moreover, it offers a direct interpretation of covariances and principal components in terms of the angular variables. Apart from its application to PCA, the method of maximal gap shifting is general and can be applied to any other dimensionality reduction method for circular data.

  15. Principal component analysis on a torus: Theory and application to protein dynamics

    NASA Astrophysics Data System (ADS)

    Sittel, Florian; Filk, Thomas; Stock, Gerhard

    2017-12-01

    A dimensionality reduction method for high-dimensional circular data is developed, which is based on a principal component analysis (PCA) of data points on a torus. Adopting a geometrical view of PCA, various distance measures on a torus are introduced and the associated problem of projecting data onto the principal subspaces is discussed. The main idea is that the (periodicity-induced) projection error can be minimized by transforming the data such that the maximal gap of the sampling is shifted to the periodic boundary. In a second step, the covariance matrix and its eigendecomposition can be computed in a standard manner. Adopting molecular dynamics simulations of two well-established biomolecular systems (Aib9 and villin headpiece), the potential of the method to analyze the dynamics of backbone dihedral angles is demonstrated. The new approach allows for a robust and well-defined construction of metastable states and provides low-dimensional reaction coordinates that accurately describe the free energy landscape. Moreover, it offers a direct interpretation of covariances and principal components in terms of the angular variables. Apart from its application to PCA, the method of maximal gap shifting is general and can be applied to any other dimensionality reduction method for circular data.

  16. A novel principal component analysis for spatially misaligned multivariate air pollution data.

    PubMed

    Jandarov, Roman A; Sheppard, Lianne A; Sampson, Paul D; Szpiro, Adam A

    2017-01-01

    We propose novel methods for predictive (sparse) PCA with spatially misaligned data. These methods identify principal component loading vectors that explain as much variability in the observed data as possible, while also ensuring the corresponding principal component scores can be predicted accurately by means of spatial statistics at locations where air pollution measurements are not available. This will make it possible to identify important mixtures of air pollutants and to quantify their health effects in cohort studies, where currently available methods cannot be used. We demonstrate the utility of predictive (sparse) PCA in simulated data and apply the approach to annual averages of particulate matter speciation data from national Environmental Protection Agency (EPA) regulatory monitors.

  17. Chemical information obtained from Auger depth profiles by means of advanced factor analysis (MLCFA)

    NASA Astrophysics Data System (ADS)

    De Volder, P.; Hoogewijs, R.; De Gryse, R.; Fiermans, L.; Vennik, J.

    1993-01-01

    The advanced multivariate statistical technique "maximum likelihood common factor analysis (MLCFA)" is shown to be superior to "principal component analysis (PCA)" for decomposing overlapping peaks into their individual component spectra of which neither the number of components nor the peak shape of the component spectra is known. An examination of the maximum resolving power of both techniques, MLCFA and PCA, by means of artificially created series of multicomponent spectra confirms this finding unambiguously. Substantial progress in the use of AES as a chemical-analysis technique is accomplished through the implementation of MLCFA. Chemical information from Auger depth profiles is extracted by investigating the variation of the line shape of the Auger signal as a function of the changing chemical state of the element. In particular, MLCFA combined with Auger depth profiling has been applied to problems related to steelcord-rubber tyre adhesion. MLCFA allows one to elucidate the precise nature of the interfacial layer of reaction products between natural rubber vulcanized on a thin brass layer. This study reveals many interesting chemical aspects of the oxi-sulfidation of brass undetectable with classical AES.

  18. Prostate Cancer Associated Lipid Signatures in Serum Studied by ESI-Tandem Mass Spectrometryas Potential New Biomarkers.

    PubMed

    Duscharla, Divya; Bhumireddy, Sudarshana Reddy; Lakshetti, Sridhar; Pospisil, Heike; Murthy, P V L N; Walther, Reinhard; Sripadi, Prabhakar; Ummanni, Ramesh

    2016-01-01

    Prostate cancer (PCa) is one amongst the most common cancersin western men. Incidence rate ofPCa is on the rise worldwide. The present study deals with theserum lipidome profiling of patients diagnosed with PCa to identify potential new biomarkers. We employed ESI-MS/MS and GC-MS for identification of significantly altered lipids in cancer patient's serum compared to controls. Lipidomic data revealed 24 lipids are significantly altered in cancer patinet's serum (n = 18) compared to normal (n = 18) with no history of PCa. By using hierarchical clustering and principal component analysis (PCA) we could clearly separate cancer patients from control group. Correlation and partition analysis along with Formal Concept Analysis (FCA) have identified that PC (39:6) and FA (22:3) could classify samples with higher certainty. Both the lipids, PC (39:6) and FA (22:3) could influence the cataloging of patients with 100% sensitivity (all 18 control samples are classified correctly) and 77.7% specificity (of 18 tumor samples 4 samples are misclassified) with p-value of 1.612×10-6 in Fischer's exact test. Further, we performed GC-MS to denote fatty acids altered in PCa patients and found that alpha-linolenic acid (ALA) levels are altered in PCa. We also performed an in vitro proliferation assay to determine the effect of ALA in survival of classical human PCa cell lines LNCaP and PC3. We hereby report that the altered lipids PC (39:6) and FA (22:3) offer a new set of biomarkers in addition to the existing diagnostic tests that could significantly improve sensitivity and specificity in PCa diagnosis.

  19. Prostate Cancer Associated Lipid Signatures in Serum Studied by ESI-Tandem Mass Spectrometryas Potential New Biomarkers

    PubMed Central

    Duscharla, Divya; Bhumireddy, Sudarshana Reddy; Lakshetti, Sridhar; Pospisil, Heike; Murthy, P. V. L. N.; Walther, Reinhard; Sripadi, Prabhakar; Ummanni, Ramesh

    2016-01-01

    Prostate cancer (PCa) is one amongst the most common cancersin western men. Incidence rate ofPCa is on the rise worldwide. The present study deals with theserum lipidome profiling of patients diagnosed with PCa to identify potential new biomarkers. We employed ESI-MS/MS and GC-MS for identification of significantly altered lipids in cancer patient’s serum compared to controls. Lipidomic data revealed 24 lipids are significantly altered in cancer patinet’s serum (n = 18) compared to normal (n = 18) with no history of PCa. By using hierarchical clustering and principal component analysis (PCA) we could clearly separate cancer patients from control group. Correlation and partition analysis along with Formal Concept Analysis (FCA) have identified that PC (39:6) and FA (22:3) could classify samples with higher certainty. Both the lipids, PC (39:6) and FA (22:3) could influence the cataloging of patients with 100% sensitivity (all 18 control samples are classified correctly) and 77.7% specificity (of 18 tumor samples 4 samples are misclassified) with p-value of 1.612×10−6 in Fischer’s exact test. Further, we performed GC-MS to denote fatty acids altered in PCa patients and found that alpha-linolenic acid (ALA) levels are altered in PCa. We also performed an in vitro proliferation assay to determine the effect of ALA in survival of classical human PCa cell lines LNCaP and PC3. We hereby report that the altered lipids PC (39:6) and FA (22:3) offer a new set of biomarkers in addition to the existing diagnostic tests that could significantly improve sensitivity and specificity in PCa diagnosis. PMID:26958841

  20. PCA as a practical indicator of OPLS-DA model reliability.

    PubMed

    Worley, Bradley; Powers, Robert

    Principal Component Analysis (PCA) and Orthogonal Projections to Latent Structures Discriminant Analysis (OPLS-DA) are powerful statistical modeling tools that provide insights into separations between experimental groups based on high-dimensional spectral measurements from NMR, MS or other analytical instrumentation. However, when used without validation, these tools may lead investigators to statistically unreliable conclusions. This danger is especially real for Partial Least Squares (PLS) and OPLS, which aggressively force separations between experimental groups. As a result, OPLS-DA is often used as an alternative method when PCA fails to expose group separation, but this practice is highly dangerous. Without rigorous validation, OPLS-DA can easily yield statistically unreliable group separation. A Monte Carlo analysis of PCA group separations and OPLS-DA cross-validation metrics was performed on NMR datasets with statistically significant separations in scores-space. A linearly increasing amount of Gaussian noise was added to each data matrix followed by the construction and validation of PCA and OPLS-DA models. With increasing added noise, the PCA scores-space distance between groups rapidly decreased and the OPLS-DA cross-validation statistics simultaneously deteriorated. A decrease in correlation between the estimated loadings (added noise) and the true (original) loadings was also observed. While the validity of the OPLS-DA model diminished with increasing added noise, the group separation in scores-space remained basically unaffected. Supported by the results of Monte Carlo analyses of PCA group separations and OPLS-DA cross-validation metrics, we provide practical guidelines and cross-validatory recommendations for reliable inference from PCA and OPLS-DA models.

  1. A composite measure to explore visual disability in primary progressive multiple sclerosis.

    PubMed

    Poretto, Valentina; Petracca, Maria; Saiote, Catarina; Mormina, Enricomaria; Howard, Jonathan; Miller, Aaron; Lublin, Fred D; Inglese, Matilde

    2017-01-01

    Optical coherence tomography (OCT) and magnetic resonance imaging (MRI) can provide complementary information on visual system damage in multiple sclerosis (MS). The objective of this paper is to determine whether a composite OCT/MRI score, reflecting cumulative damage along the entire visual pathway, can predict visual deficits in primary progressive multiple sclerosis (PPMS). Twenty-five PPMS patients and 20 age-matched controls underwent neuro-ophthalmologic evaluation, spectral-domain OCT, and 3T brain MRI. Differences between groups were assessed by univariate general linear model and principal component analysis (PCA) grouped instrumental variables into main components. Linear regression analysis was used to assess the relationship between low-contrast visual acuity (LCVA), OCT/MRI-derived metrics and PCA-derived composite scores. PCA identified four main components explaining 80.69% of data variance. Considering each variable independently, LCVA 1.25% was significantly predicted by ganglion cell-inner plexiform layer (GCIPL) thickness, thalamic volume and optic radiation (OR) lesion volume (adjusted R 2 0.328, p  = 0.00004; adjusted R 2 0.187, p  = 0.002 and adjusted R 2 0.180, p  = 0.002). The PCA composite score of global visual pathway damage independently predicted both LCVA 1.25% (adjusted R 2 value 0.361, p  = 0.00001) and LCVA 2.50% (adjusted R 2 value 0.323, p  = 0.00003). A multiparametric score represents a more comprehensive and effective tool to explain visual disability than a single instrumental metric in PPMS.

  2. Activation of Beta-Catenin Signaling in Androgen Receptor–Negative Prostate Cancer Cells

    PubMed Central

    Wan, Xinhai; Liu, Jie; Lu, Jing-Fang; Tzelepi, Vassiliki; Yang, Jun; Starbuck, Michael W.; Diao, Lixia; Wang, Jing; Efstathiou, Eleni; Vazquez, Elba S.; Troncoso, Patricia; Maity, Sankar N.; Navone, Nora M.

    2012-01-01

    Purpose To study Wnt/beta-catenin in castrate-resistant prostate cancer (CRPC) and understand its function independently of the beta-catenin–androgen receptor (AR) interaction. Experimental Design We performed beta-catenin immunocytochemical analysis, evaluated TOP-flash reporter activity (a reporter of beta-catenin–mediated transcription), and sequenced the beta-catenin gene in MDA PCa 118a, MDA PCa 118b, MDA PCa 2b, and PC-3 prostate cancer (PCa) cells. We knocked down beta-catenin in AR-negative MDA PCa 118b cells and performed comparative gene-array analysis. We also immunohistochemically analyzed beta-catenin and AR in 27 bone metastases of human CRPCs. Results Beta-catenin nuclear accumulation and TOP-flash reporter activity were high in MDA PCa 118b but not in MDA PCa 2b or PC-3 cells. MDA PCa 118a and 118b cells carry a mutated beta-catenin at codon 32 (D32G). Ten genes were expressed differently (false discovery rate, 0.05) in MDA PCa 118b cells with downregulated beta-catenin. One such gene, hyaluronan synthase 2 (HAS2), synthesizes hyaluronan, a core component of the extracellular matrix. We confirmed HAS2 upregulation in PC-3 cells transfected with D32G-mutant beta-catenin. Finally, we found nuclear localization of beta-catenin in 10 of 27 human tissue specimens; this localization was inversely associated with AR expression (P = 0.056, Fisher’s exact test), suggesting that reduced AR expression enables Wnt/beta-catenin signaling. Conclusion We identified a previously unknown downstream target of beta-catenin, HAS2, in PCa, and found that high beta-catenin nuclear localization and low or no AR expression may define a subpopulation of men with bone-metastatic PCa. These findings may guide physicians in managing these patients. PMID:22298898

  3. Direct analysis in real time mass spectrometry and multivariate data analysis: a novel approach to rapid identification of analytical markers for quality control of traditional Chinese medicine preparation.

    PubMed

    Zeng, Shanshan; Wang, Lu; Chen, Teng; Wang, Yuefei; Mo, Huanbiao; Qu, Haibin

    2012-07-06

    The paper presents a novel strategy to identify analytical markers of traditional Chinese medicine preparation (TCMP) rapidly via direct analysis in real time mass spectrometry (DART-MS). A commonly used TCMP, Danshen injection, was employed as a model. The optimal analysis conditions were achieved by measuring the contribution of various experimental parameters to the mass spectra. Salvianolic acids and saccharides were simultaneously determined within a single 1-min DART-MS run. Furthermore, spectra of Danshen injections supplied by five manufacturers were processed with principal component analysis (PCA). Obvious clustering was observed in the PCA score plot, and candidate markers were recognized from the contribution plots of PCA. The suitability of potential markers was then confirmed by contrasting with the results of traditional analysis methods. Using this strategy, fructose, glucose, sucrose, protocatechuic aldehyde and salvianolic acid A were rapidly identified as the markers of Danshen injections. The combination of DART-MS with PCA provides a reliable approach to the identification of analytical markers for quality control of TCMP. Copyright © 2012 Elsevier B.V. All rights reserved.

  4. Energy resolution improvement of CdTe detectors by using the principal component analysis technique

    NASA Astrophysics Data System (ADS)

    Alharbi, T.

    2018-02-01

    In this paper, we report on the application of the Principal Component Analysis (PCA) technique for the improvement of the γ-ray energy resolution of CdTe detectors. The PCA technique is used to estimate the amount of charge-trapping effect which is reflected in the shape of each detector pulse, thereby correcting for the charge-trapping effect. The details of the method are described and the results obtained with a CdTe detector are shown. We have achieved an energy resolution of 1.8 % (FWHM) at 662 keV with full detection efficiency from a 1 mm thick CdTe detector which gives an energy resolution of 4.5 % (FWHM) by using the standard pulse processing method.

  5. The utilization of Depth Invariant Index and Principle Component Analysis for mapping seagrass ecosystem of Kotok Island and Karang Bongkok, Indonesia

    NASA Astrophysics Data System (ADS)

    Manuputty, Agnestesya; Lumban Gaol, Jonson; Bahri Agus, Syamsul; Wayan Nurjaya, I.

    2017-01-01

    Seagrass perform a variety of functions within ecosystems, and have both economic and ecological values, therefore it has to be kept sustainable. One of the stages to preserve seagrass ecosystems is monitoring by utilizing thespatial data accurately. The purpose of the study was to assess and compare the accuracy of DII and PCA transformationsfor mapping of seagrass ecosystems. Fieldstudy was carried out in Karang Bongkok and Kotok Island waters, in Agustus 2014 and in March 2015. A WorldView-2 image acquisition date of 5 October 2013 was used in the study. The transformations for image processing data were Depth Invariant Index (DII) and Principle Component Analysis (PCA) using Support Vector Machine (SVM) classification. The result shows that benthic habitat mapping of Karang Bongkok using DII and PCA transformations were 72%and 81% overall’s accuracy respectively, whereas of Kotok Island were 83% and 84% overall’s accuracy respectively. There were seven benthic habitat types found in karang Bongkok waters and in Kotok Island namely seagrass, sand, rubble, coral, logoon, sand mix seagrass, and sand mix rubble. PCA transformation was effectively to improve mapping accuracy of sea grass mapping in Kotok Island and Karang Bongkok.

  6. Forecasting of UV-Vis absorbance time series using artificial neural networks combined with principal component analysis.

    PubMed

    Plazas-Nossa, Leonardo; Hofer, Thomas; Gruber, Günter; Torres, Andres

    2017-02-01

    This work proposes a methodology for the forecasting of online water quality data provided by UV-Vis spectrometry. Therefore, a combination of principal component analysis (PCA) to reduce the dimensionality of a data set and artificial neural networks (ANNs) for forecasting purposes was used. The results obtained were compared with those obtained by using discrete Fourier transform (DFT). The proposed methodology was applied to four absorbance time series data sets composed by a total number of 5705 UV-Vis spectra. Absolute percentage errors obtained by applying the proposed PCA/ANN methodology vary between 10% and 13% for all four study sites. In general terms, the results obtained were hardly generalizable, as they appeared to be highly dependent on specific dynamics of the water system; however, some trends can be outlined. PCA/ANN methodology gives better results than PCA/DFT forecasting procedure by using a specific spectra range for the following conditions: (i) for Salitre wastewater treatment plant (WWTP) (first hour) and Graz West R05 (first 18 min), from the last part of UV range to all visible range; (ii) for Gibraltar pumping station (first 6 min) for all UV-Vis absorbance spectra; and (iii) for San Fernando WWTP (first 24 min) for all of UV range to middle part of visible range.

  7. A novel method for qualitative analysis of edible oil oxidation using an electronic nose.

    PubMed

    Xu, Lirong; Yu, Xiuzhu; Liu, Lei; Zhang, Rui

    2016-07-01

    An electronic nose (E-nose) was used for rapid assessment of the degree of oxidation in edible oils. Peroxide and acid values of edible oil samples were analyzed using data obtained by the American Oil Chemists' Society (AOCS) Official Method for reference. Qualitative discrimination between non-oxidized and oxidized oils was conducted using the E-nose technique developed in combination with cluster analysis (CA), principal component analysis (PCA), and linear discriminant analysis (LDA). The results from CA, PCA and LDA indicated that the E-nose technique could be used for differentiation of non-oxidized and oxidized oils. LDA produced slightly better results than CA and PCA. The proposed approach can be used as an alternative to AOCS Official Method as an innovative tool for rapid detection of edible oil oxidation. Copyright © 2016 Elsevier Ltd. All rights reserved.

  8. Burst and Principal Components Analyses of MEA Data Separates Chemicals by Class

    EPA Science Inventory

    Microelectrode arrays (MEAs) detect drug and chemical induced changes in action potential "spikes" in neuronal networks and can be used to screen chemicals for neurotoxicity. Analytical "fingerprinting," using Principal Components Analysis (PCA) on spike trains recorded from prim...

  9. Using both principal component analysis and reduced rank regression to study dietary patterns and diabetes in Chinese adults.

    PubMed

    Batis, Carolina; Mendez, Michelle A; Gordon-Larsen, Penny; Sotres-Alvarez, Daniela; Adair, Linda; Popkin, Barry

    2016-02-01

    We examined the association between dietary patterns and diabetes using the strengths of two methods: principal component analysis (PCA) to identify the eating patterns of the population and reduced rank regression (RRR) to derive a pattern that explains the variation in glycated Hb (HbA1c), homeostasis model assessment of insulin resistance (HOMA-IR) and fasting glucose. We measured diet over a 3 d period with 24 h recalls and a household food inventory in 2006 and used it to derive PCA and RRR dietary patterns. The outcomes were measured in 2009. Adults (n 4316) from the China Health and Nutrition Survey. The adjusted odds ratio for diabetes prevalence (HbA1c≥6·5 %), comparing the highest dietary pattern score quartile with the lowest, was 1·26 (95 % CI 0·76, 2·08) for a modern high-wheat pattern (PCA; wheat products, fruits, eggs, milk, instant noodles and frozen dumplings), 0·76 (95 % CI 0·49, 1·17) for a traditional southern pattern (PCA; rice, meat, poultry and fish) and 2·37 (95 % CI 1·56, 3·60) for the pattern derived with RRR. By comparing the dietary pattern structures of RRR and PCA, we found that the RRR pattern was also behaviourally meaningful. It combined the deleterious effects of the modern high-wheat pattern (high intakes of wheat buns and breads, deep-fried wheat and soya milk) with the deleterious effects of consuming the opposite of the traditional southern pattern (low intakes of rice, poultry and game, fish and seafood). Our findings suggest that using both PCA and RRR provided useful insights when studying the association of dietary patterns with diabetes.

  10. Using both Principal Component Analysis and Reduced Rank Regression to Study Dietary Patterns and Diabetes in Chinese Adults

    PubMed Central

    Batis, Carolina; Mendez, Michelle A.; Gordon-Larsen, Penny; Sotres-Alvarez, Daniela; Adair, Linda; Popkin, Barry

    2014-01-01

    Objective We examined the association between dietary patterns and diabetes using the strengths of two methods: principal component analysis (PCA) to identify the eating patterns of the population and reduced rank regression (RRR) to derive a pattern that explains the variation in hemoglobin A1c (HbA1c), homeostasis model of insulin resistance (HOMA-IR), and fasting glucose. Design We measured diet over a 3-day period with 24-hour recalls and a household food inventory in 2006 and used it to derive PCA and RRR dietary patterns. The outcomes were measured in 2009. Setting Adults (n = 4,316) from the China Health and Nutrition Survey. Results The adjusted odds ratio for diabetes prevalence (HbA1c ≥ 6.5%), comparing the highest dietary pattern score quartile to the lowest, was 1.26 (0.76, 2.08) for a modern high-wheat pattern (PCA; wheat products, fruits, eggs, milk, instant noodles and frozen dumplings), 0.76 (0.49, 1.17) for a traditional southern pattern (PCA; rice, meat, poultry, and fish), and 2.37 (1.56, 3.60) for the pattern derived with RRR. By comparing the dietary pattern structures of RRR and PCA, we found that the RRR pattern was also behaviorally meaningful. It combined the deleterious effects of the modern high-wheat (high intake of wheat buns and breads, deep-fried wheat, and soy milk) with the deleterious effects of consuming the opposite of the traditional southern (low intake of rice, poultry and game, fish and seafood). Conclusions Our findings suggest that using both PCA and RRR provided useful insights when studying the association of dietary patterns with diabetes. PMID:26784586

  11. Characterization of Leaf Extracts of Schinus terebinthifolius Raddi by GC-MS and Chemometric Analysis

    PubMed Central

    Carneiro, Fabíola B.; Lopes, Pablo Q.; Ramalho, Ricardo C.; Scotti, Marcus T.; Santos, Sócrates G.; Soares, Luiz A. L.

    2017-01-01

    Background: Schinus terebinthifolius Raddi belongs to Anacardiacea family and is widely known as “aroeira.” This species originates from South America, and its extracts are used in folk medicine due to its therapeutic properties, which include antimicrobial, anti-inflammatory, and antipyretic effects. The complexity and variability of the chemical constitution of the herbal raw material establishes the quality of the respective herbal medicine products. Objective: Thus, the purpose of this study was to investigate the variability of the volatile compounds from leaves of S. terebinthifolius. Materials and Methods: The samples were collected from different states of the Northeast region of Brazil and analyzed with a gas chromatograph coupled to a mass spectrometer (GC-MS). The collected data were analyzed using multivariate data analysis. Results: The samples’ chromatograms, obtained by GC-MS, showed similar chemical profiles in a number of peaks, but some differences were observed in the intensity of these analytical markers. The chromatographic fingerprints obtained by GC-MS were suitable for discrimination of the samples; these results along with a statistical treatment (principal component analysis [PCA]) were used as a tool for comparative analysis between the different samples of S. terebinthifolius. Conclusion: The experimental data show that the PCA used in this study clustered the samples into groups with similar chemical profiles, which builds an appropriate approach to evaluate the similarity in the phytochemical pattern found in the different leaf samples. SUMMARY The leave extracts of Schinus terebinthifolius were obtained by turbo-extractionThe extracts were partitioned with hexane and analyzed by GC-MSThe chromatographic data were analyzed using the principal component analysis (PCA)The PCA plots showed the main compounds (phellandrene, limonene, and carene), which were used to group the samples from a different geographical location in accordance to their chemical similarity. Abbreviations used: AL: Alagoas, BA: Bahia, CE: Ceará, CPETEC: Center for Weather Forecasting and Climate Studies, GC-MS: Gas chromatograph coupled to a mass spectrometer, MA: Maranhão, MVA: Multivariate data analysis, PB: Paraíba, PC1: Direction that describes the maximum variance of the original data, PC2: Maximum direction variance of the data in the subspace orthogonal to PC1, PCA: Principal component analysis, PE: Pernambuco, PI: Piauí, RN: Rio Grande do Norte, SE: Sergipe. PMID:29142431

  12. A two-stage linear discriminant analysis via QR-decomposition.

    PubMed

    Ye, Jieping; Li, Qi

    2005-06-01

    Linear Discriminant Analysis (LDA) is a well-known method for feature extraction and dimension reduction. It has been used widely in many applications involving high-dimensional data, such as image and text classification. An intrinsic limitation of classical LDA is the so-called singularity problems; that is, it fails when all scatter matrices are singular. Many LDA extensions were proposed in the past to overcome the singularity problems. Among these extensions, PCA+LDA, a two-stage method, received relatively more attention. In PCA+LDA, the LDA stage is preceded by an intermediate dimension reduction stage using Principal Component Analysis (PCA). Most previous LDA extensions are computationally expensive, and not scalable, due to the use of Singular Value Decomposition or Generalized Singular Value Decomposition. In this paper, we propose a two-stage LDA method, namely LDA/QR, which aims to overcome the singularity problems of classical LDA, while achieving efficiency and scalability simultaneously. The key difference between LDA/QR and PCA+LDA lies in the first stage, where LDA/QR applies QR decomposition to a small matrix involving the class centroids, while PCA+LDA applies PCA to the total scatter matrix involving all training data points. We further justify the proposed algorithm by showing the relationship among LDA/QR and previous LDA methods. Extensive experiments on face images and text documents are presented to show the effectiveness of the proposed algorithm.

  13. Increasing the Reliability of Ability-Achievement Difference Scores: An Example Using the Kaufman Assessment Battery for Children.

    ERIC Educational Resources Information Center

    Caruso, John C.; Witkiewitz, Katie

    2002-01-01

    As an alternative to equally weighted difference scores, examined an orthogonal reliable component analysis (RCA) solution and an oblique principal components analysis (PCA) solution for the standardization sample of the Kaufman Assessment Battery for Children (KABC; A. Kaufman and N. Kaufman, 1983). Discusses the practical implications of the…

  14. Variable Neighborhood Search Heuristics for Selecting a Subset of Variables in Principal Component Analysis

    ERIC Educational Resources Information Center

    Brusco, Michael J.; Singh, Renu; Steinley, Douglas

    2009-01-01

    The selection of a subset of variables from a pool of candidates is an important problem in several areas of multivariate statistics. Within the context of principal component analysis (PCA), a number of authors have argued that subset selection is crucial for identifying those variables that are required for correct interpretation of the…

  15. Quantitative descriptive analysis and principal component analysis for sensory characterization of Indian milk product cham-cham.

    PubMed

    Puri, Ritika; Khamrui, Kaushik; Khetra, Yogesh; Malhotra, Ravinder; Devraja, H C

    2016-02-01

    Promising development and expansion in the market of cham-cham, a traditional Indian dairy product is expected in the coming future with the organized production of this milk product by some large dairies. The objective of this study was to document the extent of variation in sensory properties of market samples of cham-cham collected from four different locations known for their excellence in cham-cham production and to find out the attributes that govern much of variation in sensory scores of this product using quantitative descriptive analysis (QDA) and principal component analysis (PCA). QDA revealed significant (p < 0.05) difference in sensory attributes of cham-cham among the market samples. PCA identified four significant principal components that accounted for 72.4 % of the variation in the sensory data. Factor scores of each of the four principal components which primarily correspond to sweetness/shape/dryness of interior, surface appearance/surface dryness, rancid and firmness attributes specify the location of each market sample along each of the axes in 3-D graphs. These findings demonstrate the utility of quantitative descriptive analysis for identifying and measuring attributes of cham-cham that contribute most to its sensory acceptability.

  16. Quantitative analysis of NMR spectra with chemometrics

    NASA Astrophysics Data System (ADS)

    Winning, H.; Larsen, F. H.; Bro, R.; Engelsen, S. B.

    2008-01-01

    The number of applications of chemometrics to series of NMR spectra is rapidly increasing due to an emerging interest for quantitative NMR spectroscopy e.g. in the pharmaceutical and food industries. This paper gives an analysis of advantages and limitations of applying the two most common chemometric procedures, Principal Component Analysis (PCA) and Multivariate Curve Resolution (MCR), to a designed set of 231 simple alcohol mixture (propanol, butanol and pentanol) 1H 400 MHz spectra. The study clearly demonstrates that the major advantage of chemometrics is the visualisation of larger data structures which adds a new exploratory dimension to NMR research. While robustness and powerful data visualisation and exploration are the main qualities of the PCA method, the study demonstrates that the bilinear MCR method is an even more powerful method for resolving pure component NMR spectra from mixtures when certain conditions are met.

  17. Facilitating Neuronal Connectivity Analysis of Evoked Responses by Exposing Local Activity with Principal Component Analysis Preprocessing: Simulation of Evoked MEG

    PubMed Central

    Gao, Lin; Zhang, Tongsheng; Wang, Jue; Stephen, Julia

    2014-01-01

    When connectivity analysis is carried out for event related EEG and MEG, the presence of strong spatial correlations from spontaneous activity in background may mask the local neuronal evoked activity and lead to spurious connections. In this paper, we hypothesized PCA decomposition could be used to diminish the background activity and further improve the performance of connectivity analysis in event related experiments. The idea was tested using simulation, where we found that for the 306-channel Elekta Neuromag system, the first 4 PCs represent the dominant background activity, and the source connectivity pattern after preprocessing is consistent with the true connectivity pattern designed in the simulation. Improving signal to noise of the evoked responses by discarding the first few PCs demonstrates increased coherences at major physiological frequency bands when removing the first few PCs. Furthermore, the evoked information was maintained after PCA preprocessing. In conclusion, it is demonstrated that the first few PCs represent background activity, and PCA decomposition can be employed to remove it to expose the evoked activity for the channels under investigation. Therefore, PCA can be applied as a preprocessing approach to improve neuronal connectivity analysis for event related data. PMID:22918837

  18. Facilitating neuronal connectivity analysis of evoked responses by exposing local activity with principal component analysis preprocessing: simulation of evoked MEG.

    PubMed

    Gao, Lin; Zhang, Tongsheng; Wang, Jue; Stephen, Julia

    2013-04-01

    When connectivity analysis is carried out for event related EEG and MEG, the presence of strong spatial correlations from spontaneous activity in background may mask the local neuronal evoked activity and lead to spurious connections. In this paper, we hypothesized PCA decomposition could be used to diminish the background activity and further improve the performance of connectivity analysis in event related experiments. The idea was tested using simulation, where we found that for the 306-channel Elekta Neuromag system, the first 4 PCs represent the dominant background activity, and the source connectivity pattern after preprocessing is consistent with the true connectivity pattern designed in the simulation. Improving signal to noise of the evoked responses by discarding the first few PCs demonstrates increased coherences at major physiological frequency bands when removing the first few PCs. Furthermore, the evoked information was maintained after PCA preprocessing. In conclusion, it is demonstrated that the first few PCs represent background activity, and PCA decomposition can be employed to remove it to expose the evoked activity for the channels under investigation. Therefore, PCA can be applied as a preprocessing approach to improve neuronal connectivity analysis for event related data.

  19. Portable XRF and principal component analysis for bill characterization in forensic science.

    PubMed

    Appoloni, C R; Melquiades, F L

    2014-02-01

    Several modern techniques have been applied to prevent counterfeiting of money bills. The objective of this study was to demonstrate the potential of Portable X-ray Fluorescence (PXRF) technique and the multivariate analysis method of Principal Component Analysis (PCA) for classification of bills in order to use it in forensic science. Bills of Dollar, Euro and Real (Brazilian currency) were measured directly at different colored regions, without any previous preparation. Spectra interpretation allowed the identification of Ca, Ti, Fe, Cu, Sr, Y, Zr and Pb. PCA analysis separated the bills in three groups and subgroups among Brazilian currency. In conclusion, the samples were classified according to its origin identifying the elements responsible for differentiation and basic pigment composition. PXRF allied to multivariate discriminate methods is a promising technique for rapid and no destructive identification of false bills in forensic science. Copyright © 2013 Elsevier Ltd. All rights reserved.

  20. Towards the identification of plant and animal binders on Australian stone knives.

    PubMed

    Blee, Alisa J; Walshe, Keryn; Pring, Allan; Quinton, Jamie S; Lenehan, Claire E

    2010-07-15

    There is limited information regarding the nature of plant and animal residues used as adhesives, fixatives and pigments found on Australian Aboriginal artefacts. This paper reports the use of FTIR in combination with the chemometric tools principal component analysis (PCA) and hierarchical clustering (HC) for the analysis and identification of Australian plant and animal fixatives on Australian stone artefacts. Ten different plant and animal residues were able to be discriminated from each other at a species level by combining FTIR spectroscopy with the chemometric data analysis methods, principal component analysis (PCA) and hierarchical clustering (HC). Application of this method to residues from three broken stone knives from the collections of the South Australian Museum indicated that two of the handles of knives were likely to have contained beeswax as the fixative whilst Spinifex resin was the probable binder on the third. Copyright 2010 Elsevier B.V. All rights reserved.

  1. Principle component analysis and linear discriminant analysis of multi-spectral autofluorescence imaging data for differentiating basal cell carcinoma and healthy skin

    NASA Astrophysics Data System (ADS)

    Chernomyrdin, Nikita V.; Zaytsev, Kirill I.; Lesnichaya, Anastasiya D.; Kudrin, Konstantin G.; Cherkasova, Olga P.; Kurlov, Vladimir N.; Shikunova, Irina A.; Perchik, Alexei V.; Yurchenko, Stanislav O.; Reshetov, Igor V.

    2016-09-01

    In present paper, an ability to differentiate basal cell carcinoma (BCC) and healthy skin by combining multi-spectral autofluorescence imaging, principle component analysis (PCA), and linear discriminant analysis (LDA) has been demonstrated. For this purpose, the experimental setup, which includes excitation and detection branches, has been assembled. The excitation branch utilizes a mercury arc lamp equipped with a 365-nm narrow-linewidth excitation filter, a beam homogenizer, and a mechanical chopper. The detection branch employs a set of bandpass filters with the central wavelength of spectral transparency of λ = 400, 450, 500, and 550 nm, and a digital camera. The setup has been used to study three samples of freshly excised BCC. PCA and LDA have been implemented to analyze the data of multi-spectral fluorescence imaging. Observed results of this pilot study highlight the advantages of proposed imaging technique for skin cancer diagnosis.

  2. Standardized processing of MALDI imaging raw data for enhancement of weak analyte signals in mouse models of gastric cancer and Alzheimer's disease.

    PubMed

    Schwartz, Matthias; Meyer, Björn; Wirnitzer, Bernhard; Hopf, Carsten

    2015-03-01

    Conventional mass spectrometry image preprocessing methods used for denoising, such as the Savitzky-Golay smoothing or discrete wavelet transformation, typically do not only remove noise but also weak signals. Recently, memory-efficient principal component analysis (PCA) in conjunction with random projections (RP) has been proposed for reversible compression and analysis of large mass spectrometry imaging datasets. It considers single-pixel spectra in their local context and consequently offers the prospect of using information from the spectra of adjacent pixels for denoising or signal enhancement. However, little systematic analysis of key RP-PCA parameters has been reported so far, and the utility and validity of this method for context-dependent enhancement of known medically or pharmacologically relevant weak analyte signals in linear-mode matrix-assisted laser desorption/ionization (MALDI) mass spectra has not been explored yet. Here, we investigate MALDI imaging datasets from mouse models of Alzheimer's disease and gastric cancer to systematically assess the importance of selecting the right number of random projections k and of principal components (PCs) L for reconstructing reproducibly denoised images after compression. We provide detailed quantitative data for comparison of RP-PCA-denoising with the Savitzky-Golay and wavelet-based denoising in these mouse models as a resource for the mass spectrometry imaging community. Most importantly, we demonstrate that RP-PCA preprocessing can enhance signals of low-intensity amyloid-β peptide isoforms such as Aβ1-26 even in sparsely distributed Alzheimer's β-amyloid plaques and that it enables enhanced imaging of multiply acetylated histone H4 isoforms in response to pharmacological histone deacetylase inhibition in vivo. We conclude that RP-PCA denoising may be a useful preprocessing step in biomarker discovery workflows.

  3. Utility of Metabolomics toward Assessing the Metabolic Basis of Quality Traits in Apple Fruit with an Emphasis on Antioxidants

    PubMed Central

    Cuthbertson, Daniel; Andrews, Preston K.; Reganold, John P.; Davies, Neal M.; Lange, B. Markus

    2012-01-01

    A gas chromatography–mass spectrometry approach was employed to evaluate the use of metabolite patterns to differentiate fruit from six commercially grown apple cultivars harvested in 2008. Principal component analysis (PCA) of apple fruit peel and flesh data indicated that individual cultivar replicates clustered together and were separated from all other cultivar samples. An independent metabolomics investigation with fruit harvested in 2003 confirmed the separate clustering of fruit from different cultivars. Further evidence for cultivar separation was obtained using a hierarchical clustering analysis. An evaluation of PCA component loadings revealed specific metabolite classes that contributed the most to each principal component, whereas a correlation analysis demonstrated that specific metabolites correlate directly with quality traits such as antioxidant activity, total phenolics, and total anthocyanins, which are important parameters in the selection of breeding germplasm. These data sets lay the foundation for elucidating the metabolic basis of commercially important fruit quality traits. PMID:22881116

  4. Targeted and non-targeted detection of lemon juice adulteration by LC-MS and chemometrics.

    PubMed

    Wang, Zhengfang; Jablonski, Joseph E

    2016-01-01

    Economically motivated adulteration (EMA) of lemon juice was detected by LC-MS and principal component analysis (PCA). Twenty-two batches of freshly squeezed lemon juice were adulterated by adding an aqueous solution containing 5% citric acid and 6% sucrose to pure lemon juice to obtain 30%, 60% and 100% lemon juice samples. Their total titratable acidities, °Brix and pH values were measured, and then all the lemon juice samples were subject to LC-MS analysis. Concentrations of hesperidin and eriocitrin, major phenolic components of lemon juice, were quantified. The PCA score plots for LC-MS datasets were used to preview the classification of pure and adulterated lemon juice samples. Results showed a large inherent variability in the chemical properties among 22 batches of 100% lemon juice samples. Measurement or quantitation of one or several chemical properties (targeted detection) was not effective in detecting lemon juice adulteration. However, by using the LC-MS datasets, including both chromatographic and mass spectrometric information, 100% lemon juice samples were successfully differentiated from adulterated samples containing 30% lemon juice in the PCA score plot. LC-MS coupled with chemometric analysis can be a complement to existing methods for detecting juice adulteration.

  5. Principal elementary mode analysis (PEMA).

    PubMed

    Folch-Fortuny, Abel; Marques, Rodolfo; Isidro, Inês A; Oliveira, Rui; Ferrer, Alberto

    2016-03-01

    Principal component analysis (PCA) has been widely applied in fluxomics to compress data into a few latent structures in order to simplify the identification of metabolic patterns. These latent structures lack a direct biological interpretation due to the intrinsic constraints associated with a PCA model. Here we introduce a new method that significantly improves the interpretability of the principal components with a direct link to metabolic pathways. This method, called principal elementary mode analysis (PEMA), establishes a bridge between a PCA-like model, aimed at explaining the maximum variance in flux data, and the set of elementary modes (EMs) of a metabolic network. It provides an easy way to identify metabolic patterns in large fluxomics datasets in terms of the simplest pathways of the organism metabolism. The results using a real metabolic model of Escherichia coli show the ability of PEMA to identify the EMs that generated the different simulated flux distributions. Actual flux data of E. coli and Pichia pastoris cultures confirm the results observed in the simulated study, providing a biologically meaningful model to explain flux data of both organisms in terms of the EM activation. The PEMA toolbox is freely available for non-commercial purposes on http://mseg.webs.upv.es.

  6. Morphological analysis of Trichomycterus areolatus Valenciennes, 1846 from southern Chilean rivers using a truss-based system (Siluriformes, Trichomycteridae).

    PubMed

    Colihueque, Nelson; Corrales, Olga; Yáñez, Miguel

    2017-01-01

    Trichomycterus areolatus Valenciennes, 1846 is a small endemic catfish inhabiting the Andean river basins of Chile. In this study, the morphological variability of three T. areolatus populations, collected in two river basins from southern Chile, was assessed with multivariate analyses, including principal component analysis (PCA) and discriminant function analysis (DFA). It is hypothesized that populations must segregate morphologically from each other based on the river basin that they were sampled from, since each basin presents relatively particular hydrological characteristics. Significant morphological differences among the three populations were found with PCA (ANOSIM test, r = 0.552, p < 0.0001) and DFA (Wilks's λ = 0.036, p < 0.01). PCA accounted for a total variation of 56.16% by the first two principal components. The first Principal Component (PC1) and PC2 explained 34.72 and 21.44% of the total variation, respectively. The scatter-plot of the first two discriminant functions (DF1 on DF2) also validated the existence of three different populations. In group classification using DFA, 93.3% of the specimens were correctly-classified into their original populations. Of the total of 22 transformed truss measurements, 17 exhibited highly significant ( p < 0.01) differences among populations. The data support the existence of T. areolatus morphological variation across different rivers in southern Chile, likely reflecting the geographic isolation underlying population structure of the species.

  7. Fast and Accurate Radiative Transfer Calculations Using Principal Component Analysis for (Exo-)Planetary Retrieval Models

    NASA Astrophysics Data System (ADS)

    Kopparla, P.; Natraj, V.; Shia, R. L.; Spurr, R. J. D.; Crisp, D.; Yung, Y. L.

    2015-12-01

    Radiative transfer (RT) computations form the engine of atmospheric retrieval codes. However, full treatment of RT processes is computationally expensive, prompting usage of two-stream approximations in current exoplanetary atmospheric retrieval codes [Line et al., 2013]. Natraj et al. [2005, 2010] and Spurr and Natraj [2013] demonstrated the ability of a technique using principal component analysis (PCA) to speed up RT computations. In the PCA method for RT performance enhancement, empirical orthogonal functions are developed for binned sets of inherent optical properties that possess some redundancy; costly multiple-scattering RT calculations are only done for those few optical states corresponding to the most important principal components, and correction factors are applied to approximate radiation fields. Kopparla et al. [2015, in preparation] extended the PCA method to a broadband spectral region from the ultraviolet to the shortwave infrared (0.3-3 micron), accounting for major gas absorptions in this region. Here, we apply the PCA method to a some typical (exo-)planetary retrieval problems. Comparisons between the new model, called Universal Principal Component Analysis Radiative Transfer (UPCART) model, two-stream models and line-by-line RT models are performed, for spectral radiances, spectral fluxes and broadband fluxes. Each of these are calculated at the top of the atmosphere for several scenarios with varying aerosol types, extinction and scattering optical depth profiles, and stellar and viewing geometries. We demonstrate that very accurate radiance and flux estimates can be obtained, with better than 1% accuracy in all spectral regions and better than 0.1% in most cases, as compared to a numerically exact line-by-line RT model. The accuracy is enhanced when the results are convolved to typical instrument resolutions. The operational speed and accuracy of UPCART can be further improved by optimizing binning schemes and parallelizing the codes, work on which is under way.

  8. Identification of Surface Water Quality along the Coast of Sanya, South China Sea

    PubMed Central

    Wu, Zhen-Zhen; Che, Zhi-Wei; Wang, You-Shao; Dong, Jun-De; Wu, Mei-Lin

    2015-01-01

    Principal component analysis (PCA) and cluster analysis (CA) are utilized to identify the effects caused by human activities on water quality along the coast of Sanya, South China Sea. PCA and CA identify the seasonality of water quality (dry and wet seasons) and polluted status (polluted area). The seasonality of water quality is related to climate change and Southeast monsoons. Spatial pattern is mainly related to anthropogenic activities (especially land input of pollutions). PCA reveals the characteristics underlying the generation of coastal water quality. The temporal and spatial variation of the trophic status along the coast of Sanya is governed by hydrodynamics and human activities. The results provide a novel typological understanding of seasonal trophic status in a shallow, tropical, open marine bay. PMID:25894980

  9. Novel Three-Component Phenazine-1-Carboxylic Acid 1,2-Dioxygenase in Sphingomonas wittichii DP58

    PubMed Central

    Zhao, Qiang; Wang, Wei; Huang, Xian-Qing; Zhang, Xue-Hong

    2017-01-01

    ABSTRACT Phenazine-1-carboxylic acid, the main component of shenqinmycin, is widely used in southern China for the prevention of rice sheath blight. However, the fate of phenazine-1-carboxylic acid in soil remains uncertain. Sphingomonas wittichii DP58 can use phenazine-1-carboxylic acid as its sole carbon and nitrogen sources for growth. In this study, dioxygenase-encoding genes, pcaA1A2, were found using transcriptome analysis to be highly upregulated upon phenazine-1-carboxylic acid biodegradation. PcaA1 shares 68% amino acid sequence identity with the large oxygenase subunit of anthranilate 1,2-dioxygenase from Rhodococcus maanshanensis DSM 44675. The dioxygenase was coexpressed in Escherichia coli with its adjacent reductase-encoding gene, pcaA3, and ferredoxin-encoding gene, pcaA4, and showed phenazine-1-carboxylic acid consumption. The dioxygenase-, ferredoxin-, and reductase-encoding genes were expressed in Pseudomonas putida KT2440 or E. coli BL21, and the three recombinant proteins were purified. A phenazine-1-carboxylic acid conversion capability occurred in vitro only when all three components were present. However, P. putida KT2440 transformed with pcaA1A2 obtained phenazine-1-carboxylic acid degradation ability, suggesting that phenazine-1-carboxylic acid 1,2-dioxygenase has low specificities for its ferredoxin and reductase. This was verified by replacing PcaA3 with RedA2 in the in vitro enzyme assay. High-performance liquid chromatography–mass spectrometry (HPLC-MS) and nuclear magnetic resonance (NMR) analysis showed that phenazine-1-carboxylic acid was converted to 1,2-dihydroxyphenazine through decarboxylation and hydroxylation, indicating that PcaA1A2A3A4 constitutes the initial phenazine-1-carboxylic acid 1,2-dioxygenase. This study fills a gap in our understanding of the biodegradation of phenazine-1-carboxylic acid and illustrates a new dioxygenase for decarboxylation. IMPORTANCE Phenazine-1-carboxylic acid is widely used in southern China as a key fungicide to prevent rice sheath blight. However, the degradation characteristics of phenazine-1-carboxylic acid and the environmental consequences of the long-term application are not clear. S. wittichii DP58 can use phenazine-1-carboxylic acid as its sole carbon and nitrogen sources. In this study, a three-component dioxygenase, PcaA1A2A3A4, was determined to be the initial dioxygenase for phenazine-1-carboxylic acid degradation in S. wittichii DP58. Phenazine-1-carboxylic acid was converted to 1,2-dihydroxyphenazine through decarboxylation and hydroxylation. This finding may help us discover the pathway for phenazine-1-carboxylic acid degradation. PMID:28188209

  10. Discrimination of Rhizoma Gastrodiae (Tianma) using 3D synchronous fluorescence spectroscopy coupled with principal component analysis

    NASA Astrophysics Data System (ADS)

    Fan, Qimeng; Chen, Chaoyin; Huang, Zaiqiang; Zhang, Chunmei; Liang, Pengjuan; Zhao, Shenglan

    2015-02-01

    Rhizoma Gastrodiae (Tianma) of different variants and different geographical origins has vital difference in quality and physiological efficacy. This paper focused on the classification and identification of Tianma of six types (two variants from three different geographical origins) using three dimensional synchronous fluorescence spectroscopy (3D-SFS) coupled with principal component analysis (PCA). 3D-SF spectra of aqueous extracts, which were obtained from Tianma of the six types, were measured by a LS-50B luminescence spectrofluorometer. The experimental results showed that the characteristic fluorescent spectral regions of the 3D-SF spectra were similar, while the intensities of characteristic regions are different significantly. Coupled these differences in peak intensities with PCA, Tianma of six types could be discriminated successfully. In conclusion, 3D-SFS coupled with PCA, which has such advantages as effective, specific, rapid, non-polluting, has an edge for discrimination of the similar Chinese herbal medicine. And the proposed methodology is a useful tool to classify and identify Tianma of different variants and different geographical origins.

  11. Identifying natural and anthropogenic sources of metals in urban and rural soils using GIS-based data, PCA, and spatial interpolation

    PubMed Central

    Davis, Harley T.; Aelion, C. Marjorie; McDermott, Suzanne; Lawson, Andrew B.

    2009-01-01

    Determining sources of neurotoxic metals in rural and urban soils is important for mitigating human exposure. Surface soil from four areas with significant clusters of mental retardation and developmental delay (MR/DD) in children, and one control site were analyzed for nine metals and characterized by soil type, climate, ecological region, land use and industrial facilities using readily-available GIS-based data. Kriging, principal component analysis (PCA) and cluster analysis (CA) were used to identify commonalities of metal distribution. Three MR/DD areas (one rural and two urban) had similar soil types and significantly higher soil metal concentrations. PCA and CA results suggested that Ba, Be and Mn were consistently from natural sources; Pb and Hg from anthropogenic sources; and As, Cr, Cu, and Ni from both sources. Arsenic had low commonality estimates, was highly associated with a third PCA factor, and had a complex distribution, complicating mitigation strategies to minimize concentrations and exposures. PMID:19361902

  12. Regional and local background ozone in Houston during Texas Air Quality Study 2006

    NASA Astrophysics Data System (ADS)

    Langford, A. O.; Senff, C. J.; Banta, R. M.; Hardesty, R. M.; Alvarez, R. J.; Sandberg, Scott P.; Darby, Lisa S.

    2009-04-01

    Principal Component Analysis (PCA) is used to isolate the common modes of behavior in the daily maximum 8-h average ozone mixing ratios measured at 30 Continuous Ambient Monitoring Stations in the Houston-Galveston-Brazoria area during the Second Texas Air Quality Study field intensive (1 August to 15 October 2006). Three principal components suffice to explain 93% of the total variance. Nearly 84% is explained by the first component, which is attributed to changes in the "regional background" determined primarily by the large-scale winds. The second component (6%) is attributed to changes in the "local background," that is, ozone photochemically produced in the Houston area and spatially and temporally averaged by local circulations. Finally, the third component (3.5%) is attributed to short-lived plumes containing high ozone originating from industrial areas along Galveston Bay and the Houston Ship Channel. Regional background ozone concentrations derived using the first component compare well with mean ozone concentrations measured above the Gulf of Mexico by the tunable profiler for aerosols and ozone lidar aboard the NOAA Twin Otter. The PCA regional background values also agree well with background values derived using the lowest daily 8-h maximum method of Nielsen-Gammon et al. (2005), provided the Galveston Airport data (C34) are omitted from that analysis. The differences found when Galveston is included are caused by the sea breeze, which depresses ozone at Galveston relative to sites further inland. PCA removes the effects of this and other local circulations to obtain a regional background value representative of the greater Houston area.

  13. Pepper seed variety identification based on visible/near-infrared spectral technology

    NASA Astrophysics Data System (ADS)

    Li, Cuiling; Wang, Xiu; Meng, Zhijun; Fan, Pengfei; Cai, Jichen

    2016-11-01

    Pepper is a kind of important fruit vegetable, with the expansion of pepper hybrid planting area, detection of pepper seed purity is especially important. This research used visible/near infrared (VIS/NIR) spectral technology to detect the variety of single pepper seed, and chose hybrid pepper seeds "Zhuo Jiao NO.3", "Zhuo Jiao NO.4" and "Zhuo Jiao NO.5" as research sample. VIS/NIR spectral data of 80 "Zhuo Jiao NO.3", 80 "Zhuo Jiao NO.4" and 80 "Zhuo Jiao NO.5" pepper seeds were collected, and the original spectral data was pretreated with standard normal variable (SNV) transform, first derivative (FD), and Savitzky-Golay (SG) convolution smoothing methods. Principal component analysis (PCA) method was adopted to reduce the dimension of the spectral data and extract principal components, according to the distribution of the first principal component (PC1) along with the second principal component(PC2) in the twodimensional plane, similarly, the distribution of PC1 coupled with the third principal component(PC3), and the distribution of PC2 combined with PC3, distribution areas of three varieties of pepper seeds were divided in each twodimensional plane, and the discriminant accuracy of PCA was tested through observing the distribution area of samples' principal components in validation set. This study combined PCA and linear discriminant analysis (LDA) to identify single pepper seed varieties, results showed that with the FD preprocessing method, the discriminant accuracy of pepper seed varieties was 98% for validation set, it concludes that using VIS/NIR spectral technology is feasible for identification of single pepper seed varieties.

  14. Improved estimation of parametric images of cerebral glucose metabolic rate from dynamic FDG-PET using volume-wise principle component analysis

    NASA Astrophysics Data System (ADS)

    Dai, Xiaoqian; Tian, Jie; Chen, Zhe

    2010-03-01

    Parametric images can represent both spatial distribution and quantification of the biological and physiological parameters of tracer kinetics. The linear least square (LLS) method is a well-estimated linear regression method for generating parametric images by fitting compartment models with good computational efficiency. However, bias exists in LLS-based parameter estimates, owing to the noise present in tissue time activity curves (TTACs) that propagates as correlated error in the LLS linearized equations. To address this problem, a volume-wise principal component analysis (PCA) based method is proposed. In this method, firstly dynamic PET data are properly pre-transformed to standardize noise variance as PCA is a data driven technique and can not itself separate signals from noise. Secondly, the volume-wise PCA is applied on PET data. The signals can be mostly represented by the first few principle components (PC) and the noise is left in the subsequent PCs. Then the noise-reduced data are obtained using the first few PCs by applying 'inverse PCA'. It should also be transformed back according to the pre-transformation method used in the first step to maintain the scale of the original data set. Finally, the obtained new data set is used to generate parametric images using the linear least squares (LLS) estimation method. Compared with other noise-removal method, the proposed method can achieve high statistical reliability in the generated parametric images. The effectiveness of the method is demonstrated both with computer simulation and with clinical dynamic FDG PET study.

  15. Landslides Identification Using Airborne Laser Scanning Data Derived Topographic Terrain Attributes and Support Vector Machine Classification

    NASA Astrophysics Data System (ADS)

    Pawłuszek, Kamila; Borkowski, Andrzej

    2016-06-01

    Since the availability of high-resolution Airborne Laser Scanning (ALS) data, substantial progress in geomorphological research, especially in landslide analysis, has been carried out. First and second order derivatives of Digital Terrain Model (DTM) have become a popular and powerful tool in landslide inventory mapping. Nevertheless, an automatic landslide mapping based on sophisticated classifiers including Support Vector Machine (SVM), Artificial Neural Network or Random Forests is often computationally time consuming. The objective of this research is to deeply explore topographic information provided by ALS data and overcome computational time limitation. For this reason, an extended set of topographic features and the Principal Component Analysis (PCA) were used to reduce redundant information. The proposed novel approach was tested on a susceptible area affected by more than 50 landslides located on Rożnów Lake in Carpathian Mountains, Poland. The initial seven PCA components with 90% of the total variability in the original topographic attributes were used for SVM classification. Comparing results with landslide inventory map, the average user's accuracy (UA), producer's accuracy (PA), and overall accuracy (OA) were calculated for two models according to the classification results. Thereby, for the PCA-feature-reduced model UA, PA, and OA were found to be 72%, 76%, and 72%, respectively. Similarly, UA, PA, and OA in the non-reduced original topographic model, was 74%, 77% and 74%, respectively. Using the initial seven PCA components instead of the twenty original topographic attributes does not significantly change identification accuracy but reduce computational time.

  16. Recurrence quantification analysis applied to spatiotemporal pattern analysis in high-density mapping of human atrial fibrillation.

    PubMed

    Zeemering, Stef; Bonizzi, Pietro; Maesen, Bart; Peeters, Ralf; Schotten, Ulrich

    2015-01-01

    Spatiotemporal complexity of atrial fibrillation (AF) patterns is often quantified by annotated intracardiac contact mapping. We introduce a new approach that applies recurrence plot (RP) construction followed by recurrence quantification analysis (RQA) to epicardial atrial electrograms, recorded with a high-density grid of electrodes. In 32 patients with no history of AF (aAF, n=11), paroxysmal AF (PAF, n=12) and persistent AF (persAF, n=9), RPs were constructed using a phase space electrogram embedding dimension equal to the estimated AF cycle length. Spatial information was incorporated by 1) averaging the recurrence over all electrodes, and 2) by applying principal component analysis (PCA) to the matrix of embedded electrograms and selecting the first principal component as a representation of spatial diversity. Standard RQA parameters were computed on the constructed RPs and correlated to the number of fibrillation waves per AF cycle (NW). Averaged RP RQA parameters showed no correlation with NW. Correlations improved when applying PCA, with maximum correlation achieved between RP threshold and NW (RR1%, r=0.68, p <; 0.001) and RP determinism (DET, r=-0.64, p <; 0.001). All studied RQA parameters based on the PCA RP were able to discriminate between persAF and aAF/PAF (DET persAF 0.40 ± 0.11 vs. 0.59 ± 0.14/0.62 ± 0.16, p <; 0.01). RP construction and RQA combined with PCA provide a quick and reliable tool to visualize dynamical behaviour and to assess the complexity of contact mapping patterns in AF.

  17. Health status monitoring for ICU patients based on locally weighted principal component analysis.

    PubMed

    Ding, Yangyang; Ma, Xin; Wang, Youqing

    2018-03-01

    Intelligent status monitoring for critically ill patients can help medical stuff quickly discover and assess the changes of disease and then make appropriate treatment strategy. However, general-type monitoring model now widely used is difficult to adapt the changes of intensive care unit (ICU) patients' status due to its fixed pattern, and a more robust, efficient and fast monitoring model should be developed to the individual. A data-driven learning approach combining locally weighted projection regression (LWPR) and principal component analysis (PCA) is firstly proposed and applied to monitor the nonlinear process of patients' health status in ICU. LWPR is used to approximate the complex nonlinear process with local linear models, in which PCA could be further applied to status monitoring, and finally a global weighted statistic will be acquired for detecting the possible abnormalities. Moreover, some improved versions are developed, such as LWPR-MPCA and LWPR-JPCA, which also have superior performance. Eighteen subjects were selected from the Physiobank's Multi-parameter Intelligent Monitoring for Intensive Care II (MIMIC II) database, and two vital signs of each subject were chosen for online monitoring. The proposed method was compared with several existing methods including traditional PCA, Partial least squares (PLS), just in time learning combined with modified PCA (L-PCA), and Kernel PCA (KPCA). The experimental results demonstrated that the mean fault detection rate (FDR) of PCA can be improved by 41.7% after adding LWPR. The mean FDR of LWPR-MPCA was increased by 8.3%, compared with the latest reported method L-PCA. Meanwhile, LWPR spent less training time than others, especially KPCA. LWPR is first introduced into ICU patients monitoring and achieves the best monitoring performance including adaptability to changes in patient status, sensitivity for abnormality detection as well as its fast learning speed and low computational complexity. The algorithm is an excellent approach to establishing a personalized model for patients, which is the mainstream direction of modern medicine in the following development, as well as improving the global monitoring performance. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.

  18. Empirical evaluation of grouping of lower urinary tract symptoms: principal component analysis of Tampere Ageing Male Urological Study data.

    PubMed

    Pöyhönen, Antti; Häkkinen, Jukka T; Koskimäki, Juha; Hakama, Matti; Tammela, Teuvo L J; Auvinen, Anssi

    2013-03-01

    WHAT'S KNOWN ON THE SUBJECT? AND WHAT DOES THE STUDY ADD?: The ICS has divided LUTS into three groups: storage, voiding and post-micturition symptoms. The classification is based on anatomical, physiological and urodynamic considerations of a theoretical nature. We used principal component analysis (PCA) to determine the inter-correlations of various LUTS, which is a novel approach to research and can strengthen existing knowledge of the phenomenology of LUTS. After we had completed our analyses, another study was published that used a similar approach and results were very similar to those of the present study. We evaluated the constellation of LUTS using PCA of the data from a population-based study that included >4000 men. In our analysis, three components emerged from the 12 LUTS: voiding, storage and incontinence components. Our results indicated that incontinence may be separate from the other storage symptoms and post-micturition symptoms should perhaps be regarded as voiding symptoms. To determine how lower urinary tract symptoms (LUTS) relate to each other and assess if the classification proposed by the International Continence Society (ICS) is consistent with empirical findings. The information on urinary symptoms for this population-based study was collected using a self-administered postal questionnaire in 2004. The questionnaire was sent to 7470 men, aged 30-80 years, from Pirkanmaa County (Finland), of whom 4384 (58.7%) returned the questionnaire. The Danish Prostatic Symptom Score-1 questionnaire was used to evaluate urinary symptoms. Principal component analysis (PCA) was used to evaluate the inter-correlations among various urinary symptoms. The PCA produced a grouping of 12 LUTS into three categories consisting of voiding, storage and incontinence symptoms. Post-micturition symptoms were related to voiding symptoms, but incontinence symptoms were separate from storage symptoms. In the analyses by age group, similar categorization was found at ages 40, 50, 60 and 80 years, but only two groups of symptoms emerged among men aged 70 years. The prevalence among men aged 30 was too low for meaningful analysis. This population-based study suggests that LUTS can be divided into three subgroups consisting of voiding, storage and incontinence symptoms based on their inter-correlations. Our empirical findings suggest an alternative grouping of LUTS. The potential utility of such an approach requires careful consideration. © 2012 BJU International.

  19. Preliminary identification of unicellular algal genus by using combined confocal resonance Raman spectroscopy with PCA and DPLS analysis

    NASA Astrophysics Data System (ADS)

    He, Shixuan; Xie, Wanyi; Zhang, Ping; Fang, Shaoxi; Li, Zhe; Tang, Peng; Gao, Xia; Guo, Jinsong; Tlili, Chaker; Wang, Deqiang

    2018-02-01

    The analysis of algae and dominant alga plays important roles in ecological and environmental fields since it can be used to forecast water bloom and control its potential deleterious effects. Herein, we combine in vivo confocal resonance Raman spectroscopy with multivariate analysis methods to preliminary identify the three algal genera in water blooms at unicellular scale. Statistical analysis of characteristic Raman peaks demonstrates that certain shifts and different normalized intensities, resulting from composition of different carotenoids, exist in Raman spectra of three algal cells. Principal component analysis (PCA) scores and corresponding loading weights show some differences from Raman spectral characteristics which are caused by vibrations of carotenoids in unicellular algae. Then, discriminant partial least squares (DPLS) classification method is used to verify the effectiveness of algal identification with confocal resonance Raman spectroscopy. Our results show that confocal resonance Raman spectroscopy combined with PCA and DPLS could handle the preliminary identification of dominant alga for forecasting and controlling of water blooms.

  20. Application of Principal Component Analysis (PCA) to Reduce Multicollinearity Exchange Rate Currency of Some Countries in Asia Period 2004-2014

    ERIC Educational Resources Information Center

    Rahayu, Sri; Sugiarto, Teguh; Madu, Ludiro; Holiawati; Subagyo, Ahmad

    2017-01-01

    This study aims to apply the model principal component analysis to reduce multicollinearity on variable currency exchange rate in eight countries in Asia against US Dollar including the Yen (Japan), Won (South Korea), Dollar (Hong Kong), Yuan (China), Bath (Thailand), Rupiah (Indonesia), Ringgit (Malaysia), Dollar (Singapore). It looks at yield…

  1. Low-rank plus sparse decomposition for exoplanet detection in direct-imaging ADI sequences. The LLSG algorithm

    NASA Astrophysics Data System (ADS)

    Gomez Gonzalez, C. A.; Absil, O.; Absil, P.-A.; Van Droogenbroeck, M.; Mawet, D.; Surdej, J.

    2016-05-01

    Context. Data processing constitutes a critical component of high-contrast exoplanet imaging. Its role is almost as important as the choice of a coronagraph or a wavefront control system, and it is intertwined with the chosen observing strategy. Among the data processing techniques for angular differential imaging (ADI), the most recent is the family of principal component analysis (PCA) based algorithms. It is a widely used statistical tool developed during the first half of the past century. PCA serves, in this case, as a subspace projection technique for constructing a reference point spread function (PSF) that can be subtracted from the science data for boosting the detectability of potential companions present in the data. Unfortunately, when building this reference PSF from the science data itself, PCA comes with certain limitations such as the sensitivity of the lower dimensional orthogonal subspace to non-Gaussian noise. Aims: Inspired by recent advances in machine learning algorithms such as robust PCA, we aim to propose a localized subspace projection technique that surpasses current PCA-based post-processing algorithms in terms of the detectability of companions at near real-time speed, a quality that will be useful for future direct imaging surveys. Methods: We used randomized low-rank approximation methods recently proposed in the machine learning literature, coupled with entry-wise thresholding to decompose an ADI image sequence locally into low-rank, sparse, and Gaussian noise components (LLSG). This local three-term decomposition separates the starlight and the associated speckle noise from the planetary signal, which mostly remains in the sparse term. We tested the performance of our new algorithm on a long ADI sequence obtained on β Pictoris with VLT/NACO. Results: Compared to a standard PCA approach, LLSG decomposition reaches a higher signal-to-noise ratio and has an overall better performance in the receiver operating characteristic space. This three-term decomposition brings a detectability boost compared to the full-frame standard PCA approach, especially in the small inner working angle region where complex speckle noise prevents PCA from discerning true companions from noise.

  2. Extracting factors for interest rate scenarios

    NASA Astrophysics Data System (ADS)

    Molgedey, L.; Galic, E.

    2001-04-01

    Factor based interest rate models are widely used for risk managing purposes, for option pricing and for identifying and capturing yield curve anomalies. The movements of a term structure of interest rates are commonly assumed to be driven by a small number of orthogonal factors such as SHIFT, TWIST and BUTTERFLY (BOW). These factors are usually obtained by a Principal Component Analysis (PCA) of historical bond prices (interest rates). Although PCA diagonalizes the covariance matrix of either the interest rates or the interest rate changes, it does not use both covariance matrices simultaneously. Furthermore higher linear and nonlinear correlations are neglected. These correlations as well as the mean reverting properties of the interest rates become crucial, if one is interested in a longer time horizon (infrequent hedging or trading). We will show that Independent Component Analysis (ICA) is a more appropriate tool than PCA, since ICA uses the covariance matrix of the interest rates as well as the covariance matrix of the interest rate changes simultaneously. Additionally higher linear and nonlinear correlations may be easily incorporated. The resulting factors are uncorrelated for various time delays, approximately independent but nonorthogonal. This is in contrast to the factors obtained from the PCA, which are orthogonal and uncorrelated for identical times only. Although factors from the ICA are nonorthogonal, it is sufficient to consider only a few factors in order to explain most of the variation in the original data. Finally we will present examples that ICA based hedges outperforms PCA based hedges specifically if the portfolio is sensitive to structural changes of the yield curve.

  3. Obesity, metabolic syndrome, impaired fasting glucose, and microvascular dysfunction: a principal component analysis approach.

    PubMed

    Panazzolo, Diogo G; Sicuro, Fernando L; Clapauch, Ruth; Maranhão, Priscila A; Bouskela, Eliete; Kraemer-Aguiar, Luiz G

    2012-11-13

    We aimed to evaluate the multivariate association between functional microvascular variables and clinical-laboratorial-anthropometrical measurements. Data from 189 female subjects (34.0 ± 15.5 years, 30.5 ± 7.1 kg/m2), who were non-smokers, non-regular drug users, without a history of diabetes and/or hypertension, were analyzed by principal component analysis (PCA). PCA is a classical multivariate exploratory tool because it highlights common variation between variables allowing inferences about possible biological meaning of associations between them, without pre-establishing cause-effect relationships. In total, 15 variables were used for PCA: body mass index (BMI), waist circumference, systolic and diastolic blood pressure (BP), fasting plasma glucose, levels of total cholesterol, high-density lipoprotein cholesterol (HDL-c), low-density lipoprotein cholesterol (LDL-c), triglycerides (TG), insulin, C-reactive protein (CRP), and functional microvascular variables measured by nailfold videocapillaroscopy. Nailfold videocapillaroscopy was used for direct visualization of nutritive capillaries, assessing functional capillary density, red blood cell velocity (RBCV) at rest and peak after 1 min of arterial occlusion (RBCV(max)), and the time taken to reach RBCV(max) (TRBCV(max)). A total of 35% of subjects had metabolic syndrome, 77% were overweight/obese, and 9.5% had impaired fasting glucose. PCA was able to recognize that functional microvascular variables and clinical-laboratorial-anthropometrical measurements had a similar variation. The first five principal components explained most of the intrinsic variation of the data. For example, principal component 1 was associated with BMI, waist circumference, systolic BP, diastolic BP, insulin, TG, CRP, and TRBCV(max) varying in the same way. Principal component 1 also showed a strong association among HDL-c, RBCV, and RBCV(max), but in the opposite way. Principal component 3 was associated only with microvascular variables in the same way (functional capillary density, RBCV and RBCV(max)). Fasting plasma glucose appeared to be related to principal component 4 and did not show any association with microvascular reactivity. In non-diabetic female subjects, a multivariate scenario of associations between classic clinical variables strictly related to obesity and metabolic syndrome suggests a significant relationship between these diseases and microvascular reactivity.

  4. An inter-comparison of PM10 source apportionment using PCA and PMF receptor models in three European sites.

    PubMed

    Cesari, Daniela; Amato, F; Pandolfi, M; Alastuey, A; Querol, X; Contini, D

    2016-08-01

    Source apportionment of aerosol is an important approach to investigate aerosol formation and transformation processes as well as to assess appropriate mitigation strategies and to investigate causes of non-compliance with air quality standards (Directive 2008/50/CE). Receptor models (RMs) based on chemical composition of aerosol measured at specific sites are a useful, and widely used, tool to perform source apportionment. However, an analysis of available studies in the scientific literature reveals heterogeneities in the approaches used, in terms of "working variables" such as the number of samples in the dataset and the number of chemical species used as well as in the modeling tools used. In this work, an inter-comparison of PM10 source apportionment results obtained at three European measurement sites is presented, using two receptor models: principal component analysis coupled with multi-linear regression analysis (PCA-MLRA) and positive matrix factorization (PMF). The inter-comparison focuses on source identification, quantification of source contribution to PM10, robustness of the results, and how these are influenced by the number of chemical species available in the datasets. Results show very similar component/factor profiles identified by PCA and PMF, with some discrepancies in the number of factors. The PMF model appears to be more suitable to separate secondary sulfate and secondary nitrate with respect to PCA at least in the datasets analyzed. Further, some difficulties have been observed with PCA in separating industrial and heavy oil combustion contributions. Commonly at all sites, the crustal contributions found with PCA were larger than those found with PMF, and the secondary inorganic aerosol contributions found by PCA were lower than those found by PMF. Site-dependent differences were also observed for traffic and marine contributions. The inter-comparison of source apportionment performed on complete datasets (using the full range of available chemical species) and incomplete datasets (with reduced number of chemical species) allowed to investigate the sensitivity of source apportionment (SA) results to the working variables used in the RMs. Results show that, at both sites, the profiles and the contributions of the different sources calculated with PMF are comparable within the estimated uncertainties indicating a good stability and robustness of PMF results. In contrast, PCA outputs are more sensitive to the chemical species present in the datasets. In PCA, the crustal contributions are higher in the incomplete datasets and the traffic contributions are significantly lower for incomplete datasets.

  5. Efficient principal component analysis for multivariate 3D voxel-based mapping of brain functional imaging data sets as applied to FDG-PET and normal aging.

    PubMed

    Zuendorf, Gerhard; Kerrouche, Nacer; Herholz, Karl; Baron, Jean-Claude

    2003-01-01

    Principal component analysis (PCA) is a well-known technique for reduction of dimensionality of functional imaging data. PCA can be looked at as the projection of the original images onto a new orthogonal coordinate system with lower dimensions. The new axes explain the variance in the images in decreasing order of importance, showing correlations between brain regions. We used an efficient, stable and analytical method to work out the PCA of Positron Emission Tomography (PET) images of 74 normal subjects using [(18)F]fluoro-2-deoxy-D-glucose (FDG) as a tracer. Principal components (PCs) and their relation to age effects were investigated. Correlations between the projections of the images on the new axes and the age of the subjects were carried out. The first two PCs could be identified as being the only PCs significantly correlated to age. The first principal component, which explained 10% of the data set variance, was reduced only in subjects of age 55 or older and was related to loss of signal in and adjacent to ventricles and basal cisterns, reflecting expected age-related brain atrophy with enlarging CSF spaces. The second principal component, which accounted for 8% of the total variance, had high loadings from prefrontal, posterior parietal and posterior cingulate cortices and showed the strongest correlation with age (r = -0.56), entirely consistent with previously documented age-related declines in brain glucose utilization. Thus, our method showed that the effect of aging on brain metabolism has at least two independent dimensions. This method should have widespread applications in multivariate analysis of brain functional images. Copyright 2002 Wiley-Liss, Inc.

  6. Psychometric Measurement Models and Artificial Neural Networks

    ERIC Educational Resources Information Center

    Sese, Albert; Palmer, Alfonso L.; Montano, Juan J.

    2004-01-01

    The study of measurement models in psychometrics by means of dimensionality reduction techniques such as Principal Components Analysis (PCA) is a very common practice. In recent times, an upsurge of interest in the study of artificial neural networks apt to computing a principal component extraction has been observed. Despite this interest, the…

  7. An analytics of electricity consumption characteristics based on principal component analysis

    NASA Astrophysics Data System (ADS)

    Feng, Junshu

    2018-02-01

    Abstract . More detailed analysis of the electricity consumption characteristics can make demand side management (DSM) much more targeted. In this paper, an analytics of electricity consumption characteristics based on principal component analysis (PCA) is given, which the PCA method can be used in to extract the main typical characteristics of electricity consumers. Then, electricity consumption characteristics matrix is designed, which can make a comparison of different typical electricity consumption characteristics between different types of consumers, such as industrial consumers, commercial consumers and residents. In our case study, the electricity consumption has been mainly divided into four characteristics: extreme peak using, peak using, peak-shifting using and others. Moreover, it has been found that industrial consumers shift their peak load often, meanwhile commercial and residential consumers have more peak-time consumption. The conclusions can provide decision support of DSM for the government and power providers.

  8. Discovering phases, phase transitions, and crossovers through unsupervised machine learning: A critical examination

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hu, Wenjian; Singh, Rajiv R. P.; Scalettar, Richard T.

    Here, we apply unsupervised machine learning techniques, mainly principal component analysis (PCA), to compare and contrast the phase behavior and phase transitions in several classical spin models - the square and triangular-lattice Ising models, the Blume-Capel model, a highly degenerate biquadratic-exchange spin-one Ising (BSI) model, and the 2D XY model, and examine critically what machine learning is teaching us. We find that quantified principal components from PCA not only allow exploration of different phases and symmetry-breaking, but can distinguish phase transition types and locate critical points. We show that the corresponding weight vectors have a clear physical interpretation, which ismore » particularly interesting in the frustrated models such as the triangular antiferromagnet, where they can point to incipient orders. Unlike the other well-studied models, the properties of the BSI model are less well known. Using both PCA and conventional Monte Carlo analysis, we demonstrate that the BSI model shows an absence of phase transition and macroscopic ground-state degeneracy. The failure to capture the 'charge' correlations (vorticity) in the BSI model (XY model) from raw spin configurations points to some of the limitations of PCA. Finally, we employ a nonlinear unsupervised machine learning procedure, the 'antoencoder method', and demonstrate that it too can be trained to capture phase transitions and critical points.« less

  9. Principal component analysis to assess the composition and fate of impurities in a large river-embedded reservoir: Qingcaosha Reservoir.

    PubMed

    Ou, Hua-Se; Wei, Chao-Hai; Deng, Yang; Gao, Nai-Yun

    2013-08-01

    Qingcaosha Reservoir (QR) is the largest river-embedded reservoir in east China, which receives its source water from the Yangtze River (YR). The temporal and spatial variations in dissolved organic matter (DOM), chromophoric DOM (CDOM), nitrogen, phosphorus and phytoplankton biomass were investigated from June to September in 2012 and were integrated by principal component analysis (PCA). Three PCA factors were identified: (1) phytoplankton related factor 1, (2) total DOM related factor 2, and (3) eutrophication related factor 3. Factor 1 was a lake-type parameter which correlated with chlorophyll-a and protein-like CDOM (r = 0.793 and r = 0.831, respectively). Factor 2 was a river-type parameter which correlated with total DOC and humic-like CDOM (r = 0.668 and r = 0.726, respectively). Factor 3 correlated with total nitrogen and phosphorus (r = 0.864 and r = 0.621, respectively). The low flow speed, self-sedimentation and nutrient accumulation in QR resulted in increases in PCA factor 1 scores (phytoplankton biomass and derived CDOM) in the spatial scale, indicating a change of river-type water (YR) to lake-type water (QR). In summer, the water temperature variation induced a growth-bloom-decay process of phytoplankton combined with the increase of PCA factor 2 (humic-like CDOM) in the QR, which was absent in the YR.

  10. Discovering phases, phase transitions, and crossovers through unsupervised machine learning: A critical examination

    DOE PAGES

    Hu, Wenjian; Singh, Rajiv R. P.; Scalettar, Richard T.

    2017-06-19

    Here, we apply unsupervised machine learning techniques, mainly principal component analysis (PCA), to compare and contrast the phase behavior and phase transitions in several classical spin models - the square and triangular-lattice Ising models, the Blume-Capel model, a highly degenerate biquadratic-exchange spin-one Ising (BSI) model, and the 2D XY model, and examine critically what machine learning is teaching us. We find that quantified principal components from PCA not only allow exploration of different phases and symmetry-breaking, but can distinguish phase transition types and locate critical points. We show that the corresponding weight vectors have a clear physical interpretation, which ismore » particularly interesting in the frustrated models such as the triangular antiferromagnet, where they can point to incipient orders. Unlike the other well-studied models, the properties of the BSI model are less well known. Using both PCA and conventional Monte Carlo analysis, we demonstrate that the BSI model shows an absence of phase transition and macroscopic ground-state degeneracy. The failure to capture the 'charge' correlations (vorticity) in the BSI model (XY model) from raw spin configurations points to some of the limitations of PCA. Finally, we employ a nonlinear unsupervised machine learning procedure, the 'antoencoder method', and demonstrate that it too can be trained to capture phase transitions and critical points.« less

  11. Discovering phases, phase transitions, and crossovers through unsupervised machine learning: A critical examination

    NASA Astrophysics Data System (ADS)

    Hu, Wenjian; Singh, Rajiv R. P.; Scalettar, Richard T.

    2017-06-01

    We apply unsupervised machine learning techniques, mainly principal component analysis (PCA), to compare and contrast the phase behavior and phase transitions in several classical spin models—the square- and triangular-lattice Ising models, the Blume-Capel model, a highly degenerate biquadratic-exchange spin-1 Ising (BSI) model, and the two-dimensional X Y model—and we examine critically what machine learning is teaching us. We find that quantified principal components from PCA not only allow the exploration of different phases and symmetry-breaking, but they can distinguish phase-transition types and locate critical points. We show that the corresponding weight vectors have a clear physical interpretation, which is particularly interesting in the frustrated models such as the triangular antiferromagnet, where they can point to incipient orders. Unlike the other well-studied models, the properties of the BSI model are less well known. Using both PCA and conventional Monte Carlo analysis, we demonstrate that the BSI model shows an absence of phase transition and macroscopic ground-state degeneracy. The failure to capture the "charge" correlations (vorticity) in the BSI model (X Y model) from raw spin configurations points to some of the limitations of PCA. Finally, we employ a nonlinear unsupervised machine learning procedure, the "autoencoder method," and we demonstrate that it too can be trained to capture phase transitions and critical points.

  12. Study of support vector machine and serum surface-enhanced Raman spectroscopy for noninvasive esophageal cancer detection

    NASA Astrophysics Data System (ADS)

    Li, Shao-Xin; Zeng, Qiu-Yao; Li, Lin-Fang; Zhang, Yan-Jiao; Wan, Ming-Ming; Liu, Zhi-Ming; Xiong, Hong-Lian; Guo, Zhou-Yi; Liu, Song-Hao

    2013-02-01

    The ability of combining serum surface-enhanced Raman spectroscopy (SERS) with support vector machine (SVM) for improving classification esophageal cancer patients from normal volunteers is investigated. Two groups of serum SERS spectra based on silver nanoparticles (AgNPs) are obtained: one group from patients with pathologically confirmed esophageal cancer (n=30) and the other group from healthy volunteers (n=31). Principal components analysis (PCA), conventional SVM (C-SVM) and conventional SVM combination with PCA (PCA-SVM) methods are implemented to classify the same spectral dataset. Results show that a diagnostic accuracy of 77.0% is acquired for PCA technique, while diagnostic accuracies of 83.6% and 85.2% are obtained for C-SVM and PCA-SVM methods based on radial basis functions (RBF) models. The results prove that RBF SVM models are superior to PCA algorithm in classification serum SERS spectra. The study demonstrates that serum SERS in combination with SVM technique has great potential to provide an effective and accurate diagnostic schema for noninvasive detection of esophageal cancer.

  13. Biochemical signatures of in vitro radiation response in human lung, breast and prostate tumour cells observed with Raman spectroscopy

    NASA Astrophysics Data System (ADS)

    Matthews, Q.; Jirasek, A.; Lum, J. J.; Brolo, A. G.

    2011-11-01

    This work applies noninvasive single-cell Raman spectroscopy (RS) and principal component analysis (PCA) to analyze and correlate radiation-induced biochemical changes in a panel of human tumour cell lines that vary by tissue of origin, p53 status and intrinsic radiosensitivity. Six human tumour cell lines, derived from prostate (DU145, PC3 and LNCaP), breast (MDA-MB-231 and MCF7) and lung (H460), were irradiated in vitro with single fractions (15, 30 or 50 Gy) of 6 MV photons. Remaining live cells were harvested for RS analysis at 0, 24, 48 and 72 h post-irradiation, along with unirradiated controls. Single-cell Raman spectra were acquired from 20 cells per sample utilizing a 785 nm excitation laser. All spectra (200 per cell line) were individually post-processed using established methods and the total data set for each cell line was analyzed with PCA using standard algorithms. One radiation-induced PCA component was detected for each cell line by identification of statistically significant changes in the PCA score distributions for irradiated samples, as compared to unirradiated samples, in the first 24-72 h post-irradiation. These RS response signatures arise from radiation-induced changes in cellular concentrations of aromatic amino acids, conformational protein structures and certain nucleic acid and lipid functional groups. Correlation analysis between the radiation-induced PCA components separates the cell lines into three distinct RS response categories: R1 (H460 and MCF7), R2 (MDA-MB-231 and PC3) and R3 (DU145 and LNCaP). These RS categories partially segregate according to radiosensitivity, as the R1 and R2 cell lines are radioresistant (SF2 > 0.6) and the R3 cell lines are radiosensitive (SF2 < 0.5). The R1 and R2 cell lines further segregate according to p53 gene status, corroborated by cell cycle analysis post-irradiation. Potential radiation-induced biochemical response mechanisms underlying our RS observations are proposed, such as (1) the regulated synthesis and degradation of structured proteins and (2) the expression of anti-apoptosis factors or other survival signals. This study demonstrates the utility of RS for noninvasive radiobiological analysis of tumour cell radiation response, and indicates the potential for future RS studies designed to investigate, monitor or predict radiation response.

  14. Coupling of on-column trypsin digestion-peptide mapping and principal component analysis for stability and biosimilarity assessment of recombinant human growth hormone.

    PubMed

    Shatat, Sara M; Eltanany, Basma M; Mohamed, Abeer A; Al-Ghobashy, Medhat A; Fathalla, Faten A; Abbas, Samah S

    2018-01-01

    Peptide mapping (PM) is a vital technique in biopharmaceutical industry. The fingerprint obtained helps to qualitatively confirm host stability as well as verify primary structure, purity and integrity of the target protein. Yet, in-solution digestion followed by tandem mass spectrometry is not suitable as a routine quality control test. It is time consuming and requires sophisticated, expensive instruments and highly skilled operators. In an attempt to enhance the fuctionality of PM and extract multi-dimentional data about various critical quality attributes and comparability of biosimilars, coupling of PM generated using immobilized trypsin followed by HPLC-UV to principal component analysis (PCA) is proposed. Recombinant human growth hormone (rhGH); was selected as a model biopharmaceutical since it is available in the market from different manufacturers and its PM is a well-established pharmacopoeial test. Samples of different rhGH biosimilars as well as degraded samples: deamidated and oxidized were subjected to trypsin digestion followed by RP-HPLC-UV analysis. PCA of the entire chromatograms of test and reference samples was then conducted. Comparison of the scores of samples and investigation of the loadings plots clearly indicated the applicability of PM-PCA for: i) identity testing, ii) biosimilarity assessment and iii) stability evaluation. Hotelling's T 2 and Q statistics were employed at 95% confidence level to measure the variation and to test the conformance of each sample to the PCA model, respectively. Coupling of PM to PCA provided a novel tool to identify peptide fragments responsible for variation between the test and reference samples as well as evaluation of the extent and relative significance of this variability. Transformation of conventional PM that is largely based on subjective visual comparison into an objective statiscally-guided analysis framework should provide a simple and economic tool to help both manufacturers and regulatory authorities in quality and biosimilarity assessment of biopharmaceuticals. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. Multivariate analysis of molecular and morphological diversity in fig (Ficus carica L.)

    USDA-ARS?s Scientific Manuscript database

    Genetic polymorphism across 15 microsatellite loci among 194 fig accessions including Common, Smyrna, San Pedro, and Caprifig were analyzed using a cluster analysis (CA) and the principal components analysis (PCA). The collection was moderately variable with observed number of alleles per locus rang...

  16. Observation of Nonthermal Emission from the Supernova Remnant IC443 with RXTE

    NASA Technical Reports Server (NTRS)

    Sturner, S. J.; Keohane, J. W.; Reimer, O.

    2002-01-01

    In this paper we present analysis of X-ray spectra from the supernova remnant IC443 obtained using the PCA on RXTE. The spectra in the 3 - 20 keV band are well fit by a two-component model consisting of thermal and nonthermal components. We compare these results with recent results of other X-ray missions and discuss the need for a cut-off in the nonthermal spectrum. Recent Chandra and XMM-Newton observations suggest that much of the nonthermal emission from IC443 can be attributed to a pulsar wind nebula. We present the results of our search for periodic emission in the RXTE PCA data. We then discuss the origin o f the nonthermal component and its possible association with the unidentified EGRET source.

  17. Multivariate analysis of historical data (2004-2013) in assessing the possible environmental impact of the Bellolampo landfill (Palermo).

    PubMed

    Indelicato, Serena; Bongiorno, David; Tuzzolino, Nicola; Mannino, Maria Rosaria; Muscarella, Rosalia; Fradella, Pasquale; Gargano, Maria Elena; Nicosia, Salvatore; Ceraulo, Leopoldo

    2018-03-14

    Multivariate analysis was performed on a large data set of groundwater and leachate samples collected during 9 years of operation of the Bellolampo municipal solid waste landfill (located above Palermo, Italy). The aim was to obtain the most likely correlations among the data. The analysis results are presented. Groundwater samples were collected in the period 2004-2013, whereas the leachate analysis refers to the period 2006-2013. For groundwater, statistical data evaluation revealed notable differences among the samples taken from the numerous wells located around the landfill. Characteristic parameters revealed by principal component analysis (PCA) were more deeply investigated, and corresponding thematic maps were drawn. The composition of the leachate was also thoroughly investigated. Several chemical macro-descriptors were calculated, and the results are presented. A comparison of PCA results for the leachate and groundwater data clearly reveals that the groundwater's main components substantially differ from those of the leachate. This outcome strongly suggests excluding leachate permeation through the multiple landfill lining.

  18. Chemometric and multivariate statistical analysis of time-of-flight secondary ion mass spectrometry spectra from complex Cu-Fe sulfides.

    PubMed

    Kalegowda, Yogesh; Harmer, Sarah L

    2012-03-20

    Time-of-flight secondary ion mass spectrometry (TOF-SIMS) spectra of mineral samples are complex, comprised of large mass ranges and many peaks. Consequently, characterization and classification analysis of these systems is challenging. In this study, different chemometric and statistical data evaluation methods, based on monolayer sensitive TOF-SIMS data, have been tested for the characterization and classification of copper-iron sulfide minerals (chalcopyrite, chalcocite, bornite, and pyrite) at different flotation pulp conditions (feed, conditioned feed, and Eh modified). The complex mass spectral data sets were analyzed using the following chemometric and statistical techniques: principal component analysis (PCA); principal component-discriminant functional analysis (PC-DFA); soft independent modeling of class analogy (SIMCA); and k-Nearest Neighbor (k-NN) classification. PCA was found to be an important first step in multivariate analysis, providing insight into both the relative grouping of samples and the elemental/molecular basis for those groupings. For samples exposed to oxidative conditions (at Eh ~430 mV), each technique (PCA, PC-DFA, SIMCA, and k-NN) was found to produce excellent classification. For samples at reductive conditions (at Eh ~ -200 mV SHE), k-NN and SIMCA produced the most accurate classification. Phase identification of particles that contain the same elements but a different crystal structure in a mixed multimetal mineral system has been achieved.

  19. Assets as a Socioeconomic Status Index: Categorical Principal Components Analysis vs. Latent Class Analysis.

    PubMed

    Sartipi, Majid; Nedjat, Saharnaz; Mansournia, Mohammad Ali; Baigi, Vali; Fotouhi, Akbar

    2016-11-01

    Some variables like Socioeconomic Status (SES) cannot be directly measured, instead, so-called 'latent variables' are measured indirectly through calculating tangible items. There are different methods for measuring latent variables such as data reduction methods e.g. Principal Components Analysis (PCA) and Latent Class Analysis (LCA). The purpose of our study was to measure assets index- as a representative of SES- through two methods of Non-Linear PCA (NLPCA) and LCA, and to compare them for choosing the most appropriate model. This was a cross sectional study in which 1995 respondents filled the questionnaires about their assets in Tehran. The data were analyzed by SPSS 19 (CATPCA command) and SAS 9.2 (PROC LCA command) to estimate their socioeconomic status. The results were compared based on the Intra-class Correlation Coefficient (ICC). The 6 derived classes from LCA based on BIC, were highly consistent with the 6 classes from CATPCA (Categorical PCA) (ICC = 0.87, 95%CI: 0.86 - 0.88). There is no gold standard to measure SES. Therefore, it is not possible to definitely say that a specific method is better than another one. LCA is a complicated method that presents detailed information about latent variables and required one assumption (local independency), while NLPCA is a simple method, which requires more assumptions. Generally, NLPCA seems to be an acceptable method of analysis because of its simplicity and high agreement with LCA.

  20. Relationships between NIR spectra and sensory attributes of Thai commercial fish sauces.

    PubMed

    Ritthiruangdej, Pitiporn; Suwonsichon, Thongchai

    2007-07-01

    Twenty Thai commercial fish sauces were characterized by sensory descriptive analysis and near-infrared (NIR) spectroscopy. The main objectives were i) to investigate the relationships between sensory attributes and NIR spectra of samples and ii) to characterize the sensory characteristics of fish sauces based on NIR data. A generic descriptive analysis with 12 trained panels was used to characterize the sensory attributes. These attributes consisted of 15 descriptors: brown color, 5 aromatics (sweet, caramelized, fermented, fishy, and musty), 4 tastes (sweet, salty, bitter, and umami), 3 aftertastes (sweet, salty and bitter) and 2 flavors (caramelized and fishy). The results showed that Thai fish sauce samples exhibited significant differences in all of sensory attribute values (p < 0.05). NIR transflectance spectra were obtained from 1100 to 2500 nm. Prior to investigation of the relationships between sensory attributes and NIR spectra, principal component analysis (PCA) was applied to reduce the dimensionality of the spectral data from 622 wavelengths to two uncorrelated components (NIR1 and NIR2) which explained 92 and 7% of the total variation, respectively. NIR1 was highly correlated with the wavelength regions of 1100 - 1544, 1774 - 2062, 2092 - 2308, and 2358 - 2440 nm, while NIR2 was highly correlated with the wavelength regions of 1742 - 1764, 2066 - 2088, and 2312 - 2354 nm. Subsequently, the relationships among these two components and all sensory attributes were also investigated by PCA. The results showed that the first three principal components (PCs) named as fishy flavor component (PC1), sweet component (PC2) and bitterness component (PC3), respectively, explained a total of 66.86% of the variation. NIR1 was mainly correlated to the sensory attributes of fishy aromatic, fishy flavor and sweet aftertaste on PC1. In addition, the PCA using only the factor loadings of NIR1 and NIR2 could be used to classify samples into three groups which showed high, medium and low degrees of fishy aromatic, fishy flavor and sweet aftertaste.

  1. Metabolic fingerprint of Brazilian maize landraces silk (stigma/styles) using NMR spectroscopy and chemometric methods.

    PubMed

    Kuhnen, Shirley; Bernardi Ogliari, Juliana; Dias, Paulo Fernando; da Silva Santos, Maiara; Ferreira, Antônio Gilberto; Bonham, Connie C; Wood, Karl Vernon; Maraschin, Marcelo

    2010-02-24

    Aqueous extract from maize silks is used by traditional medicine for the treatment of several ailments, mainly related to the urinary system. This work focuses on the application of NMR spectroscopy and chemometric analysis for the determination of metabolic fingerprint and pattern recognition of silk extracts from seven maize landraces cultivated in southern Brazil. Principal component analysis (PCA) of the (1)H NMR data set showed clear discrimination among the maize varieties by PC1 and PC2, pointing out three distinct metabolic profiles. Target compounds analysis showed significant differences (p < 0.05) in the contents of protocatechuic acid, gallic acid, t-cinnamic acid, and anthocyanins, corroborating the discrimination of the genotypes in this study as revealed by PCA analysis. Thus the combination of (1)H NMR and PCA is a useful tool for the discrimination of maize silks in respect to their chemical composition, including rapid authentication of the raw material of current pharmacological interest.

  2. Nondestructive determination of transgenic Bacillus thuringiensis rice seeds (Oryza sativa L.) using multispectral imaging and chemometric methods.

    PubMed

    Liu, Changhong; Liu, Wei; Lu, Xuzhong; Chen, Wei; Yang, Jianbo; Zheng, Lei

    2014-06-15

    Crop-to-crop transgene flow may affect the seed purity of non-transgenic rice varieties, resulting in unwanted biosafety consequences. The feasibility of a rapid and nondestructive determination of transgenic rice seeds from its non-transgenic counterparts was examined by using multispectral imaging system combined with chemometric data analysis. Principal component analysis (PCA), partial least squares discriminant analysis (PLSDA), least squares-support vector machines (LS-SVM), and PCA-back propagation neural network (PCA-BPNN) methods were applied to classify rice seeds according to their genetic origins. The results demonstrated that clear differences between non-transgenic and transgenic rice seeds could be easily visualized with the nondestructive determination method developed through this study and an excellent classification (up to 100% with LS-SVM model) can be achieved. It is concluded that multispectral imaging together with chemometric data analysis is a promising technique to identify transgenic rice seeds with high efficiency, providing bright prospects for future applications. Copyright © 2013 Elsevier Ltd. All rights reserved.

  3. Analysis of antique bronze coins by Laser Induced Breakdown Spectroscopy and multivariate analysis

    NASA Astrophysics Data System (ADS)

    Bachler, M. Orlić; Bišćan, M.; Kregar, Z.; Jelovica Badovinac, I.; Dobrinić, J.; Milošević, S.

    2016-09-01

    This work presents a feasibility study of applying the Principal Component Analysis (PCA) to data obtained by Laser-Induced Breakdown Spectroscopy (LIBS) with the aim of determining correlation between different samples. The samples were antique bronze coins coated in silver (follis) dated in the Roman Empire period and were made during different rulers in different mints. While raw LIBS data revealed that in the period from the year 286 to 383 CE content of silver was constantly decreasing, the PCA showed that the samples can be somewhat grouped together based on their place of origin, which could be a useful hint when analysing unknown samples. It was also found that PCA can help in discriminating spectra corresponding to ablation from the surface and from the bulk. Furthermore, Partial Least Squares method (PLS) was used to obtain, based on a set of samples with known composition, an estimation of relative copper concentration in studied ancient coins. This analysis showed that copper concentration in surface layers ranged from 83% to 90%.

  4. Multivariate Analysis of Electron Detachment Dissociation and Infrared Multiphoton Dissociation Mass Spectra of Heparan Sulfate Tetrasaccharides Differing Only in Hexuronic acid Stereochemistry

    NASA Astrophysics Data System (ADS)

    Oh, Han Bin; Leach, Franklin E.; Arungundram, Sailaja; Al-Mafraji, Kanar; Venot, Andre; Boons, Geert-Jan; Amster, I. Jonathan

    2011-03-01

    The structural characterization of glycosaminoglycan (GAG) carbohydrates by mass spectrometry has been a long-standing analytical challenge due to the inherent heterogeneity of these biomolecules, specifically polydispersity, variability in sulfation, and hexuronic acid stereochemistry. Recent advances in tandem mass spectrometry methods employing threshold and electron-based ion activation have resulted in the ability to determine the location of the labile sulfate modification as well as assign the stereochemistry of hexuronic acid residues. To facilitate the analysis of complex electron detachment dissociation (EDD) spectra, principal component analysis (PCA) is employed to differentiate the hexuronic acid stereochemistry of four synthetic GAG epimers whose EDD spectra are nearly identical upon visual inspection. For comparison, PCA is also applied to infrared multiphoton dissociation spectra (IRMPD) of the examined epimers. To assess the applicability of multivariate methods in GAG mixture analysis, PCA is utilized to identify the relative content of two epimers in a binary mixture.

  5. Performance evaluation of BPM system in SSRF using PCA method

    NASA Astrophysics Data System (ADS)

    Chen, Zhi-Chu; Leng, Yong-Bin; Yan, Ying-Bing; Yuan, Ren-Xian; Lai, Long-Wei

    2014-07-01

    The beam position monitor (BPM) system is of most importance in a light source. The capability of the BPM depends on the resolution of the system. The traditional standard deviation on the raw data method merely gives the upper limit of the resolution. Principal component analysis (PCA) had been introduced in the accelerator physics and it could be used to get rid of the actual signals. Beam related information was extracted before the evaluation of the BPM performance. A series of studies had been made in the Shanghai Synchrotron Radiation Facility (SSRF) and PCA was proved to be an effective and robust method in the performance evaluations of our BPM system.

  6. Optimization of critical medium components using response surface methodology for phenazine-1-carboxylic acid production by Pseudomonas sp. M-18Q.

    PubMed

    Yuan, Li-Li; Li, Ya-Qian; Wang, Yi; Zhang, Xue-Hong; Xu, Yu-Quan

    2008-03-01

    The optimal flask-shaking batch fermentation medium for phenazine-1-carboxylic acid (PCA) production by Pseudomonas sp. M-18Q, a qscR chromosomal inactivated mutant of the strain M18 was studied using statistical experimental design and analysis. The Plackett-Burman design (PBD) was used to evaluate the effects of eight medium components on the production of PCA, which showed that glucose and soytone were the most significant ingredients (P<0.05). The steepest ascent experiment was adopted to determine the optimal region of the medium composition. The optimum composition of the fermentation medium for maximum PCA yield, as determined on the basis of a five-level two-factor central composite design (CCD), was obtained by response surface methodology (RSM). The high correlation between the predicted and observed values indicated the validity of the model. A maximum PCA yield of 1240 mg/l was obtained at 17.81 g/l glucose and 11.47 g/l soytone, and the production was increased by 65.3% compared with that using the original medium, which was at 750 mg/l.

  7. Receptor modeling for source apportionment of polycyclic aromatic hydrocarbons in urban atmosphere.

    PubMed

    Singh, Kunwar P; Malik, Amrita; Kumar, Ranjan; Saxena, Puneet; Sinha, Sarita

    2008-01-01

    This study reports source apportionment of polycyclic aromatic hydrocarbons (PAHs) in particulate depositions on vegetation foliages near highway in the urban environment of Lucknow city (India) using the principal components analysis/absolute principal components scores (PCA/APCS) receptor modeling approach. The multivariate method enables identification of major PAHs sources along with their quantitative contributions with respect to individual PAH. The PCA identified three major sources of PAHs viz. combustion, vehicular emissions, and diesel based activities. The PCA/APCS receptor modeling approach revealed that the combustion sources (natural gas, wood, coal/coke, biomass) contributed 19-97% of various PAHs, vehicular emissions 0-70%, diesel based sources 0-81% and other miscellaneous sources 0-20% of different PAHs. The contributions of major pyrolytic and petrogenic sources to the total PAHs were 56 and 42%, respectively. Further, the combustion related sources contribute major fraction of the carcinogenic PAHs in the study area. High correlation coefficient (R2 > 0.75 for most PAHs) between the measured and predicted concentrations of PAHs suggests for the applicability of the PCA/APCS receptor modeling approach for estimation of source contribution to the PAHs in particulates.

  8. Analyzing brain networks with PCA and conditional Granger causality.

    PubMed

    Zhou, Zhenyu; Chen, Yonghong; Ding, Mingzhou; Wright, Paul; Lu, Zuhong; Liu, Yijun

    2009-07-01

    Identifying directional influences in anatomical and functional circuits presents one of the greatest challenges for understanding neural computations in the brain. Granger causality mapping (GCM) derived from vector autoregressive models of data has been employed for this purpose, revealing complex temporal and spatial dynamics underlying cognitive processes. However, the traditional GCM methods are computationally expensive, as signals from thousands of voxels within selected regions of interest (ROIs) are individually processed, and being based on pairwise Granger causality, they lack the ability to distinguish direct from indirect connectivity among brain regions. In this work a new algorithm called PCA based conditional GCM is proposed to overcome these problems. The algorithm implements the following two procedures: (i) dimensionality reduction in ROIs of interest with principle component analysis (PCA), and (ii) estimation of the direct causal influences in local brain networks, using conditional Granger causality. Our results show that the proposed method achieves greater accuracy in detecting network connectivity than the commonly used pairwise Granger causality method. Furthermore, the use of PCA components in conjunction with conditional GCM greatly reduces the computational cost relative to the use of individual voxel time series. Copyright 2009 Wiley-Liss, Inc

  9. Detection and Characterization of Ground Displacement Sources from Variational Bayesian Independent Component Analysis of GPS Time Series

    NASA Astrophysics Data System (ADS)

    Gualandi, A.; Serpelloni, E.; Belardinelli, M. E.

    2014-12-01

    A critical point in the analysis of ground displacements time series is the development of data driven methods that allow to discern and characterize the different sources that generate the observed displacements. A widely used multivariate statistical technique is the Principal Component Analysis (PCA), which allows to reduce the dimensionality of the data space maintaining most of the variance of the dataset explained. It reproduces the original data using a limited number of Principal Components, but it also shows some deficiencies. Indeed, PCA does not perform well in finding the solution to the so-called Blind Source Separation (BSS) problem, i.e. in recovering and separating the original sources that generated the observed data. This is mainly due to the assumptions on which PCA relies: it looks for a new Euclidean space where the projected data are uncorrelated. Usually, the uncorrelation condition is not strong enough and it has been proven that the BSS problem can be tackled imposing on the components to be independent. The Independent Component Analysis (ICA) is, in fact, another popular technique adopted to approach this problem, and it can be used in all those fields where PCA is also applied. An ICA approach enables us to explain the time series imposing a fewer number of constraints on the model, and to reveal anomalies in the data such as transient signals. However, the independence condition is not easy to impose, and it is often necessary to introduce some approximations. To work around this problem, we use a variational bayesian ICA (vbICA) method, which models the probability density function (pdf) of each source signal using a mix of Gaussian distributions. This technique allows for more flexibility in the description of the pdf of the sources, giving a more reliable estimate of them. Here we present the application of the vbICA technique to GPS position time series. First, we use vbICA on synthetic data that simulate a seismic cycle (interseismic + coseismic + postseismic + seasonal + noise), and study the ability of the algorithm to recover the original (known) sources of deformation. Secondly, we apply vbICA to different tectonically active scenarios, such as earthquakes in central and northern Italy, as well as the study of slow slip events in Cascadia.

  10. Principal Component Analysis for Normal-Distribution-Valued Symbolic Data.

    PubMed

    Wang, Huiwen; Chen, Meiling; Shi, Xiaojun; Li, Nan

    2016-02-01

    This paper puts forward a new approach to principal component analysis (PCA) for normal-distribution-valued symbolic data, which has a vast potential of applications in the economic and management field. We derive a full set of numerical characteristics and variance-covariance structure for such data, which forms the foundation for our analytical PCA approach. Our approach is able to use all of the variance information in the original data than the prevailing representative-type approach in the literature which only uses centers, vertices, etc. The paper also provides an accurate approach to constructing the observations in a PC space based on the linear additivity property of normal distribution. The effectiveness of the proposed method is illustrated by simulated numerical experiments. At last, our method is applied to explain the puzzle of risk-return tradeoff in China's stock market.

  11. Morphological analysis of Trichomycterus areolatus Valenciennes, 1846 from southern Chilean rivers using a truss-based system (Siluriformes, Trichomycteridae)

    PubMed Central

    Colihueque, Nelson; Corrales, Olga; Yáñez, Miguel

    2017-01-01

    Abstract Trichomycterus areolatus Valenciennes, 1846 is a small endemic catfish inhabiting the Andean river basins of Chile. In this study, the morphological variability of three T. areolatus populations, collected in two river basins from southern Chile, was assessed with multivariate analyses, including principal component analysis (PCA) and discriminant function analysis (DFA). It is hypothesized that populations must segregate morphologically from each other based on the river basin that they were sampled from, since each basin presents relatively particular hydrological characteristics. Significant morphological differences among the three populations were found with PCA (ANOSIM test, r = 0.552, p < 0.0001) and DFA (Wilks’s λ = 0.036, p < 0.01). PCA accounted for a total variation of 56.16% by the first two principal components. The first Principal Component (PC1) and PC2 explained 34.72 and 21.44% of the total variation, respectively. The scatter-plot of the first two discriminant functions (DF1 on DF2) also validated the existence of three different populations. In group classification using DFA, 93.3% of the specimens were correctly-classified into their original populations. Of the total of 22 transformed truss measurements, 17 exhibited highly significant (p < 0.01) differences among populations. The data support the existence of T. areolatus morphological variation across different rivers in southern Chile, likely reflecting the geographic isolation underlying population structure of the species. PMID:29134012

  12. Source Determination of Red Gel Pen Inks using Raman Spectroscopy and Attenuated Total Reflectance Fourier Transform Infrared Spectroscopy combined with Pearson's Product Moment Correlation Coefficients and Principal Component Analysis.

    PubMed

    Mohamad Asri, Muhammad Naeim; Mat Desa, Wan Nur Syuhaila; Ismail, Dzulkiflee

    2018-01-01

    The potential combination of two nondestructive techniques, that is, Raman spectroscopy (RS) and attenuated total reflectance-fourier transform infrared (ATR-FTIR) spectroscopy with Pearson's product moment correlation (PPMC) coefficient (r) and principal component analysis (PCA) to determine the actual source of red gel pen ink used to write a simulated threatening note, was examined. Eighteen (18) red gel pens purchased from Japan and Malaysia from November to December 2014 where one of the pens was used to write a simulated threatening note were analyzed using RS and ATR-FTIR spectroscopy, respectively. The spectra of all the red gel pen inks including the ink deposited on the simulated threatening note gathered from the RS and ATR-FTIR analyses were subjected to PPMC coefficient (r) calculation and principal component analysis (PCA). The coefficients r = 0.9985 and r = 0.9912 for pairwise combination of RS and ATR-FTIR spectra respectively and similarities in terms of PC1 and PC2 scores of one of the inks to the ink deposited on the simulated threatening note substantiated the feasibility of combining RS and ATR-FTIR spectroscopy with PPMC coefficient (r) and PCA for successful source determination of red gel pen inks. The development of pigment spectral library had allowed the ink deposited on the threatening note to be identified as XSL Poppy Red (CI Pigment Red 112). © 2017 American Academy of Forensic Sciences.

  13. Once upon Multivariate Analyses: When They Tell Several Stories about Biological Evolution.

    PubMed

    Renaud, Sabrina; Dufour, Anne-Béatrice; Hardouin, Emilie A; Ledevin, Ronan; Auffray, Jean-Christophe

    2015-01-01

    Geometric morphometrics aims to characterize of the geometry of complex traits. It is therefore by essence multivariate. The most popular methods to investigate patterns of differentiation in this context are (1) the Principal Component Analysis (PCA), which is an eigenvalue decomposition of the total variance-covariance matrix among all specimens; (2) the Canonical Variate Analysis (CVA, a.k.a. linear discriminant analysis (LDA) for more than two groups), which aims at separating the groups by maximizing the between-group to within-group variance ratio; (3) the between-group PCA (bgPCA) which investigates patterns of between-group variation, without standardizing by the within-group variance. Standardizing within-group variance, as performed in the CVA, distorts the relationships among groups, an effect that is particularly strong if the variance is similarly oriented in a comparable way in all groups. Such shared direction of main morphological variance may occur and have a biological meaning, for instance corresponding to the most frequent standing genetic variation in a population. Here we undertake a case study of the evolution of house mouse molar shape across various islands, based on the real dataset and simulations. We investigated how patterns of main variance influence the depiction of among-group differentiation according to the interpretation of the PCA, bgPCA and CVA. Without arguing about a method performing 'better' than another, it rather emerges that working on the total or between-group variance (PCA and bgPCA) will tend to put the focus on the role of direction of main variance as line of least resistance to evolution. Standardizing by the within-group variance (CVA), by dampening the expression of this line of least resistance, has the potential to reveal other relevant patterns of differentiation that may otherwise be blurred.

  14. Evaluation of Parallel Analysis Methods for Determining the Number of Factors

    ERIC Educational Resources Information Center

    Crawford, Aaron V.; Green, Samuel B.; Levy, Roy; Lo, Wen-Juo; Scott, Lietta; Svetina, Dubravka; Thompson, Marilyn S.

    2010-01-01

    Population and sample simulation approaches were used to compare the performance of parallel analysis using principal component analysis (PA-PCA) and parallel analysis using principal axis factoring (PA-PAF) to identify the number of underlying factors. Additionally, the accuracies of the mean eigenvalue and the 95th percentile eigenvalue criteria…

  15. SU-F-J-138: An Extension of PCA-Based Respiratory Deformation Modeling Via Multi-Linear Decomposition

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Iliopoulos, AS; Sun, X; Pitsianis, N

    Purpose: To address and lift the limited degree of freedom (DoF) of globally bilinear motion components such as those based on principal components analysis (PCA), for encoding and modeling volumetric deformation motion. Methods: We provide a systematic approach to obtaining a multi-linear decomposition (MLD) and associated motion model from deformation vector field (DVF) data. We had previously introduced MLD for capturing multi-way relationships between DVF variables, without being restricted by the bilinear component format of PCA-based models. PCA-based modeling is commonly used for encoding patient-specific deformation as per planning 4D-CT images, and aiding on-board motion estimation during radiotherapy. However, themore » bilinear space-time decomposition inherently limits the DoF of such models by the small number of respiratory phases. While this limit is not reached in model studies using analytical or digital phantoms with low-rank motion, it compromises modeling power in the presence of relative motion, asymmetries and hysteresis, etc, which are often observed in patient data. Specifically, a low-DoF model will spuriously couple incoherent motion components, compromising its adaptability to on-board deformation changes. By the multi-linear format of extracted motion components, MLD-based models can encode higher-DoF deformation structure. Results: We conduct mathematical and experimental comparisons between PCA- and MLD-based models. A set of temporally-sampled analytical trajectories provides a synthetic, high-rank DVF; trajectories correspond to respiratory and cardiac motion factors, including different relative frequencies and spatial variations. Additionally, a digital XCAT phantom is used to simulate a lung lesion deforming incoherently with respect to the body, which adheres to a simple respiratory trend. In both cases, coupling of incoherent motion components due to a low model DoF is clearly demonstrated. Conclusion: Multi-linear decomposition can enable decoupling of distinct motion factors in high-rank DVF measurements. This may improve motion model expressiveness and adaptability to on-board deformation, aiding model-based image reconstruction for target verification. NIH Grant No. R01-184173.« less

  16. Comparative Analysis of a Principal Component Analysis-Based and an Artificial Neural Network-Based Method for Baseline Removal.

    PubMed

    Carvajal, Roberto C; Arias, Luis E; Garces, Hugo O; Sbarbaro, Daniel G

    2016-04-01

    This work presents a non-parametric method based on a principal component analysis (PCA) and a parametric one based on artificial neural networks (ANN) to remove continuous baseline features from spectra. The non-parametric method estimates the baseline based on a set of sampled basis vectors obtained from PCA applied over a previously composed continuous spectra learning matrix. The parametric method, however, uses an ANN to filter out the baseline. Previous studies have demonstrated that this method is one of the most effective for baseline removal. The evaluation of both methods was carried out by using a synthetic database designed for benchmarking baseline removal algorithms, containing 100 synthetic composed spectra at different signal-to-baseline ratio (SBR), signal-to-noise ratio (SNR), and baseline slopes. In addition to deomonstrating the utility of the proposed methods and to compare them in a real application, a spectral data set measured from a flame radiation process was used. Several performance metrics such as correlation coefficient, chi-square value, and goodness-of-fit coefficient were calculated to quantify and compare both algorithms. Results demonstrate that the PCA-based method outperforms the one based on ANN both in terms of performance and simplicity. © The Author(s) 2016.

  17. Using Principal Component and Tidal Analysis as a Quality Metric for Detecting Systematic Heading Uncertainty in Long-Term Acoustic Doppler Current Profiler Data

    NASA Astrophysics Data System (ADS)

    Morley, M. G.; Mihaly, S. F.; Dewey, R. K.; Jeffries, M. A.

    2015-12-01

    Ocean Networks Canada (ONC) operates the NEPTUNE and VENUS cabled ocean observatories to collect data on physical, chemical, biological, and geological ocean conditions over multi-year time periods. Researchers can download real-time and historical data from a large variety of instruments to study complex earth and ocean processes from their home laboratories. Ensuring that the users are receiving the most accurate data is a high priority at ONC, requiring quality assurance and quality control (QAQC) procedures to be developed for all data types. While some data types have relatively straightforward QAQC tests, such as scalar data range limits that are based on expected observed values or measurement limits of the instrument, for other data types the QAQC tests are more comprehensive. Long time series of ocean currents from Acoustic Doppler Current Profilers (ADCP), stitched together from multiple deployments over many years is one such data type where systematic data biases are more difficult to identify and correct. Data specialists at ONC are working to quantify systematic compass heading uncertainty in long-term ADCP records at each of the major study sites using the internal compass, remotely operated vehicle bearings, and more analytical tools such as principal component analysis (PCA) to estimate the optimal instrument alignments. In addition to using PCA, some work has been done to estimate the main components of the current at each site using tidal harmonic analysis. This paper describes the key challenges and presents preliminary PCA and tidal analysis approaches used by ONC to improve long-term observatory current measurements.

  18. Predicting timing of foot strike during running, independent of striking technique, using principal component analysis of joint angles.

    PubMed

    Osis, Sean T; Hettinga, Blayne A; Leitch, Jessica; Ferber, Reed

    2014-08-22

    As 3-dimensional (3D) motion-capture for clinical gait analysis continues to evolve, new methods must be developed to improve the detection of gait cycle events based on kinematic data. Recently, the application of principal component analysis (PCA) to gait data has shown promise in detecting important biomechanical features. Therefore, the purpose of this study was to define a new foot strike detection method for a continuum of striking techniques, by applying PCA to joint angle waveforms. In accordance with Newtonian mechanics, it was hypothesized that transient features in the sagittal-plane accelerations of the lower extremity would be linked with the impulsive application of force to the foot at foot strike. Kinematic and kinetic data from treadmill running were selected for 154 subjects, from a database of gait biomechanics. Ankle, knee and hip sagittal plane angular acceleration kinematic curves were chained together to form a row input to a PCA matrix. A linear polynomial was calculated based on PCA scores, and a 10-fold cross-validation was performed to evaluate prediction accuracy against gold-standard foot strike as determined by a 10 N rise in the vertical ground reaction force. Results show 89-94% of all predicted foot strikes were within 4 frames (20 ms) of the gold standard with the largest error being 28 ms. It is concluded that this new foot strike detection is an improvement on existing methods and can be applied regardless of whether the runner exhibits a rearfoot, midfoot, or forefoot strike pattern. Copyright © 2014 Elsevier Ltd. All rights reserved.

  19. Sample-Poor Estimation of Order and Common Signal Subspace with Application to Fusion of Medical Imaging Data

    PubMed Central

    Levin-Schwartz, Yuri; Song, Yang; Schreier, Peter J.; Calhoun, Vince D.; Adalı, Tülay

    2016-01-01

    Due to their data-driven nature, multivariate methods such as canonical correlation analysis (CCA) have proven very useful for fusion of multimodal neurological data. However, being able to determine the degree of similarity between datasets and appropriate order selection are crucial to the success of such techniques. The standard methods for calculating the order of multimodal data focus only on sources with the greatest individual energy and ignore relations across datasets. Additionally, these techniques as well as the most widely-used methods for determining the degree of similarity between datasets assume sufficient sample support and are not effective in the sample-poor regime. In this paper, we propose to jointly estimate the degree of similarity between datasets and their order when few samples are present using principal component analysis and canonical correlation analysis (PCA-CCA). By considering these two problems simultaneously, we are able to minimize the assumptions placed on the data and achieve superior performance in the sample-poor regime compared to traditional techniques. We apply PCA-CCA to the pairwise combinations of functional magnetic resonance imaging (fMRI), structural magnetic resonance imaging (sMRI), and electroencephalogram (EEG) data drawn from patients with schizophrenia and healthy controls while performing an auditory oddball task. The PCA-CCA results indicate that the fMRI and sMRI datasets are the most similar, whereas the sMRI and EEG datasets share the least similarity. We also demonstrate that the degree of similarity obtained by PCA-CCA is highly predictive of the degree of significance found for components generated using CCA. PMID:27039696

  20. Classification and quantification analysis of peach kernel from different origins with near-infrared diffuse reflection spectroscopy

    PubMed Central

    Liu, Wei; Wang, Zhen-Zhong; Qing, Jian-Ping; Li, Hong-Juan; Xiao, Wei

    2014-01-01

    Background: Peach kernels which contain kinds of fatty acids play an important role in the regulation of a variety of physiological and biological functions. Objective: To establish an innovative and rapid diffuse reflectance near-infrared spectroscopy (DR-NIR) analysis method along with chemometric techniques for the qualitative and quantitative determination of a peach kernel. Materials and Methods: Peach kernel samples from nine different origins were analyzed with high-performance liquid chromatography (HPLC) as a reference method. DR-NIR is in the spectral range 1100-2300 nm. Principal component analysis (PCA) and partial least squares regression (PLSR) algorithm were applied to obtain prediction models, The Savitzky-Golay derivative and first derivative were adopted for the spectral pre-processing, PCA was applied to classify the varieties of those samples. For the quantitative calibration, the models of linoleic and oleinic acids were established with the PLSR algorithm and the optimal principal component (PC) numbers were selected with leave-one-out (LOO) cross-validation. The established models were evaluated with the root mean square error of deviation (RMSED) and corresponding correlation coefficients (R2). Results: The PCA results of DR-NIR spectra yield clear classification of the two varieties of peach kernel. PLSR had a better predictive ability. The correlation coefficients of the two calibration models were above 0.99, and the RMSED of linoleic and oleinic acids were 1.266% and 1.412%, respectively. Conclusion: The DR-NIR combined with PCA and PLSR algorithm could be used efficiently to identify and quantify peach kernels and also help to solve variety problem. PMID:25422544

  1. Combination of 1H nuclear magnetic resonance spectroscopy and principal component analysis to evaluate the lipid fluidity of flutamide-encapsulated lipid nanoemulsions.

    PubMed

    Takegami, Shigehiko; Ueyama, Keita; Konishi, Atsuko; Kitade, Tatsuya

    2018-06-06

    The lipid fluidity of various lipid nanoemulsions (LNEs) without and with flutamide (FT) and containing one of two neutral lipids, one of four phosphatidylcholines as a surfactant, and sodium palmitate as a cosurfactant was investigated by the combination of 1 H nuclear magnetic resonance (NMR) spectroscopy and principal component analysis (PCA). In the 1 H NMR spectra, the peaks from the methylene groups of the neutral lipids and surfactants for all LNE preparations showed downfield shifts with increasing temperature from 20 to 60 °C. PCA was applied to the 1 H NMR spectral data obtained for the LNEs. The PCA resulted in a model in which the first two principal components (PCs) extracted 88% of the total spectral variation; the first PC (PC-1) axis and second PC (PC-2) axis accounted for 73 and 15%, respectively, of the total spectral variation. The Score-1 values for PC-1 plotted against temperature revealed the existence of two clusters, which were defined by the neutral lipid of the LNE preparations. Meanwhile, the Score-2 values decreased with rising temperature and reflected the increase in lipid fluidity of each LNE preparation, consistent with fluorescence anisotropy measurements. In addition, the changes of Score-2 values with temperature for LNE preparations with FT were smaller than those for LNE preparations without FT. This indicates that FT encapsulated in LNE particles markedly suppressed the increase in lipid fluidity of LNE particles with rising temperature. Thus, PCA of 1 H NMR spectra will become a powerful tool to analyze the lipid fluidity of lipid nanoparticles. Graphical abstract ᅟ.

  2. The Relationship between Parent-Reported Coping, Stress, and Mental Health in a Preschool Population

    ERIC Educational Resources Information Center

    Kiernan, Neisha; Frydenberg, Erica; Deans, Jan; Liang, Rachel

    2017-01-01

    The present study explored the component structure of coping in preschoolers as measured by the Children's Coping Scale-Revised (CCS-R) through principal component analysis (PCA). The study also examined the relationship between different coping patterns and mental health (as measured by the Strengths and Diffculties Questionnaire; SDQ) in…

  3. Statistical analysis of aerosol species, trace gasses, and meteorology in Chicago.

    PubMed

    Binaku, Katrina; O'Brien, Timothy; Schmeling, Martina; Fosco, Tinamarie

    2013-09-01

    Both canonical correlation analysis (CCA) and principal component analysis (PCA) were applied to atmospheric aerosol and trace gas concentrations and meteorological data collected in Chicago during the summer months of 2002, 2003, and 2004. Concentrations of ammonium, calcium, nitrate, sulfate, and oxalate particulate matter, as well as, meteorological parameters temperature, wind speed, wind direction, and humidity were subjected to CCA and PCA. Ozone and nitrogen oxide mixing ratios were also included in the data set. The purpose of statistical analysis was to determine the extent of existing linear relationship(s), or lack thereof, between meteorological parameters and pollutant concentrations in addition to reducing dimensionality of the original data to determine sources of pollutants. In CCA, the first three canonical variate pairs derived were statistically significant at the 0.05 level. Canonical correlation between the first canonical variate pair was 0.821, while correlations of the second and third canonical variate pairs were 0.562 and 0.461, respectively. The first canonical variate pair indicated that increasing temperatures resulted in high ozone mixing ratios, while the second canonical variate pair showed wind speed and humidity's influence on local ammonium concentrations. No new information was uncovered in the third variate pair. Canonical loadings were also interpreted for information regarding relationships between data sets. Four principal components (PCs), expressing 77.0 % of original data variance, were derived in PCA. Interpretation of PCs suggested significant production and/or transport of secondary aerosols in the region (PC1). Furthermore, photochemical production of ozone and wind speed's influence on pollutants were expressed (PC2) along with overall measure of local meteorology (PC3). In summary, CCA and PCA results combined were successful in uncovering linear relationships between meteorology and air pollutants in Chicago and aided in determining possible pollutant sources.

  4. Comparing Independent Component Analysis with Principle Component Analysis in Detecting Alterations of Porphyry Copper Deposit (case Study: Ardestan Area, Central Iran)

    NASA Astrophysics Data System (ADS)

    Mahmoudishadi, S.; Malian, A.; Hosseinali, F.

    2017-09-01

    The image processing techniques in transform domain are employed as analysis tools for enhancing the detection of mineral deposits. The process of decomposing the image into important components increases the probability of mineral extraction. In this study, the performance of Principal Component Analysis (PCA) and Independent Component Analysis (ICA) has been evaluated for the visible and near-infrared (VNIR) and Shortwave infrared (SWIR) subsystems of ASTER data. Ardestan is located in part of Central Iranian Volcanic Belt that hosts many well-known porphyry copper deposits. This research investigated the propylitic and argillic alteration zones and outer mineralogy zone in part of Ardestan region. The two mentioned approaches were applied to discriminate alteration zones from igneous bedrock using the major absorption of indicator minerals from alteration and mineralogy zones in spectral rang of ASTER bands. Specialized PC components (PC2, PC3 and PC6) were used to identify pyrite and argillic and propylitic zones that distinguish from igneous bedrock in RGB color composite image. Due to the eigenvalues, the components 2, 3 and 6 account for 4.26% ,0.9% and 0.09% of the total variance of the data for Ardestan scene, respectively. For the purpose of discriminating the alteration and mineralogy zones of porphyry copper deposit from bedrocks, those mentioned percentages of data in ICA independent components of IC2, IC3 and IC6 are more accurately separated than noisy bands of PCA. The results of ICA method conform to location of lithological units of Ardestan region, as well.

  5. INTEGRATED ENVIRONMENTAL ASSESSMENT OF THE MID-ATLANTIC REGION WITH ANALYTICAL NETWORK PROCESS

    EPA Science Inventory

    A decision analysis method for integrating environmental indicators was developed. This was a combination of Principal Component Analysis (PCA) and the Analytic Network Process (ANP). Being able to take into account interdependency among variables, the method was capable of ran...

  6. An Efficient Data Compression Model Based on Spatial Clustering and Principal Component Analysis in Wireless Sensor Networks.

    PubMed

    Yin, Yihang; Liu, Fengzheng; Zhou, Xiang; Li, Quanzhong

    2015-08-07

    Wireless sensor networks (WSNs) have been widely used to monitor the environment, and sensors in WSNs are usually power constrained. Because inner-node communication consumes most of the power, efficient data compression schemes are needed to reduce the data transmission to prolong the lifetime of WSNs. In this paper, we propose an efficient data compression model to aggregate data, which is based on spatial clustering and principal component analysis (PCA). First, sensors with a strong temporal-spatial correlation are grouped into one cluster for further processing with a novel similarity measure metric. Next, sensor data in one cluster are aggregated in the cluster head sensor node, and an efficient adaptive strategy is proposed for the selection of the cluster head to conserve energy. Finally, the proposed model applies principal component analysis with an error bound guarantee to compress the data and retain the definite variance at the same time. Computer simulations show that the proposed model can greatly reduce communication and obtain a lower mean square error than other PCA-based algorithms.

  7. Classification and identification of Rhodobryum roseum Limpr. and its adulterants based on fourier-transform infrared spectroscopy (FTIR) and chemometrics.

    PubMed

    Cao, Zhen; Wang, Zhenjie; Shang, Zhonglin; Zhao, Jiancheng

    2017-01-01

    Fourier-transform infrared spectroscopy (FTIR) with the attenuated total reflectance technique was used to identify Rhodobryum roseum from its four adulterants. The FTIR spectra of six samples in the range from 4000 cm-1 to 600 cm-1 were obtained. The second-derivative transformation test was used to identify the small and nearby absorption peaks. A cluster analysis was performed to classify the spectra in a dendrogram based on the spectral similarity. Principal component analysis (PCA) was used to classify the species of six moss samples. A cluster analysis with PCA was used to identify different genera. However, some species of the same genus exhibited highly similar chemical components and FTIR spectra. Fourier self-deconvolution and discrete wavelet transform (DWT) were used to enhance the differences among the species with similar chemical components and FTIR spectra. Three scales were selected as the feature-extracting space in the DWT domain. The results show that FTIR spectroscopy with chemometrics is suitable for identifying Rhodobryum roseum and its adulterants.

  8. Characterization of Hatay honeys according to their multi-element analysis using ICP-OES combined with chemometrics.

    PubMed

    Yücel, Yasin; Sultanoğlu, Pınar

    2013-09-01

    Chemical characterisation has been carried out on 45 honey samples collected from Hatay region of Turkey. The concentrations of 17 elements were determined by inductively coupled plasma optical emission spectrometry (ICP-OES). Ca, K, Mg and Na were the most abundant elements, with mean contents of 219.38, 446.93, 49.06 and 95.91 mg kg(-1) respectively. The trace element mean contents ranged between 0.03 and 15.07 mg kg(-1). Chemometric methods such as principal component analysis (PCA) and cluster analysis (CA) techniques were applied to classify honey according to mineral content. The first most important principal component (PC) was strongly associated with the value of Al, B, Cd and Co. CA showed eight clusters corresponding to the eight botanical origins of honey. PCA explained 75.69% of the variance with the first six PC variables. Chemometric analysis of the analytical data allowed the accurate classification of the honey samples according to origin. Copyright © 2013 Elsevier Ltd. All rights reserved.

  9. Analysis of Floral Volatile Components and Antioxidant Activity of Different Varieties of Chrysanthemum morifolium.

    PubMed

    Yang, Lu; Cheng, Ping; Wang, Jin-Hui; Li, Hong

    2017-10-23

    This study investigated the volatile flavor compounds and antioxidant properties of the essential oil of chrysanthemums that was extracted from the fresh flowers of 10 taxa of Chrysanthemum morifolium from three species; namely Dendranthema morifolium (Ramat.) Yellow, Dendranthema morifolium (Ramat.) Red, Dendranthema morifolium (Ramat.) Pink, Dendranthema morifolium (Ramat.) White, Pericallis hybrid Blue, Pericallis hybrid Pink, Pericallis hybrid Purple, Bellis perennis Pink, Bellis perennis Yellow, and Bellis perennis White. The antioxidant capacity of the essential oil was assayed by spectrophotometric analysis. The volatile flavor compounds from the fresh flowers were collected using dynamic headspace collection, analyzed using auto thermal desorber-gas chromatography/mass spectrometry, and identified with quantification using the external standard method. The antioxidant activities of Chrysanthemum morifolium were evaluated by DPPH and FRAP assays, and the results showed that the antioxidant activity of each sample was not the same. The different varieties of fresh Chrysanthemum morifolium flowers were distinguished and classified by fingerprint similarity evaluation, principle component analysis (PCA), and cluster analysis. The results showed that the floral volatile component profiles were significantly different among the different Chrysanthemum morifolium varieties. A total of 36 volatile flavor compounds were identified with eight functional groups: hydrocarbons, terpenoids, aromatic compounds, alcohols, ketones, ethers, aldehydes, and esters. Moreover, the variability among Chrysanthemum morifolium in basis to the data, and the first three principal components (PC1, PC2, and PC3) accounted for 96.509% of the total variance (55.802%, 30.599%, and 10.108%, respectively). PCA indicated that there were marked differences among Chrysanthemum morifolium varieties. The cluster analysis confirmed the results of the PCA analysis. In conclusion, the results of this study provide a basis for breeding Chrysanthemum cultivars with desirable floral scents, and they further support the view that some plants are promising sources of natural antioxidants.

  10. Collective Dynamics of Periplasmic Glutamine Binding Protein upon Domain Closure

    PubMed Central

    Loeffler, Hannes H.; Kitao, Akio

    2009-01-01

    The glutamine binding protein is a vital component of the associated ATP binding cassette transport systems responsible for the uptake of glutamine into the cell. We have investigated the global movements of this protein by molecular dynamics simulations and principal component analysis (PCA). We confirm that the most dominant mode corresponds to the biological function of the protein, i.e., a hinge-type motion upon ligand binding. The closure itself was directly observed from two independent trajectories whereby PCA was used to elucidate the nature of this closing reaction. Two intermediary states are identified and described in detail. The ligand binding induces the structural change of the hinge regions from a discontinuous β-sheet to a continuous one, which also enhances softness of the hinge and modifies the direction of hinge motion to enable closing. We also investigated the convergence behavior of PCA modes, which were found to converge rather quickly when the associated magnitudes of the eigenvalues are well separated. PMID:19883597

  11. Prediction With Dimension Reduction of Multiple Molecular Data Sources for Patient Survival.

    PubMed

    Kaplan, Adam; Lock, Eric F

    2017-01-01

    Predictive modeling from high-dimensional genomic data is often preceded by a dimension reduction step, such as principal component analysis (PCA). However, the application of PCA is not straightforward for multisource data, wherein multiple sources of 'omics data measure different but related biological components. In this article, we use recent advances in the dimension reduction of multisource data for predictive modeling. In particular, we apply exploratory results from Joint and Individual Variation Explained (JIVE), an extension of PCA for multisource data, for prediction of differing response types. We conduct illustrative simulations to illustrate the practical advantages and interpretability of our approach. As an application example, we consider predicting survival for patients with glioblastoma multiforme from 3 data sources measuring messenger RNA expression, microRNA expression, and DNA methylation. We also introduce a method to estimate JIVE scores for new samples that were not used in the initial dimension reduction and study its theoretical properties; this method is implemented in the R package R.JIVE on CRAN, in the function jive.predict.

  12. Apo adenylate kinase encodes its holo form: a principal component and varimax analysis.

    PubMed

    Cukier, Robert I

    2009-02-12

    Adenylate kinase undergoes large-scale motions of its LID and AMP-binding (AMPbd) domains when its apo, open form closes over its substrates, AMP and Mg2+-ATP. It may be an example of an enzyme that provides an ensemble of conformations in its apo state from which its substrates can select and bind to produce catalytically competent conformations. In this work, the fluctuations of the enzyme apo Escherichia coli adenylate kinase (AKE) are obtained with molecular dynamics. The resulting trajectory is analyzed with principal component analysis (PCA) that decomposes the atom motions into orthogonal modes ordered by their decreasing contributions to the total protein fluctuation. In apo AKE, a small set of the PCA modes describes the bulk of the fluctuations. Identification of the atom motions that are important contributors to these modes is improved with the use of a varimax rotation method that rotates the PCA modes to a new mode set that concentrates the atom contributions to a smaller set of atoms in these new modes. In this way, the nature of the important motions of the LID and AMPbd domains are clarified. The dominant PCA modes are used to investigate if apo AKE can fluctuate to conformations that are holo-like, even though the apo trajectory is mainly confined to a region around the initial apo structure. This is accomplished by expressing the difference between the protein coordinates, obtained from the holo and apo crystal structures, using as a basis the PCA modes from the apo AKE trajectory. The coherent motion described by a small set of the apo PCA modes is shown to be able to produce protein conformations that are quite similar to the holo conformation of the protein. In this sense, apo AKE does encode in its fluctuations information about holo-like conformations.

  13. Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis.

    PubMed

    You, Zhu-Hong; Lei, Ying-Ke; Zhu, Lin; Xia, Junfeng; Wang, Bing

    2013-01-01

    Protein-protein interactions (PPIs) play crucial roles in the execution of various cellular processes and form the basis of biological mechanisms. Although large amount of PPIs data for different species has been generated by high-throughput experimental techniques, current PPI pairs obtained with experimental methods cover only a fraction of the complete PPI networks, and further, the experimental methods for identifying PPIs are both time-consuming and expensive. Hence, it is urgent and challenging to develop automated computational methods to efficiently and accurately predict PPIs. We present here a novel hierarchical PCA-EELM (principal component analysis-ensemble extreme learning machine) model to predict protein-protein interactions only using the information of protein sequences. In the proposed method, 11188 protein pairs retrieved from the DIP database were encoded into feature vectors by using four kinds of protein sequences information. Focusing on dimension reduction, an effective feature extraction method PCA was then employed to construct the most discriminative new feature set. Finally, multiple extreme learning machines were trained and then aggregated into a consensus classifier by majority voting. The ensembling of extreme learning machine removes the dependence of results on initial random weights and improves the prediction performance. When performed on the PPI data of Saccharomyces cerevisiae, the proposed method achieved 87.00% prediction accuracy with 86.15% sensitivity at the precision of 87.59%. Extensive experiments are performed to compare our method with state-of-the-art techniques Support Vector Machine (SVM). Experimental results demonstrate that proposed PCA-EELM outperforms the SVM method by 5-fold cross-validation. Besides, PCA-EELM performs faster than PCA-SVM based method. Consequently, the proposed approach can be considered as a new promising and powerful tools for predicting PPI with excellent performance and less time.

  14. Quantitative thickness prediction of tectonically deformed coal using Extreme Learning Machine and Principal Component Analysis: a case study

    NASA Astrophysics Data System (ADS)

    Wang, Xin; Li, Yan; Chen, Tongjun; Yan, Qiuyan; Ma, Li

    2017-04-01

    The thickness of tectonically deformed coal (TDC) has positive correlation associations with gas outbursts. In order to predict the TDC thickness of coal beds, we propose a new quantitative predicting method using an extreme learning machine (ELM) algorithm, a principal component analysis (PCA) algorithm, and seismic attributes. At first, we build an ELM prediction model using the PCA attributes of a synthetic seismic section. The results suggest that the ELM model can produce a reliable and accurate prediction of the TDC thickness for synthetic data, preferring Sigmoid activation function and 20 hidden nodes. Then, we analyze the applicability of the ELM model on the thickness prediction of the TDC with real application data. Through the cross validation of near-well traces, the results suggest that the ELM model can produce a reliable and accurate prediction of the TDC. After that, we use 250 near-well traces from 10 wells to build an ELM predicting model and use the model to forecast the TDC thickness of the No. 15 coal in the study area using the PCA attributes as the inputs. Comparing the predicted results, it is noted that the trained ELM model with two selected PCA attributes yields better predication results than those from the other combinations of the attributes. Finally, the trained ELM model with real seismic data have a different number of hidden nodes (10) than the trained ELM model with synthetic seismic data. In summary, it is feasible to use an ELM model to predict the TDC thickness using the calculated PCA attributes as the inputs. However, the input attributes, the activation function and the number of hidden nodes in the ELM model should be selected and tested carefully based on individual application.

  15. A study of anatomy of distal femur pertaining to total knee replacement: an analysis, conclusions and recommendations.

    PubMed

    Kumar, K; Sharma, D

    2018-04-01

    Multiple landmarks including the transepicondylar axis (TEA), posterior condylar axis (PCA) and anterior trochlear line (TL) have been used to set up the femoral component rotation, but each is faced with its own practical obstacle that limits its usage. Also a common practice is to set the femoral component rotation at 3° external rotation to PCA and valgus resection angle at 5°-7° to anatomical axis of femur. For the reason that the anatomy of each knee is different, it may not be justified to practice such a set protocol in all cases. The aim of the study was to compare the anatomical landmarks used to set up the femoral component rotation and to study the variability in the different anatomical relationships relevant to total knee replacement. The study had 52 patients (94 knees) with grade IV osteoarthritis. Full-length lower limb scanogram and 1 mm cross-sectional cuts of distal femur were taken. aTEA, sTEA, PCL, TL, CTA, PCA, TLA and valgus angles were taken for all knees. aTEA is identifiable in all cases but sTEA in only 59 knees (62.77%). Correspondingly, CTA is calculable in all knees and PCA in 62.77% cases. Mean CTA and mean PCA were 5.4° ± 1.88° SD and 0.71° ± 1.95° SD, respectively. Mean angle between aTEA and sTEA was 4.88. TL is a line difficult to draw because of high incidence of anterior osteophytes, making CTA a more reliable parameter than TLA. Mean TLA was 10.31° ± 3.52° SD. Mean valgus resection angle was 4.86° ± 2.53° SD. Gender- or side-based differences in any of these values were not statistically different. Using aTEA or sTEA can make a big difference in femoral component rotation; therefore, whether aTEA or sTEA should be used needs to be further investigated. CTA, PCA and valgus resection angle need to be individually calculated for each knee. Use of TLA is not recommended.

  16. Diurnal global variability of the Earth's magnetic field during geomagnetically quiet conditions

    NASA Astrophysics Data System (ADS)

    Klausner, V.

    2012-12-01

    This work proposes a methodology (or treatment) to establish a representative signal of the global magnetic diurnal variation. It is based on a spatial distribution in both longitude and latitude of a set of magnetic stations as well as their magnetic behavior on a time basis. We apply the Principal Component Analysis (PCA) technique using gapped wavelet transform and wavelet correlation. This new approach was used to describe the characteristics of the magnetic variations at Vassouras (Brazil) and 12 other magnetic stations spread around the terrestrial globe. Using magnetograms from 2007, we have investigated the global dominant pattern of the Sq variation as a function of low solar activity. This year was divided into two seasons for seasonal variation analysis: solstices (June and December) and equinoxes (March and September). We aim to reconstruct the original geomagnetic data series of the H component taking into account only the diurnal variations with periods of 24 hours on geomagnetically quiet days. We advance a proposal to reconstruct the Sq baseline using only the PCA first mode. The first interpretation of the results suggests that PCA/wavelet method could be used to the reconstruction of the Sq baseline.

  17. Reconstructing the free-energy landscape of Met-enkephalin using dihedral principal component analysis and well-tempered metadynamics

    NASA Astrophysics Data System (ADS)

    Sicard, François; Senet, Patrick

    2013-06-01

    Well-Tempered Metadynamics (WTmetaD) is an efficient method to enhance the reconstruction of the free-energy surface of proteins. WTmetaD guarantees a faster convergence in the long time limit in comparison with the standard metadynamics. It still suffers, however, from the same limitation, i.e., the non-trivial choice of pertinent collective variables (CVs). To circumvent this problem, we couple WTmetaD with a set of CVs generated from a dihedral Principal Component Analysis (dPCA) on the Ramachandran dihedral angles describing the backbone structure of the protein. The dPCA provides a generic method to extract relevant CVs built from internal coordinates, and does not depend on the alignment to an arbitrarily chosen reference structure as usual in Cartesian PCA. We illustrate the robustness of this method in the case of a reference model protein, the small and very diffusive Met-enkephalin pentapeptide. We propose a justification a posteriori of the considered number of CVs necessary to bias the metadynamics simulation in terms of the one-dimensional free-energy profiles associated with Ramachandran dihedral angles along the amino-acid sequence.

  18. Reconstructing the free-energy landscape of Met-enkephalin using dihedral principal component analysis and well-tempered metadynamics.

    PubMed

    Sicard, François; Senet, Patrick

    2013-06-21

    Well-Tempered Metadynamics (WTmetaD) is an efficient method to enhance the reconstruction of the free-energy surface of proteins. WTmetaD guarantees a faster convergence in the long time limit in comparison with the standard metadynamics. It still suffers, however, from the same limitation, i.e., the non-trivial choice of pertinent collective variables (CVs). To circumvent this problem, we couple WTmetaD with a set of CVs generated from a dihedral Principal Component Analysis (dPCA) on the Ramachandran dihedral angles describing the backbone structure of the protein. The dPCA provides a generic method to extract relevant CVs built from internal coordinates, and does not depend on the alignment to an arbitrarily chosen reference structure as usual in Cartesian PCA. We illustrate the robustness of this method in the case of a reference model protein, the small and very diffusive Met-enkephalin pentapeptide. We propose a justification a posteriori of the considered number of CVs necessary to bias the metadynamics simulation in terms of the one-dimensional free-energy profiles associated with Ramachandran dihedral angles along the amino-acid sequence.

  19. Medical diagnosis of atherosclerosis from Carotid Artery Doppler Signals using principal component analysis (PCA), k-NN based weighting pre-processing and Artificial Immune Recognition System (AIRS).

    PubMed

    Latifoğlu, Fatma; Polat, Kemal; Kara, Sadik; Güneş, Salih

    2008-02-01

    In this study, we proposed a new medical diagnosis system based on principal component analysis (PCA), k-NN based weighting pre-processing, and Artificial Immune Recognition System (AIRS) for diagnosis of atherosclerosis from Carotid Artery Doppler Signals. The suggested system consists of four stages. First, in the feature extraction stage, we have obtained the features related with atherosclerosis disease using Fast Fourier Transformation (FFT) modeling and by calculating of maximum frequency envelope of sonograms. Second, in the dimensionality reduction stage, the 61 features of atherosclerosis disease have been reduced to 4 features using PCA. Third, in the pre-processing stage, we have weighted these 4 features using different values of k in a new weighting scheme based on k-NN based weighting pre-processing. Finally, in the classification stage, AIRS classifier has been used to classify subjects as healthy or having atherosclerosis. Hundred percent of classification accuracy has been obtained by the proposed system using 10-fold cross validation. This success shows that the proposed system is a robust and effective system in diagnosis of atherosclerosis disease.

  20. PCA feature extraction for change detection in multidimensional unlabeled data.

    PubMed

    Kuncheva, Ludmila I; Faithfull, William J

    2014-01-01

    When classifiers are deployed in real-world applications, it is assumed that the distribution of the incoming data matches the distribution of the data used to train the classifier. This assumption is often incorrect, which necessitates some form of change detection or adaptive classification. While there has been a lot of work on change detection based on the classification error monitored over the course of the operation of the classifier, finding changes in multidimensional unlabeled data is still a challenge. Here, we propose to apply principal component analysis (PCA) for feature extraction prior to the change detection. Supported by a theoretical example, we argue that the components with the lowest variance should be retained as the extracted features because they are more likely to be affected by a change. We chose a recently proposed semiparametric log-likelihood change detection criterion that is sensitive to changes in both mean and variance of the multidimensional distribution. An experiment with 35 datasets and an illustration with a simple video segmentation demonstrate the advantage of using extracted features compared to raw data. Further analysis shows that feature extraction through PCA is beneficial, specifically for data with multiple balanced classes.

  1. Free-energy landscape, principal component analysis, and structural clustering to identify representative conformations from molecular dynamics simulations: the myoglobin case.

    PubMed

    Papaleo, Elena; Mereghetti, Paolo; Fantucci, Piercarlo; Grandori, Rita; De Gioia, Luca

    2009-01-01

    Several molecular dynamics (MD) simulations were used to sample conformations in the neighborhood of the native structure of holo-myoglobin (holo-Mb), collecting trajectories spanning 0.22 micros at 300 K. Principal component (PCA) and free-energy landscape (FEL) analyses, integrated by cluster analysis, which was performed considering the position and structures of the individual helices of the globin fold, were carried out. The coherence between the different structural clusters and the basins of the FEL, together with the convergence of parameters derived by PCA indicates that an accurate description of the Mb conformational space around the native state was achieved by multiple MD trajectories spanning at least 0.14 micros. The integration of FEL, PCA, and structural clustering was shown to be a very useful approach to gain an overall view of the conformational landscape accessible to a protein and to identify representative protein substates. This method could be also used to investigate the conformational and dynamical properties of Mb apo-, mutant, or delete versions, in which greater conformational variability is expected and, therefore identification of representative substates from the simulations is relevant to disclose structure-function relationship.

  2. Prediction of BP reactivity to talking using hybrid soft computing approaches.

    PubMed

    Kaur, Gurmanik; Arora, Ajat Shatru; Jain, Vijender Kumar

    2014-01-01

    High blood pressure (BP) is associated with an increased risk of cardiovascular diseases. Therefore, optimal precision in measurement of BP is appropriate in clinical and research studies. In this work, anthropometric characteristics including age, height, weight, body mass index (BMI), and arm circumference (AC) were used as independent predictor variables for the prediction of BP reactivity to talking. Principal component analysis (PCA) was fused with artificial neural network (ANN), adaptive neurofuzzy inference system (ANFIS), and least square-support vector machine (LS-SVM) model to remove the multicollinearity effect among anthropometric predictor variables. The statistical tests in terms of coefficient of determination (R (2)), root mean square error (RMSE), and mean absolute percentage error (MAPE) revealed that PCA based LS-SVM (PCA-LS-SVM) model produced a more efficient prediction of BP reactivity as compared to other models. This assessment presents the importance and advantages posed by PCA fused prediction models for prediction of biological variables.

  3. Finessing filter scarcity problem in face recognition via multi-fold filter convolution

    NASA Astrophysics Data System (ADS)

    Low, Cheng-Yaw; Teoh, Andrew Beng-Jin

    2017-06-01

    The deep convolutional neural networks for face recognition, from DeepFace to the recent FaceNet, demand a sufficiently large volume of filters for feature extraction, in addition to being deep. The shallow filter-bank approaches, e.g., principal component analysis network (PCANet), binarized statistical image features (BSIF), and other analogous variants, endure the filter scarcity problem that not all PCA and ICA filters available are discriminative to abstract noise-free features. This paper extends our previous work on multi-fold filter convolution (ℳ-FFC), where the pre-learned PCA and ICA filter sets are exponentially diversified by ℳ folds to instantiate PCA, ICA, and PCA-ICA offspring. The experimental results unveil that the 2-FFC operation solves the filter scarcity state. The 2-FFC descriptors are also evidenced to be superior to that of PCANet, BSIF, and other face descriptors, in terms of rank-1 identification rate (%).

  4. Low-Dimensional Feature Representation for Instrument Identification

    NASA Astrophysics Data System (ADS)

    Ihara, Mizuki; Maeda, Shin-Ichi; Ikeda, Kazushi; Ishii, Shin

    For monophonic music instrument identification, various feature extraction and selection methods have been proposed. One of the issues toward instrument identification is that the same spectrum is not always observed even in the same instrument due to the difference of the recording condition. Therefore, it is important to find non-redundant instrument-specific features that maintain information essential for high-quality instrument identification to apply them to various instrumental music analyses. For such a dimensionality reduction method, the authors propose the utilization of linear projection methods: local Fisher discriminant analysis (LFDA) and LFDA combined with principal component analysis (PCA). After experimentally clarifying that raw power spectra are actually good for instrument classification, the authors reduced the feature dimensionality by LFDA or by PCA followed by LFDA (PCA-LFDA). The reduced features achieved reasonably high identification performance that was comparable or higher than those by the power spectra and those achieved by other existing studies. These results demonstrated that our LFDA and PCA-LFDA can successfully extract low-dimensional instrument features that maintain the characteristic information of the instruments.

  5. Delusions in first-episode psychosis: Principal component analysis of twelve types of delusions and demographic and clinical correlates of resulting domains.

    PubMed

    Paolini, Enrico; Moretti, Patrizia; Compton, Michael T

    2016-09-30

    Although delusions represent one of the core symptoms of psychotic disorders, it is remarkable that few studies have investigated distinct delusional themes. We analyzed data from a large sample of first-episode psychosis patients (n=245) to understand relations between delusion types and demographic and clinical correlates. First, we conducted a principal component analysis (PCA) of the 12 delusion items within the Scale for the Assessment of Positive Symptoms (SAPS). Then, using the domains derived via PCA, we tested a priori hypotheses and answered exploratory research questions related to delusional content. PCA revealed five distinct components: Delusions of Influence, Grandiose/Religious Delusions, Paranoid Delusions, Negative Affect Delusions (jealousy, and sin or guilt), and Somatic Delusions. The most prevalent type of delusion was Paranoid Delusions, and such delusions were more common at older ages at onset of psychosis. The level of Delusions of Influence was correlated with the severity of hallucinations and negative symptoms. We ascertained a general relationship between different childhood adversities and delusional themes, and a specific relationship between Somatic Delusions and childhood neglect. Moreover, we found higher scores on Delusions of Influence and Negative Affect Delusions among cannabis and stimulant users. Our results support considering delusions as varied experiences with varying prevalences and correlates. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  6. The development of summary components for the Disablement in the Physically Active scale in collegiate athletes.

    PubMed

    Houston, Megan N; Hoch, Johanna M; Van Lunen, Bonnie L; Hoch, Matthew C

    2015-11-01

    The Disablement in the Physically Active scale (DPA) is a generic patient-reported outcome designed to evaluate constructs of disability in physically active populations. The purpose of this study was to analyze the DPA scale structure for summary components. Four hundred and fifty-six collegiate athletes completed a demographic form and the DPA. A principal component analysis (PCA) was conducted with oblique rotation. Factors with eigenvalues >1 that explained >5 % of the variance were retained. The PCA revealed a two-factor structure consistent with paradigms used to develop the original DPA. Items 1-12 loaded on Factors 1 and Items 13-16 loaded on Factor 2. Items 1-12 pertain to impairment, activity limitations, and participation restrictions. Items 13-16 address psychosocial and emotional well-being. Consideration of item content suggested Factor 1 concerned physical function, while Factor 2 concerned mental well-being. Thus, items clustered around Factor 1 and 2 were identified as physical (DPA-PSC) and mental (DPA-MSC) summary components, respectively. Together, the factors accounted for 65.1 % of the variance. The PCA revealed a two-factor structure for the DPA that resulted in DPA-PSC and DPA-MSC. Analyzing the DPA as separate constructs may provide distinct information that could help to prescribe treatment and rehabilitation strategies.

  7. Performance evaluation of PCA-based spike sorting algorithms.

    PubMed

    Adamos, Dimitrios A; Kosmidis, Efstratios K; Theophilidis, George

    2008-09-01

    Deciphering the electrical activity of individual neurons from multi-unit noisy recordings is critical for understanding complex neural systems. A widely used spike sorting algorithm is being evaluated for single-electrode nerve trunk recordings. The algorithm is based on principal component analysis (PCA) for spike feature extraction. In the neuroscience literature it is generally assumed that the use of the first two or most commonly three principal components is sufficient. We estimate the optimum PCA-based feature space by evaluating the algorithm's performance on simulated series of action potentials. A number of modifications are made to the open source nev2lkit software to enable systematic investigation of the parameter space. We introduce a new metric to define clustering error considering over-clustering more favorable than under-clustering as proposed by experimentalists for our data. Both the program patch and the metric are available online. Correlated and white Gaussian noise processes are superimposed to account for biological and artificial jitter in the recordings. We report that the employment of more than three principal components is in general beneficial for all noise cases considered. Finally, we apply our results to experimental data and verify that the sorting process with four principal components is in agreement with a panel of electrophysiology experts.

  8. Quality Evaluation of Potentilla fruticosa L. by High Performance Liquid Chromatography Fingerprinting Associated with Chemometric Methods.

    PubMed

    Liu, Wei; Wang, Dongmei; Liu, Jianjun; Li, Dengwu; Yin, Dongxue

    2016-01-01

    The present study was performed to assess the quality of Potentilla fruticosa L. sampled from distinct regions of China using high performance liquid chromatography (HPLC) fingerprinting coupled with a suite of chemometric methods. For this quantitative analysis, the main active phytochemical compositions and the antioxidant activity in P. fruticosa were also investigated. Considering the high percentages and antioxidant activities of phytochemicals, P. fruticosa samples from Kangding, Sichuan were selected as the most valuable raw materials. Similarity analysis (SA) of HPLC fingerprints, hierarchical cluster analysis (HCA), principle component analysis (PCA), and discriminant analysis (DA) were further employed to provide accurate classification and quality estimates of P. fruticosa. Two principal components (PCs) were collected by PCA. PC1 separated samples from Kangding, Sichuan, capturing 57.64% of the variance, whereas PC2 contributed to further separation, capturing 18.97% of the variance. Two kinds of discriminant functions with a 100% discrimination ratio were constructed. The results strongly supported the conclusion that the eight samples from different regions were clustered into three major groups, corresponding with their morphological classification, for which HPLC analysis confirmed the considerable variation in phytochemical compositions and that P. fruticosa samples from Kangding, Sichuan were of high quality. The results of SA, HCA, PCA, and DA were in agreement and performed well for the quality assessment of P. fruticosa. Consequently, HPLC fingerprinting coupled with chemometric techniques provides a highly flexible and reliable method for the quality evaluation of traditional Chinese medicines.

  9. Quality Evaluation of Potentilla fruticosa L. by High Performance Liquid Chromatography Fingerprinting Associated with Chemometric Methods

    PubMed Central

    Liu, Wei; Wang, Dongmei; Liu, Jianjun; Li, Dengwu; Yin, Dongxue

    2016-01-01

    The present study was performed to assess the quality of Potentilla fruticosa L. sampled from distinct regions of China using high performance liquid chromatography (HPLC) fingerprinting coupled with a suite of chemometric methods. For this quantitative analysis, the main active phytochemical compositions and the antioxidant activity in P. fruticosa were also investigated. Considering the high percentages and antioxidant activities of phytochemicals, P. fruticosa samples from Kangding, Sichuan were selected as the most valuable raw materials. Similarity analysis (SA) of HPLC fingerprints, hierarchical cluster analysis (HCA), principle component analysis (PCA), and discriminant analysis (DA) were further employed to provide accurate classification and quality estimates of P. fruticosa. Two principal components (PCs) were collected by PCA. PC1 separated samples from Kangding, Sichuan, capturing 57.64% of the variance, whereas PC2 contributed to further separation, capturing 18.97% of the variance. Two kinds of discriminant functions with a 100% discrimination ratio were constructed. The results strongly supported the conclusion that the eight samples from different regions were clustered into three major groups, corresponding with their morphological classification, for which HPLC analysis confirmed the considerable variation in phytochemical compositions and that P. fruticosa samples from Kangding, Sichuan were of high quality. The results of SA, HCA, PCA, and DA were in agreement and performed well for the quality assessment of P. fruticosa. Consequently, HPLC fingerprinting coupled with chemometric techniques provides a highly flexible and reliable method for the quality evaluation of traditional Chinese medicines. PMID:26890416

  10. Quantitative Ultrasound Using Texture Analysis of Myofascial Pain Syndrome in the Trapezius.

    PubMed

    Kumbhare, Dinesh A; Ahmed, Sara; Behr, Michael G; Noseworthy, Michael D

    2018-01-01

    Objective-The objective of this study is to assess the discriminative ability of textural analyses to assist in the differentiation of the myofascial trigger point (MTrP) region from normal regions of skeletal muscle. Also, to measure the ability to reliably differentiate between three clinically relevant groups: healthy asymptomatic, latent MTrPs, and active MTrP. Methods-18 and 19 patients were identified with having active and latent MTrPs in the trapezius muscle, respectively. We included 24 healthy volunteers. Images were obtained by research personnel, who were blinded with respect to the clinical status of the study participant. Histograms provided first-order parameters associated with image grayscale. Haralick, Galloway, and histogram-related features were used in texture analysis. Blob analysis was conducted on the regions of interest (ROIs). Principal component analysis (PCA) was performed followed by multivariate analysis of variance (MANOVA) to determine the statistical significance of the features. Results-92 texture features were analyzed for factorability using Bartlett's test of sphericity, which was significant. The Kaiser-Meyer-Olkin measure of sampling adequacy was 0.94. PCA demonstrated rotated eigenvalues of the first eight components (each comprised of multiple texture features) explained 94.92% of the cumulative variance in the ultrasound image characteristics. The 24 features identified by PCA were included in the MANOVA as dependent variables, and the presence of a latent or active MTrP or healthy muscle were independent variables. Conclusion-Texture analysis techniques can discriminate between the three clinically relevant groups.

  11. The Raman spectrum character of skin tumor induced by UVB

    NASA Astrophysics Data System (ADS)

    Wu, Shulian; Hu, Liangjun; Wang, Yunxia; Li, Yongzeng

    2016-03-01

    In our study, the skin canceration processes induced by UVB were analyzed from the perspective of tissue spectrum. A home-made Raman spectral system with a millimeter order excitation laser spot size combined with a multivariate statistical analysis for monitoring the skin changed irradiated by UVB was studied and the discrimination were evaluated. Raman scattering signals of the SCC and normal skin were acquired. Spectral differences in Raman spectra were revealed. Linear discriminant analysis (LDA) based on principal component analysis (PCA) were employed to generate diagnostic algorithms for the classification of skin SCC and normal. The results indicated that Raman spectroscopy combined with PCA-LDA demonstrated good potential for improving the diagnosis of skin cancers.

  12. A Filtering of Incomplete GNSS Position Time Series with Probabilistic Principal Component Analysis

    NASA Astrophysics Data System (ADS)

    Gruszczynski, Maciej; Klos, Anna; Bogusz, Janusz

    2018-04-01

    For the first time, we introduced the probabilistic principal component analysis (pPCA) regarding the spatio-temporal filtering of Global Navigation Satellite System (GNSS) position time series to estimate and remove Common Mode Error (CME) without the interpolation of missing values. We used data from the International GNSS Service (IGS) stations which contributed to the latest International Terrestrial Reference Frame (ITRF2014). The efficiency of the proposed algorithm was tested on the simulated incomplete time series, then CME was estimated for a set of 25 stations located in Central Europe. The newly applied pPCA was compared with previously used algorithms, which showed that this method is capable of resolving the problem of proper spatio-temporal filtering of GNSS time series characterized by different observation time span. We showed, that filtering can be carried out with pPCA method when there exist two time series in the dataset having less than 100 common epoch of observations. The 1st Principal Component (PC) explained more than 36% of the total variance represented by time series residuals' (series with deterministic model removed), what compared to the other PCs variances (less than 8%) means that common signals are significant in GNSS residuals. A clear improvement in the spectral indices of the power-law noise was noticed for the Up component, which is reflected by an average shift towards white noise from - 0.98 to - 0.67 (30%). We observed a significant average reduction in the accuracy of stations' velocity estimated for filtered residuals by 35, 28 and 69% for the North, East, and Up components, respectively. CME series were also subjected to analysis in the context of environmental mass loading influences of the filtering results. Subtraction of the environmental loading models from GNSS residuals provides to reduction of the estimated CME variance by 20 and 65% for horizontal and vertical components, respectively.

  13. Speciation of Energy Critical Elements in Marine Ferromanganese Crusts and Nodules by Principal Component Analysis and Least-squares fits to XAFS Spectra

    NASA Astrophysics Data System (ADS)

    Foster, A. L.; Klofas, J. M.; Hein, J. R.; Koschinsky, A.; Bargar, J.; Dunham, R. E.; Conrad, T. A.

    2011-12-01

    Marine ferromanganese crusts and nodules ("Fe-Mn crusts") are considered a potential mineral resource due to their accumulation of several economically-important elements at concentrations above mean crustal abundances. They are typically composed of intergrown Fe oxyhydroxide and Mn oxide; thicker (older) crusts can also contain carbonate fluorapatite. We used X-ray absorption fine-structure (XAFS) spectroscopy, a molecular-scale structure probe, to determine the speciation of several elements (Te, Bi, Mo, Zr, Pt) in Fe-Mn crusts. As a first step in analysis of this dataset, we have conducted principal component analysis (PCA) of Te K-edge and Mo K-edge, k3-weighted XAFS spectra. The sample set consisted of 12 homogenized, ground Fe-Mn crust samples from 8 locations in the global ocean. One sample was subjected to a chemical leach to selectively remove Mn oxides and the elements associated with it. The samples in the study set contain 50-205 mg/kg Te (average = 88) and 97-802 mg/kg Mo (average = 567). PCAs of background-subtracted, normalized Te K-edge and Mo K-edge XAFS spectra were performed on a data matrix of 12 rows x 122 columns (rows = samples; columns = Te or Mo fluorescence value at each energy step) and results were visualized without rotation. The number of significant components was assessed by the Malinowski indicator function and ability of the components to reconstruct the features (minus noise) of all sample spectra. Two components were significant by these criteria for both Te and Mo PCAs and described a total of 74 and 75% of the total variance, respectively. Reconstruction of potential model compounds by the principal components derived from PCAs on the sample set ("target transformation") provides a means of ranking models in terms of their utility for subsequent linear-combination, least-squares (LCLS) fits (the next step of data analysis). Synthetic end-member models of Te4+, Te6+, and Mo adsorbed to Fe(III) oxyhydroxide and Mn oxide were tested. Te6+ sorbed to Fe oxyhydroxide and Mo sorbed to Fe oxyhydroxide were identified as the best models for Te and Mo PCAs, respectively. However, in the case of Mo, least-squares fits contradicted these results, indicating that about 80% of Mo in crust samples was associated with Mn oxides. Ultimately it was discovered that the sample from which Mn oxide had been leached was skewing the results in the Mo PCA but not in the Te PCA. When the leached sample was removed and the Mo PCA repeated (n = 11), target transformation indicated that Mo sorbed to Mn oxide was indeed the best model for the set. Our results indicate that Te and Mo are strongly partitioned into different phases in these Fe-Mn crusts, and emphasize the importance of evaluating outliers and their effects on PCA.

  14. Identification and visualization of dominant patterns and anomalies in remotely sensed vegetation phenology using a parallel tool for principal components analysis

    Treesearch

    Richard Tran Mills; Jitendra Kumar; Forrest M. Hoffman; William W. Hargrove; Joseph P. Spruce; Steven P. Norman

    2013-01-01

    We investigated the use of principal components analysis (PCA) to visualize dominant patterns and identify anomalies in a multi-year land surface phenology data set (231 m × 231 m normalized difference vegetation index (NDVI) values derived from the Moderate Resolution Imaging Spectroradiometer (MODIS)) used for detecting threats to forest health in the conterminous...

  15. Patterns in longitudinal growth of refraction in Southern Chinese children: cluster and principal component analysis.

    PubMed

    Chen, Yanxian; Chang, Billy Heung Wing; Ding, Xiaohu; He, Mingguang

    2016-11-22

    In the present study we attempt to use hypothesis-independent analysis in investigating the patterns in refraction growth in Chinese children, and to explore the possible risk factors affecting the different components of progression, as defined by Principal Component Analysis (PCA). A total of 637 first-born twins in Guangzhou Twin Eye Study with 6-year annual visits (baseline age 7-15 years) were available in the analysis. Cluster 1 to 3 were classified after a partitioning clustering, representing stable, slow and fast progressing groups of refraction respectively. Baseline age and refraction, paternal refraction, maternal refraction and proportion of two myopic parents showed significant differences across the three groups. Three major components of progression were extracted using PCA: "Average refraction", "Acceleration" and the combination of "Myopia stabilization" and "Late onset of refraction progress". In regression models, younger children with more severe myopia were associated with larger "Acceleration". The risk factors of "Acceleration" included change of height and weight, near work, and parental myopia, while female gender, change of height and weight were associated with "Stabilization", and increased outdoor time was related to "Late onset of refraction progress". We therefore concluded that genetic and environmental risk factors have different impacts on patterns of refraction progression.

  16. Patterns in longitudinal growth of refraction in Southern Chinese children: cluster and principal component analysis

    PubMed Central

    Chen, Yanxian; Chang, Billy Heung Wing; Ding, Xiaohu; He, Mingguang

    2016-01-01

    In the present study we attempt to use hypothesis-independent analysis in investigating the patterns in refraction growth in Chinese children, and to explore the possible risk factors affecting the different components of progression, as defined by Principal Component Analysis (PCA). A total of 637 first-born twins in Guangzhou Twin Eye Study with 6-year annual visits (baseline age 7–15 years) were available in the analysis. Cluster 1 to 3 were classified after a partitioning clustering, representing stable, slow and fast progressing groups of refraction respectively. Baseline age and refraction, paternal refraction, maternal refraction and proportion of two myopic parents showed significant differences across the three groups. Three major components of progression were extracted using PCA: “Average refraction”, “Acceleration” and the combination of “Myopia stabilization” and “Late onset of refraction progress”. In regression models, younger children with more severe myopia were associated with larger “Acceleration”. The risk factors of “Acceleration” included change of height and weight, near work, and parental myopia, while female gender, change of height and weight were associated with “Stabilization”, and increased outdoor time was related to “Late onset of refraction progress”. We therefore concluded that genetic and environmental risk factors have different impacts on patterns of refraction progression. PMID:27874105

  17. Fault Detection of Bearing Systems through EEMD and Optimization Algorithm

    PubMed Central

    Lee, Dong-Han; Ahn, Jong-Hyo; Koh, Bong-Hwan

    2017-01-01

    This study proposes a fault detection and diagnosis method for bearing systems using ensemble empirical mode decomposition (EEMD) based feature extraction, in conjunction with particle swarm optimization (PSO), principal component analysis (PCA), and Isomap. First, a mathematical model is assumed to generate vibration signals from damaged bearing components, such as the inner-race, outer-race, and rolling elements. The process of decomposing vibration signals into intrinsic mode functions (IMFs) and extracting statistical features is introduced to develop a damage-sensitive parameter vector. Finally, PCA and Isomap algorithm are used to classify and visualize this parameter vector, to separate damage characteristics from healthy bearing components. Moreover, the PSO-based optimization algorithm improves the classification performance by selecting proper weightings for the parameter vector, to maximize the visualization effect of separating and grouping of parameter vectors in three-dimensional space. PMID:29143772

  18. The variance needed to accurately describe jump height from vertical ground reaction force data.

    PubMed

    Richter, Chris; McGuinness, Kevin; O'Connor, Noel E; Moran, Kieran

    2014-12-01

    In functional principal component analysis (fPCA) a threshold is chosen to define the number of retained principal components, which corresponds to the amount of preserved information. A variety of thresholds have been used in previous studies and the chosen threshold is often not evaluated. The aim of this study is to identify the optimal threshold that preserves the information needed to describe a jump height accurately utilizing vertical ground reaction force (vGRF) curves. To find an optimal threshold, a neural network was used to predict jump height from vGRF curve measures generated using different fPCA thresholds. The findings indicate that a threshold from 99% to 99.9% (6-11 principal components) is optimal for describing jump height, as these thresholds generated significantly lower jump height prediction errors than other thresholds.

  19. [Analyzing and modeling methods of near infrared spectroscopy for in-situ prediction of oil yield from oil shale].

    PubMed

    Liu, Jie; Zhang, Fu-Dong; Teng, Fei; Li, Jun; Wang, Zhi-Hong

    2014-10-01

    In order to in-situ detect the oil yield of oil shale, based on portable near infrared spectroscopy analytical technology, with 66 rock core samples from No. 2 well drilling of Fuyu oil shale base in Jilin, the modeling and analyzing methods for in-situ detection were researched. By the developed portable spectrometer, 3 data formats (reflectance, absorbance and K-M function) spectra were acquired. With 4 different modeling data optimization methods: principal component-mahalanobis distance (PCA-MD) for eliminating abnormal samples, uninformative variables elimination (UVE) for wavelength selection and their combina- tions: PCA-MD + UVE and UVE + PCA-MD, 2 modeling methods: partial least square (PLS) and back propagation artificial neural network (BPANN), and the same data pre-processing, the modeling and analyzing experiment were performed to determine the optimum analysis model and method. The results show that the data format, modeling data optimization method and modeling method all affect the analysis precision of model. Results show that whether or not using the optimization method, reflectance or K-M function is the proper spectrum format of the modeling database for two modeling methods. Using two different modeling methods and four different data optimization methods, the model precisions of the same modeling database are different. For PLS modeling method, the PCA-MD and UVE + PCA-MD data optimization methods can improve the modeling precision of database using K-M function spectrum data format. For BPANN modeling method, UVE, UVE + PCA-MD and PCA- MD + UVE data optimization methods can improve the modeling precision of database using any of the 3 spectrum data formats. In addition to using the reflectance spectra and PCA-MD data optimization method, modeling precision by BPANN method is better than that by PLS method. And modeling with reflectance spectra, UVE optimization method and BPANN modeling method, the model gets the highest analysis precision, its correlation coefficient (Rp) is 0.92, and its standard error of prediction (SEP) is 0.69%.

  20. Fast, Exact Bootstrap Principal Component Analysis for p > 1 million

    PubMed Central

    Fisher, Aaron; Caffo, Brian; Schwartz, Brian; Zipunnikov, Vadim

    2015-01-01

    Many have suggested a bootstrap procedure for estimating the sampling variability of principal component analysis (PCA) results. However, when the number of measurements per subject (p) is much larger than the number of subjects (n), calculating and storing the leading principal components from each bootstrap sample can be computationally infeasible. To address this, we outline methods for fast, exact calculation of bootstrap principal components, eigenvalues, and scores. Our methods leverage the fact that all bootstrap samples occupy the same n-dimensional subspace as the original sample. As a result, all bootstrap principal components are limited to the same n-dimensional subspace and can be efficiently represented by their low dimensional coordinates in that subspace. Several uncertainty metrics can be computed solely based on the bootstrap distribution of these low dimensional coordinates, without calculating or storing the p-dimensional bootstrap components. Fast bootstrap PCA is applied to a dataset of sleep electroencephalogram recordings (p = 900, n = 392), and to a dataset of brain magnetic resonance images (MRIs) (p ≈ 3 million, n = 352). For the MRI dataset, our method allows for standard errors for the first 3 principal components based on 1000 bootstrap samples to be calculated on a standard laptop in 47 minutes, as opposed to approximately 4 days with standard methods. PMID:27616801

  1. Blind source separation problem in GPS time series

    NASA Astrophysics Data System (ADS)

    Gualandi, A.; Serpelloni, E.; Belardinelli, M. E.

    2016-04-01

    A critical point in the analysis of ground displacement time series, as those recorded by space geodetic techniques, is the development of data-driven methods that allow the different sources of deformation to be discerned and characterized in the space and time domains. Multivariate statistic includes several approaches that can be considered as a part of data-driven methods. A widely used technique is the principal component analysis (PCA), which allows us to reduce the dimensionality of the data space while maintaining most of the variance of the dataset explained. However, PCA does not perform well in finding the solution to the so-called blind source separation (BSS) problem, i.e., in recovering and separating the original sources that generate the observed data. This is mainly due to the fact that PCA minimizes the misfit calculated using an L2 norm (χ 2), looking for a new Euclidean space where the projected data are uncorrelated. The independent component analysis (ICA) is a popular technique adopted to approach the BSS problem. However, the independence condition is not easy to impose, and it is often necessary to introduce some approximations. To work around this problem, we test the use of a modified variational Bayesian ICA (vbICA) method to recover the multiple sources of ground deformation even in the presence of missing data. The vbICA method models the probability density function (pdf) of each source signal using a mix of Gaussian distributions, allowing for more flexibility in the description of the pdf of the sources with respect to standard ICA, and giving a more reliable estimate of them. Here we present its application to synthetic global positioning system (GPS) position time series, generated by simulating deformation near an active fault, including inter-seismic, co-seismic, and post-seismic signals, plus seasonal signals and noise, and an additional time-dependent volcanic source. We evaluate the ability of the PCA and ICA decomposition techniques in explaining the data and in recovering the original (known) sources. Using the same number of components, we find that the vbICA method fits the data almost as well as a PCA method, since the χ 2 increase is less than 10 % the value calculated using a PCA decomposition. Unlike PCA, the vbICA algorithm is found to correctly separate the sources if the correlation of the dataset is low (<0.67) and the geodetic network is sufficiently dense (ten continuous GPS stations within a box of side equal to two times the locking depth of a fault where an earthquake of Mw >6 occurred). We also provide a cookbook for the use of the vbICA algorithm in analyses of position time series for tectonic and non-tectonic applications.

  2. [Principal component analysis and cluster analysis of inorganic elements in sea cucumber Apostichopus japonicus].

    PubMed

    Liu, Xiao-Fang; Xue, Chang-Hu; Wang, Yu-Ming; Li, Zhao-Jie; Xue, Yong; Xu, Jie

    2011-11-01

    The present study is to investigate the feasibility of multi-elements analysis in determination of the geographical origin of sea cucumber Apostichopus japonicus, and to make choice of the effective tracers in sea cucumber Apostichopus japonicus geographical origin assessment. The content of the elements such as Al, V, Cr, Mn, Fe, Co, Ni, Cu, Zn, As, Se, Mo, Cd, Hg and Pb in sea cucumber Apostichopus japonicus samples from seven places of geographical origin were determined by means of ICP-MS. The results were used for the development of elements database. Cluster analysis(CA) and principal component analysis (PCA) were applied to differentiate the sea cucumber Apostichopus japonicus geographical origin. Three principal components which accounted for over 89% of the total variance were extracted from the standardized data. The results of Q-type cluster analysis showed that the 26 samples could be clustered reasonably into five groups, the classification results were significantly associated with the marine distribution of the sea cucumber Apostichopus japonicus samples. The CA and PCA were the effective methods for elements analysis of sea cucumber Apostichopus japonicus samples. The content of the mineral elements in sea cucumber Apostichopus japonicus samples was good chemical descriptors for differentiating their geographical origins.

  3. Impact of Measurement Uncertainties on Receptor Modeling of Speciated Atmospheric Mercury.

    PubMed

    Cheng, I; Zhang, L; Xu, X

    2016-02-09

    Gaseous oxidized mercury (GOM) and particle-bound mercury (PBM) measurement uncertainties could potentially affect the analysis and modeling of atmospheric mercury. This study investigated the impact of GOM measurement uncertainties on Principal Components Analysis (PCA), Absolute Principal Component Scores (APCS), and Concentration-Weighted Trajectory (CWT) receptor modeling results. The atmospheric mercury data input into these receptor models were modified by combining GOM and PBM into a single reactive mercury (RM) parameter and excluding low GOM measurements to improve the data quality. PCA and APCS results derived from RM or excluding low GOM measurements were similar to those in previous studies, except for a non-unique component and an additional component extracted from the RM dataset. The percent variance explained by the major components from a previous study differed slightly compared to RM and excluding low GOM measurements. CWT results were more sensitive to the input of RM than GOM excluding low measurements. Larger discrepancies were found between RM and GOM source regions than those between RM and PBM. Depending on the season, CWT source regions of RM differed by 40-61% compared to GOM from a previous study. No improvement in correlations between CWT results and anthropogenic mercury emissions were found.

  4. Impact of Measurement Uncertainties on Receptor Modeling of Speciated Atmospheric Mercury

    PubMed Central

    Cheng, I.; Zhang, L.; Xu, X.

    2016-01-01

    Gaseous oxidized mercury (GOM) and particle-bound mercury (PBM) measurement uncertainties could potentially affect the analysis and modeling of atmospheric mercury. This study investigated the impact of GOM measurement uncertainties on Principal Components Analysis (PCA), Absolute Principal Component Scores (APCS), and Concentration-Weighted Trajectory (CWT) receptor modeling results. The atmospheric mercury data input into these receptor models were modified by combining GOM and PBM into a single reactive mercury (RM) parameter and excluding low GOM measurements to improve the data quality. PCA and APCS results derived from RM or excluding low GOM measurements were similar to those in previous studies, except for a non-unique component and an additional component extracted from the RM dataset. The percent variance explained by the major components from a previous study differed slightly compared to RM and excluding low GOM measurements. CWT results were more sensitive to the input of RM than GOM excluding low measurements. Larger discrepancies were found between RM and GOM source regions than those between RM and PBM. Depending on the season, CWT source regions of RM differed by 40–61% compared to GOM from a previous study. No improvement in correlations between CWT results and anthropogenic mercury emissions were found. PMID:26857835

  5. Principle component analysis (PCA) for investigation of relationship between population dynamics of microbial pathogenesis, chemical and sensory characteristics in beef slices containing Tarragon essential oil.

    PubMed

    Alizadeh Behbahani, Behrooz; Tabatabaei Yazdi, Farideh; Shahidi, Fakhri; Mortazavi, Seyed Ali; Mohebbi, Mohebbat

    2017-04-01

    Principle component analysis (PCA) was employed to examine the effect of the exerted treatments on the beef shelf life as well as discovering the correlations between the studied responses. Considering the variability of the dimensions of the responses, correlation coefficients were applied to form the matrix and extract the eigenvalue. Antimicrobial effect was evaluated on 10 pathogenic microorganisms through the methods of hole-plate diffusion method, disk diffusion method, pour plate method, minimum inhibitory concentration and minimum bactericidal/fungicidal concentration. Antioxidant potential and total phenolic content were examined through the method of 2,2-diphenyl-1-picrylhydrazyl (DPPH) and Folin-Ciocalteu method, respectively. The components were identified through gas chromatography and gas chromatography/mass spectrometry. Barhang seed mucilage (BSM) based edible coating containing 0, 0.5, 1 and 1.5% (w/w) Tarragon (T) essential oil mix were applied on beef slices to control the growth of pathogenic microorganisms. Microbiological (total viable count, psychrotrophic count, Escherichia coli, Staphylococcus aureus and fungi), chemical (thiobarbituric acid, peroxide value and pH) and sensory characteristics (odor, color and overall acceptability) analysis measurements were made during the storage periodically. PCA was employed to examine the effect of the exerted treatments on the beef shelf life as well as discovering the correlations between the studied responses. Considering the variability of the dimensions of the responses, correlation coefficients were applied to form the matrix and extract the eigenvalue. The PCA showed that the properties of the uncoated meat samples on the 9th, 12th, 15th and 18th days of storage are continuously changing independent of the exerted treatments on the other samples. This reveals the effect of the exerted treatments on the samples. Copyright © 2017 Elsevier Ltd. All rights reserved.

  6. Principal component analysis for the early detection of mastitis and lameness in dairy cows.

    PubMed

    Miekley, Bettina; Traulsen, Imke; Krieter, Joachim

    2013-08-01

    This investigation analysed the applicability of principal component analysis (PCA), a latent variable method, for the early detection of mastitis and lameness. Data used were recorded on the Karkendamm dairy research farm between August 2008 and December 2010. For mastitis and lameness detection, data of 338 and 315 cows in their first 200 d in milk were analysed, respectively. Mastitis as well as lameness were specified according to veterinary treatments. Diseases were defined as disease blocks. The different definitions used (two for mastitis, three for lameness) varied solely in the sequence length of the blocks. Only the days before the treatment were included in the blocks. Milk electrical conductivity, milk yield and feeding patterns (feed intake, number of feeding visits and time at the trough) were used for recognition of mastitis. Pedometer activity and feeding patterns were utilised for lameness detection. To develop and verify the PCA model, the mastitis and the lameness datasets were divided into training and test datasets. PCA extracted uncorrelated principle components (PC) by linear transformations of the raw data so that the first few PCs captured most of the variations in the original dataset. For process monitoring and disease detection, these resulting PCs were applied to the Hotelling's T 2 chart and to the residual control chart. The results show that block sensitivity of mastitis detection ranged from 77·4 to 83·3%, whilst specificity was around 76·7%. The error rates were around 98·9%. For lameness detection, the block sensitivity ranged from 73·8 to 87·8% while the obtained specificities were between 54·8 and 61·9%. The error rates varied from 87·8 to 89·2%. In conclusion, PCA seems to be not yet transferable into practical usage. Results could probably be improved if different traits and more informative sensor data are included in the analysis.

  7. LARVAL FISH HABITAT QUALITY : THE EFFECTS OF FRESHWATER FLOW

    EPA Science Inventory

    We sampled larval fish in Suisun Marsh, in the San Francisco Bay estuary from February to June 1994-1999. We used principal components analysis (PCA) and canonical correspondence analysis (CCA) on 13 taxonomic groups making up 99.7% of the catch and several environmental variable...

  8. Habitat characteristic of macrozoobenthos in Naborsahan River of Toba Lake, North Sumatra, Indonesia

    NASA Astrophysics Data System (ADS)

    Basyuni, M.; Lubis, M. S.; Suryanti, A.

    2018-02-01

    This research described the relative abundance, dominance index, and index of macrozoobenthos equitability in Naborsahan River of Toba Lake, North Sumatra, Indonesia. The purposive random sampling at three stations was used to characterize the biological, chemical, and physical parameters of macrozoobenthos. The highest relative abundance of macrozoobenthos found at station 2 (99.96%). By contrast, the highest dominance index was at station 3 (0.31), and the maximum equitability index found at station 1 (0.94). The present results showed diversity parameters among the stations. A principal component analysis (PCA) was used to determine the habitat characteristics of macrozoobenthos. PCA analysis depicted that six parameters studied, brightness, turbidity, depth, temperature, dissolved oxygen (DO) and biochemical oxygen demand (BOD5) play a significant role on the relative abundance, dominance index, and equitability index. PCA analysis suggested that station 3 was suitable habitat characteristic for the life of macro-zoobenthos indicating of the negative axis. The present study demonstrated the six parameters should be conserved to support the survival of macrozoobenthos.

  9. Spatial Mapping of Pyocyanin in Pseudomonas aeruginosa Bacterial Communities by Surface Enhanced Raman Scattering

    PubMed Central

    Polisetti, Sneha; Baig, Nameera F.; Morales-Soto, Nydia; Shrout, Joshua D.; Bohn, Paul W.

    2017-01-01

    Surface Enhanced Raman Spectroscopy (SERS) imaging was used in conjunction with Principal Component Analysis (PCA) for the in situ spatiotemporal mapping of the virulence factor pyocyanin, in communities of the pathogenic bacterium Pseudomonas aeruginosa. The combination of SERS imaging and PCA analysis provides a robust method for characterization of heterogeneous biological systems while circumventing issues associated with interference from sample autofluorescence and low reproducibility of SERS signals. The production of pyocyanin is found to depend both on the growth carbon source and on the specific strain of P. aeruginosa studied. A cystic fibrosis lung isolate strain of P. aeruginosa synthesizes and secretes pyocyanin when grown with glucose and glutamate, while the laboratory strain exhibits detectable production of pyocyanin only when grown with glutamate as the source of carbon. Pyocyanin production in the laboratory strain grown with glucose was below the limit of detection of SERS. In addition, the combination of SERS imaging and PCA can elucidate subtle differences in the molecular composition of biofilms. PCA loading plots from the clinical isolate exhibit features corresponding to vibrational bands of carbohydrates, which represent the mucoid biofilm matrix specific to that isolate, features that are not seen in the PCA loading plots of the laboratory strain. PMID:27354400

  10. Multivariate analysis of fatty acid and biochemical constitutes of seaweeds to characterize their potential as bioresource for biofuel and fine chemicals.

    PubMed

    Verma, Priyanka; Kumar, Manoj; Mishra, Girish; Sahoo, Dinabandhu

    2017-02-01

    In the present study bio prospecting of thirty seaweeds from Indian coasts was analyzed for their biochemical components including pigments, fatty acid and ash content. Multivariate analysis of biochemical components and fatty acids was done using Principal Component Analysis (PCA) and Agglomerative hierarchical clustering (AHC) to manifest chemotaxonomic relationship among various seaweeds. The overall analysis suggests that these seaweeds have multi-functional properties and can be utilized as promising bioresource for proteins, lipids, pigments and carbohydrates for the food/feed and biofuel industry. Copyright © 2016. Published by Elsevier Ltd.

  11. Selecting predictors for discriminant analysis of species performance: an example from an amphibious softwater plant.

    PubMed

    Vanderhaeghe, F; Smolders, A J P; Roelofs, J G M; Hoffmann, M

    2012-03-01

    Selecting an appropriate variable subset in linear multivariate methods is an important methodological issue for ecologists. Interest often exists in obtaining general predictive capacity or in finding causal inferences from predictor variables. Because of a lack of solid knowledge on a studied phenomenon, scientists explore predictor variables in order to find the most meaningful (i.e. discriminating) ones. As an example, we modelled the response of the amphibious softwater plant Eleocharis multicaulis using canonical discriminant function analysis. We asked how variables can be selected through comparison of several methods: univariate Pearson chi-square screening, principal components analysis (PCA) and step-wise analysis, as well as combinations of some methods. We expected PCA to perform best. The selected methods were evaluated through fit and stability of the resulting discriminant functions and through correlations between these functions and the predictor variables. The chi-square subset, at P < 0.05, followed by a step-wise sub-selection, gave the best results. In contrast to expectations, PCA performed poorly, as so did step-wise analysis. The different chi-square subset methods all yielded ecologically meaningful variables, while probable noise variables were also selected by PCA and step-wise analysis. We advise against the simple use of PCA or step-wise discriminant analysis to obtain an ecologically meaningful variable subset; the former because it does not take into account the response variable, the latter because noise variables are likely to be selected. We suggest that univariate screening techniques are a worthwhile alternative for variable selection in ecology. © 2011 German Botanical Society and The Royal Botanical Society of the Netherlands.

  12. Cluster analysis of commercial samples of Bauhinia spp. using HPLC-UV/PDA and MCR-ALS/PCA without peak alignment procedure.

    PubMed

    Ardila, Jorge Armando; Funari, Cristiano Soleo; Andrade, André Marques; Cavalheiro, Alberto José; Carneiro, Renato Lajarim

    2015-01-01

    Bauhinia forficata Link. is recognised by the Brazilian Health Ministry as a treatment of hypoglycemia and diabetes. Analytical methods are useful to assess the plant identity due the similarities found in plants from Bauhinia spp. HPLC-UV/PDA in combination with chemometric tools is an alternative widely used and suitable for authentication of plant material, however, the shifts of retention times for similar compounds in different samples is a problem. To perform comparisons between the authentic medicinal plant (Bauhinia forficata Link.) and samples commercially available in drugstores claiming to be "Bauhinia spp. to treat diabetes" and to evaluate the performance of multivariate curve resolution - alternating least squares (MCR-ALS) associated to principal component analysis (PCA) when compared to pure PCA. HPLC-UV/PDA data obtained from extracts of leaves were evaluated employing a combination of MCR-ALS and PCA, which allowed the use of the full chromatographic and spectrometric information without the need of peak alignment procedures. The use of MCR-ALS/PCA showed better results than the conventional PCA using only one wavelength. Only two of nine commercial samples presented characteristics similar to the authentic Bauhinia forficata spp., considering the full HPLC-UV/PDA data. The combination of MCR-ALS and PCA is very useful when applied to a group of samples where a general alignment procedure could not be applied due to the different chromatographic profiles. This work also demonstrates the need of more strict control from the health authorities regarding herbal products available on the market. Copyright © 2015 John Wiley & Sons, Ltd.

  13. Sample-space-based feature extraction and class preserving projection for gene expression data.

    PubMed

    Wang, Wenjun

    2013-01-01

    In order to overcome the problems of high computational complexity and serious matrix singularity for feature extraction using Principal Component Analysis (PCA) and Fisher's Linear Discrinimant Analysis (LDA) in high-dimensional data, sample-space-based feature extraction is presented, which transforms the computation procedure of feature extraction from gene space to sample space by representing the optimal transformation vector with the weighted sum of samples. The technique is used in the implementation of PCA, LDA, Class Preserving Projection (CPP) which is a new method for discriminant feature extraction proposed, and the experimental results on gene expression data demonstrate the effectiveness of the method.

  14. Fuji apple storage time rapid determination method using Vis/NIR spectroscopy.

    PubMed

    Liu, Fuqi; Tang, Xuxiang

    2015-01-01

    Fuji apple storage time rapid determination method using visible/near-infrared (Vis/NIR) spectroscopy was studied in this paper. Vis/NIR diffuse reflection spectroscopy responses to samples were measured for 6 days. Spectroscopy data were processed by stochastic resonance (SR). Principal component analysis (PCA) was utilized to analyze original spectroscopy data and SNR eigen value. Results demonstrated that PCA could not totally discriminate Fuji apples using original spectroscopy data. Signal-to-noise ratio (SNR) spectrum clearly classified all apple samples. PCA using SNR spectrum successfully discriminated apple samples. Therefore, Vis/NIR spectroscopy was effective for Fuji apple storage time rapid discrimination. The proposed method is also promising in condition safety control and management for food and environmental laboratories.

  15. A Study of Wind Turbine Comprehensive Operational Assessment Model Based on EM-PCA Algorithm

    NASA Astrophysics Data System (ADS)

    Zhou, Minqiang; Xu, Bin; Zhan, Yangyan; Ren, Danyuan; Liu, Dexing

    2018-01-01

    To assess wind turbine performance accurately and provide theoretical basis for wind farm management, a hybrid assessment model based on Entropy Method and Principle Component Analysis (EM-PCA) was established, which took most factors of operational performance into consideration and reach to a comprehensive result. To verify the model, six wind turbines were chosen as the research objects, the ranking obtained by the method proposed in the paper were 4#>6#>1#>5#>2#>3#, which are completely in conformity with the theoretical ranking, which indicates that the reliability and effectiveness of the EM-PCA method are high. The method could give guidance for processing unit state comparison among different units and launching wind farm operational assessment.

  16. Power line identification of millimeter wave radar based on PCA-GS-SVM

    NASA Astrophysics Data System (ADS)

    Fang, Fang; Zhang, Guifeng; Cheng, Yansheng

    2017-12-01

    Aiming at the problem that the existing detection method can not effectively solve the security of UAV's ultra low altitude flight caused by power line, a power line recognition method based on grid search (GS) and the principal component analysis and support vector machine (PCA-SVM) is proposed. Firstly, the candidate line of Hough transform is reduced by PCA, and the main feature of candidate line is extracted. Then, upport vector machine (SVM is) optimized by grid search method (GS). Finally, using support vector machine classifier optimized parameters to classify the candidate line. MATLAB simulation results show that this method can effectively identify the power line and noise, and has high recognition accuracy and algorithm efficiency.

  17. RG-inspired machine learning for lattice field theory

    NASA Astrophysics Data System (ADS)

    Foreman, Sam; Giedt, Joel; Meurice, Yannick; Unmuth-Yockey, Judah

    2018-03-01

    Machine learning has been a fast growing field of research in several areas dealing with large datasets. We report recent attempts to use renormalization group (RG) ideas in the context of machine learning. We examine coarse graining procedures for perceptron models designed to identify the digits of the MNIST data. We discuss the correspondence between principal components analysis (PCA) and RG flows across the transition for worm configurations of the 2D Ising model. Preliminary results regarding the logarithmic divergence of the leading PCA eigenvalue were presented at the conference. More generally, we discuss the relationship between PCA and observables in Monte Carlo simulations and the possibility of reducing the number of learning parameters in supervised learning based on RG inspired hierarchical ansatzes.

  18. Fuji apple storage time rapid determination method using Vis/NIR spectroscopy

    PubMed Central

    Liu, Fuqi; Tang, Xuxiang

    2015-01-01

    Fuji apple storage time rapid determination method using visible/near-infrared (Vis/NIR) spectroscopy was studied in this paper. Vis/NIR diffuse reflection spectroscopy responses to samples were measured for 6 days. Spectroscopy data were processed by stochastic resonance (SR). Principal component analysis (PCA) was utilized to analyze original spectroscopy data and SNR eigen value. Results demonstrated that PCA could not totally discriminate Fuji apples using original spectroscopy data. Signal-to-noise ratio (SNR) spectrum clearly classified all apple samples. PCA using SNR spectrum successfully discriminated apple samples. Therefore, Vis/NIR spectroscopy was effective for Fuji apple storage time rapid discrimination. The proposed method is also promising in condition safety control and management for food and environmental laboratories. PMID:25874818

  19. Geographical classification of Epimedium based on HPLC fingerprint analysis combined with multi-ingredients quantitative analysis.

    PubMed

    Xu, Ning; Zhou, Guofu; Li, Xiaojuan; Lu, Heng; Meng, Fanyun; Zhai, Huaqiang

    2017-05-01

    A reliable and comprehensive method for identifying the origin and assessing the quality of Epimedium has been developed. The method is based on analysis of HPLC fingerprints, combined with similarity analysis, hierarchical cluster analysis (HCA), principal component analysis (PCA) and multi-ingredient quantitative analysis. Nineteen batches of Epimedium, collected from different areas in the western regions of China, were used to establish the fingerprints and 18 peaks were selected for the analysis. Similarity analysis, HCA and PCA all classified the 19 areas into three groups. Simultaneous quantification of the five major bioactive ingredients in the Epimedium samples was also carried out to confirm the consistency of the quality tests. These methods were successfully used to identify the geographical origin of the Epimedium samples and to evaluate their quality. Copyright © 2016 John Wiley & Sons, Ltd.

  20. Survey to Identify Substandard and Falsified Tablets in Several Asian Countries with Pharmacopeial Quality Control Tests and Principal Component Analysis of Handheld Raman Spectroscopy.

    PubMed

    Kakio, Tomoko; Nagase, Hitomi; Takaoka, Takashi; Yoshida, Naoko; Hirakawa, Junichi; Macha, Susan; Hiroshima, Takashi; Ikeda, Yukihiro; Tsuboi, Hirohito; Kimura, Kazuko

    2018-06-01

    The World Health Organization has warned that substandard and falsified medical products (SFs) can harm patients and fail to treat the diseases for which they were intended, and they affect every region of the world, leading to loss of confidence in medicines, health-care providers, and health systems. Therefore, development of analytical procedures to detect SFs is extremely important. In this study, we investigated the quality of pharmaceutical tablets containing the antihypertensive candesartan cilexetil, collected in China, Indonesia, Japan, and Myanmar, using the Japanese pharmacopeial analytical procedures for quality control, together with principal component analysis (PCA) of Raman spectrum obtained with handheld Raman spectrometer. Some samples showed delayed dissolution and failed to meet the pharmacopeial specification, whereas others failed the assay test. These products appeared to be substandard. Principal component analysis showed that all Raman spectra could be explained in terms of two components: the amount of the active pharmaceutical ingredient and the kinds of excipients. Principal component analysis score plot indicated one substandard, and the falsified tablets have similar principal components in Raman spectra, in contrast to authentic products. The locations of samples within the PCA score plot varied according to the source country, suggesting that manufacturers in different countries use different excipients. Our results indicate that the handheld Raman device will be useful for detection of SFs in the field. Principal component analysis of that Raman data clarify the difference in chemical properties between good quality products and SFs that circulate in the Asian market.

  1. The Application of Principal Component Analysis Using Fixed Eigenvectors to the Infrared Thermographic Inspection of the Space Shuttle Thermal Protection System

    NASA Technical Reports Server (NTRS)

    Cramer, K. Elliott; Winfree, William P.

    2006-01-01

    The Nondestructive Evaluation Sciences Branch at NASA s Langley Research Center has been actively involved in the development of thermographic inspection techniques for more than 15 years. Since the Space Shuttle Columbia accident, NASA has focused on the improvement of advanced NDE techniques for the Reinforced Carbon-Carbon (RCC) panels that comprise the orbiter s wing leading edge. Various nondestructive inspection techniques have been used in the examination of the RCC, but thermography has emerged as an effective inspection alternative to more traditional methods. Thermography is a non-contact inspection method as compared to ultrasonic techniques which typically require the use of a coupling medium between the transducer and material. Like radiographic techniques, thermography can be used to inspect large areas, but has the advantage of minimal safety concerns and the ability for single-sided measurements. Principal Component Analysis (PCA) has been shown effective for reducing thermographic NDE data. A typical implementation of PCA is when the eigenvectors are generated from the data set being analyzed. Although it is a powerful tool for enhancing the visibility of defects in thermal data, PCA can be computationally intense and time consuming when applied to the large data sets typical in thermography. Additionally, PCA can experience problems when very large defects are present (defects that dominate the field-of-view), since the calculation of the eigenvectors is now governed by the presence of the defect, not the good material. To increase the processing speed and to minimize the negative effects of large defects, an alternative method of PCA is being pursued when a fixed set of eigenvectors is used to process the thermal data from the RCC materials. These eigen vectors can be generated either from an analytic model of the thermal response of the material under examination, or from a large cross section of experimental data. This paper will provide the details of the analytic model; an overview of the PCA process; as well as a quantitative signal-to-noise comparison of the results of performing both embodiments of PCA on thermographic data from various RCC specimens. Details of a system that has been developed to allow insitu inspection of a majority of shuttle RCC components will be presented along with the acceptance test results for this system. Additionally, the results of applying this technology to the Space Shuttle Discovery after its return from flight will be presented.

  2. Functional Data Analysis in NTCP Modeling: A New Method to Explore the Radiation Dose-Volume Effects

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Benadjaoud, Mohamed Amine, E-mail: mohamedamine.benadjaoud@gustaveroussy.fr; Université Paris sud, Le Kremlin-Bicêtre; Institut Gustave Roussy, Villejuif

    2014-11-01

    Purpose/Objective(s): To describe a novel method to explore radiation dose-volume effects. Functional data analysis is used to investigate the information contained in differential dose-volume histograms. The method is applied to the normal tissue complication probability modeling of rectal bleeding (RB) for patients irradiated in the prostatic bed by 3-dimensional conformal radiation therapy. Methods and Materials: Kernel density estimation was used to estimate the individual probability density functions from each of the 141 rectum differential dose-volume histograms. Functional principal component analysis was performed on the estimated probability density functions to explore the variation modes in the dose distribution. The functional principalmore » components were then tested for association with RB using logistic regression adapted to functional covariates (FLR). For comparison, 3 other normal tissue complication probability models were considered: the Lyman-Kutcher-Burman model, logistic model based on standard dosimetric parameters (LM), and logistic model based on multivariate principal component analysis (PCA). Results: The incidence rate of grade ≥2 RB was 14%. V{sub 65Gy} was the most predictive factor for the LM (P=.058). The best fit for the Lyman-Kutcher-Burman model was obtained with n=0.12, m = 0.17, and TD50 = 72.6 Gy. In PCA and FLR, the components that describe the interdependence between the relative volumes exposed at intermediate and high doses were the most correlated to the complication. The FLR parameter function leads to a better understanding of the volume effect by including the treatment specificity in the delivered mechanistic information. For RB grade ≥2, patients with advanced age are significantly at risk (odds ratio, 1.123; 95% confidence interval, 1.03-1.22), and the fits of the LM, PCA, and functional principal component analysis models are significantly improved by including this clinical factor. Conclusion: Functional data analysis provides an attractive method for flexibly estimating the dose-volume effect for normal tissues in external radiation therapy.« less

  3. Principal component analysis of PiB distribution in Parkinson and Alzheimer diseases

    PubMed Central

    Markham, Joanne; Flores, Hubert; Hartlein, Johanna M.; Goate, Alison M.; Cairns, Nigel J.; Videen, Tom O.; Perlmutter, Joel S.

    2013-01-01

    Objective: To use principal component analyses (PCA) of Pittsburgh compound B (PiB) PET imaging to determine whether the pattern of in vivo β-amyloid (Aβ) in Parkinson disease (PD) with cognitive impairment is similar to the pattern found in symptomatic Alzheimer disease (AD). Methods: PiB PET scans were obtained from participants with PD with cognitive impairment (n = 53), participants with symptomatic AD (n = 35), and age-matched controls (n = 67). All were assessed using the Clinical Dementia Rating and APOE genotype was determined in 137 participants. PCA was used to 1) determine the PiB binding pattern in AD, 2) determine a possible unique PD pattern, and 3) directly compare the PiB binding patterns in PD and AD groups. Results: The first 2 principal components (PC1 and PC2) significantly separated the AD and control participants (p < 0.001). Participants with PD with cognitive impairment also were significantly different from participants with symptomatic AD on both components (p < 0.001). However, there was no difference between PD and controls on either component. Even those participants with PD with elevated mean cortical binding potentials were significantly different from participants with AD on both components. Conclusion: Using PCA, we demonstrated that participants with PD with cognitive impairment do not exhibit the same PiB binding pattern as participants with AD. These data suggest that Aβ deposition may play a different pathophysiologic role in the cognitive impairment of PD compared to that in AD. PMID:23825179

  4. Complex numbers in chemometrics: examples from multivariate impedance measurements on lipid monolayers.

    PubMed

    Geladi, Paul; Nelson, Andrew; Lindholm-Sethson, Britta

    2007-07-09

    Electrical impedance gives multivariate complex number data as results. Two examples of multivariate electrical impedance data measured on lipid monolayers in different solutions give rise to matrices (16x50 and 38x50) of complex numbers. Multivariate data analysis by principal component analysis (PCA) or singular value decomposition (SVD) can be used for complex data and the necessary equations are given. The scores and loadings obtained are vectors of complex numbers. It is shown that the complex number PCA and SVD are better at concentrating information in a few components than the naïve juxtaposition method and that Argand diagrams can replace score and loading plots. Different concentrations of Magainin and Gramicidin A give different responses and also the role of the electrolyte medium can be studied. An interaction of Gramicidin A in the solution with the monolayer over time can be observed.

  5. Discrimination of selected species of pathogenic bacteria using near-infrared Raman spectroscopy and principal components analysis

    NASA Astrophysics Data System (ADS)

    de Siqueira e Oliveira, Fernanda S.; Giana, Hector E.; Silveira, Landulfo, Jr.

    2012-03-01

    It has been proposed a method based on Raman spectroscopy for identification of different microorganisms involved in bacterial urinary tract infections. Spectra were collected from different bacterial colonies (Gram negative: E. coli, K. pneumoniae, P. mirabilis, P. aeruginosa, E. cloacae and Gram positive: S. aureus and Enterococcus sp.), grown in culture medium (Agar), using a Raman spectrometer with a fiber Raman probe (830 nm). Colonies were scraped from Agar surface placed in an aluminum foil for Raman measurements. After pre-processing, spectra were submitted to a Principal Component Analysis and Mahalanobis distance (PCA/MD) discrimination algorithm. It has been found that the mean Raman spectra of different bacterial species show similar bands, being the S. aureus well characterized by strong bands related to carotenoids. PCA/MD could discriminate Gram positive bacteria with sensitivity and specificity of 100% and Gram negative bacteria with good sensitivity and high specificity.

  6. Migration of styrene and ethylbenzene from virgin and recycled expanded polystyrene containers and discrimination of these two kinds of polystyrene by principal component analysis.

    PubMed

    Lin, Qin-Bao; Song, Xue-Chao; Fang, Hong; Wu, Yu-Mei; Wang, Zhi-Wei

    2017-01-01

    The migration of styrene and ethylbenzene from virgin and recycled expanded polystyrene (EPS) containers into isooctane was investigated using gas chromatography-mass spectrometry (GC-MS). EPS containers were in two-sided contact with isooctane at temperatures of 25 and 40°C. It was shown that recycled EPS gave greater migration ratios compared with virgin EPS, which indicated that styrene and ethylbenzene migrated more easily from recycled EPS. In addition, an analytical method to distinguish between virgin and recycled EPS containers was established by GC-MS followed by principal component analysis (PCA). The relative peak area of the identified compounds was used as input data for PCA. Distinct separation between virgin and recycled EPS was achieved on a score plot. Extension of this method to other plastics may be of great interest for recycled plastics identification.

  7. Face-space architectures: evidence for the use of independent color-based features.

    PubMed

    Nestor, Adrian; Plaut, David C; Behrmann, Marlene

    2013-07-01

    The concept of psychological face space lies at the core of many theories of face recognition and representation. To date, much of the understanding of face space has been based on principal component analysis (PCA); the structure of the psychological space is thought to reflect some important aspects of a physical face space characterized by PCA applications to face images. In the present experiments, we investigated alternative accounts of face space and found that independent component analysis provided the best fit to human judgments of face similarity and identification. Thus, our results challenge an influential approach to the study of human face space and provide evidence for the role of statistically independent features in face encoding. In addition, our findings support the use of color information in the representation of facial identity, and we thus argue for the inclusion of such information in theoretical and computational constructs of face space.

  8. Classification of narcotics in solid mixtures using principal component analysis and Raman spectroscopy.

    PubMed

    Ryder, Alan G

    2002-03-01

    Eighty-five solid samples consisting of illegal narcotics diluted with several different materials were analyzed by near-infrared (785 nm excitation) Raman spectroscopy. Principal Component Analysis (PCA) was employed to classify the samples according to narcotic type. The best sample discrimination was obtained by using the first derivative of the Raman spectra. Furthermore, restricting the spectral variables for PCA to 2 or 3% of the original spectral data according to the most intense peaks in the Raman spectrum of the pure narcotic resulted in a rapid discrimination method for classifying samples according to narcotic type. This method allows for the easy discrimination between cocaine, heroin, and MDMA mixtures even when the Raman spectra are complex or very similar. This approach of restricting the spectral variables also decreases the computational time by a factor of 30 (compared to the complete spectrum), making the methodology attractive for rapid automatic classification and identification of suspect materials.

  9. An improved principal component analysis based region matching method for fringe direction estimation

    NASA Astrophysics Data System (ADS)

    He, A.; Quan, C.

    2018-04-01

    The principal component analysis (PCA) and region matching combined method is effective for fringe direction estimation. However, its mask construction algorithm for region matching fails in some circumstances, and the algorithm for conversion of orientation to direction in mask areas is computationally-heavy and non-optimized. We propose an improved PCA based region matching method for the fringe direction estimation, which includes an improved and robust mask construction scheme, and a fast and optimized orientation-direction conversion algorithm for the mask areas. Along with the estimated fringe direction map, filtered fringe pattern by automatic selective reconstruction modification and enhanced fast empirical mode decomposition (ASRm-EFEMD) is used for Hilbert spiral transform (HST) to demodulate the phase. Subsequently, windowed Fourier ridge (WFR) method is used for the refinement of the phase. The robustness and effectiveness of proposed method are demonstrated by both simulated and experimental fringe patterns.

  10. Determination of butter adulteration with margarine using Raman spectroscopy.

    PubMed

    Uysal, Reyhan Selin; Boyaci, Ismail Hakki; Genis, Hüseyin Efe; Tamer, Ugur

    2013-12-15

    In this study, adulteration of butter with margarine was analysed using Raman spectroscopy combined with chemometric methods (principal component analysis (PCA), principal component regression (PCR), partial least squares (PLS)) and artificial neural networks (ANNs). Different butter and margarine samples were mixed at various concentrations ranging from 0% to 100% w/w. PCA analysis was applied for the classification of butters, margarines and mixtures. PCR, PLS and ANN were used for the detection of adulteration ratios of butter. Models were created using a calibration data set and developed models were evaluated using a validation data set. The coefficient of determination (R(2)) values between actual and predicted values obtained for PCR, PLS and ANN for the validation data set were 0.968, 0.987 and 0.978, respectively. In conclusion, a combination of Raman spectroscopy with chemometrics and ANN methods can be applied for testing butter adulteration. Copyright © 2013 Elsevier Ltd. All rights reserved.

  11. LARVAL FISH DIVERSITY IN SUISAN MARSH, CALIFORNIA: ARE INTERMEDIATE FLOWS THE BEST?

    EPA Science Inventory

    We sampled larval fish in Suisun Marsh, in the San Francisco Bay estuary from February to June 1995-1999. We used principal components analysis (PCA) and canonical correspondence analysis (CCA) on 13 taxonomic groups making up 99.7% of the catch and several environmental variable...

  12. Evaluation of Meterorite Amono Acid Analysis Data Using Multivariate Techniques

    NASA Technical Reports Server (NTRS)

    McDonald, G.; Storrie-Lombardi, M.; Nealson, K.

    1999-01-01

    The amino acid distributions in the Murchison carbonaceous chondrite, Mars meteorite ALH84001, and ice from the Allan Hills region of Antarctica are shown, using a multivariate technique known as Principal Component Analysis (PCA), to be statistically distinct from the average amino acid compostion of 101 terrestrial protein superfamilies.

  13. An improved PCA method with application to boiler leak detection.

    PubMed

    Sun, Xi; Marquez, Horacio J; Chen, Tongwen; Riaz, Muhammad

    2005-07-01

    Principal component analysis (PCA) is a popular fault detection technique. It has been widely used in process industries, especially in the chemical industry. In industrial applications, achieving a sensitive system capable of detecting incipient faults, which maintains the false alarm rate to a minimum, is a crucial issue. Although a lot of research has been focused on these issues for PCA-based fault detection and diagnosis methods, sensitivity of the fault detection scheme versus false alarm rate continues to be an important issue. In this paper, an improved PCA method is proposed to address this problem. In this method, a new data preprocessing scheme and a new fault detection scheme designed for Hotelling's T2 as well as the squared prediction error are developed. A dynamic PCA model is also developed for boiler leak detection. This new method is applied to boiler water/steam leak detection with real data from Syncrude Canada's utility plant in Fort McMurray, Canada. Our results demonstrate that the proposed method can effectively reduce false alarm rate, provide effective and correct leak alarms, and give early warning to operators.

  14. Weak acid extractable metals in Bramble Bay, Queensland, Australia: temporal behaviour, enrichment and source apportionment.

    PubMed

    Brady, James P; Ayoko, Godwin A; Martens, Wayde N; Goonetilleke, Ashantha

    2015-02-15

    Sediment samples were taken from six sampling sites in Bramble Bay, Queensland, Australia between February and November in 2012. They were analysed for a range of heavy metals including Al, Fe, Mn, Ti, Ce, Th, U, V, Cr, Co, Ni, Cu, Zn, As, Cd, Sb, Te, Hg, Tl and Pb. Fraction analysis, Enrichment Factors and Principal Component Analysis-Absolute Principal Component Scores (PCA-APCS) were carried out in order to assess metal pollution, potential bioavailability and source apportionment. Cr and Ni exceeded the Australian Interim Sediment Quality Guidelines at some sampling sites, while Hg was found to be the most enriched metal. Fraction analysis identified increased weak acid soluble Hg and Cd during the sampling period. Source apportionment via PCA-APCS found four sources of metals pollution, namely, marine sediments, shipping, antifouling coatings and a mixed source. These sources need to be considered in any metal pollution control measure within Bramble Bay. Copyright © 2014 Elsevier Ltd. All rights reserved.

  15. Performance analysis of a Principal Component Analysis ensemble classifier for Emotiv headset P300 spellers.

    PubMed

    Elsawy, Amr S; Eldawlatly, Seif; Taher, Mohamed; Aly, Gamal M

    2014-01-01

    The current trend to use Brain-Computer Interfaces (BCIs) with mobile devices mandates the development of efficient EEG data processing methods. In this paper, we demonstrate the performance of a Principal Component Analysis (PCA) ensemble classifier for P300-based spellers. We recorded EEG data from multiple subjects using the Emotiv neuroheadset in the context of a classical oddball P300 speller paradigm. We compare the performance of the proposed ensemble classifier to the performance of traditional feature extraction and classifier methods. Our results demonstrate the capability of the PCA ensemble classifier to classify P300 data recorded using the Emotiv neuroheadset with an average accuracy of 86.29% on cross-validation data. In addition, offline testing of the recorded data reveals an average classification accuracy of 73.3% that is significantly higher than that achieved using traditional methods. Finally, we demonstrate the effect of the parameters of the P300 speller paradigm on the performance of the method.

  16. PCA meets RG

    NASA Astrophysics Data System (ADS)

    Bradde, Serena; Bialek, William

    A system with many degrees of freedom can be characterized by a covariance matrix; principal components analysis (PCA) focuses on the eigenvalues of this matrix, hoping to find a lower dimensional description. But when the spectrum is nearly continuous, any distinction between components that we keep and those that we ignore becomes arbitrary; it then is natural to ask what happens as we vary this arbitrary cutoff. We argue that this problem is analogous to the momentum shell renormalization group (RG). Following this analogy, we can define relevant and irrelevant operators, where the role of dimensionality is played by properties of the eigenvalue density. These results also suggest an approach to the analysis of real data. As an example, we study neural activity in the vertebrate retina as it responds to naturalistic movies, and find evidence of behavior controlled by a nontrivial fixed point. Applied to financial data, our analysis separates modes dominated by sampling noise from a smaller but still macroscopic number of modes described by a non-Gaussian distribution.

  17. Tool Wear Prediction in Ti-6Al-4V Machining through Multiple Sensor Monitoring and PCA Features Pattern Recognition.

    PubMed

    Caggiano, Alessandra

    2018-03-09

    Machining of titanium alloys is characterised by extremely rapid tool wear due to the high cutting temperature and the strong adhesion at the tool-chip and tool-workpiece interface, caused by the low thermal conductivity and high chemical reactivity of Ti alloys. With the aim to monitor the tool conditions during dry turning of Ti-6Al-4V alloy, a machine learning procedure based on the acquisition and processing of cutting force, acoustic emission and vibration sensor signals during turning is implemented. A number of sensorial features are extracted from the acquired sensor signals in order to feed machine learning paradigms based on artificial neural networks. To reduce the large dimensionality of the sensorial features, an advanced feature extraction methodology based on Principal Component Analysis (PCA) is proposed. PCA allowed to identify a smaller number of features ( k = 2 features), the principal component scores, obtained through linear projection of the original d features into a new space with reduced dimensionality k = 2, sufficient to describe the variance of the data. By feeding artificial neural networks with the PCA features, an accurate diagnosis of tool flank wear ( VB max ) was achieved, with predicted values very close to the measured tool wear values.

  18. Tool Wear Prediction in Ti-6Al-4V Machining through Multiple Sensor Monitoring and PCA Features Pattern Recognition

    PubMed Central

    2018-01-01

    Machining of titanium alloys is characterised by extremely rapid tool wear due to the high cutting temperature and the strong adhesion at the tool-chip and tool-workpiece interface, caused by the low thermal conductivity and high chemical reactivity of Ti alloys. With the aim to monitor the tool conditions during dry turning of Ti-6Al-4V alloy, a machine learning procedure based on the acquisition and processing of cutting force, acoustic emission and vibration sensor signals during turning is implemented. A number of sensorial features are extracted from the acquired sensor signals in order to feed machine learning paradigms based on artificial neural networks. To reduce the large dimensionality of the sensorial features, an advanced feature extraction methodology based on Principal Component Analysis (PCA) is proposed. PCA allowed to identify a smaller number of features (k = 2 features), the principal component scores, obtained through linear projection of the original d features into a new space with reduced dimensionality k = 2, sufficient to describe the variance of the data. By feeding artificial neural networks with the PCA features, an accurate diagnosis of tool flank wear (VBmax) was achieved, with predicted values very close to the measured tool wear values. PMID:29522443

  19. [Research on fast classification based on LIBS technology and principle component analyses].

    PubMed

    Yu, Qi; Ma, Xiao-Hong; Wang, Rui; Zhao, Hua-Feng

    2014-11-01

    Laser-induced breakdown spectroscopy (LIBS) and the principle component analysis (PCA) were combined to study aluminum alloy classification in the present article. Classification experiments were done on thirteen different kinds of standard samples of aluminum alloy which belong to 4 different types, and the results suggested that the LIBS-PCA method can be used to aluminum alloy fast classification. PCA was used to analyze the spectrum data from LIBS experiments, three principle components were figured out that contribute the most, the principle component scores of the spectrums were calculated, and the scores of the spectrums data in three-dimensional coordinates were plotted. It was found that the spectrum sample points show clear convergence phenomenon according to the type of aluminum alloy they belong to. This result ensured the three principle components and the preliminary aluminum alloy type zoning. In order to verify its accuracy, 20 different aluminum alloy samples were used to do the same experiments to verify the aluminum alloy type zoning. The experimental result showed that the spectrum sample points all located in their corresponding area of the aluminum alloy type, and this proved the correctness of the earlier aluminum alloy standard sample type zoning method. Based on this, the identification of unknown type of aluminum alloy can be done. All the experimental results showed that the accuracy of principle component analyses method based on laser-induced breakdown spectroscopy is more than 97.14%, and it can classify the different type effectively. Compared to commonly used chemical methods, laser-induced breakdown spectroscopy can do the detection of the sample in situ and fast with little sample preparation, therefore, using the method of the combination of LIBS and PCA in the areas such as quality testing and on-line industrial controlling can save a lot of time and cost, and improve the efficiency of detection greatly.

  20. Automated Classification and Analysis of Non-metallic Inclusion Data Sets

    NASA Astrophysics Data System (ADS)

    Abdulsalam, Mohammad; Zhang, Tongsheng; Tan, Jia; Webler, Bryan A.

    2018-05-01

    The aim of this study is to utilize principal component analysis (PCA), clustering methods, and correlation analysis to condense and examine large, multivariate data sets produced from automated analysis of non-metallic inclusions. Non-metallic inclusions play a major role in defining the properties of steel and their examination has been greatly aided by automated analysis in scanning electron microscopes equipped with energy dispersive X-ray spectroscopy. The methods were applied to analyze inclusions on two sets of samples: two laboratory-scale samples and four industrial samples from a near-finished 4140 alloy steel components with varying machinability. The laboratory samples had well-defined inclusions chemistries, composed of MgO-Al2O3-CaO, spinel (MgO-Al2O3), and calcium aluminate inclusions. The industrial samples contained MnS inclusions as well as (Ca,Mn)S + calcium aluminate oxide inclusions. PCA could be used to reduce inclusion chemistry variables to a 2D plot, which revealed inclusion chemistry groupings in the samples. Clustering methods were used to automatically classify inclusion chemistry measurements into groups, i.e., no user-defined rules were required.

  1. SU-E-I-58: Objective Models of Breast Shape Undergoing Mammography and Tomosynthesis Using Principal Component Analysis.

    PubMed

    Feng, Ssj; Sechopoulos, I

    2012-06-01

    To develop an objective model of the shape of the compressed breast undergoing mammographic or tomosynthesis acquisition. Automated thresholding and edge detection was performed on 984 anonymized digital mammograms (492 craniocaudal (CC) view mammograms and 492 medial lateral oblique (MLO) view mammograms), to extract the edge of each breast. Principal Component Analysis (PCA) was performed on these edge vectors to identify a limited set of parameters and eigenvectors that. These parameters and eigenvectors comprise a model that can be used to describe the breast shapes present in acquired mammograms and to generate realistic models of breasts undergoing acquisition. Sample breast shapes were then generated from this model and evaluated. The mammograms in the database were previously acquired for a separate study and authorized for use in further research. The PCA successfully identified two principal components and their corresponding eigenvectors, forming the basis for the breast shape model. The simulated breast shapes generated from the model are reasonable approximations of clinically acquired mammograms. Using PCA, we have obtained models of the compressed breast undergoing mammographic or tomosynthesis acquisition based on objective analysis of a large image database. Up to now, the breast in the CC view has been approximated as a semi-circular tube, while there has been no objectively-obtained model for the MLO view breast shape. Such models can be used for various breast imaging research applications, such as x-ray scatter estimation and correction, dosimetry estimates, and computer-aided detection and diagnosis. © 2012 American Association of Physicists in Medicine.

  2. A multifaceted independent performance analysis of facial subspace recognition algorithms.

    PubMed

    Bajwa, Usama Ijaz; Taj, Imtiaz Ahmad; Anwar, Muhammad Waqas; Wang, Xuan

    2013-01-01

    Face recognition has emerged as the fastest growing biometric technology and has expanded a lot in the last few years. Many new algorithms and commercial systems have been proposed and developed. Most of them use Principal Component Analysis (PCA) as a base for their techniques. Different and even conflicting results have been reported by researchers comparing these algorithms. The purpose of this study is to have an independent comparative analysis considering both performance and computational complexity of six appearance based face recognition algorithms namely PCA, 2DPCA, A2DPCA, (2D)(2)PCA, LPP and 2DLPP under equal working conditions. This study was motivated due to the lack of unbiased comprehensive comparative analysis of some recent subspace methods with diverse distance metric combinations. For comparison with other studies, FERET, ORL and YALE databases have been used with evaluation criteria as of FERET evaluations which closely simulate real life scenarios. A comparison of results with previous studies is performed and anomalies are reported. An important contribution of this study is that it presents the suitable performance conditions for each of the algorithms under consideration.

  3. Sparse PCA corrects for cell type heterogeneity in epigenome-wide association studies.

    PubMed

    Rahmani, Elior; Zaitlen, Noah; Baran, Yael; Eng, Celeste; Hu, Donglei; Galanter, Joshua; Oh, Sam; Burchard, Esteban G; Eskin, Eleazar; Zou, James; Halperin, Eran

    2016-05-01

    In epigenome-wide association studies (EWAS), different methylation profiles of distinct cell types may lead to false discoveries. We introduce ReFACTor, a method based on principal component analysis (PCA) and designed for the correction of cell type heterogeneity in EWAS. ReFACTor does not require knowledge of cell counts, and it provides improved estimates of cell type composition, resulting in improved power and control for false positives in EWAS. Corresponding software is available at http://www.cs.tau.ac.il/~heran/cozygene/software/refactor.html.

  4. Evaluating motion processing algorithms for use with functional near-infrared spectroscopy data from young children.

    PubMed

    Delgado Reyes, Lourdes M; Bohache, Kevin; Wijeakumar, Sobanawartiny; Spencer, John P

    2018-04-01

    Motion artifacts are often a significant component of the measured signal in functional near-infrared spectroscopy (fNIRS) experiments. A variety of methods have been proposed to address this issue, including principal components analysis (PCA), correlation-based signal improvement (CBSI), wavelet filtering, and spline interpolation. The efficacy of these techniques has been compared using simulated data; however, our understanding of how these techniques fare when dealing with task-based cognitive data is limited. Brigadoi et al. compared motion correction techniques in a sample of adult data measured during a simple cognitive task. Wavelet filtering showed the most promise as an optimal technique for motion correction. Given that fNIRS is often used with infants and young children, it is critical to evaluate the effectiveness of motion correction techniques directly with data from these age groups. This study addresses that problem by evaluating motion correction algorithms implemented in HomER2. The efficacy of each technique was compared quantitatively using objective metrics related to the physiological properties of the hemodynamic response. Results showed that targeted PCA (tPCA), spline, and CBSI retained a higher number of trials. These techniques also performed well in direct head-to-head comparisons with the other approaches using quantitative metrics. The CBSI method corrected many of the artifacts present in our data; however, this approach produced sometimes unstable HRFs. The targeted PCA and spline methods proved to be the most robust, performing well across all comparison metrics. When compared head to head, tPCA consistently outperformed spline. We conclude, therefore, that tPCA is an effective technique for correcting motion artifacts in fNIRS data from young children.

  5. Principal component analysis of socioeconomic factors and their association with malaria and arbovirus risk in Tanzania: a sensitivity analysis.

    PubMed

    Homenauth, Esha; Kajeguka, Debora; Kulkarni, Manisha A

    2017-11-01

    Principal component analysis (PCA) is frequently adopted for creating socioeconomic proxies in order to investigate the independent effects of wealth on disease status. The guidelines and methods for the creation of these proxies are well described and validated. The Demographic and Health Survey, World Health Survey and the Living Standards Measurement Survey are examples of large data sets that use PCA to create wealth indices particularly in low and middle-income countries (LMIC), where quantifying wealth-disease associations is problematic due to the unavailability of reliable income and expenditure data. However, the application of this method to smaller survey data sets, especially in rural LMIC settings, is less rigorously studied.In this paper, we aimed to highlight some of these issues by investigating the association of derived wealth indices using PCA on risk of vector-borne disease infection in Tanzania focusing on malaria and key arboviruses (ie, dengue and chikungunya). We demonstrated that indices consisting of subsets of socioeconomic indicators provided the least methodologically flawed representations of household wealth compared with an index that combined all socioeconomic variables. These results suggest that the choice of the socioeconomic indicators included in a wealth proxy can influence the relative position of households in the overall wealth hierarchy, and subsequently the strength of disease associations. This can, therefore, influence future resource planning activities and should be considered among investigators who use a PCA-derived wealth index based on community-level survey data to influence programme or policy decisions in rural LMIC settings. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  6. Psychometric properties of the children's sleep habits questionnaire in children with autism spectrum disorder.

    PubMed

    Johnson, Cynthia R; DeMand, Alexandra; Lecavalier, Luc; Smith, Tristram; Aman, Michael; Foldes, Emily; Scahill, Lawrence

    2016-04-01

    Sleep disturbances in autism spectrum disorder (ASD) are very common. Psychometrically sound instruments are essential to assess these disturbances. Children's Sleep Habit Questionnaire (CSHQ) is a widely used measure in ASD. The purpose of this study was to explore the psychometric properties of the CSHQ in a sample of children with ASD. Parents/caregivers of 310 children (mean age: 4.7) with ASD completed the CSHQ at study enrollment. Correlations between intelligence quotient (IQ) scores and the original CSHQ scales were calculated. Item endorsement frequencies and percentages were also calculated. A principal component analysis (PCA) was performed, and internal consistency was assessed for the newly extracted components. Correlations between IQ scores and CSHQ subscales and total scores ranged from .015 to .001 suggesting a weak, if any, association. Item endorsement frequencies were high for bedtime resistance items, but lower for parasomnia and sleep-disordered breathing items. A PCA suggested that a five-component solution best fits the data. Internal consistency of the newly extracted five components ranged α = .87-.50. Item endorsement frequencies were highest for bedtime resistance items. A PCA suggested a five-component solution. Three of the five components (Sleep Routine Problems, Insufficient Sleep, and Sleep-onset Association Problems) were types of sleep disturbances commonly reported in ASD, but the other two components (Parasomnia/Sleep-disordered Breathing and Sleep Anxiety) were less clear. Internal consistencies ranged from mediocre to good. Further development of this measure for use in children with ASD is encouraged. Copyright © 2016 Elsevier B.V. All rights reserved.

  7. Integrative sparse principal component analysis of gene expression data.

    PubMed

    Liu, Mengque; Fan, Xinyan; Fang, Kuangnan; Zhang, Qingzhao; Ma, Shuangge

    2017-12-01

    In the analysis of gene expression data, dimension reduction techniques have been extensively adopted. The most popular one is perhaps the PCA (principal component analysis). To generate more reliable and more interpretable results, the SPCA (sparse PCA) technique has been developed. With the "small sample size, high dimensionality" characteristic of gene expression data, the analysis results generated from a single dataset are often unsatisfactory. Under contexts other than dimension reduction, integrative analysis techniques, which jointly analyze the raw data of multiple independent datasets, have been developed and shown to outperform "classic" meta-analysis and other multidatasets techniques and single-dataset analysis. In this study, we conduct integrative analysis by developing the iSPCA (integrative SPCA) method. iSPCA achieves the selection and estimation of sparse loadings using a group penalty. To take advantage of the similarity across datasets and generate more accurate results, we further impose contrasted penalties. Different penalties are proposed to accommodate different data conditions. Extensive simulations show that iSPCA outperforms the alternatives under a wide spectrum of settings. The analysis of breast cancer and pancreatic cancer data further shows iSPCA's satisfactory performance. © 2017 WILEY PERIODICALS, INC.

  8. [Analysis on component difference in Citrus reticulata before and after being processed with salt by UPLC-Q-TOF/MS].

    PubMed

    Zeng, Rui; Fu, Juan; Wu, La-Bin; Huang, Lin-Fang

    2013-07-01

    To analyze components of Citrus reticulata and salt-processed C. reticulata by ultra-performance liquid chromatography coupled with quadrupole-time-of-flight mass spectrometry (UPLC-Q-TOF/MS), and compared the changes in components before and after being processed with salt. Principal component analysis (PCA) and partial least squares discriminant analysis (OPLS-DA) were adopted to analyze the difference in fingerprint between crude and processed C. reticulata, showing increased content of eriocitrin, limonin, nomilin and obacunone increase in salt-processed C. reticulata. Potential chemical markers were identified as limonin, obacunone and nomilin, which could be used for distinguishing index components of crude and processed C. reticulata.

  9. Comparison of receptor models for source apportionment of the PM10 in Zaragoza (Spain).

    PubMed

    Callén, M S; de la Cruz, M T; López, J M; Navarro, M V; Mastral, A M

    2009-08-01

    Receptor models are useful to understand the chemical and physical characteristics of air pollutants by identifying their sources and by estimating contributions of each source to receptor concentrations. In this work, three receptor models based on principal component analysis with absolute principal component scores (PCA-APCS), Unmix and positive matrix factorization (PMF) were applied to study for the first time the apportionment of the airborne particulate matter less or equal than 10microm (PM10) in Zaragoza, Spain, during 1year sampling campaign (2003-2004). The PM10 samples were characterized regarding their concentrations in inorganic components: trace elements and ions and also organic components: polycyclic aromatic hydrocarbons (PAH) not only in the solid phase but also in the gas phase. A comparison of the three receptor models was carried out in order to do a more robust characterization of the PM10. The three models predicted that the major sources of PM10 in Zaragoza were related to natural sources (60%, 75% and 47%, respectively, for PCA-APCS, Unmix and PMF) although anthropogenic sources also contributed to PM10 (28%, 25% and 39%). With regard to the anthropogenic sources, while PCA and PMF allowed high discrimination in the sources identification associated with different combustion sources such as traffic and industry, fossil fuel, biomass and fuel-oil combustion, heavy traffic and evaporative emissions, the Unmix model only allowed the identification of industry and traffic emissions, evaporative emissions and heavy-duty vehicles. The three models provided good correlations between the experimental and modelled PM10 concentrations with major precision and the closest agreement between the PMF and PCA models.

  10. Chemometrics-based Approach in Analysis of Arnicae flos

    PubMed Central

    Zheleva-Dimitrova, Dimitrina Zh.; Balabanova, Vessela; Gevrenova, Reneta; Doichinova, Irini; Vitkova, Antonina

    2015-01-01

    Introduction: Arnica montana flowers have a long history as herbal medicines for external use on injuries and rheumatic complaints. Objective: To investigate Arnicae flos of cultivated accessions from Bulgaria, Poland, Germany, Finland, and Pharmacy store for phenolic derivatives and sesquiterpene lactones (STLs). Materials and Methods: Samples of Arnica from nine origins were prepared by ultrasound-assisted extraction with 80% methanol for phenolic compounds analysis. Subsequent reverse-phase high-performance liquid chromatography (HPLC) separation of the analytes was performed using gradient elution and ultraviolet detection at 280 and 310 nm (phenolic acids), and 360 nm (flavonoids). Total STLs were determined in chloroform extracts by solid-phase extraction-HPLC at 225 nm. The HPLC generated chromatographic data were analyzed using principal component analysis (PCA) and hierarchical clustering (HC). Results: The highest total amount of phenolic acids was found in the sample from Botanical Garden at Joensuu University, Finland (2.36 mg/g dw). Astragalin, isoquercitrin, and isorhamnetin 3-glucoside were the main flavonol glycosides being present up to 3.37 mg/g (astragalin). Three well-defined clusters were distinguished by PCA and HC. Cluster C1 comprised of the German and Finnish accessions characterized by the highest content of flavonols. Cluster C2 included the Bulgarian and Polish samples presenting a low content of flavonoids. Cluster C3 consisted only of one sample from a pharmacy store. Conclusion: A validated HPLC method for simultaneous determination of phenolic acids, flavonoid glycosides, and aglycones in A. montana flowers was developed. The PCA loading plot showed that quercetin, kaempferol, and isorhamnetin can be used to distinguish different Arnica accessions. SUMMARY A principal component analysis (PCA) on 13 phenolic compounds and total amount of sesquiterpene lactones in Arnicae flos collection tended to cluster the studied 9 accessions into three main groups. The profiles obtained demonstrated that the samples from Germany and Finland are characterized by greater amounts of phenolic derivatives than the Bulgarian and Polish ones. The PCA loading plot showed that quercetin, kaemferol and isorhamnetin can be used to distinguish different arnica accessions. PMID:27013791

  11. Coarse-to-fine markerless gait analysis based on PCA and Gauss-Laguerre decomposition

    NASA Astrophysics Data System (ADS)

    Goffredo, Michela; Schmid, Maurizio; Conforto, Silvia; Carli, Marco; Neri, Alessandro; D'Alessio, Tommaso

    2005-04-01

    Human movement analysis is generally performed through the utilization of marker-based systems, which allow reconstructing, with high levels of accuracy, the trajectories of markers allocated on specific points of the human body. Marker based systems, however, show some drawbacks that can be overcome by the use of video systems applying markerless techniques. In this paper, a specifically designed computer vision technique for the detection and tracking of relevant body points is presented. It is based on the Gauss-Laguerre Decomposition, and a Principal Component Analysis Technique (PCA) is used to circumscribe the region of interest. Results obtained on both synthetic and experimental tests provide significant reduction of the computational costs, with no significant reduction of the tracking accuracy.

  12. Real-time myoelectric control of a multi-fingered hand prosthesis using principal components analysis.

    PubMed

    Matrone, Giulia C; Cipriani, Christian; Carrozza, Maria Chiara; Magenes, Giovanni

    2012-06-15

    In spite of the advances made in the design of dexterous anthropomorphic hand prostheses, these sophisticated devices still lack adequate control interfaces which could allow amputees to operate them in an intuitive and close-to-natural way. In this study, an anthropomorphic five-fingered robotic hand, actuated by six motors, was used as a prosthetic hand emulator to assess the feasibility of a control approach based on Principal Components Analysis (PCA), specifically conceived to address this problem. Since it was demonstrated elsewhere that the first two principal components (PCs) can describe the whole hand configuration space sufficiently well, the controller here employed reverted the PCA algorithm and allowed to drive a multi-DoF hand by combining a two-differential channels EMG input with these two PCs. Hence, the novelty of this approach stood in the PCA application for solving the challenging problem of best mapping the EMG inputs into the degrees of freedom (DoFs) of the prosthesis. A clinically viable two DoFs myoelectric controller, exploiting two differential channels, was developed and twelve able-bodied participants, divided in two groups, volunteered to control the hand in simple grasp trials, using forearm myoelectric signals. Task completion rates and times were measured. The first objective (assessed through one group of subjects) was to understand the effectiveness of the approach; i.e., whether it is possible to drive the hand in real-time, with reasonable performance, in different grasps, also taking advantage of the direct visual feedback of the moving hand. The second objective (assessed through a different group) was to investigate the intuitiveness, and therefore to assess statistical differences in the performance throughout three consecutive days. Subjects performed several grasp, transport and release trials with differently shaped objects, by operating the hand with the myoelectric PCA-based controller. Experimental trials showed that the simultaneous use of the two differential channels paradigm was successful. This work demonstrates that the proposed two-DoFs myoelectric controller based on PCA allows to drive in real-time a prosthetic hand emulator into different prehensile patterns with excellent performance. These results open up promising possibilities for the development of intuitive, effective myoelectric hand controllers.

  13. Epileptic seizure detection in EEG signal with GModPCA and support vector machine.

    PubMed

    Jaiswal, Abeg Kumar; Banka, Haider

    2017-01-01

    Epilepsy is one of the most common neurological disorders caused by recurrent seizures. Electroencephalograms (EEGs) record neural activity and can detect epilepsy. Visual inspection of an EEG signal for epileptic seizure detection is a time-consuming process and may lead to human error; therefore, recently, a number of automated seizure detection frameworks were proposed to replace these traditional methods. Feature extraction and classification are two important steps in these procedures. Feature extraction focuses on finding the informative features that could be used for classification and correct decision-making. Therefore, proposing effective feature extraction techniques for seizure detection is of great significance. Principal Component Analysis (PCA) is a dimensionality reduction technique used in different fields of pattern recognition including EEG signal classification. Global modular PCA (GModPCA) is a variation of PCA. In this paper, an effective framework with GModPCA and Support Vector Machine (SVM) is presented for epileptic seizure detection in EEG signals. The feature extraction is performed with GModPCA, whereas SVM trained with radial basis function kernel performed the classification between seizure and nonseizure EEG signals. Seven different experimental cases were conducted on the benchmark epilepsy EEG dataset. The system performance was evaluated using 10-fold cross-validation. In addition, we prove analytically that GModPCA has less time and space complexities as compared to PCA. The experimental results show that EEG signals have strong inter-sub-pattern correlations. GModPCA and SVM have been able to achieve 100% accuracy for the classification between normal and epileptic signals. Along with this, seven different experimental cases were tested. The classification results of the proposed approach were better than were compared the results of some of the existing methods proposed in literature. It is also found that the time and space complexities of GModPCA are less as compared to PCA. This study suggests that GModPCA and SVM could be used for automated epileptic seizure detection in EEG signal.

  14. Gabor-based kernel PCA with fractional power polynomial models for face recognition.

    PubMed

    Liu, Chengjun

    2004-05-01

    This paper presents a novel Gabor-based kernel Principal Component Analysis (PCA) method by integrating the Gabor wavelet representation of face images and the kernel PCA method for face recognition. Gabor wavelets first derive desirable facial features characterized by spatial frequency, spatial locality, and orientation selectivity to cope with the variations due to illumination and facial expression changes. The kernel PCA method is then extended to include fractional power polynomial models for enhanced face recognition performance. A fractional power polynomial, however, does not necessarily define a kernel function, as it might not define a positive semidefinite Gram matrix. Note that the sigmoid kernels, one of the three classes of widely used kernel functions (polynomial kernels, Gaussian kernels, and sigmoid kernels), do not actually define a positive semidefinite Gram matrix either. Nevertheless, the sigmoid kernels have been successfully used in practice, such as in building support vector machines. In order to derive real kernel PCA features, we apply only those kernel PCA eigenvectors that are associated with positive eigenvalues. The feasibility of the Gabor-based kernel PCA method with fractional power polynomial models has been successfully tested on both frontal and pose-angled face recognition, using two data sets from the FERET database and the CMU PIE database, respectively. The FERET data set contains 600 frontal face images of 200 subjects, while the PIE data set consists of 680 images across five poses (left and right profiles, left and right half profiles, and frontal view) with two different facial expressions (neutral and smiling) of 68 subjects. The effectiveness of the Gabor-based kernel PCA method with fractional power polynomial models is shown in terms of both absolute performance indices and comparative performance against the PCA method, the kernel PCA method with polynomial kernels, the kernel PCA method with fractional power polynomial models, the Gabor wavelet-based PCA method, and the Gabor wavelet-based kernel PCA method with polynomial kernels.

  15. New approach for rapid assessment of trophic status of Yellow Sea and East China Sea using easy-to-measure parameters

    NASA Astrophysics Data System (ADS)

    Kong, Xianyu; Liu, Yanfang; Jian, Huimin; Su, Rongguo; Yao, Qingzhen; Shi, Xiaoyong

    2017-10-01

    To realize potential cost savings in coastal monitoring programs and provide timely advice for marine management, there is an urgent need for efficient evaluation tools based on easily measured variables for the rapid and timely assessment of estuarine and offshore eutrophication. In this study, using parallel factor analysis (PARAFAC), principal component analysis (PCA), and discriminant function analysis (DFA) with the trophic index (TRIX) for reference, we developed an approach for rapidly assessing the eutrophication status of coastal waters using easy-to-measure parameters, including chromophoric dissolved organic matter (CDOM), fluorescence excitation-emission matrices, CDOM UV-Vis absorbance, and other water-quality parameters (turbidity, chlorophyll a, and dissolved oxygen). First, we decomposed CDOM excitation-emission matrices (EEMs) by PARAFAC to identify three components. Then, we applied PCA to simplify the complexity of the relationships between the water-quality parameters. Finally, we used the PCA score values as independent variables in DFA to develop a eutrophication assessment model. The developed model yielded classification accuracy rates of 97.1%, 80.5%, 90.3%, and 89.1% for good, moderate, and poor water qualities, and for the overall data sets, respectively. Our results suggest that these easy-to-measure parameters could be used to develop a simple approach for rapid in-situ assessment and monitoring of the eutrophication of estuarine and offshore areas.

  16. Examination of a Social-Networking Site Activities Scale (SNSAS) Using Rasch Analysis

    ERIC Educational Resources Information Center

    Alhaythami, Hassan; Karpinski, Aryn; Kirschner, Paul; Bolden, Edward

    2017-01-01

    This study examined the psychometric properties of a social-networking site (SNS) activities scale (SNSAS) using Rasch Analysis. Items were also examined with Rasch Principal Components Analysis (PCA) and Differential Item Functioning (DIF) across groups of university students (i.e., males and females from the United States [US] and Europe; N =…

  17. Rapid differentiation of Chinese hop varieties (Humulus lupulus) using volatile fingerprinting by HS-SPME-GC-MS combined with multivariate statistical analysis.

    PubMed

    Liu, Zechang; Wang, Liping; Liu, Yumei

    2018-01-18

    Hops impart flavor to beer, with the volatile components characterizing the various hop varieties and qualities. Fingerprinting, especially flavor fingerprinting, is often used to identify 'flavor products' because inconsistencies in the description of flavor may lead to an incorrect definition of beer quality. Compared to flavor fingerprinting, volatile fingerprinting is simpler and easier. We performed volatile fingerprinting using head space-solid phase micro-extraction gas chromatography-mass spectrometry combined with similarity analysis and principal component analysis (PCA) for evaluating and distinguishing between three major Chinese hops. Eighty-four volatiles were identified, which were classified into seven categories. Volatile fingerprinting based on similarity analysis did not yield any obvious result. By contrast, hop varieties and qualities were identified using volatile fingerprinting based on PCA. The potential variables explained the variance in the three hop varieties. In addition, the dendrogram and principal component score plot described the differences and classifications of hops. Volatile fingerprinting plus multivariate statistical analysis can rapidly differentiate between the different varieties and qualities of the three major Chinese hops. Furthermore, this method can be used as a reference in other fields. © 2018 Society of Chemical Industry. © 2018 Society of Chemical Industry.

  18. Accurate Structural Correlations from Maximum Likelihood Superpositions

    PubMed Central

    Theobald, Douglas L; Wuttke, Deborah S

    2008-01-01

    The cores of globular proteins are densely packed, resulting in complicated networks of structural interactions. These interactions in turn give rise to dynamic structural correlations over a wide range of time scales. Accurate analysis of these complex correlations is crucial for understanding biomolecular mechanisms and for relating structure to function. Here we report a highly accurate technique for inferring the major modes of structural correlation in macromolecules using likelihood-based statistical analysis of sets of structures. This method is generally applicable to any ensemble of related molecules, including families of nuclear magnetic resonance (NMR) models, different crystal forms of a protein, and structural alignments of homologous proteins, as well as molecular dynamics trajectories. Dominant modes of structural correlation are determined using principal components analysis (PCA) of the maximum likelihood estimate of the correlation matrix. The correlations we identify are inherently independent of the statistical uncertainty and dynamic heterogeneity associated with the structural coordinates. We additionally present an easily interpretable method (“PCA plots”) for displaying these positional correlations by color-coding them onto a macromolecular structure. Maximum likelihood PCA of structural superpositions, and the structural PCA plots that illustrate the results, will facilitate the accurate determination of dynamic structural correlations analyzed in diverse fields of structural biology. PMID:18282091

  19. Forensic analysis of Salvia divinorum using multivariate statistical procedures. Part I: discrimination from related Salvia species.

    PubMed

    Willard, Melissa A Bodnar; McGuffin, Victoria L; Smith, Ruth Waddell

    2012-01-01

    Salvia divinorum is a hallucinogenic herb that is internationally regulated. In this study, salvinorin A, the active compound in S. divinorum, was extracted from S. divinorum plant leaves using a 5-min extraction with dichloromethane. Four additional Salvia species (Salvia officinalis, Salvia guaranitica, Salvia splendens, and Salvia nemorosa) were extracted using this procedure, and all extracts were analyzed by gas chromatography-mass spectrometry. Differentiation of S. divinorum from other Salvia species was successful based on visual assessment of the resulting chromatograms. To provide a more objective comparison, the total ion chromatograms (TICs) were subjected to principal components analysis (PCA). Prior to PCA, the TICs were subjected to a series of data pretreatment procedures to minimize non-chemical sources of variance in the data set. Successful discrimination of S. divinorum from the other four Salvia species was possible based on visual assessment of the PCA scores plot. To provide a numerical assessment of the discrimination, a series of statistical procedures such as Euclidean distance measurement, hierarchical cluster analysis, Student's t tests, Wilcoxon rank-sum tests, and Pearson product moment correlation were also applied to the PCA scores. The statistical procedures were then compared to determine the advantages and disadvantages for forensic applications.

  20. Progress Towards Improved Analysis of TES X-ray Data Using Principal Component Analysis

    NASA Technical Reports Server (NTRS)

    Busch, S. E.; Adams, J. S.; Bandler, S. R.; Chervenak, J. A.; Eckart, M. E.; Finkbeiner, F. M.; Fixsen, D. J.; Kelley, R. L.; Kilbourne, C. A.; Lee, S.-J.; hide

    2015-01-01

    The traditional method of applying a digital optimal filter to measure X-ray pulses from transition-edge sensor (TES) devices does not achieve the best energy resolution when the signals have a highly non-linear response to energy, or the noise is non-stationary during the pulse. We present an implementation of a method to analyze X-ray data from TESs, which is based upon principal component analysis (PCA). Our method separates the X-ray signal pulse into orthogonal components that have the largest variance. We typically recover pulse height, arrival time, differences in pulse shape, and the variation of pulse height with detector temperature. These components can then be combined to form a representation of pulse energy. An added value of this method is that by reporting information on more descriptive parameters (as opposed to a single number representing energy), we generate a much more complete picture of the pulse received. Here we report on progress in developing this technique for future implementation on X-ray telescopes. We used an 55Fe source to characterize Mo/Au TESs. On the same dataset, the PCA method recovers a spectral resolution that is better by a factor of two than achievable with digital optimal filters.

  1. Combination of multivariate curve resolution and multivariate classification techniques for comprehensive high-performance liquid chromatography-diode array absorbance detection fingerprints analysis of Salvia reuterana extracts.

    PubMed

    Hakimzadeh, Neda; Parastar, Hadi; Fattahi, Mohammad

    2014-01-24

    In this study, multivariate curve resolution (MCR) and multivariate classification methods are proposed to develop a new chemometric strategy for comprehensive analysis of high-performance liquid chromatography-diode array absorbance detection (HPLC-DAD) fingerprints of sixty Salvia reuterana samples from five different geographical regions. Different chromatographic problems occurred during HPLC-DAD analysis of S. reuterana samples, such as baseline/background contribution and noise, low signal-to-noise ratio (S/N), asymmetric peaks, elution time shifts, and peak overlap are handled using the proposed strategy. In this way, chromatographic fingerprints of sixty samples are properly segmented to ten common chromatographic regions using local rank analysis and then, the corresponding segments are column-wise augmented for subsequent MCR analysis. Extended multivariate curve resolution-alternating least squares (MCR-ALS) is used to obtain pure component profiles in each segment. In general, thirty-one chemical components were resolved using MCR-ALS in sixty S. reuterana samples and the lack of fit (LOF) values of MCR-ALS models were below 10.0% in all cases. Pure spectral profiles are considered for identification of chemical components by comparing their resolved spectra with the standard ones and twenty-four components out of thirty-one components were identified. Additionally, pure elution profiles are used to obtain relative concentrations of chemical components in different samples for multivariate classification analysis by principal component analysis (PCA) and k-nearest neighbors (kNN). Inspection of the PCA score plot (explaining 76.1% of variance accounted for three PCs) showed that S. reuterana samples belong to four clusters. The degree of class separation (DCS) which quantifies the distance separating clusters in relation to the scatter within each cluster is calculated for four clusters and it was in the range of 1.6-5.8. These results are then confirmed by kNN. In addition, according to the PCA loading plot and kNN dendrogram of thirty-one variables, five chemical constituents of luteolin-7-o-glucoside, salvianolic acid D, rosmarinic acid, lithospermic acid and trijuganone A are identified as the most important variables (i.e., chemical markers) for clusters discrimination. Finally, the effect of different chemical markers on samples differentiation is investigated using counter-propagation artificial neural network (CP-ANN) method. It is concluded that the proposed strategy can be successfully applied for comprehensive analysis of chromatographic fingerprints of complex natural samples. Copyright © 2013 Elsevier B.V. All rights reserved.

  2. Retest of a Principal Components Analysis of Two Household Environmental Risk Instruments.

    PubMed

    Oneal, Gail A; Postma, Julie; Odom-Maryon, Tamara; Butterfield, Patricia

    2016-08-01

    Household Risk Perception (HRP) and Self-Efficacy in Environmental Risk Reduction (SEERR) instruments were developed for a public health nurse-delivered intervention designed to reduce home-based, environmental health risks among rural, low-income families. The purpose of this study was to test both instruments in a second low-income population that differed geographically and economically from the original sample. Participants (N = 199) were recruited from the Women, Infants, and Children (WIC) program. Paper and pencil surveys were collected at WIC sites by research-trained student nurses. Exploratory principal components analysis (PCA) was conducted, and comparisons were made to the original PCA for the purpose of data reduction. Instruments showed satisfactory Cronbach alpha values for all components. HRP components were reduced from five to four, which explained 70% of variance. The components were labeled sensed risks, unseen risks, severity of risks, and knowledge. In contrast to the original testing, environmental tobacco smoke (ETS) items was not a separate component of the HRP. The SEERR analysis demonstrated four components explaining 71% of variance, with similar patterns of items as in the first study, including a component on ETS, but some differences in item location. Although low-income populations constituted both samples, differences in demographics and risk exposures may have played a role in component and item locations. Findings provided justification for changing or reducing items, and for tailoring the instruments to population-level risks and behaviors. Although analytic refinement will continue, both instruments advance the measurement of environmental health risk perception and self-efficacy. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  3. Multivariate analysis of mixed contaminants (PAHs and heavy metals) at manufactured gas plant site soils.

    PubMed

    Thavamani, Palanisami; Megharaj, Mallavarapu; Naidu, Ravi

    2012-06-01

    Principal component analysis (PCA) was used to provide an overview of the distribution pattern of polycyclic aromatic hydrocarbons (PAHs) and heavy metals in former manufactured gas plant (MGP) site soils. PCA is the powerful multivariate method to identify the patterns in data and expressing their similarities and differences. Ten PAHs (naphthalene, acenapthylene, acenaphthene, fluorene, phenanthrene, anthracene, fluoranthene, pyrene, chrysene, benzo[a]pyrene) and four toxic heavy metals - lead (Pb), cadmium (Cd), chromium (Cr) and zinc (Zn) - were detected in the site soils. PAH contamination was contributed equally by both low and high molecular weight PAHs. PCA was performed using the varimax rotation method in SPSS, 17.0. Two principal components accounting for 91.7% of the total variance was retained using scree test. Principle component 1 (PC1) substantially explained the dominance of PAH contamination in the MGP site soils. All PAHs, except anthracene, were positively correlated in PC1. There was a common thread in high molecular weight PAHs loadings, where the loadings were inversely proportional to the hydrophobicity and molecular weight of individual PAHs. Anthracene, which was less correlated with other individual PAHs, deviated well from the origin which can be ascribed to its lower toxicity and different origin than its isomer phenanthrene. Among the four major heavy metals studied in MGP sites, Pb, Cd and Cr were negatively correlated in PC1 but showed strong positive correlation in principle component 2 (PC2). Although metals may not have originated directly from gaswork processes, the correlation between PAHs and metals suggests that the materials used in these sites may have contributed to high concentrations of Pb, Cd, Cr and Zn. Thus, multivariate analysis helped to identify the sources of PAHs, heavy metals and their association in MGP site, and thereby better characterise the site risk, which would not be possible if one uses chemical analysis alone.

  4. Characterization of Ground Displacement Sources from Variational Bayesian Independent Component Analysis of Space Geodetic Time Series

    NASA Astrophysics Data System (ADS)

    Gualandi, Adriano; Serpelloni, Enrico; Elina Belardinelli, Maria; Bonafede, Maurizio; Pezzo, Giuseppe; Tolomei, Cristiano

    2015-04-01

    A critical point in the analysis of ground displacement time series, as those measured by modern space geodetic techniques (primarly continuous GPS/GNSS and InSAR) is the development of data driven methods that allow to discern and characterize the different sources that generate the observed displacements. A widely used multivariate statistical technique is the Principal Component Analysis (PCA), which allows to reduce the dimensionality of the data space maintaining most of the variance of the dataset explained. It reproduces the original data using a limited number of Principal Components, but it also shows some deficiencies, since PCA does not perform well in finding the solution to the so-called Blind Source Separation (BSS) problem. The recovering and separation of the different sources that generate the observed ground deformation is a fundamental task in order to provide a physical meaning to the possible different sources. PCA fails in the BSS problem since it looks for a new Euclidean space where the projected data are uncorrelated. Usually, the uncorrelation condition is not strong enough and it has been proven that the BSS problem can be tackled imposing on the components to be independent. The Independent Component Analysis (ICA) is, in fact, another popular technique adopted to approach this problem, and it can be used in all those fields where PCA is also applied. An ICA approach enables us to explain the displacement time series imposing a fewer number of constraints on the model, and to reveal anomalies in the data such as transient deformation signals. However, the independence condition is not easy to impose, and it is often necessary to introduce some approximations. To work around this problem, we use a variational bayesian ICA (vbICA) method, which models the probability density function (pdf) of each source signal using a mix of Gaussian distributions. This technique allows for more flexibility in the description of the pdf of the sources, giving a more reliable estimate of them. Here we introduce the vbICA technique and present its application on synthetic data that simulate a GPS network recording ground deformation in a tectonically active region, with synthetic time-series containing interseismic, coseismic, and postseismic deformation, plus seasonal deformation, and white and coloured noise. We study the ability of the algorithm to recover the original (known) sources of deformation, and then apply it to a real scenario: the Emilia seismic sequence (2012, northern Italy), which is an example of seismic sequence occurred in a slowly converging tectonic setting, characterized by several local to regional anthropogenic or natural sources of deformation, mainly subsidence due to fluid withdrawal and sediments compaction. We apply both PCA and vbICA to displacement time-series recorded by continuous GPS and InSAR (Pezzo et al., EGU2015-8950).

  5. Geological and Structural Patterns on Titan Enhanced Through Cassini's SAR PCA and High-Resolution Radiometry

    NASA Astrophysics Data System (ADS)

    Paganelli, F.; Schubert, G.; Lopes, R. M. C.; Malaska, M.; Le Gall, A. A.; Kirk, R. L.

    2016-12-01

    The current SAR data coverage on Titan encompasses several areas in which multiple radar passes are present and overlapping, providing additional information to aid the interpretation of geological and structural features. We exploit the different combinations of look direction and variable incidence angle to examine Cassini Synthetic Aperture RADAR (SAR) data using the Principal Component Analysis (PCA) technique and high-resolution radiometry, as a tool to aid in the interpretation of geological and structural features. Look direction and variable incidence angle is of particular importance in the analysis of variance in the images, which aid in the perception and identification of geological and structural features, as extensively demonstrated in Earth and planetary examples. The PCA enhancement technique uses projected non-ortho-rectified SAR imagery in order to maintain the inherent differences in scattering and geometric properties due to the different look directions, while enhancing the geometry of surface features. The PC2 component provides a stereo view of the areas in which complex surface features and structural patterns can be enhanced and outlined. We focus on several areas of interest, in older and recently acquired flybys, in which evidence of geological and structural features can be enhanced and outlined in the PC1 and PC2 components. Results of this technique provide enhanced geometry and insights into the interpretation of the observed geological and structural features, thus allowing a better understanding towards the geology and tectonics on Titan.

  6. A Functional Monomer Is Not Enough: Principal Component Analysis of the Influence of Template Complexation in Pre-Polymerization Mixtures on Imprinted Polymer Recognition and Morphology

    PubMed Central

    Golker, Kerstin; Karlsson, Björn C. G.; Rosengren, Annika M.; Nicholls, Ian A.

    2014-01-01

    In this report, principal component analysis (PCA) has been used to explore the influence of template complexation in the pre-polymerization phase on template molecularly imprinted polymer (MIP) recognition and polymer morphology. A series of 16 bupivacaine MIPs were studied. The ethylene glycol dimethacrylate (EGDMA)-crosslinked polymers had either methacrylic acid (MAA) or methyl methacrylate (MMA) as the functional monomer, and the stoichiometry between template, functional monomer and crosslinker was varied. The polymers were characterized using radioligand equilibrium binding experiments, gas sorption measurements, swelling studies and data extracted from molecular dynamics (MD) simulations of all-component pre-polymerization mixtures. The molar fraction of the functional monomer in the MAA-polymers contributed to describing both the binding, surface area and pore volume. Interestingly, weak positive correlations between the swelling behavior and the rebinding characteristics of the MAA-MIPs were exposed. Polymers prepared with MMA as a functional monomer and a polymer prepared with only EGDMA were found to share the same characteristics, such as poor rebinding capacities, as well as similar surface area and pore volume, independent of the molar fraction MMA used in synthesis. The use of PCA for interpreting relationships between MD-derived descriptions of events in the pre-polymerization mixture, recognition properties and morphologies of the corresponding polymers illustrates the potential of PCA as a tool for better understanding these complex materials and for their rational design. PMID:25391043

  7. A functional monomer is not enough: principal component analysis of the influence of template complexation in pre-polymerization mixtures on imprinted polymer recognition and morphology.

    PubMed

    Golker, Kerstin; Karlsson, Björn C G; Rosengren, Annika M; Nicholls, Ian A

    2014-11-10

    In this report, principal component analysis (PCA) has been used to explore the influence of template complexation in the pre-polymerization phase on template molecularly imprinted polymer (MIP) recognition and polymer morphology. A series of 16 bupivacaine MIPs were studied. The ethylene glycol dimethacrylate (EGDMA)-crosslinked polymers had either methacrylic acid (MAA) or methyl methacrylate (MMA) as the functional monomer, and the stoichiometry between template, functional monomer and crosslinker was varied. The polymers were characterized using radioligand equilibrium binding experiments, gas sorption measurements, swelling studies and data extracted from molecular dynamics (MD) simulations of all-component pre-polymerization mixtures. The molar fraction of the functional monomer in the MAA-polymers contributed to describing both the binding, surface area and pore volume. Interestingly, weak positive correlations between the swelling behavior and the rebinding characteristics of the MAA-MIPs were exposed. Polymers prepared with MMA as a functional monomer and a polymer prepared with only EGDMA were found to share the same characteristics, such as poor rebinding capacities, as well as similar surface area and pore volume, independent of the molar fraction MMA used in synthesis. The use of PCA for interpreting relationships between MD-derived descriptions of events in the pre-polymerization mixture, recognition properties and morphologies of the corresponding polymers illustrates the potential of PCA as a tool for better understanding these complex materials and for their rational design.

  8. Resolution of coi-dominant phytoplankton species in a eutrophiclake using synchrotron-based Fourier transform infraredspectroscopy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dean, A.P.; Martin, Michael C.; Sigee, D.C.

    2006-10-09

    Synchrotron-based Fourier-transform infrared (FTIR)microspectroscopy was used to distinguish micropopulations of thecodominant algae Microcystis aeruginosa (Cyanophyceae) and Ceratiumhirundinella (Dinophyceae) in mixed phytoplankton samples taken from thewater column of a stratified eutrophic lake (Rostherne Mere, UK). FTIRspectra of the two algae showed a closely similar sequence of 10 bandsover the wave-number range 4000-900 cm-1. These were assigned to a rangeof vibrationally active chemical groups using published band assignmentsand on the basis of correlation and factor analysis. In both algae,intracellular concentrations of macromolecular components (determined asband intensity) varied considerably within the same population,indicating substantial intraspecific heterogeneity. Interspecificdifferences were separately analysed in relation tomore » discrete bands and bymultivariate analysis of the entire spectral region 1750-900 cm-1. Interms of discrete bands, comparison of individual intensities (normalisedto amide 1) demonstrated significant (99 percent probability level)differences in relation to six bands between the two algal species. Keyinterspecific differences were also noted in relation to the positions ofbands 2, 10 (carbohydrate) and 7 (protein) and in the 3-D plots derivedby principal component analysis (PCA) of the sequence of bandintensities. PCA of entire spectral regions showed clear resolutionofspecies in the PCA plot, with indication of separation on the basis ofprotein (region 1700-1500 cm1) and carbohydrate (region 1150-900 cm1)composition in the loading plot. Hierarchical cluster analysis (Wardalgorithm) of entire spectral regions also showed clear discrimination ofthe two species within the resulting dendrogram.« less

  9. Extending the 2 x 2 Achievement Goal Framework: Development of a Measure of Scientific Achievement Goals

    ERIC Educational Resources Information Center

    Deemer, Eric D.; Carter, Alice P.; Lobrano, Michael T.

    2010-01-01

    The current research sought to extend the 2 x 2 achievement goal framework by developing and testing the Achievement Goals for Research Scale (AGRS). Participants (N = 317) consisted of graduate students in the life, physical, and behavioral sciences. A principal components analysis (PCA) extracted five components accounting for 72.59% of the…

  10. Principal component analysis of Raman spectra for TiO2 nanoparticle characterization

    NASA Astrophysics Data System (ADS)

    Ilie, Alina Georgiana; Scarisoareanu, Monica; Morjan, Ion; Dutu, Elena; Badiceanu, Maria; Mihailescu, Ion

    2017-09-01

    The Raman spectra of anatase/rutile mixed phases of Sn doped TiO2 nanoparticles and undoped TiO2 nanoparticles, synthesised by laser pyrolysis, with nanocrystallite dimensions varying from 8 to 28 nm, was simultaneously processed with a self-written software that applies Principal Component Analysis (PCA) on the measured spectrum to verify the possibility of objective auto-characterization of nanoparticles from their vibrational modes. The photo-excited process of Raman scattering is very sensible to the material characteristics, especially in the case of nanomaterials, where more properties become relevant for the vibrational behaviour. We used PCA, a statistical procedure that performs eigenvalue decomposition of descriptive data covariance, to automatically analyse the sample's measured Raman spectrum, and to interfere the correlation between nanoparticle dimensions, tin and carbon concentration, and their Principal Component values (PCs). This type of application can allow an approximation of the crystallite size, or tin concentration, only by measuring the Raman spectrum of the sample. The study of loadings of the principal components provides information of the way the vibrational modes are affected by the nanoparticle features and the spectral area relevant for the classification.

  11. Incorporating principal component analysis into air quality ...

    EPA Pesticide Factsheets

    The efficacy of standard air quality model evaluation techniques is becoming compromised as the simulation periods continue to lengthen in response to ever increasing computing capacity. Accordingly, the purpose of this paper is to demonstrate a statistical approach called Principal Component Analysis (PCA) with the intent of motivating its use by the evaluation community. One of the main objectives of PCA is to identify, through data reduction, the recurring and independent modes of variations (or signals) within a very large dataset, thereby summarizing the essential information of that dataset so that meaningful and descriptive conclusions can be made. In this demonstration, PCA is applied to a simple evaluation metric – the model bias associated with EPA's Community Multi-scale Air Quality (CMAQ) model when compared to weekly observations of sulfate (SO42−) and ammonium (NH4+) ambient air concentrations measured by the Clean Air Status and Trends Network (CASTNet). The advantages of using this technique are demonstrated as it identifies strong and systematic patterns of CMAQ model bias across a myriad of spatial and temporal scales that are neither constrained to geopolitical boundaries nor monthly/seasonal time periods (a limitation of many current studies). The technique also identifies locations (station–grid cell pairs) that are used as indicators for a more thorough diagnostic evaluation thereby hastening and facilitating understanding of the prob

  12. Spectral data compression using weighted principal component analysis with consideration of human visual system and light sources

    NASA Astrophysics Data System (ADS)

    Cao, Qian; Wan, Xiaoxia; Li, Junfeng; Liu, Qiang; Liang, Jingxing; Li, Chan

    2016-10-01

    This paper proposed two weight functions based on principal component analysis (PCA) to reserve more colorimetric information in spectral data compression process. One weight function consisted of the CIE XYZ color-matching functions representing the characteristic of the human visual system, while another was made up of the CIE XYZ color-matching functions of human visual system and relative spectral power distribution of the CIE standard illuminant D65. The improvement obtained from the proposed two methods were tested to compress and reconstruct the reflectance spectra of 1600 glossy Munsell color chips and 1950 Natural Color System color chips as well as six multispectral images. The performance was evaluated by the mean values of color difference under the CIE 1931 standard colorimetric observer and the CIE standard illuminant D65 and A. The mean values of root mean square errors between the original and reconstructed spectra were also calculated. The experimental results show that the proposed two methods significantly outperform the standard PCA and another two weighted PCA in the aspects of colorimetric reconstruction accuracy with very slight degradation in spectral reconstruction accuracy. In addition, weight functions with the CIE standard illuminant D65 can improve the colorimetric reconstruction accuracy compared to weight functions without the CIE standard illuminant D65.

  13. Simultaneous and Continuous Estimation of Shoulder and Elbow Kinematics from Surface EMG Signals

    PubMed Central

    Zhang, Qin; Liu, Runfeng; Chen, Wenbin; Xiong, Caihua

    2017-01-01

    In this paper, we present a simultaneous and continuous kinematics estimation method for multiple DoFs across shoulder and elbow joint. Although simultaneous and continuous kinematics estimation from surface electromyography (EMG) is a feasible way to achieve natural and intuitive human-machine interaction, few works investigated multi-DoF estimation across the significant joints of upper limb, shoulder and elbow joints. This paper evaluates the feasibility to estimate 4-DoF kinematics at shoulder and elbow during coordinated arm movements. Considering the potential applications of this method in exoskeleton, prosthetics and other arm rehabilitation techniques, the estimation performance is presented with different muscle activity decomposition and learning strategies. Principle component analysis (PCA) and independent component analysis (ICA) are respectively employed for EMG mode decomposition with artificial neural network (ANN) for learning the electromechanical association. Four joint angles across shoulder and elbow are simultaneously and continuously estimated from EMG in four coordinated arm movements. By using ICA (PCA) and single ANN, the average estimation accuracy 91.12% (90.23%) is obtained in 70-s intra-cross validation and 87.00% (86.30%) is obtained in 2-min inter-cross validation. This result suggests it is feasible and effective to use ICA (PCA) with single ANN for multi-joint kinematics estimation in variant application conditions. PMID:28611573

  14. Representation of Probability Density Functions from Orbit Determination using the Particle Filter

    NASA Technical Reports Server (NTRS)

    Mashiku, Alinda K.; Garrison, James; Carpenter, J. Russell

    2012-01-01

    Statistical orbit determination enables us to obtain estimates of the state and the statistical information of its region of uncertainty. In order to obtain an accurate representation of the probability density function (PDF) that incorporates higher order statistical information, we propose the use of nonlinear estimation methods such as the Particle Filter. The Particle Filter (PF) is capable of providing a PDF representation of the state estimates whose accuracy is dependent on the number of particles or samples used. For this method to be applicable to real case scenarios, we need a way of accurately representing the PDF in a compressed manner with little information loss. Hence we propose using the Independent Component Analysis (ICA) as a non-Gaussian dimensional reduction method that is capable of maintaining higher order statistical information obtained using the PF. Methods such as the Principal Component Analysis (PCA) are based on utilizing up to second order statistics, hence will not suffice in maintaining maximum information content. Both the PCA and the ICA are applied to two scenarios that involve a highly eccentric orbit with a lower apriori uncertainty covariance and a less eccentric orbit with a higher a priori uncertainty covariance, to illustrate the capability of the ICA in relation to the PCA.

  15. Identification of fungal phytopathogens using Fourier transform infrared-attenuated total reflection spectroscopy and advanced statistical methods

    NASA Astrophysics Data System (ADS)

    Salman, Ahmad; Lapidot, Itshak; Pomerantz, Ami; Tsror, Leah; Shufan, Elad; Moreh, Raymond; Mordechai, Shaul; Huleihel, Mahmoud

    2012-01-01

    The early diagnosis of phytopathogens is of a great importance; it could save large economical losses due to crops damaged by fungal diseases, and prevent unnecessary soil fumigation or the use of fungicides and bactericides and thus prevent considerable environmental pollution. In this study, 18 isolates of three different fungi genera were investigated; six isolates of Colletotrichum coccodes, six isolates of Verticillium dahliae and six isolates of Fusarium oxysporum. Our main goal was to differentiate these fungi samples on the level of isolates, based on their infrared absorption spectra obtained using the Fourier transform infrared-attenuated total reflection (FTIR-ATR) sampling technique. Advanced statistical and mathematical methods: principal component analysis (PCA), linear discriminant analysis (LDA), and k-means were applied to the spectra after manipulation. Our results showed significant spectral differences between the various fungi genera examined. The use of k-means enabled classification between the genera with a 94.5% accuracy, whereas the use of PCA [3 principal components (PCs)] and LDA has achieved a 99.7% success rate. However, on the level of isolates, the best differentiation results were obtained using PCA (9 PCs) and LDA for the lower wavenumber region (800-1775 cm-1), with identification success rates of 87%, 85.5%, and 94.5% for Colletotrichum, Fusarium, and Verticillium strains, respectively.

  16. Sources of hydrocarbons in urban road dust: Identification, quantification and prediction.

    PubMed

    Mummullage, Sandya; Egodawatta, Prasanna; Ayoko, Godwin A; Goonetilleke, Ashantha

    2016-09-01

    Among urban stormwater pollutants, hydrocarbons are a significant environmental concern due to their toxicity and relatively stable chemical structure. This study focused on the identification of hydrocarbon contributing sources to urban road dust and approaches for the quantification of pollutant loads to enhance the design of source control measures. The study confirmed the validity of the use of mathematical techniques of principal component analysis (PCA) and hierarchical cluster analysis (HCA) for source identification and principal component analysis/absolute principal component scores (PCA/APCS) receptor model for pollutant load quantification. Study outcomes identified non-combusted lubrication oils, non-combusted diesel fuels and tyre and asphalt wear as the three most critical urban hydrocarbon sources. The site specific variabilities of contributions from sources were replicated using three mathematical models. The models employed predictor variables of daily traffic volume (DTV), road surface texture depth (TD), slope of the road section (SLP), effective population (EPOP) and effective impervious fraction (EIF), which can be considered as the five governing parameters of pollutant generation, deposition and redistribution. Models were developed such that they can be applicable in determining hydrocarbon contributions from urban sites enabling effective design of source control measures. Copyright © 2016 Elsevier Ltd. All rights reserved.

  17. Finger crease pattern recognition using Legendre moments and principal component analysis

    NASA Astrophysics Data System (ADS)

    Luo, Rongfang; Lin, Tusheng

    2007-03-01

    The finger joint lines defined as finger creases and its distribution can identify a person. In this paper, we propose a new finger crease pattern recognition method based on Legendre moments and principal component analysis (PCA). After obtaining the region of interest (ROI) for each finger image in the pre-processing stage, Legendre moments under Radon transform are applied to construct a moment feature matrix from the ROI, which greatly decreases the dimensionality of ROI and can represent principal components of the finger creases quite well. Then, an approach to finger crease pattern recognition is designed based on Karhunen-Loeve (K-L) transform. The method applies PCA to a moment feature matrix rather than the original image matrix to achieve the feature vector. The proposed method has been tested on a database of 824 images from 103 individuals using the nearest neighbor classifier. The accuracy up to 98.584% has been obtained when using 4 samples per class for training. The experimental results demonstrate that our proposed approach is feasible and effective in biometrics.

  18. Extracting grid cell characteristics from place cell inputs using non-negative principal component analysis

    PubMed Central

    Dordek, Yedidyah; Soudry, Daniel; Meir, Ron; Derdikman, Dori

    2016-01-01

    Many recent models study the downstream projection from grid cells to place cells, while recent data have pointed out the importance of the feedback projection. We thus asked how grid cells are affected by the nature of the input from the place cells. We propose a single-layer neural network with feedforward weights connecting place-like input cells to grid cell outputs. Place-to-grid weights are learned via a generalized Hebbian rule. The architecture of this network highly resembles neural networks used to perform Principal Component Analysis (PCA). Both numerical results and analytic considerations indicate that if the components of the feedforward neural network are non-negative, the output converges to a hexagonal lattice. Without the non-negativity constraint, the output converges to a square lattice. Consistent with experiments, grid spacing ratio between the first two consecutive modules is −1.4. Our results express a possible linkage between place cell to grid cell interactions and PCA. DOI: http://dx.doi.org/10.7554/eLife.10094.001 PMID:26952211

  19. Measuring the Indonesian provinces competitiveness by using PCA technique

    NASA Astrophysics Data System (ADS)

    Runita, Ditha; Fajriyah, Rohmatul

    2017-12-01

    Indonesia is a country which has vast teritoty. It has 34 provinces. Building local competitiveness is critical to enhance the long-term national competitiveness especially for a country as diverse as Indonesia. A competitive local government can attract and maintain successful firms and increase living standards for its inhabitants, because investment and skilled workers gravitate from uncompetitive regions to more competitive ones. Altough there are other methods to measuring competitiveness, but here we have demonstrated a simple method using principal component analysis (PCA). It can directly be applied to correlated, multivariate data. The analysis on Indonesian provinces provides 3 clusters based on the competitiveness measurement and the clusters are Bad, Good and Best perform provinces.

  20. Study on elemental fingerprint of traditional marine Chinese medicine oysters from Jiaozhou Bay, China

    NASA Astrophysics Data System (ADS)

    Zheng, Yongjun; Zheng, Kang; Li, Yantuan

    2012-09-01

    In order to investigate the relationship between the trace elements and the characteristics of the oysters, we analyzed the trace elements present in the germplasm of oysters from different producing areas in the Jiaozhou Bay. The element fingerprints were established to reflect the elemental characteristics of the oysters. Concentration patterns of the elements were deciphered by principle component analysis (PCA) and hierarchical cluster analysis (HCA). The six regions were discriminated with accuracy using HCA and PCA based on the concentration of 16 trace elements. The elements were viewed as characteristic elements of the oysters and the fingerprints of these elements could be used to distinguish the quality of the oysters.

  1. NIR monitoring of in-service wood structures

    Treesearch

    Michela Zanetti; Timothy G. Rials; Douglas Rammer

    2005-01-01

    Near infrared spectroscopy (NIRS) was used to study a set of Southern Yellow Pine boards exposed to natural weathering for different periods of exposure time. This non-destructive spectroscopic technique is a very powerful tool to predict the weathering of wood when used in combination with multivariate analysis (Principal Component Analysis, PCA, and Projection to...

  2. Inquiring the Most Critical Teacher's Technology Education Competences in the Highest Efficient Technology Education Learning Organization

    ERIC Educational Resources Information Center

    Yung-Kuan, Chan; Hsieh, Ming-Yuan; Lee, Chin-Feng; Huang, Chih-Cheng; Ho, Li-Chih

    2017-01-01

    Under the hyper-dynamic education situation, this research, in order to comprehensively explore the interplays between Teacher Competence Demands (TCD) and Learning Organization Requests (LOR), cross-employs the data refined method of Descriptive Statistics (DS) method and Analysis of Variance (ANOVA) and Principal Components Analysis (PCA)…

  3. A comparison of PCA/ICA for data preprocessing in remote sensing imagery classification

    NASA Astrophysics Data System (ADS)

    He, Hui; Yu, Xianchuan

    2005-10-01

    In this paper a performance comparison of a variety of data preprocessing algorithms in remote sensing image classification is presented. These selected algorithms are principal component analysis (PCA) and three different independent component analyses, ICA (Fast-ICA (Aapo Hyvarinen, 1999), Kernel-ICA (KCCA and KGV (Bach & Jordan, 2002), EFFICA (Aiyou Chen & Peter Bickel, 2003). These algorithms were applied to a remote sensing imagery (1600×1197), obtained from Shunyi, Beijing. For classification, a MLC method is used for the raw and preprocessed data. The results show that classification with the preprocessed data have more confident results than that with raw data and among the preprocessing algorithms, ICA algorithms improve on PCA and EFFICA performs better than the others. The convergence of these ICA algorithms (for data points more than a million) are also studied, the result shows EFFICA converges much faster than the others. Furthermore, because EFFICA is a one-step maximum likelihood estimate (MLE) which reaches asymptotic Fisher efficiency (EFFICA), it computers quite small so that its demand of memory come down greatly, which settled the "out of memory" problem occurred in the other algorithms.

  4. Problematic mobile phone use in adolescents: derivation of a short scale MPPUS-10.

    PubMed

    Foerster, Milena; Roser, Katharina; Schoeni, Anna; Röösli, Martin

    2015-02-01

    Our aim was to derive a short version of the Mobile Phone Problem Use Scale (MPPUS) using data from 412 adolescents of the Swiss HERMES (Health Effects Related to Mobile phonE use in adolescentS) cohort. A German version of the original MPPUS consisting of 27 items was shortened by principal component analysis (PCA) using baseline data collected in 2012. For confirmation, the PCA was carried out again with follow-up data 1 year later. PCA revealed four factors related to symptoms of addiction (Loss of Control, Withdrawal, Negative Life Consequences and Craving) and a fifth factor reflecting the social component of mobile phone use (Peer Dependence). The shortened scale (MPPUS-10) highly reflects the original MPPUS (Kendalls' Tau: 0.80 with 90% concordant pairs). Internal consistency of MPPUS-10 was good with Cronbach's alpha: 0.85. The results were confirmed using the follow-up data. The MPPUS-10 is a suitable instrument for research in adolescents. It will help to further clarify the definition of problematic mobile phone use in adolescents and explore similarities and differences to other technological addictions.

  5. Research on Air Quality Evaluation based on Principal Component Analysis

    NASA Astrophysics Data System (ADS)

    Wang, Xing; Wang, Zilin; Guo, Min; Chen, Wei; Zhang, Huan

    2018-01-01

    Economic growth has led to environmental capacity decline and the deterioration of air quality. Air quality evaluation as a fundamental of environmental monitoring and air pollution control has become increasingly important. Based on the principal component analysis (PCA), this paper evaluates the air quality of a large city in Beijing-Tianjin-Hebei Area in recent 10 years and identifies influencing factors, in order to provide reference to air quality management and air pollution control.

  6. Understanding the pattern of the BSE Sensex

    NASA Astrophysics Data System (ADS)

    Mukherjee, I.; Chatterjee, Soumya; Giri, A.; Barat, P.

    2017-09-01

    An attempt is made to understand the pattern of behaviour of the BSE Sensex by analysing the tick-by-tick Sensex data for the years 2006 to 2012 on yearly as well as cumulative basis using Principal Component Analysis (PCA) and its nonlinear variant Kernel Principal Component Analysis (KPCA). The latter technique ensures that the nonlinear character of the interactions present in the system gets captured in the analysis. The analysis is carried out by constructing vector spaces of varying dimensions. The size of the data set ranges from a minimum of 360,000 for one year to a maximum of 2,520,000 for seven years. In all cases the prices appear to be highly correlated and restricted to a very low dimensional subspace of the original vector space. An external perturbation is added to the system in the form of noise. It is observed that while standard PCA is unable to distinguish the behaviour of the noise-mixed data from that of the original, KPCA clearly identifies the effect of the noise. The exercise is extended in case of daily data of other stock markets and similar results are obtained.

  7. Source attribution of poly- and perfluoroalkyl substances (PFASs) in surface waters from Rhode Island and the New York Metropolitan Area

    PubMed Central

    Zhang, Xianming; Lohmann, Rainer; Dassuncao, Clifton; Hu, Xindi C.; Weber, Andrea K.; Vecitis, Chad D.; Sunderland, Elsie M.

    2017-01-01

    Exposure to poly and perfluoroalkyl substances (PFASs) has been associated with adverse health effects in humans and wildlife. Understanding pollution sources is essential for environmental regulation but source attribution for PFASs has been confounded by limited information on industrial releases and rapid changes in chemical production. Here we use principal component analysis (PCA), hierarchical clustering, and geospatial analysis to understand source contributions to 14 PFASs measured across 37 sites in the Northeastern United States in 2014. PFASs are significantly elevated in urban areas compared to rural sites except for perfluorobutane sulfonate (PFBS), N-methyl perfluorooctanesulfonamidoacetic acid (N-MeFOSAA), perfluoroundecanate (PFUnDA) and perfluorododecanate (PFDoDA). The highest PFAS concentrations across sites were for perfluorooctanate (PFOA, 56 ng L−1) and perfluorohexane sulfonate (PFOS, 43 ng L−1) and PFOS levels are lower than earlier measurements of U.S. surface waters. PCA and cluster analysis indicates three main statistical groupings of PFASs. Geospatial analysis of watersheds reveals the first component/cluster originates from a mixture of contemporary point sources such as airports and textile mills. Atmospheric sources from the waste sector are consistent with the second component, and the metal smelting industry plausibly explains the third component. We find this source-attribution technique is effective for better understanding PFAS sources in urban areas. PMID:28217711

  8. Decomposition-Based Failure Mode Identification Method for Risk-Free Design of Large Systems

    NASA Technical Reports Server (NTRS)

    Tumer, Irem Y.; Stone, Robert B.; Roberts, Rory A.; Clancy, Daniel (Technical Monitor)

    2002-01-01

    When designing products, it is crucial to assure failure and risk-free operation in the intended operating environment. Failures are typically studied and eliminated as much as possible during the early stages of design. The few failures that go undetected result in unacceptable damage and losses in high-risk applications where public safety is of concern. Published NASA and NTSB accident reports point to a variety of components identified as sources of failures in the reported cases. In previous work, data from these reports were processed and placed in matrix form for all the system components and failure modes encountered, and then manipulated using matrix methods to determine similarities between the different components and failure modes. In this paper, these matrices are represented in the form of a linear combination of failures modes, mathematically formed using Principal Components Analysis (PCA) decomposition. The PCA decomposition results in a low-dimensionality representation of all failure modes and components of interest, represented in a transformed coordinate system. Such a representation opens the way for efficient pattern analysis and prediction of failure modes with highest potential risks on the final product, rather than making decisions based on the large space of component and failure mode data. The mathematics of the proposed method are explained first using a simple example problem. The method is then applied to component failure data gathered from helicopter, accident reports to demonstrate its potential.

  9. [In vitro transdermal delivery of the active fraction of xiangfusiwu decoction based on principal component analysis].

    PubMed

    Li, Zhen-Hao; Liu, Pei; Qian, Da-Wei; Li, Wei; Shang, Er-Xin; Duan, Jin-Ao

    2013-06-01

    The objective of the present study was to establish a method based on principal component analysis (PCA) for the study of transdermal delivery of multiple components in Chinese medicine, and to choose the best penetration enhancers for the active fraction of Xiangfusiwu decoction (BW) with this method. Improved Franz diffusion cells with isolated rat abdomen skins were carried out to experiment on the transdermal delivery of six active components, including ferulic acid, paeoniflorin, albiflorin, protopine, tetrahydropalmatine and tetrahydrocolumbamine. The concentrations of these components were determined by LC-MS/MS, then the total factor scores of the concentrations at different times were calculated using PCA and were employed instead of the concentrations to compute the cumulative amounts and steady fluxes, the latter of which were considered as the indexes for optimizing penetration enhancers. The results showed that compared to the control group, the steady fluxes of the other groups increased significantly and furthermore, 4% azone with 1% propylene glycol manifested the best effect. The six components could penetrate through skin well under the action of penetration enhancers. The method established in this study has been proved to be suitable for the study of transdermal delivery of multiple components, and it provided a scientific basis for preparation research of Xiangfusiwu decoction and moreover, it could be a reference for Chinese medicine research.

  10. Generation of Boundary Manikin Anthropometry

    NASA Technical Reports Server (NTRS)

    Young, Karen S.; Margerum, Sarah; Barr, Abbe; Ferrer, Mike A.; Rajulu, Sudhakar

    2008-01-01

    The purpose of this study was to develop 3D digital boundary manikins that are representative of the anthropometry of a unique population. These digital manikins can be used by designers to verify and validate that the components of the spacesuit design satisfy the requirements specified in the Human Systems Integration Requirements (HSIR) document. Currently, the HSIR requires the suit to accommodate the 1st percentile American female to the 99th percentile American male. The manikin anthropometry was derived using two methods: Principal Component Analysis (PCA) and Whole Body Posture Based Analysis (WBPBA). PCA is a statistical method for reducing a multidimensional data set by using eigenvectors and eigenvalues. The goal is to create a reduced data set that encapsulates the majority of the variation in the population. WBPBA is a multivariate analytical approach that was developed by the Anthropometry and Biomechanics Facility (ABF) to identify the extremes of the population for a given body posture. WBPBA is a simulation-based method that finds extremes in a population based on anthropometry and posture whereas PCA is based solely on anthropometry. Both methods yield a list of subjects and their anthropometry from the target population; PCA resulted in 20 female and 22 male subjects anthropometry and WBPBA resulted in 7 subjects' anthropometry representing the extreme subjects in the target population. The subjects anthropometry is then used to 'morph' a baseline digital scan of a person with the same body type to create a 3D digital model that can be used as a tool for designers, the details of which will be discussed in subsequent papers.

  11. Physical fitness in persons with hemiparetic stroke: its structure and longitudinal changes during an inpatient rehabilitation programme.

    PubMed

    Tsuji, Tetsuya; Liu, Meigen; Hase, Kimitaka; Masakado, Yoshihisa; Takahashi, Hidetoshi; Hara, Yukihiro; Chino, Naoichi

    2004-06-01

    To test the hypothesis that the structure of fitness in patients with hemiparetic stroke can be categorized into impairment/disability, cardiopulmonary, muscular and metabolic domains, and to study longitudinal changes in their fitness during an inpatient rehabilitation programme. Structure analysis of multiple fitness parameters with principal component analysis (PCA), and a before and after trial. Tertiary rehabilitation centre in Japan. One hundred and seven consecutive inpatients with hemiparetic stroke. A conventional stroke rehabilitation programme consisting of 80 minutes of physical therapy and occupational therapy sessions five days a week, and daily rehabilitation nursing for a median duration of 105.5 days. Principal component scores extracted from measurement of paresis/daily living (the Stroke Impairment Assessment Set (SIAS) and the Functional Independence Measure (FIM)); muscular (grip strength (GS), knee extensor torque, and cross-sectional areas of thigh muscles); metabolic (body mass index (BMI) and fat accumulation on CT); cardiopulmonary (heart rate oxygen coefficient (HR-O2-Coeff) obtained with a graded bridging activity and a 12-minute propulsion distance). PCA categorized the original 15 variables into four factors corresponding to paresis/activities of daily living, muscular, metabolic and cardiopulmonary domains, and explained 78.1% of the total variance at admission and 69.6% at discharge. Except the metabolic domain, PCA scores for the other three domains improved significantly at discharge (paired t-test, p < 0.05). The hypothetical structure of fitness was confirmed, and the PCA scores were useful in following longitudinal changes of fitness during inpatient rehabilitation.

  12. An improved geographically weighted regression model for PM2.5 concentration estimation in large areas

    NASA Astrophysics Data System (ADS)

    Zhai, Liang; Li, Shuang; Zou, Bin; Sang, Huiyong; Fang, Xin; Xu, Shan

    2018-05-01

    Considering the spatial non-stationary contributions of environment variables to PM2.5 variations, the geographically weighted regression (GWR) modeling method has been using to estimate PM2.5 concentrations widely. However, most of the GWR models in reported studies so far were established based on the screened predictors through pretreatment correlation analysis, and this process might cause the omissions of factors really driving PM2.5 variations. This study therefore developed a best subsets regression (BSR) enhanced principal component analysis-GWR (PCA-GWR) modeling approach to estimate PM2.5 concentration by fully considering all the potential variables' contributions simultaneously. The performance comparison experiment between PCA-GWR and regular GWR was conducted in the Beijing-Tianjin-Hebei (BTH) region over a one-year-period. Results indicated that the PCA-GWR modeling outperforms the regular GWR modeling with obvious higher model fitting- and cross-validation based adjusted R2 and lower RMSE. Meanwhile, the distribution map of PM2.5 concentration from PCA-GWR modeling also clearly depicts more spatial variation details in contrast to the one from regular GWR modeling. It can be concluded that the BSR enhanced PCA-GWR modeling could be a reliable way for effective air pollution concentration estimation in the coming future by involving all the potential predictor variables' contributions to PM2.5 variations.

  13. On a PCA-based lung motion model

    NASA Astrophysics Data System (ADS)

    Li, Ruijiang; Lewis, John H.; Jia, Xun; Zhao, Tianyu; Liu, Weifeng; Wuenschel, Sara; Lamb, James; Yang, Deshan; Low, Daniel A.; Jiang, Steve B.

    2011-09-01

    Respiration-induced organ motion is one of the major uncertainties in lung cancer radiotherapy and is crucial to be able to accurately model the lung motion. Most work so far has focused on the study of the motion of a single point (usually the tumor center of mass), and much less work has been done to model the motion of the entire lung. Inspired by the work of Zhang et al (2007 Med. Phys. 34 4772-81), we believe that the spatiotemporal relationship of the entire lung motion can be accurately modeled based on principle component analysis (PCA) and then a sparse subset of the entire lung, such as an implanted marker, can be used to drive the motion of the entire lung (including the tumor). The goal of this work is twofold. First, we aim to understand the underlying reason why PCA is effective for modeling lung motion and find the optimal number of PCA coefficients for accurate lung motion modeling. We attempt to address the above important problems both in a theoretical framework and in the context of real clinical data. Second, we propose a new method to derive the entire lung motion using a single internal marker based on the PCA model. The main results of this work are as follows. We derived an important property which reveals the implicit regularization imposed by the PCA model. We then studied the model using two mathematical respiratory phantoms and 11 clinical 4DCT scans for eight lung cancer patients. For the mathematical phantoms with cosine and an even power (2n) of cosine motion, we proved that 2 and 2n PCA coefficients and eigenvectors will completely represent the lung motion, respectively. Moreover, for the cosine phantom, we derived the equivalence conditions for the PCA motion model and the physiological 5D lung motion model (Low et al 2005 Int. J. Radiat. Oncol. Biol. Phys. 63 921-9). For the clinical 4DCT data, we demonstrated the modeling power and generalization performance of the PCA model. The average 3D modeling error using PCA was within 1 mm (0.7 ± 0.1 mm). When a single artificial internal marker was used to derive the lung motion, the average 3D error was found to be within 2 mm (1.8 ± 0.3 mm) through comprehensive statistical analysis. The optimal number of PCA coefficients needs to be determined on a patient-by-patient basis and two PCA coefficients seem to be sufficient for accurate modeling of the lung motion for most patients. In conclusion, we have presented thorough theoretical analysis and clinical validation of the PCA lung motion model. The feasibility of deriving the entire lung motion using a single marker has also been demonstrated on clinical data using a simulation approach.

  14. Comparison of five Lonicera flowers by simultaneous determination of multi-components with single reference standard method and principal component analysis.

    PubMed

    Gao, Wen; Wang, Rui; Li, Dan; Liu, Ke; Chen, Jun; Li, Hui-Jun; Xu, Xiaojun; Li, Ping; Yang, Hua

    2016-01-05

    The flowers of Lonicera japonica Thunb. were extensively used to treat many diseases. As the demands for L. japonica increased, some related Lonicera plants were often confused or misused. Caffeoylquinic acids were always regarded as chemical markers in the quality control of L. japonica, but they could be found in all Lonicera species. Thus, a simple and reliable method for the evaluation of different Lonicera flowers is necessary to be established. In this work a method based on single standard to determine multi-components (SSDMC) combined with principal component analysis (PCA) for control and distinguish of Lonicera species flowers have been developed. Six components including three caffeoylquinic acids and three iridoid glycosides were assayed simultaneously using chlorogenic acid as the reference standard. The credibility and feasibility of the SSDMC method were carefully validated and the results demonstrated that there were no remarkable differences compared with external standard method. Finally, a total of fifty-one batches covering five Lonicera species were analyzed and PCA was successfully applied to distinguish the Lonicera species. This strategy simplifies the processes in the quality control of multiple-componential herbal medicine which effectively adapted for improving the quality control of those herbs belonging to closely related species. Copyright © 2015 Elsevier B.V. All rights reserved.

  15. Geochemical differentiation processes for arc magma of the Sengan volcanic cluster, Northeastern Japan, constrained from principal component analysis

    NASA Astrophysics Data System (ADS)

    Ueki, Kenta; Iwamori, Hikaru

    2017-10-01

    In this study, with a view of understanding the structure of high-dimensional geochemical data and discussing the chemical processes at work in the evolution of arc magmas, we employed principal component analysis (PCA) to evaluate the compositional variations of volcanic rocks from the Sengan volcanic cluster of the Northeastern Japan Arc. We analyzed the trace element compositions of various arc volcanic rocks, sampled from 17 different volcanoes in a volcanic cluster. The PCA results demonstrated that the first three principal components accounted for 86% of the geochemical variation in the magma of the Sengan region. Based on the relationships between the principal components and the major elements, the mass-balance relationships with respect to the contributions of minerals, the composition of plagioclase phenocrysts, geothermal gradient, and seismic velocity structure in the crust, the first, the second, and the third principal components appear to represent magma mixing, crystallizations of olivine/pyroxene, and crystallizations of plagioclase, respectively. These represented 59%, 20%, and 6%, respectively, of the variance in the entire compositional range, indicating that magma mixing accounted for the largest variance in the geochemical variation of the arc magma. Our result indicated that crustal processes dominate the geochemical variation of magma in the Sengan volcanic cluster.

  16. Near-infrared confocal micro-Raman spectroscopy combined with PCA-LDA multivariate analysis for detection of esophageal cancer

    NASA Astrophysics Data System (ADS)

    Chen, Long; Wang, Yue; Liu, Nenrong; Lin, Duo; Weng, Cuncheng; Zhang, Jixue; Zhu, Lihuan; Chen, Weisheng; Chen, Rong; Feng, Shangyuan

    2013-06-01

    The diagnostic capability of using tissue intrinsic micro-Raman signals to obtain biochemical information from human esophageal tissue is presented in this paper. Near-infrared micro-Raman spectroscopy combined with multivariate analysis was applied for discrimination of esophageal cancer tissue from normal tissue samples. Micro-Raman spectroscopy measurements were performed on 54 esophageal cancer tissues and 55 normal tissues in the 400-1750 cm-1 range. The mean Raman spectra showed significant differences between the two groups. Tentative assignments of the Raman bands in the measured tissue spectra suggested some changes in protein structure, a decrease in the relative amount of lactose, and increases in the percentages of tryptophan, collagen and phenylalanine content in esophageal cancer tissue as compared to those of a normal subject. The diagnostic algorithms based on principal component analysis (PCA) and linear discriminate analysis (LDA) achieved a diagnostic sensitivity of 87.0% and specificity of 70.9% for separating cancer from normal esophageal tissue samples. The result demonstrated that near-infrared micro-Raman spectroscopy combined with PCA-LDA analysis could be an effective and sensitive tool for identification of esophageal cancer.

  17. Systems genetic analysis of multivariate response to iron deficiency in mice

    PubMed Central

    Yin, Lina; Unger, Erica L.; Jellen, Leslie C.; Earley, Christopher J.; Allen, Richard P.; Tomaszewicz, Ann; Fleet, James C.

    2012-01-01

    The aim of this study was to identify genes that influence iron regulation under varying dietary iron availability. Male and female mice from 20+ BXD recombinant inbred strains were fed iron-poor or iron-adequate diets from weaning until 4 mo of age. At death, the spleen, liver, and blood were harvested for the measurement of hemoglobin, hematocrit, total iron binding capacity, transferrin saturation, and liver, spleen and plasma iron concentration. For each measure and diet, we found large, strain-related variability. A principal-components analysis (PCA) was performed on the strain means for the seven parameters under each dietary condition for each sex, followed by quantitative trait loci (QTL) analysis on the factors. Compared with the iron-adequate diet, iron deficiency altered the factor structure of the principal components. QTL analysis, combined with PosMed (a candidate gene searching system) published gene expression data and literature citations, identified seven candidate genes, Ptprd, Mdm1, Picalm, lip1, Tcerg1, Skp2, and Frzb based on PCA factor, diet, and sex. Expression of each of these is cis-regulated, significantly correlated with the corresponding PCA factor, and previously reported to regulate iron, directly or indirectly. We propose that polymorphisms in multiple genes underlie individual differences in iron regulation, especially in response to dietary iron challenge. This research shows that iron management is a highly complex trait, influenced by multiple genes. Systems genetics analysis of iron homeostasis holds promise for developing new methods for prevention and treatment of iron deficiency anemia and related diseases. PMID:22461179

  18. Early Improper Motion Detection in Golf Swings Using Wearable Motion Sensors: The First Approach

    PubMed Central

    Stančin, Sara; Tomažič, Sašo

    2013-01-01

    This paper presents an analysis of a golf swing to detect improper motion in the early phase of the swing. Led by the desire to achieve a consistent shot outcome, a particular golfer would (in multiple trials) prefer to perform completely identical golf swings. In reality, some deviations from the desired motion are always present due to the comprehensive nature of the swing motion. Swing motion deviations that are not detrimental to performance are acceptable. This analysis is conducted using a golfer's leading arm kinematic data, which are obtained from a golfer wearing a motion sensor that is comprised of gyroscopes and accelerometers. Applying the principal component analysis (PCA) to the reference observations of properly performed swings, the PCA components of acceptable swing motion deviations are established. Using these components, the motion deviations in the observations of other swings are examined. Any unacceptable deviations that are detected indicate an improper swing motion. Arbitrarily long observations of an individual player's swing sequences can be included in the analysis. The results obtained for the considered example show an improper swing motion in early phase of the swing, i.e., the first part of the backswing. An early detection method for improper swing motions that is conducted on an individual basis provides assistance for performance improvement. PMID:23752563

  19. Early improper motion detection in golf swings using wearable motion sensors: the first approach.

    PubMed

    Stančin, Sara; Tomažič, Sašo

    2013-06-10

    This paper presents an analysis of a golf swing to detect improper motion in the early phase of the swing. Led by the desire to achieve a consistent shot outcome, a particular golfer would (in multiple trials) prefer to perform completely identical golf swings. In reality, some deviations from the desired motion are always present due to the comprehensive nature of the swing motion. Swing motion deviations that are not detrimental to performance are acceptable. This analysis is conducted using a golfer's leading arm kinematic data, which are obtained from a golfer wearing a motion sensor that is comprised of gyroscopes and accelerometers. Applying the principal component analysis (PCA) to the reference observations of properly performed swings, the PCA components of acceptable swing motion deviations are established. Using these components, the motion deviations in the observations of other swings are examined. Any unacceptable deviations that are detected indicate an improper swing motion. Arbitrarily long observations of an individual player's swing sequences can be included in the analysis. The results obtained for the considered example show an improper swing motion in early phase of the swing, i.e., the first part of the backswing. An early detection method for improper swing motions that is conducted on an individual basis provides assistance for performance improvement.

  20. Dimensionality reduction for the quantitative evaluation of a smartphone-based Timed Up and Go test.

    PubMed

    Palmerini, Luca; Mellone, Sabato; Rocchi, Laura; Chiari, Lorenzo

    2011-01-01

    The Timed Up and Go is a clinical test to assess mobility in the elderly and in Parkinson's disease. Lately instrumented versions of the test are being considered, where inertial sensors assess motion. To improve the pervasiveness, ease of use, and cost, we consider a smartphone's accelerometer as the measurement system. Several parameters (usually highly correlated) can be computed from the signals recorded during the test. To avoid redundancy and obtain the features that are most sensitive to the locomotor performance, a dimensionality reduction was performed through principal component analysis (PCA). Forty-nine healthy subjects of different ages were tested. PCA was performed to extract new features (principal components) which are not redundant combinations of the original parameters and account for most of the data variability. They can be useful for exploratory analysis and outlier detection. Then, a reduced set of the original parameters was selected through correlation analysis with the principal components. This set could be recommended for studies based on healthy adults. The proposed procedure could be used as a first-level feature selection in classification studies (i.e. healthy-Parkinson's disease, fallers-non fallers) and could allow, in the future, a complete system for movement analysis to be incorporated in a smartphone.

  1. A SPECTRAL GRAPH APPROACH TO DISCOVERING GENETIC ANCESTRY1

    PubMed Central

    Lee, Ann B.; Luca, Diana; Roeder, Kathryn

    2010-01-01

    Mapping human genetic variation is fundamentally interesting in fields such as anthropology and forensic inference. At the same time, patterns of genetic diversity confound efforts to determine the genetic basis of complex disease. Due to technological advances, it is now possible to measure hundreds of thousands of genetic variants per individual across the genome. Principal component analysis (PCA) is routinely used to summarize the genetic similarity between subjects. The eigenvectors are interpreted as dimensions of ancestry. We build on this idea using a spectral graph approach. In the process we draw on connections between multidimensional scaling and spectral kernel methods. Our approach, based on a spectral embedding derived from the normalized Laplacian of a graph, can produce more meaningful delineation of ancestry than by using PCA. The method is stable to outliers and can more easily incorporate different similarity measures of genetic data than PCA. We illustrate a new algorithm for genetic clustering and association analysis on a large, genetically heterogeneous sample. PMID:20689656

  2. Rapid identification of pork for halal authentication using the electronic nose and gas chromatography mass spectrometer with headspace analyzer.

    PubMed

    Nurjuliana, M; Che Man, Y B; Mat Hashim, D; Mohamed, A K S

    2011-08-01

    The volatile compounds of pork, other meats and meat products were studied using an electronic nose and gas chromatography mass spectrometer with headspace analyzer (GCMS-HS) for halal verification. The zNose™ was successfully employed for identification and differentiation of pork and pork sausages from beef, mutton and chicken meats and sausages which were achieved using a visual odor pattern called VaporPrint™, derived from the frequency of the surface acoustic wave (SAW) detector of the electronic nose. GCMS-HS was employed to separate and analyze the headspace gasses from samples into peaks corresponding to individual compounds for the purpose of identification. Principal component analysis (PCA) was applied for data interpretation. Analysis by PCA was able to cluster and discriminate pork from other types of meats and sausages. It was shown that PCA could provide a good separation of the samples with 67% of the total variance accounted by PC1. Copyright © 2011 Elsevier Ltd. All rights reserved.

  3. A Dimensionally Reduced Clustering Methodology for Heterogeneous Occupational Medicine Data Mining.

    PubMed

    Saâdaoui, Foued; Bertrand, Pierre R; Boudet, Gil; Rouffiac, Karine; Dutheil, Frédéric; Chamoux, Alain

    2015-10-01

    Clustering is a set of techniques of the statistical learning aimed at finding structures of heterogeneous partitions grouping homogenous data called clusters. There are several fields in which clustering was successfully applied, such as medicine, biology, finance, economics, etc. In this paper, we introduce the notion of clustering in multifactorial data analysis problems. A case study is conducted for an occupational medicine problem with the purpose of analyzing patterns in a population of 813 individuals. To reduce the data set dimensionality, we base our approach on the Principal Component Analysis (PCA), which is the statistical tool most commonly used in factorial analysis. However, the problems in nature, especially in medicine, are often based on heterogeneous-type qualitative-quantitative measurements, whereas PCA only processes quantitative ones. Besides, qualitative data are originally unobservable quantitative responses that are usually binary-coded. Hence, we propose a new set of strategies allowing to simultaneously handle quantitative and qualitative data. The principle of this approach is to perform a projection of the qualitative variables on the subspaces spanned by quantitative ones. Subsequently, an optimal model is allocated to the resulting PCA-regressed subspaces.

  4. Settlement behavior of municipal solid waste due to internal and external environmental factors in a lysimeter.

    PubMed

    Melo, Márcio C; Caribé, Rômulo M; Ribeiro, Libânia S; Sousa, Raul B A; Monteiro, Veruschka E D; de Paiva, William

    2016-12-05

    Long-term settlement magnitude is influenced by changes in external and internal factors that control the microbiological activity in the landfill waste body. To improve the understanding of settlement phenomena, it is instructive to study lysimeters filled with MSW. This paper aims to understand the settlement behavior of MSW by correlating internal and external factors that influence waste biodegradation in a lysimeter. Thus, a lysimeter was built, instrumented and filled with MSW from the city of Campina Grande, the state of Paraíba, Brazil. Physicochemical analysis of the waste (from three levels of depth of the lysimeter) was carried out along with MSW settlement measurements. Statistical tools such as descriptive analysis and principal component analysis (PCA) were also performed. The settlement/compression, coefficient of variation and PCA results indicated the most intense rate of biodegradation in the top layer. The PCA results of intermediate and bottom levels presented fewer physicochemical and meteorological variables correlated with compression data in contrast with the top layer. It is possible to conclude that environmental conditions may influence internal indicators of MSW biodegradation, such as the settlement.

  5. Detection of l-Cysteine in wheat flour by Raman microspectroscopy combined chemometrics of HCA and PCA.

    PubMed

    Cebi, Nur; Dogan, Canan Ekinci; Develioglu, Ayşen; Yayla, Mediha Esra Altuntop; Sagdic, Osman

    2017-08-01

    l-Cysteine is deliberately added to various flour types since l-Cysteine has enabled favorable baking conditions such as low viscosity, increased elasticity and rise during baking. In Turkey, usage of l-Cysteine as a food additive isn't allowed in wheat flour according to the Turkish Food Codex Regulation on food additives. There is an urgent need for effective methods to detect l-Cysteine in wheat flour. In this study, for the first time, a new, rapid, effective, non-destructive and cost-effective method was developed for detection of l-Cysteine in wheat flour using Raman microscopy. Detection of l-Cysteine in wheat flour was accomplished successfully using Raman microscopy combined chemometrics of PCA (Principal Component Analysis) and HCA (Hierarchical Cluster Analysis). In this work, 500-2000cm -1 spectral range (fingerprint region) was determined to perform PCA and HCA analysis. l-Cysteine and l-Cystine were determined with detection limit of 0.125% (w/w) in different wheat flour samples. Copyright © 2017 Elsevier Ltd. All rights reserved.

  6. Characterization of Chinese liquor aroma components during aging process and liquor age discrimination using gas chromatography combined with multivariable statistics

    NASA Astrophysics Data System (ADS)

    Xu, M. L.; Yu, Y.; Ramaswamy, H. S.; Zhu, S. M.

    2017-01-01

    Chinese liquor aroma components were characterized during the aging process using gas chromatography (GC). Principal component and cluster analysis (PCA, CA) were used to discriminate the Chinese liquor age which has a great economic value. Of a total of 21 major aroma components identified and quantified, 13 components which included several acids, alcohols, esters, aldehydes and furans decreased significantly in the first year of aging, maintained the same levels (p > 0.05) for next three years and decreased again (p < 0.05) in the fifth year. On the contrary, a significant increase was observed in propionic acid, furfural and phenylethanol. Ethyl lactate was found to be the most stable aroma component during aging process. Results of PCA and CA demonstrated that young liquor (fresh) and aged liquors were well separated from each other, which is in consistent with the evolution of aroma components along with the aging process. These findings provide a quantitative basis for discriminating the Chinese liquor age and a scientific basis for further research on elucidating the liquor aging process, and a possible tool to guard against counterfeit and defective products.

  7. Spectral discrimination of bleached and healthy submerged corals based on principal components analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Holden, H.; LeDrew, E.

    1997-06-01

    Remote discrimination of substrate types in relatively shallow coastal waters has been limited by the spatial and spectral resolution of available sensors. An additional limiting factor is the strong attenuating influence of the water column over the substrate. As a result, there have been limited attempts to map submerged ecosystems such as coral reefs based on spectral characteristics. Both healthy and bleached corals were measured at depth with a hand-held spectroradiometer, and their spectra compared. Two separate principal components analyses (PCA) were performed on two sets of spectral data. The PCA revealed that there is indeed a spectral difference basedmore » on health. In the first data set, the first component (healthy coral) explains 46.82%, while the second component (bleached coral) explains 46.35% of the variance. In the second data set, the first component (bleached coral) explained 46.99%; the second component (healthy coral) explained 36.55%; and the third component (healthy coral) explained 15.44 % of the total variance in the original data. These results are encouraging with respect to using an airborne spectroradiometer to identify areas of bleached corals thus enabling accurate monitoring over time.« less

  8. Chemometric Data Analysis for Deconvolution of Overlapped Ion Mobility Profiles

    NASA Astrophysics Data System (ADS)

    Zekavat, Behrooz; Solouki, Touradj

    2012-11-01

    We present the details of a data analysis approach for deconvolution of the ion mobility (IM) overlapped or unresolved species. This approach takes advantage of the ion fragmentation variations as a function of the IM arrival time. The data analysis involves the use of an in-house developed data preprocessing platform for the conversion of the original post-IM/collision-induced dissociation mass spectrometry (post-IM/CID MS) data to a Matlab compatible format for chemometric analysis. We show that principle component analysis (PCA) can be used to examine the post-IM/CID MS profiles for the presence of mobility-overlapped species. Subsequently, using an interactive self-modeling mixture analysis technique, we show how to calculate the total IM spectrum (TIMS) and CID mass spectrum for each component of the IM overlapped mixtures. Moreover, we show that PCA and IM deconvolution techniques provide complementary results to evaluate the validity of the calculated TIMS profiles. We use two binary mixtures with overlapping IM profiles, including (1) a mixture of two non-isobaric peptides (neurotensin (RRPYIL) and a hexapeptide (WHWLQL)), and (2) an isobaric sugar isomer mixture of raffinose and maltotriose, to demonstrate the applicability of the IM deconvolution.

  9. Discrimination among Panax species using spectral fingerprinting

    USDA-ARS?s Scientific Manuscript database

    Spectral fingerprints of samples of three Panax species (P. quinquefolius L., P. ginseng, and P. notoginseng) were acquired using UV, NIR, and MS spectrometry. With principal components analysis (PCA), all three methods allowed visual discrimination between all three species. All three methods wer...

  10. Differentiating organic and conventional sage by chromatographic and mass spectrometry flow-injection fingerprints

    USDA-ARS?s Scientific Manuscript database

    High performance liquid chromatography (UPLC) and flow injection electrospray ionization with ion trap mass spectrometry (FIMS) fingerprints combined with the principal component analysis (PCA) were examined for their potential in differentiating commercial organic and conventional sage samples. The...

  11. Origin Discrimination of Osmanthus fragrans var. thunbergii Flowers using GC-MS and UPLC-PDA Combined with Multivariable Analysis Methods.

    PubMed

    Zhou, Fei; Zhao, Yajing; Peng, Jiyu; Jiang, Yirong; Li, Maiquan; Jiang, Yuan; Lu, Baiyi

    2017-07-01

    Osmanthus fragrans flowers are used as folk medicine and additives for teas, beverages and foods. The metabolites of O. fragrans flowers from different geographical origins were inconsistent in some extent. Chromatography and mass spectrometry combined with multivariable analysis methods provides an approach for discriminating the origin of O. fragrans flowers. To discriminate the Osmanthus fragrans var. thunbergii flowers from different origins with the identified metabolites. GC-MS and UPLC-PDA were conducted to analyse the metabolites in O. fragrans var. thunbergii flowers (in total 150 samples). Principal component analysis (PCA), soft independent modelling of class analogy analysis (SIMCA) and random forest (RF) analysis were applied to group the GC-MS and UPLC-PDA data. GC-MS identified 32 compounds common to all samples while UPLC-PDA/QTOF-MS identified 16 common compounds. PCA of the UPLC-PDA data generated a better clustering than PCA of the GC-MS data. Ten metabolites (six from GC-MS and four from UPLC-PDA) were selected as effective compounds for discrimination by PCA loadings. SIMCA and RF analysis were used to build classification models, and the RF model, based on the four effective compounds (caffeic acid derivative, acteoside, ligustroside and compound 15), yielded better results with the classification rate of 100% in the calibration set and 97.8% in the prediction set. GC-MS and UPLC-PDA combined with multivariable analysis methods can discriminate the origin of Osmanthus fragrans var. thunbergii flowers. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  12. Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis

    PubMed Central

    2013-01-01

    Background Protein-protein interactions (PPIs) play crucial roles in the execution of various cellular processes and form the basis of biological mechanisms. Although large amount of PPIs data for different species has been generated by high-throughput experimental techniques, current PPI pairs obtained with experimental methods cover only a fraction of the complete PPI networks, and further, the experimental methods for identifying PPIs are both time-consuming and expensive. Hence, it is urgent and challenging to develop automated computational methods to efficiently and accurately predict PPIs. Results We present here a novel hierarchical PCA-EELM (principal component analysis-ensemble extreme learning machine) model to predict protein-protein interactions only using the information of protein sequences. In the proposed method, 11188 protein pairs retrieved from the DIP database were encoded into feature vectors by using four kinds of protein sequences information. Focusing on dimension reduction, an effective feature extraction method PCA was then employed to construct the most discriminative new feature set. Finally, multiple extreme learning machines were trained and then aggregated into a consensus classifier by majority voting. The ensembling of extreme learning machine removes the dependence of results on initial random weights and improves the prediction performance. Conclusions When performed on the PPI data of Saccharomyces cerevisiae, the proposed method achieved 87.00% prediction accuracy with 86.15% sensitivity at the precision of 87.59%. Extensive experiments are performed to compare our method with state-of-the-art techniques Support Vector Machine (SVM). Experimental results demonstrate that proposed PCA-EELM outperforms the SVM method by 5-fold cross-validation. Besides, PCA-EELM performs faster than PCA-SVM based method. Consequently, the proposed approach can be considered as a new promising and powerful tools for predicting PPI with excellent performance and less time. PMID:23815620

  13. Contribution to the understanding of how principal component analysis-derived dietary patterns emerge from habitual data on food consumption.

    PubMed

    Schwedhelm, Carolina; Iqbal, Khalid; Knüppel, Sven; Schwingshackl, Lukas; Boeing, Heiner

    2018-02-01

    Principal component analysis (PCA) is a widely used exploratory method in epidemiology to derive dietary patterns from habitual diet. Such dietary patterns seem to originate from intakes on multiple days and eating occasions. Therefore, analyzing food intake of study populations with different levels of food consumption can provide additional insights as to how habitual dietary patterns are formed. We analyzed the food intake data of German adults in terms of the relations among food groups from three 24-h dietary recalls (24hDRs) on the habitual, single-day, and main-meal levels, and investigated the contribution of each level to the formation of PCA-derived habitual dietary patterns. Three 24hDRs were collected in 2010-2012 from 816 adults for an European Prospective Investigation into Cancer and Nutrition (EPIC)-Potsdam subcohort study. We identified PCA-derived habitual dietary patterns and compared cross-sectional food consumption data in terms of correlation (Spearman), consistency (intraclass correlation coefficient), and frequency of consumption across all days and main meals. Contribution to the formation of the dietary patterns was obtained through Spearman correlation of the dietary pattern scores. Among the meals, breakfast appeared to be the most consistent eating occasion within individuals. Dinner showed the strongest correlations with "Prudent" (Spearman correlation = 0.60), "Western" (Spearman correlation = 0.59), and "Traditional" (Spearman correlation = 0.60) dietary patterns identified on the habitual level, and lunch showed the strongest correlations with the "Cereals and legumes" (Spearman correlation = 0.60) habitual dietary pattern. Higher meal consistency was related to lower contributions to the formation of PCA-derived habitual dietary patterns. Absolute amounts of food consumption did not strongly conform to the habitual dietary patterns by meals, suggesting that these patterns are formed by complex combinations of variable food consumption across meals. Dinner showed the highest contribution to the formation of habitual dietary patterns. This study provided information about how PCA-derived dietary patterns are formed and how they could be influenced.

  14. A perspective on two chemometrics tools: PCA and MCR, and introduction of a new one: Pattern recognition entropy (PRE), as applied to XPS and ToF-SIMS depth profiles of organic and inorganic materials

    NASA Astrophysics Data System (ADS)

    Chatterjee, Shiladitya; Singh, Bhupinder; Diwan, Anubhav; Lee, Zheng Rong; Engelhard, Mark H.; Terry, Jeff; Tolley, H. Dennis; Gallagher, Neal B.; Linford, Matthew R.

    2018-03-01

    X-ray photoelectron spectroscopy (XPS) and time-of-flight secondary ion mass spectrometry (ToF-SIMS) are much used analytical techniques that provide information about the outermost atomic and molecular layers of materials. In this work, we discuss the application of multivariate spectral techniques, including principal component analysis (PCA) and multivariate curve resolution (MCR), to the analysis of XPS and ToF-SIMS depth profiles. Multivariate analyses often provide insight into data sets that is not easily obtained in a univariate fashion. Pattern recognition entropy (PRE), which has its roots in Shannon's information theory, is also introduced. This approach is not the same as the mutual information/entropy approaches sometimes used in data processing. A discussion of the theory of each technique is presented. PCA, MCR, and PRE are applied to four different data sets obtained from: a ToF-SIMS depth profile through ca. 100 nm of plasma polymerized C3F6 on Si, a ToF-SIMS depth profile through ca. 100 nm of plasma polymerized PNIPAM (poly (N-isopropylacrylamide)) on Si, an XPS depth profile through a film of SiO2 on Si, and an XPS depth profile through a film of Ta2O5 on Ta. PCA, MCR, and PRE reveal the presence of interfaces in the films, and often indicate that the first few scans in the depth profiles are different from those that follow. PRE and backward difference PRE provide this information in a straightforward fashion. Rises in the PRE signals at interfaces suggest greater complexity to the corresponding spectra. Results from PCA, especially for the higher principal components, were sometimes difficult to understand. MCR analyses were generally more interpretable.

  15. Cause Resolving of Typhoon Precipitation Using Principle Component Analysis under Complex Interactive Effect of Terrain, Monsoon and Typhoon Vortex

    NASA Astrophysics Data System (ADS)

    Huang, C. L.; Hsu, N. S.

    2015-12-01

    This study develops a novel methodology to resolve the cause of typhoon-induced precipitation using principle component analysis (PCA) and to develop a long lead-time precipitation prediction model. The discovered spatial and temporal features of rainfall are utilized to develop a state-of-the-art descriptive statistical model which can be used to predict long lead-time precipitation during typhoons. The time series of 12-hour precipitation from different types of invasive moving track of typhoons are respectively precede the signal analytical process to qualify the causes of rainfall and to quantify affected degree of each induced cause. The causes include: (1) interaction between typhoon rain band and terrain; (2) co-movement effect induced by typhoon wind field with monsoon; (3) pressure gradient; (4) wind velocity; (5) temperature environment; (6) characteristic distance between typhoon center and surface target station; (7) distance between grade 7 storm radius and surface target station; and (8) relative humidity. The results obtained from PCA can detect the hidden pattern of the eight causes in space and time and can understand the future trends and changes of precipitation. This study applies the developed methodology in Taiwan Island which is constituted by complex diverse terrain formation and height. Results show that: (1) for the typhoon moving toward the direction of 245° to 330°, Causes (1), (2) and (6) are the primary ones to generate rainfall; and (2) for the direction of 330° to 380°, Causes (1), (4) and (6) are the primary ones. Besides, the developed precipitation prediction model by using PCA with the distributed moving track approach (PCA-DMT) is 32% more accurate by that of PCA without distributed moving track approach, and the former model can effectively achieve long lead-time precipitation prediction with an average predicted error of 13% within average 48 hours of forecasted lead-time.

  16. Unsupervised analysis of small animal dynamic Cerenkov luminescence imaging

    NASA Astrophysics Data System (ADS)

    Spinelli, Antonello E.; Boschi, Federico

    2011-12-01

    Clustering analysis (CA) and principal component analysis (PCA) were applied to dynamic Cerenkov luminescence images (dCLI). In order to investigate the performances of the proposed approaches, two distinct dynamic data sets obtained by injecting mice with 32P-ATP and 18F-FDG were acquired using the IVIS 200 optical imager. The k-means clustering algorithm has been applied to dCLI and was implemented using interactive data language 8.1. We show that cluster analysis allows us to obtain good agreement between the clustered and the corresponding emission regions like the bladder, the liver, and the tumor. We also show a good correspondence between the time activity curves of the different regions obtained by using CA and manual region of interest analysis on dCLIT and PCA images. We conclude that CA provides an automatic unsupervised method for the analysis of preclinical dynamic Cerenkov luminescence image data.

  17. Removal of BCG artefact from concurrent fMRI-EEG recordings based on EMD and PCA.

    PubMed

    Javed, Ehtasham; Faye, Ibrahima; Malik, Aamir Saeed; Abdullah, Jafri Malin

    2017-11-01

    Simultaneous electroencephalography (EEG) and functional magnetic resonance image (fMRI) acquisitions provide better insight into brain dynamics. Some artefacts due to simultaneous acquisition pose a threat to the quality of the data. One such problematic artefact is the ballistocardiogram (BCG) artefact. We developed a hybrid algorithm that combines features of empirical mode decomposition (EMD) with principal component analysis (PCA) to reduce the BCG artefact. The algorithm does not require extra electrocardiogram (ECG) or electrooculogram (EOG) recordings to extract the BCG artefact. The method was tested with both simulated and real EEG data of 11 participants. From the simulated data, the similarity index between the extracted BCG and the simulated BCG showed the effectiveness of the proposed method in BCG removal. On the other hand, real data were recorded with two conditions, i.e. resting state (eyes closed dataset) and task influenced (event-related potentials (ERPs) dataset). Using qualitative (visual inspection) and quantitative (similarity index, improved normalized power spectrum (INPS) ratio, power spectrum, sample entropy (SE)) evaluation parameters, the assessment results showed that the proposed method can efficiently reduce the BCG artefact while preserving the neuronal signals. Compared with conventional methods, namely, average artefact subtraction (AAS), optimal basis set (OBS) and combined independent component analysis and principal component analysis (ICA-PCA), the statistical analyses of the results showed that the proposed method has better performance, and the differences were significant for all quantitative parameters except for the power and sample entropy. The proposed method does not require any reference signal, prior information or assumption to extract the BCG artefact. It will be very useful in circumstances where the reference signal is not available. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. Potential of cancer screening with serum surface-enhanced Raman spectroscopy and a support vector machine

    NASA Astrophysics Data System (ADS)

    Li, S. X.; Zhang, Y. J.; Zeng, Q. Y.; Li, L. F.; Guo, Z. Y.; Liu, Z. M.; Xiong, H. L.; Liu, S. H.

    2014-06-01

    Cancer is the most common disease to threaten human health. The ability to screen individuals with malignant tumours with only a blood sample would be greatly advantageous to early diagnosis and intervention. This study explores the possibility of discriminating between cancer patients and normal subjects with serum surface-enhanced Raman spectroscopy (SERS) and a support vector machine (SVM) through a peripheral blood sample. A total of 130 blood samples were obtained from patients with liver cancer, colonic cancer, esophageal cancer, nasopharyngeal cancer, gastric cancer, as well as 113 blood samples from normal volunteers. Several diagnostic models were built with the serum SERS spectra using SVM and principal component analysis (PCA) techniques. The results show that a diagnostic accuracy of 85.5% is acquired with a PCA algorithm, while a diagnostic accuracy of 95.8% is obtained using radial basis function (RBF), PCA-SVM methods. The results prove that a RBF kernel PCA-SVM technique is superior to PCA and conventional SVM (C-SVM) algorithms in classification serum SERS spectra. The study demonstrates that serum SERS, in combination with SVM techniques, has great potential for screening cancerous patients with any solid malignant tumour through a peripheral blood sample.

  19. Fast discrimination of traditional Chinese medicine according to geographical origins with FTIR spectroscopy and advanced pattern recognition techniques

    NASA Astrophysics Data System (ADS)

    Li, Ning; Wang, Yan; Xu, Kexin

    2006-08-01

    Combined with Fourier transform infrared (FTIR) spectroscopy and three kinds of pattern recognition techniques, 53 traditional Chinese medicine danshen samples were rapidly discriminated according to geographical origins. The results showed that it was feasible to discriminate using FTIR spectroscopy ascertained by principal component analysis (PCA). An effective model was built by employing the Soft Independent Modeling of Class Analogy (SIMCA) and PCA, and 82% of the samples were discriminated correctly. Through use of the artificial neural network (ANN)-based back propagation (BP) network, the origins of danshen were completely classified.

  20. Chemometric Methods to Quantify 1D and 2D NMR Spectral Differences Among Similar Protein Therapeutics.

    PubMed

    Chen, Kang; Park, Junyong; Li, Feng; Patil, Sharadrao M; Keire, David A

    2018-04-01

    NMR spectroscopy is an emerging analytical tool for measuring complex drug product qualities, e.g., protein higher order structure (HOS) or heparin chemical composition. Most drug NMR spectra have been visually analyzed; however, NMR spectra are inherently quantitative and multivariate and thus suitable for chemometric analysis. Therefore, quantitative measurements derived from chemometric comparisons between spectra could be a key step in establishing acceptance criteria for a new generic drug or a new batch after manufacture change. To measure the capability of chemometric methods to differentiate comparator NMR spectra, we calculated inter-spectra difference metrics on 1D/2D spectra of two insulin drugs, Humulin R® and Novolin R®, from different manufacturers. Both insulin drugs have an identical drug substance but differ in formulation. Chemometric methods (i.e., principal component analysis (PCA), 3-way Tucker3 or graph invariant (GI)) were performed to calculate Mahalanobis distance (D M ) between the two brands (inter-brand) and distance ratio (D R ) among the different lots (intra-brand). The PCA on 1D inter-brand spectral comparison yielded a D M value of 213. In comparing 2D spectra, the Tucker3 analysis yielded the highest differentiability value (D M  = 305) in the comparisons made followed by PCA (D M  = 255) then the GI method (D M  = 40). In conclusion, drug quality comparisons among different lots might benefit from PCA on 1D spectra for rapidly comparing many samples, while higher resolution but more time-consuming 2D-NMR-data-based comparisons using Tucker3 analysis or PCA provide a greater level of assurance for drug structural similarity evaluation between drug brands.

  1. Characterization and source identification of pollutants in runoff from a mixed land use watershed using ordination analyses.

    PubMed

    Lee, Dong Hoon; Kim, Jin Hwi; Mendoza, Joseph A; Lee, Chang Hee; Kang, Joo-Hyon

    2016-05-01

    While identification of critical pollutant sources is the key initial step for cost-effective runoff management, it is challenging due to the highly uncertain nature of runoff pollution, especially during a storm event. To identify critical sources and their quantitative contributions to runoff pollution (especially focusing on phosphorous), two ordination methods were used in this study: principal component analysis (PCA) and positive matrix factorization (PMF). For the ordination analyses, we used runoff quality data for 14 storm events, including data for phosphorus, 11 heavy metal species, and eight ionic species measured at the outlets of subcatchments with different land use compositions in a mixed land use watershed. Five factors as sources of runoff pollutants were identified by PCA: agrochemicals, groundwater, native soils, domestic sewage, and urban sources (building materials and automotive activities). PMF identified similar factors to those identified by PCA, with more detailed source mechanisms for groundwater (i.e., nitrate leaching and cation exchange) and urban sources (vehicle components/motor oils/building materials and vehicle exhausts), confirming the sources identified by PCA. PMF was further used to quantify contributions of the identified sources to the water quality. Based on the results, agrochemicals and automotive activities were the two dominant and ubiquitous phosphorus sources (39-61 and 16-47 %, respectively) in the study area, regardless of land use types.

  2. Dynamic competitive probabilistic principal components analysis.

    PubMed

    López-Rubio, Ezequiel; Ortiz-DE-Lazcano-Lobato, Juan Miguel

    2009-04-01

    We present a new neural model which extends the classical competitive learning (CL) by performing a Probabilistic Principal Components Analysis (PPCA) at each neuron. The model also has the ability to learn the number of basis vectors required to represent the principal directions of each cluster, so it overcomes a drawback of most local PCA models, where the dimensionality of a cluster must be fixed a priori. Experimental results are presented to show the performance of the network with multispectral image data.

  3. Principal components of wrist circumduction from electromagnetic surgical tracking.

    PubMed

    Rasquinha, Brian J; Rainbow, Michael J; Zec, Michelle L; Pichora, David R; Ellis, Randy E

    2017-02-01

    An electromagnetic (EM) surgical tracking system was used for a functionally calibrated kinematic analysis of wrist motion. Circumduction motions were tested for differences in subject gender and for differences in the sense of the circumduction as clockwise or counter-clockwise motion. Twenty subjects were instrumented for EM tracking. Flexion-extension motion was used to identify the functional axis. Subjects performed unconstrained wrist circumduction in a clockwise and counter-clockwise sense. Data were decomposed into orthogonal flexion-extension motions and radial-ulnar deviation motions. PCA was used to concisely represent motions. Nonparametric Wilcoxon tests were used to distinguish the groups. Flexion-extension motions were projected onto a direction axis with a root-mean-square error of [Formula: see text]. Using the first three principal components, there was no statistically significant difference in gender (all [Formula: see text]). For motion sense, radial-ulnar deviation distinguished the sense of circumduction in the first principal component ([Formula: see text]) and in the third principal component ([Formula: see text]); flexion-extension distinguished the sense in the second principal component ([Formula: see text]). The clockwise sense of circumduction could be distinguished by a multifactorial combination of components; there were no gender differences in this small population. These data constitute a baseline for normal wrist circumduction. The multifactorial PCA findings suggest that a higher-dimensional method, such as manifold analysis, may be a more concise way of representing circumduction in human joints.

  4. [Content determination of twelve major components in Tibetan medicine Zuozhu Daxi by UPLC].

    PubMed

    Qu, Yan; Li, Jin-hua; Zhang, Chen; Li, Chun-xue; Dong, Hong-jiao; Wang, Chang-sheng; Zeng, Rui; Chen, Xiao-hu

    2015-05-01

    A quantitative analytical method of ultra-high performance liquid chromatography (UPLC) was developed for simultaneously determining twelve components in Tibetan medicine Zuozhu Daxi. SIMPCA 12.0 software was used a principal component analysis PCA) and partial small squares analysis (PLSD-DA) on the twelve components in 10 batches from four pharmaceutical factories. Acquity UPLC BEH C15 column (2.1 mm x 100 mm, 1.7 µm) was adopted at the column temperature of 35 °C and eluted with acetonitrile (A) -0.05% phosphate acid solution (B) as the mobile phase with a flow rate of 0. 3 mL · min(-1). The injection volume was 1 µL. The detection wavelengths were set at 210 nm for alantolactone, isoalantolactone and oleanolic; 260 nm for trychnine and brucine; 288 nm for protopine; 306 nm for protopine, resveratrol and piperine; 370 nm for quercetin and isorhamnetin. The results showed a good separation among index components, with a good linearity relationship (R2 = 0.999 6) within the selected concentration range. The average sample recovery rates ranged between 99.44%-101.8%, with RSD between 0.37%-1.7%, indicating the method is rapid and accurate with a good repeatability and stability. The PCA and PLSD-DA analysis on the sample determination results revealed a great difference among samples from different pharmaceutical factories. The twelve components included in this study contributed significantly to the quantitative determination of intrinsic quality of Zuozhu Daxi. The UPLC established for to the quantitative determination of the twelve components can provide scientific basis for the comprehensive quality evaluation of Zuozhu Daxi.

  5. The application of near infrared (NIR) spectroscopy to inorganic preservative-treated wood

    Treesearch

    Chi-Leung So; Stan T. Lebow; Leslie H. Groom; Timothy G. Rials

    2004-01-01

    There is a growing need to find a rapid, inexpensive, and reliable method to distinguish between treated and untreated waste wood. This paper evaluates the ability of near infrared (NIR) spectroscopy with multivariate analysis (MVA) to distinguish preservative types and retentions. It is demonstrated that principal component analysis (PCA) can differentiate lumber...

  6. Differentiation of the two major species of Echinacea (E. augustifolia and E. purpurea) using a flow injection mass spectrometric (FIMS) fingerprinting method and chemometric analysis

    USDA-ARS?s Scientific Manuscript database

    A rapid, simple, and reliable flow-injection mass spectrometric (FIMS) method was developed to discriminate two major Echinacea species (E. purpurea and E. angustifolia) samples. Fifty-eight Echinacea samples collected from United States were analyzed using FIMS. Principle component analysis (PCA) a...

  7. [Discrimination of varieties of borneol using terahertz spectra based on principal component analysis and support vector machine].

    PubMed

    Li, Wu; Hu, Bing; Wang, Ming-wei

    2014-12-01

    In the present paper, the terahertz time-domain spectroscopy (THz-TDS) identification model of borneol based on principal component analysis (PCA) and support vector machine (SVM) was established. As one Chinese common agent, borneol needs a rapid, simple and accurate detection and identification method for its different source and being easily confused in the pharmaceutical and trade links. In order to assure the quality of borneol product and guard the consumer's right, quickly, efficiently and correctly identifying borneol has significant meaning to the production and transaction of borneol. Terahertz time-domain spectroscopy is a new spectroscopy approach to characterize material using terahertz pulse. The absorption terahertz spectra of blumea camphor, borneol camphor and synthetic borneol were measured in the range of 0.2 to 2 THz with the transmission THz-TDS. The PCA scores of 2D plots (PC1 X PC2) and 3D plots (PC1 X PC2 X PC3) of three kinds of borneol samples were obtained through PCA analysis, and both of them have good clustering effect on the 3 different kinds of borneol. The value matrix of the first 10 principal components (PCs) was used to replace the original spectrum data, and the 60 samples of the three kinds of borneol were trained and then the unknown 60 samples were identified. Four kinds of support vector machine model of different kernel functions were set up in this way. Results show that the accuracy of identification and classification of SVM RBF kernel function for three kinds of borneol is 100%, and we selected the SVM with the radial basis kernel function to establish the borneol identification model, in addition, in the noisy case, the classification accuracy rates of four SVM kernel function are above 85%, and this indicates that SVM has strong generalization ability. This study shows that PCA with SVM method of borneol terahertz spectroscopy has good classification and identification effects, and provides a new method for species identification of borneol in Chinese medicine.

  8. A PCA-Based method for determining craniofacial relationship and sexual dimorphism of facial shapes.

    PubMed

    Shui, Wuyang; Zhou, Mingquan; Maddock, Steve; He, Taiping; Wang, Xingce; Deng, Qingqiong

    2017-11-01

    Previous studies have used principal component analysis (PCA) to investigate the craniofacial relationship, as well as sex determination using facial factors. However, few studies have investigated the extent to which the choice of principal components (PCs) affects the analysis of craniofacial relationship and sexual dimorphism. In this paper, we propose a PCA-based method for visual and quantitative analysis, using 140 samples of 3D heads (70 male and 70 female), produced from computed tomography (CT) images. There are two parts to the method. First, skull and facial landmarks are manually marked to guide the model's registration so that dense corresponding vertices occupy the same relative position in every sample. Statistical shape spaces of the skull and face in dense corresponding vertices are constructed using PCA. Variations in these vertices, captured in every principal component (PC), are visualized to observe shape variability. The correlations of skull- and face-based PC scores are analysed, and linear regression is used to fit the craniofacial relationship. We compute the PC coefficients of a face based on this craniofacial relationship and the PC scores of a skull, and apply the coefficients to estimate a 3D face for the skull. To evaluate the accuracy of the computed craniofacial relationship, the mean and standard deviation of every vertex between the two models are computed, where these models are reconstructed using real PC scores and coefficients. Second, each PC in facial space is analysed for sex determination, for which support vector machines (SVMs) are used. We examined the correlation between PCs and sex, and explored the extent to which the choice of PCs affects the expression of sexual dimorphism. Our results suggest that skull- and face-based PCs can be used to describe the craniofacial relationship and that the accuracy of the method can be improved by using an increased number of face-based PCs. The results show that the accuracy of the sex classification is related to the choice of PCs. The highest sex classification rate is 91.43% using our method. Copyright © 2017 Elsevier Ltd. All rights reserved.

  9. SU-F-R-41: Regularized PCA Can Model Treatment-Related Changes in Head and Neck Patients Using Daily CBCTs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chetvertkov, M; Henry Ford Health System, Detroit, MI; Siddiqui, F

    2016-06-15

    Purpose: To use daily cone beam CTs (CBCTs) to develop regularized principal component analysis (PCA) models of anatomical changes in head and neck (H&N) patients, to guide replanning decisions in adaptive radiation therapy (ART). Methods: Known deformations were applied to planning CT (pCT) images of 10 H&N patients to model several different systematic anatomical changes. A Pinnacle plugin was used to interpolate systematic changes over 35 fractions, generating a set of 35 synthetic CTs for each patient. Deformation vector fields (DVFs) were acquired between the pCT and synthetic CTs and random fraction-to-fraction changes were superimposed on the DVFs. Standard non-regularizedmore » and regularized patient-specific PCA models were built using the DVFs. The ability of PCA to extract the known deformations was quantified. PCA models were also generated from clinical CBCTs, for which the deformations and DVFs were not known. It was hypothesized that resulting eigenvectors/eigenfunctions with largest eigenvalues represent the major anatomical deformations during the course of treatment. Results: As demonstrated with quantitative results in the supporting document regularized PCA is more successful than standard PCA at capturing systematic changes early in the treatment. Regularized PCA is able to detect smaller systematic changes against the background of random fraction-to-fraction changes. To be successful at guiding ART, regularized PCA should be coupled with models of when anatomical changes occur: early, late or throughout the treatment course. Conclusion: The leading eigenvector/eigenfunction from the both PCA approaches can tentatively be identified as a major systematic change during radiotherapy course when systematic changes are large enough with respect to random fraction-to-fraction changes. In all cases the regularized PCA approach appears to be more reliable at capturing systematic changes, enabling dosimetric consequences to be projected once trends are established early in the treatment course. This work is supported in part by a grant from Varian Medical Systems, Palo Alto, CA.« less

  10. National economic and development indicators and international variation in prostate cancer incidence and mortality: an ecological analysis.

    PubMed

    Neupane, Subas; Bray, Freddie; Auvinen, Anssi

    2017-06-01

    Macroeconomic indicators are likely associated with prostate cancer (PCa) incidence and mortality globally, but have rarely been assessed. Data on PCa incidence in 2003-2007 for 49 countries with either nationwide cancer registry or at least two regional registries were obtained from Cancer Incidence in Five Continents Vol X and national PCa mortality for 2012 from GLOBOCAN 2012. We compared PCa incidence and mortality rates with various population-level indicators of health, economy and development in 2000. Poisson and linear regression methods were used to quantify the associations. PCa incidence varied more than 15-fold, being highest in high-income countries. PCa mortality exhibited less variation, with higher rates in many low- and middle-income countries. Healthcare expenditure (rate ratio, RR 1.46, 95 % CI 1.45-1.47) and population growth (RR 1.15, 95 % CI 1.14-1.16), as well as computer and mobile phone density, were associated with a higher PCa incidence, while gross domestic product, GDP (RR 0.94, 95 % CI 0.93-0.95) and overall mortality (RR 0.72, 95 % CI 0.71-0.73) were associated with a low incidence. GDP (RR 0.55, 95 % CI 0.46-0.66) was also associated with a low PCa mortality, while life expectancy (RR 3.93, 95 % CI 3.22-4.79) and healthcare expenditure (RR 1.20, 95 % CI 1.09-1.32) were associated with an elevated mortality. Our results show that healthcare expenditure and, thus, the availability of medical resources are an important contributor to the patterns of international variation in PCa incidence. This suggests that there is an iatrogenic component in the current global epidemic of PCa. On the other hand, higher healthcare expenditure is associated with lower PCa death rates.

  11. Procrustean rotation in concert with principal component analysis of molecular dynamics trajectories: Quantifying global and local differences between conformational samples.

    PubMed

    Oblinsky, Daniel G; Vanschouwen, Bryan M B; Gordon, Heather L; Rothstein, Stuart M

    2009-12-14

    Given the principal component analysis (PCA) of a molecular dynamics (MD) conformational trajectory for a model protein, we perform orthogonal Procrustean rotation to "best fit" the PCA squared-loading matrix to that of a target matrix computed for a related but different molecular system. The sum of squared deviations of the elements of the rotated matrix from those of the target, known as the error of fit (EOF), provides a quantitative measure of the dissimilarity between the two conformational samples. To estimate precision of the EOF, we perform bootstrap resampling of the molecular conformations within the trajectories, generating a distribution of EOF values for the system and target. The average EOF per variable is determined and visualized to ascertain where, locally, system and target sample properties differ. We illustrate this approach by analyzing MD trajectories for the wild-type and four selected mutants of the beta1 domain of protein G.

  12. Procrustean rotation in concert with principal component analysis of molecular dynamics trajectories: Quantifying global and local differences between conformational samples

    NASA Astrophysics Data System (ADS)

    Oblinsky, Daniel G.; VanSchouwen, Bryan M. B.; Gordon, Heather L.; Rothstein, Stuart M.

    2009-12-01

    Given the principal component analysis (PCA) of a molecular dynamics (MD) conformational trajectory for a model protein, we perform orthogonal Procrustean rotation to "best fit" the PCA squared-loading matrix to that of a target matrix computed for a related but different molecular system. The sum of squared deviations of the elements of the rotated matrix from those of the target, known as the error of fit (EOF), provides a quantitative measure of the dissimilarity between the two conformational samples. To estimate precision of the EOF, we perform bootstrap resampling of the molecular conformations within the trajectories, generating a distribution of EOF values for the system and target. The average EOF per variable is determined and visualized to ascertain where, locally, system and target sample properties differ. We illustrate this approach by analyzing MD trajectories for the wild-type and four selected mutants of the β1 domain of protein G.

  13. Aroma profile and sensory characteristics of a sulfur dioxide-free mulberry (Morus nigra) wine subjected to non-thermal accelerating aging techniques.

    PubMed

    Tchabo, William; Ma, Yongkun; Kwaw, Emmanuel; Zhang, Haining; Xiao, Lulu; Tahir, Haroon Elrasheid

    2017-10-01

    The present study was undertaken to assess accelerating aging effects of high pressure, ultrasound and manosonication on the aromatic profile and sensorial attributes of aged mulberry wines (AMW). A total of 166 volatile compounds were found amongst the AMW. The outcomes of the investigation were presented by means of geometric mean (GM), cluster analysis (CA), principal component analysis (PCA), partial least squares regressions (PLSR) and principal component regression (PCR). GM highlighted 24 organoleptic attributes responsible for the sensorial profile of the AMW. Moreover, CA revealed that the volatile composition of the non-thermal accelerated aged wines differs from that of the conventional aged wines. Besides, PCA discriminated the AMW on the basis of their main sensorial characteristics. Furthermore, PLSR identified 75 aroma compounds which were mainly responsible for the olfactory notes of the AMW. Finally, the overall quality of the AMW was noted to be better predicted by PLSR than PCR. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. Discrimination of selected species of pathogenic bacteria using near-infrared Raman spectroscopy and principal components analysis

    NASA Astrophysics Data System (ADS)

    de Siqueira e Oliveira, Fernanda SantAna; Giana, Hector Enrique; Silveira, Landulfo

    2012-10-01

    A method, based on Raman spectroscopy, for identification of different microorganisms involved in bacterial urinary tract infections has been proposed. Spectra were collected from different bacterial colonies (Gram-negative: Escherichia coli, Klebsiella pneumoniae, Proteus mirabilis, Pseudomonas aeruginosa and Enterobacter cloacae, and Gram-positive: Staphylococcus aureus and Enterococcus spp.), grown on culture medium (agar), using a Raman spectrometer with a fiber Raman probe (830 nm). Colonies were scraped from the agar surface and placed on an aluminum foil for Raman measurements. After preprocessing, spectra were submitted to a principal component analysis and Mahalanobis distance (PCA/MD) discrimination algorithm. We found that the mean Raman spectra of different bacterial species show similar bands, and S. aureus was well characterized by strong bands related to carotenoids. PCA/MD could discriminate Gram-positive bacteria with sensitivity and specificity of 100% and Gram-negative bacteria with sensitivity ranging from 58 to 88% and specificity ranging from 87% to 99%.

  15. WFIRST: Principal Components Analysis of H4RG-10 Near-IR Detector Data Cubes

    NASA Astrophysics Data System (ADS)

    Rauscher, Bernard

    2018-01-01

    The Wide Field Infrared Survey Telescope’s (WFIRST) Wide Field Instrument (WFI) incorporates an array of eighteen Teledyne H4RG-10 near-IR detector arrays. Because WFIRST’s science investigations require controlling systematic uncertainties to state-of-the-art levels, we conducted principal components analysis (PCA) of some H4RG-10 test data obtained in the NASA Goddard Space Flight Center Detector Characterization Laboratory (DCL). The PCA indicates that the Legendre polynomials provide a nearly orthogonal representation of up-the-ramp sampled illuminated data cubes, and suggests other representations that may provide an even more compact representation of the data in some circumstances. We hypothesize that by using orthogonal representations, such as those described here, it may be possible to control systematic errors better than has been achieved before for NASA missions. We believe that these findings are probably applicable to other H4RG, H2RG, and H1RG based systems.

  16. A Parallel Product-Convolution approach for representing the depth varying Point Spread Functions in 3D widefield microscopy based on principal component analysis.

    PubMed

    Arigovindan, Muthuvel; Shaevitz, Joshua; McGowan, John; Sedat, John W; Agard, David A

    2010-03-29

    We address the problem of computational representation of image formation in 3D widefield fluorescence microscopy with depth varying spherical aberrations. We first represent 3D depth-dependent point spread functions (PSFs) as a weighted sum of basis functions that are obtained by principal component analysis (PCA) of experimental data. This representation is then used to derive an approximating structure that compactly expresses the depth variant response as a sum of few depth invariant convolutions pre-multiplied by a set of 1D depth functions, where the convolving functions are the PCA-derived basis functions. The model offers an efficient and convenient trade-off between complexity and accuracy. For a given number of approximating PSFs, the proposed method results in a much better accuracy than the strata based approximation scheme that is currently used in the literature. In addition to yielding better accuracy, the proposed methods automatically eliminate the noise in the measured PSFs.

  17. High Accuracy Passive Magnetic Field-Based Localization for Feedback Control Using Principal Component Analysis.

    PubMed

    Foong, Shaohui; Sun, Zhenglong

    2016-08-12

    In this paper, a novel magnetic field-based sensing system employing statistically optimized concurrent multiple sensor outputs for precise field-position association and localization is presented. This method capitalizes on the independence between simultaneous spatial field measurements at multiple locations to induce unique correspondences between field and position. This single-source-multi-sensor configuration is able to achieve accurate and precise localization and tracking of translational motion without contact over large travel distances for feedback control. Principal component analysis (PCA) is used as a pseudo-linear filter to optimally reduce the dimensions of the multi-sensor output space for computationally efficient field-position mapping with artificial neural networks (ANNs). Numerical simulations are employed to investigate the effects of geometric parameters and Gaussian noise corruption on PCA assisted ANN mapping performance. Using a 9-sensor network, the sensing accuracy and closed-loop tracking performance of the proposed optimal field-based sensing system is experimentally evaluated on a linear actuator with a significantly more expensive optical encoder as a comparison.

  18. Sensor Failure Detection of FASSIP System using Principal Component Analysis

    NASA Astrophysics Data System (ADS)

    Sudarno; Juarsa, Mulya; Santosa, Kussigit; Deswandri; Sunaryo, Geni Rina

    2018-02-01

    In the nuclear reactor accident of Fukushima Daiichi in Japan, the damages of core and pressure vessel were caused by the failure of its active cooling system (diesel generator was inundated by tsunami). Thus researches on passive cooling system for Nuclear Power Plant are performed to improve the safety aspects of nuclear reactors. The FASSIP system (Passive System Simulation Facility) is an installation used to study the characteristics of passive cooling systems at nuclear power plants. The accuracy of sensor measurement of FASSIP system is essential, because as the basis for determining the characteristics of a passive cooling system. In this research, a sensor failure detection method for FASSIP system is developed, so the indication of sensor failures can be detected early. The method used is Principal Component Analysis (PCA) to reduce the dimension of the sensor, with the Squarred Prediction Error (SPE) and statistic Hotteling criteria for detecting sensor failure indication. The results shows that PCA method is capable to detect the occurrence of a failure at any sensor.

  19. Design of experiments and principal component analysis as approaches for enhancing performance of gas-diffusional air-breathing bilirubin oxidase cathode

    NASA Astrophysics Data System (ADS)

    Babanova, Sofia; Artyushkova, Kateryna; Ulyanova, Yevgenia; Singhal, Sameer; Atanassov, Plamen

    2014-01-01

    Two statistical methods, design of experiments (DOE) and principal component analysis (PCA) are employed to investigate and improve performance of air-breathing gas-diffusional enzymatic electrodes. DOE is utilized as a tool for systematic organization and evaluation of various factors affecting the performance of the composite system. Based on the results from the DOE, an improved cathode is constructed. The current density generated utilizing the improved cathode (755 ± 39 μA cm-2 at 0.3 V vs. Ag/AgCl) is 2-5 times higher than the highest current density previously achieved. Three major factors contributing to the cathode performance are identified: the amount of enzyme, the volume of phosphate buffer used to immobilize the enzyme, and the thickness of the gas-diffusion layer (GDL). PCA is applied as an independent confirmation tool to support conclusions made by DOE and to visualize the contribution of factors in individual cathode configurations.

  20. Principal component analysis (PCA) of volatile terpene compounds dataset emitted by genetically modified sweet orange fruits and juices in which a D-limonene synthase was either up- or down-regulated vs. empty vector controls.

    PubMed

    Rodríguez, Ana; Peris, Josep E; Redondo, Ana; Shimada, Takehiko; Peña, Leandro

    2016-12-01

    We have categorized the dataset from content and emission of terpene volatiles of peel and juice in both Navelina and Pineapple sweet orange cultivars in which D-limonene was either up- (S), down-regulated (AS) or non-altered (EV; control) ("Impact of D-limonene synthase up- or down-regulation on sweet orange fruit and juice odor perception"(A. Rodríguez, J.E. Peris, A. Redondo, T. Shimada, E. Costell, I. Carbonell, C. Rojas, L. Peña, (2016)) [1]). Data from volatile identification and quantification by HS-SPME and GC-MS were classified by Principal Component Analysis (PCA) individually or as chemical groups. AS juice was characterized by the higher influence of the oxygen fraction, and S juice by the major influence of ethyl esters. S juices emitted less linalool compared to AS and EV juices.

  1. The employment of FTIR spectroscopy in combination with chemometrics for analysis of rat meat in meatball formulation.

    PubMed

    Rahmania, Halida; Sudjadi; Rohman, Abdul

    2015-02-01

    For Indonesian community, meatball is one of the favorite meat food products. In order to gain economical benefits, the substitution of beef meat with rat meat can happen due to the different prices between rat meat and beef. In this present research, the feasibility of FTIR spectroscopy in combination with multivariate calibration of partial least square (PLS) was used for the quantitative analysis of rat meat in the binary mixture of beef in meatball formulation. Meanwhile, the chemometrics of principal component analysis (PCA) was used for the classification between rat meat and beef meatballs. Some frequency regions in mid infrared region were optimized, and finally, the frequency region of 750-1000 cm(-1) was selected during PLS and PCA modeling.For quantitative analysis, the relationship between actual values (x-axis) and FTIR predicted values (y-axis) of rat meat is described by the equation of y= 0.9417x+ 2.8410 with coefficient of determination (R2) of 0.993, and root mean square error of calibration (RMSEC) of 1.79%. Furthermore, PCA was successfully used for the classification of rat meat meatball and beef meatball.

  2. Prebiotic Low Sugar Chocolate Dairy Desserts: Physical and Optical Characteristics and Performance of PARAFAC and PCA Preference Map.

    PubMed

    Morais, E C; Esmerino, E A; Monteiro, R A; Pinheiro, C M; Nunes, C A; Cruz, A G; Bolini, Helena M A

    2016-01-01

    The addition of prebiotic and sweeteners in chocolate dairy desserts opens up new opportunities to develop dairy desserts that besides having a lower calorie intake still has functional properties. In this study, prebiotic low sugar dairy desserts were evaluated by 120 consumers using a 9-point hedonic scale, in relation to the attributes of appearance, aroma, flavor, texture, and overall liking. Internal preference map using parallel factor analysis (PARAFAC) and principal component analysis (PCA) was performed using the consumer data. In addition, physical (texture profile) and optical (instrumental color) analyses were also performed. Prebiotic dairy desserts containing sucrose and sucralose were equally liked by the consumers. These samples were characterized by firmness and gumminess, which can be considered drivers of liking by the consumers. Optimization of the prebiotic low sugar dessert formulation should take in account the choice of ingredients that contribute in a positive manner for these parameters. PARAFAC allowed the extraction of more relevant information in relation to PCA, demonstrating that consumer acceptance analysis can be evaluated by simultaneously considering several attributes. Multiple factor analysis reported Rv value of 0.964, suggesting excellent concordance for both methods. © 2015 Institute of Food Technologists®

  3. Water quality analysis of the Rapur area, Andhra Pradesh, South India using multivariate techniques

    NASA Astrophysics Data System (ADS)

    Nagaraju, A.; Sreedhar, Y.; Thejaswi, A.; Sayadi, Mohammad Hossein

    2017-10-01

    The groundwater samples from Rapur area were collected from different sites to evaluate the major ion chemistry. The large number of data can lead to difficulties in the integration, interpretation, and representation of the results. Two multivariate statistical methods, hierarchical cluster analysis (HCA) and factor analysis (FA), were applied to evaluate their usefulness to classify and identify geochemical processes controlling groundwater geochemistry. Four statistically significant clusters were obtained from 30 sampling stations. This has resulted two important clusters viz., cluster 1 (pH, Si, CO3, Mg, SO4, Ca, K, HCO3, alkalinity, Na, Na + K, Cl, and hardness) and cluster 2 (EC and TDS) which are released to the study area from different sources. The application of different multivariate statistical techniques, such as principal component analysis (PCA), assists in the interpretation of complex data matrices for a better understanding of water quality of a study area. From PCA, it is clear that the first factor (factor 1), accounted for 36.2% of the total variance, was high positive loading in EC, Mg, Cl, TDS, and hardness. Based on the PCA scores, four significant cluster groups of sampling locations were detected on the basis of similarity of their water quality.

  4. Fast detection of Piscirickettsia salmonis in Salmo salar serum through MALDI-TOF-MS profiling.

    PubMed

    Olate, Verónica R; Nachtigall, Fabiane M; Santos, Leonardo S; Soto, Alex; Araya, Macarena; Oyanedel, Sandra; Díaz, Verónica; Marchant, Vanessa; Rios-Momberg, Mauricio

    2016-03-01

    Piscirickettsia salmonis is a pathogenic bacteria known as the aetiological agent of the salmonid rickettsial syndrome and causes a high mortality in farmed salmonid fishes. Detection of P. salmonis in farmed fishes is based mainly on molecular biology and immunohistochemistry techniques. These techniques are in most of the cases expensive and time consuming. In the search of new alternatives to detect the presence of P. salmonis in salmonid fishes, this work proposed the use of MALDI-TOF-MS to compare serum protein profiles from Salmo salar fish, including experimentally infected and non-infected fishes using principal component analysis (PCA). Samples were obtained from a controlled bioassay where S. salar was challenged with P. salmonis in a cohabitation model and classified according to the presence or absence of the bacteria by real time PCR analysis. MALDI spectra of the fish serum samples showed differences in its serum protein composition. These differences were corroborated with PCA analysis. The results demonstrated that the use of both MALDI-TOF-MS and PCA represents a useful tool to discriminate the fish status through the analysis of salmonid serum samples. Copyright © 2016 John Wiley & Sons, Ltd.

  5. Use of principal components analysis and protein microarray to explore the association of HIV-1-specific IgG responses with disease progression.

    PubMed

    Gerns Storey, Helen L; Richardson, Barbra A; Singa, Benson; Naulikha, Jackie; Prindle, Vivian C; Diaz-Ochoa, Vladimir E; Felgner, Phil L; Camerini, David; Horton, Helen; John-Stewart, Grace; Walson, Judd L

    2014-01-01

    The role of HIV-1-specific antibody responses in HIV disease progression is complex and would benefit from analysis techniques that examine clusterings of responses. Protein microarray platforms facilitate the simultaneous evaluation of numerous protein-specific antibody responses, though excessive data are cumbersome in analyses. Principal components analysis (PCA) reduces data dimensionality by generating fewer composite variables that maximally account for variance in a dataset. To identify clusters of antibody responses involved in disease control, we investigated the association of HIV-1-specific antibody responses by protein microarray, and assessed their association with disease progression using PCA in a nested cohort design. Associations observed among collections of antibody responses paralleled protein-specific responses. At baseline, greater antibody responses to the transmembrane glycoprotein (TM) and reverse transcriptase (RT) were associated with higher viral loads, while responses to the surface glycoprotein (SU), capsid (CA), matrix (MA), and integrase (IN) proteins were associated with lower viral loads. Over 12 months greater antibody responses were associated with smaller decreases in CD4 count (CA, MA, IN), and reduced likelihood of disease progression (CA, IN). PCA and protein microarray analyses highlighted a collection of HIV-specific antibody responses that together were associated with reduced disease progression, and may not have been identified by examining individual antibody responses. This technique may be useful to explore multifaceted host-disease interactions, such as HIV coinfections.

  6. Lithological mapping of Kanjamalai hill using hyperspectral remote sensing tools in Salem district, Tamil Nadu, India

    NASA Astrophysics Data System (ADS)

    Arulbalaji, Palanisamy; Balasubramanian, Gurugnanam

    2017-07-01

    This study uses advanced spaceborne thermal emission and reflection radiometer (ASTER) hyperspectral remote sensing techniques to discriminate rock types composing Kanjamalai hill located in the Salem district of Tamil Nadu, India. Kanjamalai hill is of particular interest because it contains economically viable iron ore deposits. ASTER hyperspectral data were subjected to principal component analysis (PCA), independent component analysis (ICA), and minimum noise fraction (MNF) to improve identification of lithologies remotely and to compare these digital data results with published geologic maps. Hyperspectral remote sensing analysis indicates that PCA (R∶G∶B=2∶1∶3), MNF (R∶G∶B=3∶2∶1), and ICA (R∶G∶B=1∶3∶2) provide the best band combination for effective discrimination of lithological rock types composing Kanjamalai hill. The remote sensing-derived lithological map compares favorably with a published geological map from Geological Survey of India and has been verified with ground truth field investigations. Therefore, ASTER data-based lithological mapping provides fast, cost-effective, and accurate geologic data useful for lithological discrimination and identification of ore deposits.

  7. Discrimination of honeys using colorimetric sensor arrays, sensory analysis and gas chromatography techniques.

    PubMed

    Tahir, Haroon Elrasheid; Xiaobo, Zou; Xiaowei, Huang; Jiyong, Shi; Mariod, Abdalbasit Adam

    2016-09-01

    Aroma profiles of six honey varieties of different botanical origins were investigated using colorimetric sensor array, gas chromatography-mass spectrometry (GC-MS) and descriptive sensory analysis. Fifty-eight aroma compounds were identified, including 2 norisoprenoids, 5 hydrocarbons, 4 terpenes, 6 phenols, 7 ketones, 9 acids, 12 aldehydes and 13 alcohols. Twenty abundant or active compounds were chosen as key compounds to characterize honey aroma. Discrimination of the honeys was subsequently implemented using multivariate analysis, including hierarchical clustering analysis (HCA) and principal component analysis (PCA). Honeys of the same botanical origin were grouped together in the PCA score plot and HCA dendrogram. SPME-GC/MS and colorimetric sensor array were able to discriminate the honeys effectively with the advantages of being rapid, simple and low-cost. Moreover, partial least squares regression (PLSR) was applied to indicate the relationship between sensory descriptors and aroma compounds. Copyright © 2016 Elsevier Ltd. All rights reserved.

  8. NIR studies of cholesterol-dependent structural modification of the model lipid bilayer doped with inhalation anesthetics

    NASA Astrophysics Data System (ADS)

    Kuć, Marta; Cieślik-Boczula, Katarzyna; Rospenk, Maria

    2018-06-01

    The influence of cholesterol on the structure of the model lipid bilayers treated with inhalation anesthetics (enflurane, isoflurane, sevoflurane and halothane) was investigated employing near-infrared (NIR) spectroscopy combined with the Principal Component Analysis (PCA). The conformational changes occurring in the hydrophobic area of the lipid bilayers were analyzed using the first overtones of symmetric (2νs) and antisymmetric (2νas) stretching vibrations of the CH2 groups of lipid aliphatic chains. The temperature values of chain-melting phase transition (Tm) of anesthetic-mixed dipalmitoylphosphatidylcholine (DPPC)/cholesterol and dipalmitoylphosphatidylglycerol (DPPG)/cholesterol membranes, which were obtained from the PCA analysis, were compared with cholesterol-free DPPC and DPPG bilayers mixed with inhalation anesthetics.

  9. Locally linear embedding: dimension reduction of massive protostellar spectra

    NASA Astrophysics Data System (ADS)

    Ward, J. L.; Lumsden, S. L.

    2016-09-01

    We present the results of the application of locally linear embedding (LLE) to reduce the dimensionality of dereddened and continuum subtracted near-infrared spectra using a combination of models and real spectra of massive protostars selected from the Red MSX Source survey data base. A brief comparison is also made with two other dimension reduction techniques; principal component analysis (PCA) and Isomap using the same set of spectra as well as a more advanced form of LLE, Hessian locally linear embedding. We find that whilst LLE certainly has its limitations, it significantly outperforms both PCA and Isomap in classification of spectra based on the presence/absence of emission lines and provides a valuable tool for classification and analysis of large spectral data sets.

  10. Nonlinear multivariate and time series analysis by neural network methods

    NASA Astrophysics Data System (ADS)

    Hsieh, William W.

    2004-03-01

    Methods in multivariate statistical analysis are essential for working with large amounts of geophysical data, data from observational arrays, from satellites, or from numerical model output. In classical multivariate statistical analysis, there is a hierarchy of methods, starting with linear regression at the base, followed by principal component analysis (PCA) and finally canonical correlation analysis (CCA). A multivariate time series method, the singular spectrum analysis (SSA), has been a fruitful extension of the PCA technique. The common drawback of these classical methods is that only linear structures can be correctly extracted from the data. Since the late 1980s, neural network methods have become popular for performing nonlinear regression and classification. More recently, neural network methods have been extended to perform nonlinear PCA (NLPCA), nonlinear CCA (NLCCA), and nonlinear SSA (NLSSA). This paper presents a unified view of the NLPCA, NLCCA, and NLSSA techniques and their applications to various data sets of the atmosphere and the ocean (especially for the El Niño-Southern Oscillation and the stratospheric quasi-biennial oscillation). These data sets reveal that the linear methods are often too simplistic to describe real-world systems, with a tendency to scatter a single oscillatory phenomenon into numerous unphysical modes or higher harmonics, which can be largely alleviated in the new nonlinear paradigm.

  11. A principal component analysis of the dynamics of subdomains and binding sites in human serum albumin.

    PubMed

    Paris, Guillaume; Ramseyer, Christophe; Enescu, Mironel

    2014-05-01

    The conformational dynamics of human serum albumin (HSA) was investigated by principal component analysis (PCA) applied to three molecular dynamics trajectories of 200 ns each. The overlap of the essential subspaces spanned by the first 10 principal components (PC) of different trajectories was about 0.3 showing that the PCA based on a trajectory length of 200 ns is not completely convergent for this protein. The contributions of the relative motion of subdomains and of the subdomains (internal) distortion to the first 10 PCs were found to be comparable. Based on the distribution of the first 3 PC, 10 protein conformers are identified showing relative root mean square deviations (RMSD) between 2.3 and 4.6 Å. The main PCs are found to be delocalized over the whole protein structure indicating that the motions of different protein subdomains are coupled. This coupling is considered as being related to the allosteric effects observed upon ligand binding to HSA. On the other hand, the first PC of one of the three trajectories describes a conformational transition of the protein domain I that is close to that experimentally observed upon myristate binding. This is a theoretical support for the older hypothesis stating that changes of the protein onformation favorable to binding can precede the ligand complexation. A detailed all atoms PCA performed on the primary Sites 1 and 2 confirms the multiconformational character of the HSA binding sites as well as the significant coupling of their motions. Copyright © 2013 Wiley Periodicals, Inc.

  12. An electrophysiological index of changes in risk decision-making strategies.

    PubMed

    Zhang, Dandan; Gu, Ruolei; Wu, Tingting; Broster, Lucas S; Luo, Yi; Jiang, Yang; Luo, Yue-jia

    2013-07-01

    Human decision-making is significantly modulated by previously experienced outcomes. Using event-related potentials (ERPs), we examined whether ERP components evoked by outcome feedbacks could serve as biomarkers to signal the influence of current outcome evaluation on subsequent decision-making. In this study, 18 adult volunteers participated in a simple monetary gambling task, in which they were asked to choose between two options that differed in risk. Their decisions were immediately followed by outcome presentation. Temporospatial principle component analysis (PCA) was applied to the outcome-onset locked ERPs in the 200-1000 ms time window. The PCA factors that approximated classical ERP components (P2, feedback-related negativity, P3a, and P3b) in terms of time course and scalp distribution were tested for their association with subsequent decision-making strategies. Our results revealed that a fronto-central PCA factor approximating the classical P3a was related to changes of decision-making strategies on subsequent trials. The decision to switch between high- and low-risk options resulted in a larger P3a relative to the decision to retain the same choice. According to the results, we suggest that the amplitude of the fronto-central P3a is an electrophysiological index of the influence of current outcome on subsequent risk decision-making. Furthermore, the ERP source analysis indicated that the activations of the frontopolar cortex and sensorimotor cortex were involved in subsequent changes of strategies, which enriches our understanding of the neural mechanisms of adjusting decision-making strategies based on previous experience. Copyright © 2013 Elsevier Ltd. All rights reserved.

  13. An electrophysiological index of changes in risk decision-making strategies

    PubMed Central

    Zhang, Dandan; Gu, Ruolei; Wu, Tingting; Broster, Lucas S.; Luo, Yi; Jiang, Yang; Luo, Yue-jia

    2014-01-01

    Human decision-making is significantly modulated by previously experienced outcomes. Using event-related potentials (ERPs), we examined whether ERP components evoked by outcome feedbacks could serve as biomarkers to signal the influence of current outcome evaluation on subsequent decision-making. In this study, eighteen adult volunteers participated in a simple monetary gambling task, in which they were asked to choose between two options that differed in risk. Their decisions were immediately followed by outcome presentation. Temporospatial principle component analysis (PCA) was applied to the outcome-onset locked ERPs in the -200 – 1000 ms time window. The PCA factors that approximated classical ERP components (P2, feedback-related negativity, P3a, & P3b) in terms of time course and scalp distribution were tested for their association with subsequent decision-making strategies. Our results revealed that a fronto-central PCA factor approximating the classical P3a was related to changes of decision-making strategies on subsequent trials. The decision to switch between high- and low-risk options resulted in a larger P3a relative to the decision to retain the same choice. According to the results, we suggest the amplitude of the fronto-central P3a is an electrophysiological index of the influence of current outcome on subsequent risk decision-making. Furthermore, the ERP source analysis indicated that the activations of the frontopolar cortex and sensorimotor cortex were involved in subsequent changes of strategies, which enriches our understanding of the neural mechanisms of adjusting decision-making strategies based on previous experience. PMID:23643796

  14. Differences in chewing sounds of dry-crisp snacks by multivariate data analysis

    NASA Astrophysics Data System (ADS)

    De Belie, N.; Sivertsvik, M.; De Baerdemaeker, J.

    2003-09-01

    Chewing sounds of different types of dry-crisp snacks (two types of potato chips, prawn crackers, cornflakes and low calorie snacks from extruded starch) were analysed to assess differences in sound emission patterns. The emitted sounds were recorded by a microphone placed over the ear canal. The first bite and the first subsequent chew were selected from the time signal and a fast Fourier transformation provided the power spectra. Different multivariate analysis techniques were used for classification of the snack groups. This included principal component analysis (PCA) and unfold partial least-squares (PLS) algorithms, as well as multi-way techniques such as three-way PLS, three-way PCA (Tucker3), and parallel factor analysis (PARAFAC) on the first bite and subsequent chew. The models were evaluated by calculating the classification errors and the root mean square error of prediction (RMSEP) for independent validation sets. It appeared that the logarithm of the power spectra obtained from the chewing sounds could be used successfully to distinguish the different snack groups. When different chewers were used, recalibration of the models was necessary. Multi-way models distinguished better between chewing sounds of different snack groups than PCA on bite or chew separately and than unfold PLS. From all three-way models applied, N-PLS with three components showed the best classification capabilities, resulting in classification errors of 14-18%. The major amount of incorrect classifications was due to one type of potato chips that had a very irregular shape, resulting in a wide variation of the emitted sounds.

  15. Evaluation of Coptidis Rhizoma-Euodiae Fructus couple and Zuojin products based on HPLC fingerprint chromatogram and simultaneous determination of main bioactive constituents.

    PubMed

    Gao, Xin; Yang, Xiu-Wei; Marriott, Philip J

    2013-11-01

    Coptidis Rhizoma-Euodiae Fructus couple (CEC) is a classic traditional Chinese medicine preparation consisting of Coptidis Rhizoma and Euodiae Fructus at the ratio of 6:1, and used to treat gastro-intestinal disorders. Alkaloids are the main bioactive component. This research provides comprehensive analysis information for the quality control of CEC. To develop a high-performance liquid chromatography-diode array detection fingerprint for chemical composition characteristics of CEC and its products. The samples were separated with a Gemini C18 column by using gradient elution with water-formic acid (100:0.03) and acetonitrile as mobile phase. Flow rate was 1.0 mL/min and detection wavelength was 250 nm. Similarity analysis and principal component analysis (PCA) were employed to evaluate quality consistencies of analytes. Mean chromatograms and correlation coefficients of analytes were calculated by the software "Similarity Evaluation System for Chromatographic Fingerprint of Traditional Chinese Medicine". Fingerprint chromatogram comparison determined 20 representative general fingerprint peaks, and the fingerprint chromatogram resemblances are all better than 0.988. Consistent results were obtained to show that CEC and its related samples could be successfully divided into three groups. Contribution plots generated by PCA were performed to interpret differences among the sample groups while peaks which significantly contributed to classification were identified. Seven bioactive constituents in the samples were verified by quantitative analysis. The chromatographic fingerprint with similarity evaluation and PCA assay combined with quantification of seven compounds could be utilized as a quality control method for the herbal couple.

  16. Human Classification Based on Gestural Motions by Using Components of PCA

    NASA Astrophysics Data System (ADS)

    Aziz, Azri A.; Wan, Khairunizam; Za'aba, S. K.; B, Shahriman A.; Adnan, Nazrul H.; H, Asyekin; R, Zuradzman M.

    2013-12-01

    Lately, a study of human capabilities with the aim to be integrated into machine is the famous topic to be discussed. Moreover, human are bless with special abilities that they can hear, see, sense, speak, think and understand each other. Giving such abilities to machine for improvement of human life is researcher's aim for better quality of life in the future. This research was concentrating on human gesture, specifically arm motions for differencing the individuality which lead to the development of the hand gesture database. We try to differentiate the human physical characteristic based on hand gesture represented by arm trajectories. Subjects are selected from different type of the body sizes, and then acquired data undergo resampling process. The results discuss the classification of human based on arm trajectories by using Principle Component Analysis (PCA).

  17. From the ocean to a salt marsh: towards understanding iron reduction processes with FORC-PCA.

    NASA Astrophysics Data System (ADS)

    Muraszko, J. R.; Lascu, I.; Collins, S. M.; Harrison, R. J.

    2017-12-01

    Biogenic magnetic minerals are a high fidelity recorder of climate change. Their sensitivity to sedimentary redox conditions and bottom water ventilation have the potential to provide useful insights into past diagenetic conditions. However, the mechanisms controlling preservation and dissolution of magnetosomes are not fully understood, thus undermining the reliability of the paleomagnetic records in marine environments. Recovering information about the diagenetic past of the sediment is a crucial challenge; specifically, the biogenic components need to be identified and unmixed from the bulk magnetic signal. We address the issue in this study by applying Principal Component Analysis on First Order Reversal Curve diagrams (FORC-PCA) in case studies of cores obtained from the Iberian Margin and the sedimentologically active coastal salt marshes of Norfolk. We demonstrate the applicability of FORC-PCA as a new environmental proxy, yielding a high resolution temporal marine record of environmental changes reflected in magnetic composition over the last 194 kyr. The strongest variations are observed in the microbially derived components, the bulk properties of the sediment being controlled by a low coercivity SP-SD component which is generally anticorrelated with the magnetosome signal. Supported by TEM studies, we suggest the prevalence of clusters of nano-particles of magnetite associated with iron reduction. To further investigate the mechanisms controlling these processes, the active sedimentary environment of Norfolk was chosen as a case study of early diagenesis controlled by strong vertical geochemical gradients.

  18. Triggers in advanced neurological conditions: prediction and management of the terminal phase.

    PubMed

    Hussain, Jamilla; Adams, Debi; Allgar, Victoria; Campbell, Colin

    2014-03-01

    The challenge to provide a palliative care service for individuals with advanced neurological conditions is compounded by variability in disease trajectories and symptom profiles. The National End of Life Care Programme (2010) recommended seven 'triggers' for a palliative approach to care for patients with advanced neurological conditions. To establish the frequency of triggers in the palliative phase, and if they could be reduced to fewer components. Management of the terminal phase also was evaluated. Retrospective study of 62 consecutive patients under the care of a specialist palliative neurology service, who had died. Principle component analysis (PCA) was performed to establish the interrelationship between triggers. Frequency of triggers increased as each patient approached death. PCA found that four symptom components explained 76.8% of the variance. These represented: rapid physical decline; significant complex symptoms, including pain; infection in combination with cognitive impairment; and risk of aspiration. Median follow-up under the palliative care service was 336 days. In 56.5% of patients, the cause of death was pneumonia. The terminal phase was recognised in 72.6%. The duration of the terminal phase was 8.8 days on average, and the Liverpool Care of the dying Pathway was commenced in 33.9%. All carers were offered bereavement support. Referral criteria based on the triggers can facilitate appropriate and timely patient access to palliative care. The components deduced through PCA have face validity; however larger studies prospectively validating the triggers are required. Closer scrutiny of the terminal phase is necessary to optimise management.

  19. Principal component analysis of chemical shift perturbation data of a multiple-ligand-binding system for elucidation of respective binding mechanism.

    PubMed

    Konuma, Tsuyoshi; Lee, Young-Ho; Goto, Yuji; Sakurai, Kazumasa

    2013-01-01

    Chemical shift perturbations (CSPs) in NMR spectra provide useful information about the interaction of a protein with its ligands. However, in a multiple-ligand-binding system, determining quantitative parameters such as a dissociation constant (K(d) ) is difficult. Here, we used a method we named CS-PCA, a principal component analysis (PCA) of chemical shift (CS) data, to analyze the interaction between bovine β-lactoglobulin (βLG) and 1-anilinonaphthalene-8-sulfonate (ANS), which is a multiple-ligand-binding system. The CSP on the binding of ANS involved contributions from two distinct binding sites. PCA of the titration data successfully separated the CSP pattern into contributions from each site. Docking simulations based on the separated CSP patterns provided the structures of βLG-ANS complexes for each binding site. In addition, we determined the K(d) values as 3.42 × 10⁻⁴ M² and 2.51 × 10⁻³ M for Sites 1 and 2, respectively. In contrast, it was difficult to obtain reliable K(d) values for respective sites from the isothermal titration calorimetry experiments. Two ANS molecules were found to bind at Site 1 simultaneously, suggesting that the binding occurs cooperatively with a partial unfolding of the βLG structure. On the other hand, the binding of ANS to Site 2 was a simple attachment without a significant conformational change. From the present results, CS-PCA was confirmed to provide not only the positions and the K(d) values of binding sites but also information about the binding mechanism. Thus, it is anticipated to be a general method to investigate protein-ligand interactions. Copyright © 2012 Wiley Periodicals, Inc.

  20. Use of electrospray ionization ion-trap tandem mass spectrometry and principal component analysis to directly distinguish monosaccharides.

    PubMed

    Xia, Bing; Zhou, Yan; Liu, Xin; Xiao, Juan; Liu, Qing; Gu, Yucheng; Ding, Lisheng

    2012-06-15

    Carbohydrates are good source of drugs and play important roles in metabolism processes and cellular interactions in organisms. Distinguishing monosaccharide isomers in saccharide derivates is an important and elementary work in investigating saccharides. It is important to develop a fast, simple and direct method for this purpose, which is described in this study. Stock solutions of monosaccharide with a concentration of 400 μM and sodium chloride at a concentration of 10 μM were made in water/methanol (50:50, v/v). The samples were subjected to electrospray ionization ion-trap tandem mass spectrometry (ESI-MS) and the detected [2M + Na - H(2)O](+) ions were further investigated by tandem mass spectrometry (MS/MS), followed by applying principal component analysis (PCA) on the obtained MS/MS data sets. The MS/MS spectra of the [2M + Na - H(2)O](+) ions at m/z 365 for hexoses and m/z 305 for pentoses yielded unambiguous fragment patterns, while rhamnose can be directly identified by its ESI-MS [M + Na](+) ion at m/z 187. PCA showed clustering of MS/MS data of identical monosaccharide samples obtained from different experiments. By using this method, the monosaccharide in daucosterol hydrolysate was successfully identified. A new strategy was developed for differentiation of the monosaccharides using ESI-MS/MS and PCA. In MS/MS spectra, the [2M + Na - H(2)O](+) ions yielded unambiguous distinction. PCA of the archived MS/MS data sets was applied to demonstrate the spatial resolution of the studied samples. This method presented a simple and reliable way for distinguishing monosaccharides by ESI-MS/MS. Copyright © 2012 John Wiley & Sons, Ltd.

Top